|
Large
Scale Gene Expression Analysis Using DNA Microarrays
May
10-15, 2004, in Turku, Finland
Organised
by Turku Centre for Biotechnology, University of Turku and
Åbo Akademi University.
Report
1.
Programme and organisers
2. List of participants (May 10, minisymposium)
3. Minisymposium lecture presentations,
synopsis and conclusions
4. Workshop groups and participants
5. Concluding remarks of the workshop
1.
Programme and organizers
Minisymposium Session I: Microarray techniques and applications
9.00-10.00 Jorg Hoheisel (DKFZ, Heidelberg, Germany): Use
of complex DNA- and
antibody-microarrays as tools in functional analyses
10.00-10.30 Coffee
10.30-11.0 Laszlo Puskas, (Biological Resarch Center, Hungarian
Academy of
Sciences, University of Szeged, Szeged, Hungary): Gene
expression profiling
the effects of dietary omega-3 polyunsaturated fatty acids
in brain
11.30-12.00 Tapio Visakorpi (University of Tampere): Microarrays
and prostate cancer
12.00-13.00 Lunch
13.00-13.30 Olli Kallioniemi (VTT Biotechnology, Turku, Finland)
: Development and
applications of novel biochip technologies
13.30-14.00 Riitta Lahesmaa (Turku Centre for Biotechnology,
Turku, Finland):
Microarrays in the analysis of lymphocyte response
14.00-14.30 Coffee
Session II: Bioinformatics and microarray data analysis
14.30-15.00 Mark Reimers (Genomics and Bioinformatics Group,
Laboratory of Molecular
Pharmacology, Bethesda, MD): Bio-statistics and network
analysis
15.00-15.30 Stephen Rudd (Bioinformatics, Turku Centre for
Biotechnology, Turku,
Finland): Bioinformatics and microarray experiments: is
this more than just
statistics?
15:30-16:00 John N Weinstein Title to be announced (Bioinformatics
Group, Laboratory
of Molecular Pharmacology, Bethesda, MD)
16.00 Closing remarks
Workshop programme
Lecture day 1, May 11
- Welcome
- DNA Microarrays
- Oligonucleotide microarrays (Affymetrix GeneChips)
- Oligonucleotide microarrays versus cDNA Microarrays
- Experimental design, specials issues to be taken into account
when working with
microarrays
- Microarrays in research, Case study of research done with
microarrays
- Introduction of hands on training day, what to expect
Experiment Day, May 12
Carrying out microarray experiment. Hands on day, making one
slide microarray
experiment. Participants can have their own RNA's (control
and sample) for the array
or they can obtain it from us. Both human and mice arrays
are available.
Data Extraction, May 13-14
Data extraction
Scanning of the slides, and transforming the images into numbers.
Introduction to
hybridization quality control software.
Lecture Day 2, May 15
- Data filtering
- Microarray Normalization
- Statistics in microarrays
- Finding differentially expressed genes
- Clustering and classification of genes
Organisers
Riitta Lahesmaa (Director of the Turku Centre for Biotechnology,
University of Turku and Åbo Akademi
University)
Tapio Salakoski (Head of the Department of Information technology,
University of Turku)
Stephen Rudd (Head of Bioinformatics Laboratory, Turku Centre
for Biotechnology, University of Turku and Åbo Akademi
University )
Annika Brandt (Microarray Team Group-Leader, DNA Microarray
Centre, Turku Centre for Biotechnology, University of Turku
and Åbo Akademi University)
2. List of participants (minisymposium May 10, 2004)
Altogether
the number of participants 96 plus organizers was 101 that
had preregistered and signed the list below:
| Name |
Gender |
Age |
Country |
| Ahlfors
Helena |
F |
25 |
Finland |
| Alatalo
Ira |
F |
25 |
Finland |
| Aranko
Kari |
M |
42 |
Finland |
| Eriksson
Susann |
F |
24 |
Finland |
| Haapaben-Paananen
Saija |
F |
36 |
Finland |
| Haaranen
Paivi |
F |
27 |
Finland |
| Hamilton
Hamish |
M |
24 |
England |
| Hoti
Fabian |
M |
24 |
Turkey |
| Jansen
Tove |
M |
27 |
Finland |
| Junni
Paivi |
F |
35 |
Finland |
| Jarvinen
Anna |
F |
25 |
Finland |
| Kaur
Sippy |
F |
26 |
India |
| Kyttä
Kaisa |
F |
26 |
Finland |
| Kakonen
Sanna |
F |
27 |
Finland |
| Laakso
Sanna |
F |
25 |
Finland |
| Lemmetyinen
Juha |
M |
42 |
Finland |
| Leonard
Paul |
M |
60 |
France |
| Lundan
Tuija |
F |
36 |
Finland |
| Mohammed
Reza Dawoudi |
M |
37 |
Turkey |
| Naillat
Florence |
F |
27 |
France |
| Nikula
Tuomas |
M |
26 |
Finland |
| Oikarinen
Anne |
F |
25 |
Finland |
| Ojala
Kirsi |
F |
27 |
Finland |
| Pedrono
eric |
M |
58 |
Italy |
| Pelkonen
Jenni |
F |
26 |
Finland |
| Pessi
Anna-Mari |
F |
26 |
Finland |
| Piippo
Mirva |
F |
27 |
Russia |
| Pivanovich
Irina |
F |
27 |
Finland |
| Pohjanvirta
Raimo |
M |
26 |
Finland |
| Pulkkinen
Leena |
F |
27 |
Finland |
| Saloranta
Carola |
F |
46 |
Finland |
| Shichao
Ge |
M |
42 |
China |
| Siitari
Harri |
M |
45 |
Finland |
| Sipila
Petra |
F |
26 |
Finland |
| Soitamo
Arto |
M |
45 |
Finland |
| Suominen
Tiina |
F |
37 |
Finland |
| Tasa
Eeva |
F |
42 |
Finland |
| Teerijoki
Heli |
F |
36 |
Finland |
| Thekkedeth
Kurian Dominic |
M |
27 |
France |
| Tikkanen
Mikko |
M |
26 |
Finland |
| Timala
Jarno |
M |
41 |
Finland |
| Valve
Eeva |
F |
43 |
Finland |
| Vauhkonen
Hanna |
F |
26 |
Finland |
| Venho
Reija |
F |
39 |
Finland |
| Vuoristo
Jussi |
M |
41 |
Finland |
| Xiujuan
Li |
M |
27 |
China |
| Aledje
balde |
M |
41 |
Portugal |
| Hoheisel
Joerg |
M |
38 |
Germany |
| Riitta
Lahesmaa |
F |
43 |
Finland |
| Bartolomeu
A. Santos |
M |
30 |
Portugal |
| Patricia
Maciel |
F |
35 |
Portugal |
| Ana
Rodrigues |
F |
22 |
Portugal |
| Rocio
Martinez-A |
F |
26 |
Spain |
| Monica
Sebastiana |
F |
36 |
Portugal |
| Sissel
Monstad |
F |
35 |
Norway |
| Anette
Knudsen |
F |
29 |
Norway |
| Antonio
Duarte |
M |
35 |
Portugal |
| Gigliotti
Sandra |
F |
26 |
Luxembourg |
| Hatzenbichler
Evelyn |
F |
27 |
Austria |
| Andreia
Figueiredo |
F |
24 |
Portugal |
| Clabaut
Celine |
F |
24 |
France |
| Ogorman
Grace |
F |
26 |
Ireland |
| Corcoran
Deirdre |
F |
24 |
Ireland |
| Corin
Irina |
F |
34 |
Sweden |
| Andrea
Chini |
M |
30 |
England |
| Leinberger
Dirk |
M |
27 |
Germany |
| Piotr
Bielecki |
M |
24 |
Poland |
| Gumma
Elkhabbuli |
M |
45 |
England |
| Osman
Sezerman |
M |
42 |
Turkey |
| Heikki
Koskinen |
M |
36 |
Finland |
| Johanna
Tuomela |
F |
27 |
Finland |
| Jing_Jiang
Zhou |
M |
35 |
England |
| Tapio
Lonnberg |
M |
48 |
Finland |
| Berit
Eitrem |
F |
34 |
Norway |
| Laila
Stordrange |
F |
30 |
Norway |
| Signe
Indahl |
F |
27 |
Norway |
| Tiina
Tomperi |
F |
22 |
Finland |
| Miia
Antikainen |
F |
22 |
Finland |
| Juha
Mykkanen |
M |
36 |
Finland |
| Sultana
Akter |
M |
24 |
Turkey |
| Pirkko
Heino |
F |
46 |
Finland |
| Nyyssonen
Mari |
F |
25 |
Finland |
| Leena
Ahonen |
F |
24 |
Finland |
| Asta
Varis |
F |
24 |
Finland |
| Benny
Abraham |
M |
30 |
Germany |
| Daniel
Picart |
M |
56 |
Turkey |
| Moussa
Hommady |
M |
27 |
France |
| Leif
Viklund |
M |
30 |
Finland |
| Petri
Susi |
M |
27 |
Finland |
| Ozgur
Gul |
M |
24 |
Turkey |
| Silvia
Barth |
F |
33 |
Germany |
| Senay
Vural Korkut |
F |
35 |
Turkey |
| Laszlo
puskas |
M |
36 |
Hungary |
| Janos
Kelemen |
M |
24 |
Hungary |
| Claudina
PerezNovo |
F |
33 |
Belguim |
| Rosanne
Asselta |
F |
34 |
Italy |
3. Lectures May 10, minisymposium
Dr Joerg Hoheisel, Head of Functional Genome Analysis
Group, DKFZ German Cancer Research Centre, Germany
The Division of Functional Genome Analysis at the DKFZ is
involved in the development of technologies for the analysis
of DNA-encoded function and its regulation. Current work emphasizes
DNA-, protein-, and peptide-microarrays. Apart from addressing
chemical and biophysical issues, such as highly parallel in
situ peptide synthesis and optimisation of surfaces of protein
arrays, for example, the resulting methods are immediately
put to the test in relevant, biologically driven projects.
Besides analyses on all genes of various model organisms,
the system is being developed toward acting as a tool for
early diagnosis, prognosis and evaluation of the success of
disease treatment.
Use of complex DNA- and antibody microarrays as tools in
functional analyses
Microarray technology has become a long way, and is applied
in biological and biomedical research as a routine method.
Transcriptional profiling and detection of single nucleotide
polymorphisms (SNPs) are by far the most applied forms of
analyses. Raw information on SNPs is required before associations
can be identified between polymorphisms and phenotypic variations
in epidemiological studies. Epigenomics: C to T conversions,
where 4% of all cytosines in the human genome are methylated
at C-5 postion. DNA oligonucleotide microarrays have, however,
not been successful in dealing with epigenetic analyses concerning
e.g. all methylation sites in a single experiment. Though,
methods like on-chip primer extension and minisequencing were
mentioned to have improved this type of analysis considerably.
The application of peptide nucleic acids in an array formate
could offer an alternative to the DNA oligonucleotide array
technology. PNA oligomers are synthetic DNA mimics with an
amide backbone and have several advantageous features: they
are stable in acidic conditions and they are resistant to
nucleases and proteases, plus that their neutral backbone
increases the binding strength to complementary DNA compared
to the corresponding DNA:DNA duplex. Therefore PNA probe can
be shorter than DNA oligonucleotide probes. In addition, mismatches
have more destabilising effect in PNA probes than in DNA oligonucleotide
probes. PNA hybridizations can be performed in low salt or
no salt conditions due to their neutral structure, which leads
to less secondary structure of the target DNA and better accessibility
to the probe molecules. Most important feature of the PNA
probes is contributed by the way of detection, where PNA:DNA
or PNA:RNA duplexes can be visualized by time-of-flight secondary
ion mass spectrometry (TOF-SIMS), by the detection of the
phosphates that constitute DNA but not PNA molecules. By combining
the PNA microarrays with TOF-SIMS detection has a potential
for a highly sensitive method for the detection of unlabelled
DNA or RNA.
Production methods of PNA-arrays: 1)in situ Spot method,or
2) application of prefabricated PNA molecules. Expensive to
produce, since PNA chemistry is not widely used. 3) There
exists a Fmoc chemistry developed by J. Hoheisel's group,
that permit fully automated process. Only the full length
molecules after synthesis are attached to the microarray surfaces
by selective binding of the terminal thiol or biotin groups,
while shorter incomplete reaction products are washed away.
Antibody microarrays could have an enormous impact on the
functional analysis by expression profiling. This type of
analysis could become invaluable also in disease diagnosis.
One of the major problems is created by the fact that the
array surface has a profound influence on the results. Also
the issue of antibody attachment creates problems, since this
influences their functional properties. Antigens in a mixture
should all bind to their cognate antibody receptors regardless
of their distinct structural features. The arrays can be varied
by the use of modification of the glass surface, the kind
and length of crosslinkers, and the composition and pH of
the spotting buffer, the type of blocking reagents, antibody
concentration, and antibody storage buffer.
In conclusion, Dr Hoheisel summarized that enormous impact
of the array formate expression profiling is currently finding
its way, and it is impossible to predict, at present, where
the development is going to be directed. The main stream is
concentrating on gene expression profiling and SNP-genotyping
experiments as well as on the epigenetic analyses. The protein
arrays are more difficult to optimize and await further development.
Laszlo Puskas (Biological Research Center, Hungarian
Academy of Sciences, University of Szeged, Szeged, Hungary):
Gene expression profiling the effects of dietary omega-3
polyunsaturated fatty acids in brain
Dietary effects regarding the n-3 polyunsaturated fatty acid
structure was analysed in rat brains by using self made rat
brain, liver and ganglion cDNA microarrays. 1. Study: Rats
were fed either a high linolenic acid (perilla oil) or high
eicosapentaenoic + docosahexaenoic acid (fish oil) diet (8%),
and the fatty acid and molecular species composition of ethanolamine
phosphoglycerides was determined. Gene expression pattern
resulting from the feeding of n-3 fatty acids also was studied.
Perilla oil feeding, in contrast to fish oil feeding, was
not reflected in total fatty acid composition of ethanolamine
phosphoglycerides. Levels of the alkenylacyl subclass of ethanolamine
phosphoglycerides increased in response to feeding. In the
sam fashion, the levels of diacyl phosphatidylethanolamine
molecular species containing docosahexaenoic acid (18:0/22:6)
were higher in perilla-fed or fish oil-fed rat brains, in
constrast to those in ethanolamine plasmalogens, which remained
unchanged. Using cDNA microarrays, 55 genes were found to
be overexpressed and 47 were suppressed relative to controls
by both dietary regimens. The altered genes included those
controlling synaptic plasticity, cytosceleton and membrane
association, signal transduction, ion channel formation, energy
metabolism, and regulatory proteins. The effect seems to be
independent of the chain length of fatty acids, but the n-3
structure appears to be important. 2. Study: Rats were fed
from conception till adulthood either with normal rat chow
with a linoleic (LA) to linolenic acid (LNA) ratio of 8.2:1
or a rat chow supplemented with a mixture of perilla and soy
bean oil giving a ratio of LA to LNA of 4.7:1. Fat content
of the feed was 5%. Fatty acid and molecular species composition
of ethanolamine phosphoglyceride was determined. Effect of
this diet on gene expression was also studied. There was an
accumulation of docosahexaenoic (DHA) and arachidonic acids
(AA) in brains of the experimental animals. Changes in the
ratio sn-1 saturated, sn-2 docosahexaenoic to sn-1 monounsaturated,
sn-2 docosahexaenoic were observed. Twenty genes were found
overexpressed in response to the 4.7:1 mixture diet and four
were found down-regulated compared to normal rat chow. Among
them were the genes related to energy household, lipid metabolism
and respiration.
It was concluded that brain sensitively reflects of the fatty
acid composition in the diet. It was suggested that "alteration
in membrane architecture and function coupled with alterations
in gene expression profiles may contribute to the observed
beneficial impact of n-3 type polyunsaturated fatty acids
on cognitive functions". 3. Study: Advanced age is associated
with reduced brain levels of long-chain polyunsaturated fatty
acids, arachidonic acid (AA) and docosahexaenoic acid (DHA).
Memory impairment is also a common phenomenon in this age.
Two-year-old, essential fatty acid-sufficient rats were fed
with fish oil (11% DHA) for 1 month, and fatty acid as well
as molecular composition of the major phospholipids, phosphatidylcholine
and phosphatidylethanolamine (PE), was compared with that
of 2-month-old rats on the same diet.
DHA but not AA was significantly reduced in brains of old
rats but was restored to the level of young rats when with
fish oil included in the regular chow. This effect was pronounced
with diacyl 18:0/22:6 PE species, whereas levels of 18:1/22:6
and 16:0/22:6 remained unchanged in all of the three PE subclasses.
Fish oil reduced the AA in the old rat brains, diacyl and
alkenylacyl 18:0/20:4 PE were most affected. Phosphatidylcholines
gave less pronounced response. Six genes were up-regulated,
whereas no significant changes were observed in brains of
old rats receiving fish oil for 1 month. None of them except
synuclein in young rat brains could be related to mental functions.
Old rats on the fish-oil diet did not perform better in Morris
water maze test than the control ones. A 10% increase in levels
of diacyl 18:0/22:6 PE in young rat brains resulted in a significant
improvement of learning ability. The results are interpreted
in terms of the roles of different phospholipid molecular
species in cognitive functions coupled with differential responsiveness
of the genetic machinery of neurons to n-3 polyunsaturated
fatty acids.
In conclusion, the rat brain responds to n-3 fatty acids in
an age- and feeding-time dependent manner. It is still unclear,
whether DHA interacts directly on gene expression of neurons
or if gene expression changes and membrane architecture are
unrelated events.
Tapio Visakorpi (Institute of Medical Technology, University
of Tampere):
Microarrays and prostate cancer
The rational to study molecular mechanisms of malignancies
is that they may provide means to provide better tools for
diagnostics, prognostics and treatment of cancer. Good examples
are trastumab, an antibody against ERBB2 oncoprotein and imanitib,
which is a tyrosine kinase inhibitor, which suppresses the
activity of ABL oncogene. Neither drugs are effective in protate
cancer. Recently Visakorpi's group has shown amplification
of uPA gene, and PC-3 cells carrying this amplification are
sensitive to uPA inhibitors. The amplification of the uPA
gene seems to indicate that invasion property of these cells
is dependent on the uPA activity. The amplifilication of uPA
is not frequent in prostate cancer, and not a relevant target
in the common form of the disease.
2. Genetic predisposition by mono- or dizygous twin studies.
The genetic predisposition may be attributable to high- and
low-penetrance genes, which increase risk of cancer several
fold (hereditary cancers). Linkage analyses have revealed
several chromosomal loci and three putative susceptibililty
genes: ELAC2, RNASEL, and MSR1. Some susceptibility genes
(RB1, PTEN, TP53, APC) are mutated also in sporadic malignancies.
Therefore ELAC2, RNASEL and MSR1 were screened in 50 unselected
prostate carcinomas for somatic mutations. No evident functionally
mutations were found and are thus rare in prostate cancer.
Polymorphisms of many gene have been suggested to be in association
with risk of prostate cancer such as the androgen receptor
(AR) gene At present none of the sequence variations can be
regarded as definitely associated with prostate cancer
3. Somatic alterations.Cytogenetical studies and analysis
of LOH and comparative genomic hybridization have been used
to detect chromosomal aberrations in prostate cancer. But
due to the fact that prostate cancer cells do not grow well
in vitro, the traditional cytogentetics has been uninformative.
Only a few whole genome-wide LOH analyses have been performed.
CGH has been mainly used detecting gains and losses of DNA
sequency copy numbers, and has revealed that losses are more
common than gain or amplifications. The chromosomal losses
are detected early stages of prostate cancer, whereas gains
and amplifications are seen in hormony refractory tumors.
Chromosomal losesse in prostate cancer are 6q, 8p, 10q, 13q,
16q and 18q indicating the locations of tumor suppressor genes
in prostate cancer. The hormony-refractory and metastatic
tumors show gains in 7p/q, 8q and Xq in CGH.
Two most common deletion are 8p and 13q. Minimally deleted
regions 8p21 and 8p22 with the most promising target NKX3.1(homeobox
gene), and N33, FEZ1 and PRTLS. Second most common deletion
occurs in 13q, associated with aggressiveness of prostate
cancer. The strongest target has been identified as RB1 and
ENDRB(endothelin receptor gene reported to be hypermethylated
and downregulated in prostate cancer).
Gain of 8q is most frequent in the hormone-refractory and
metastatic tumors. Most intensily studied gene in this region
is MYC, also amplified in prostate carcinomas. By using suppression
substraction and cDNA microarrays Visakorpi's group has identified
4 putative target genes for 8q gain: Elongin C, EIF3S3, KIAA0196,
and RAD21. They seem to be amplified in 20-30% of the hormone-refractory
prostate carcinomas. EIF3S3 is associated with hogh Gleason
score and advanced clinical stage of the disease. Other suggested
target genes include PSCA and TRPS1.
GSTP1 hypermetylation has been suggested as a diagnostic marker
for prostate cancer (Glutathione S-transferases are detoxifying
enzymes that protect cells from carcinogenic factors). The
human homeobox gene NKX3.1 is frequently detected in prostate
cancer. Homozygous and heterozygous mutant mice develop prostate
cancer. PTEN (PTEN functions as a lipid phosphatase and targets
PIP-3. By dephosphorylating PIP-3, PTEN downregulates the
Akt/PKB pathway that promotes cell survival and inhibits apoptosis).
Deletions and mutations of PTEN gene have infrequently reported
in prostate carcinomas. The frequency of LOH at PTEN locus
has been reported to be higher (40%) than the rate of mutations
in prostate cancer. Alternative mechanisms could involve haploinsufficiency.
TP53 is the most commonly mutated gene in human cancers. Under
DNA damage, TP53 can either induce apoptosis or arrest cell
cycle for DNA repair. Mutated TP53 has prolonged half-life
leading to nuclear accumulation of the abnormal protein. Immunohistochemical
detection. Rare in early but in advanced prostate carcinomas
found in 20-40% of cases. Nuclear localization is associated
with poor prognosis. AR-signalling pathways are re-activated
during progression of hormone-refractory prostate cancer.
AR mutations are detected in 20-25% of patients treated with
anti-androgens. Visakorpi's group has demonstrated that AR
gene is amplified in 30% of hormone-refractory prostate carcinomas
from patients treated with androgen withdrawal, which selects
the gene amplification. The mechanisms of the AR overexpression
without amplification is unknown.
Summary:
Genetic predisposition of prostate cancer is an extremely
complicated issue. It is likely that no high-penetrance prostate
cancer genes exist (like BRCA1 and BRCA2 in breast cancer).
ELAC", RNASEL and MSR1 contribute to a minute fraction
of prostate carcinomas. The traditional epidemiological studies
have not been able to identify the major environmental risk
factors. Is the hypermethylation of GSTP1 a primary event?
NKX3.1 and PTEN have been shown to posses haploinsufficiency
characteristics. How common is this mechanism? AR signaling
is the best known aberrant pathway in prostate cancer. How
to utilize this mechanism? Model system are also limited.
Xenografts have been recently established. Hopefully these
models will boost the efforts to develop novel targeted treatments
for prostate cancer.
Riitta
Lahesmaa (Turku centre for Biotechnology, University of
Turku and Åbo Akademi University, Turku, Finland)
Defects in the polarization of T helper subtypes Th1 and
Th2 can result in various immune-mediated diseases such as
asthma. To understand the development of these diseases
it is essential to know the process at the at the molecular
level. Both Th1 and Th2 originate from common precursor cell,
Thp. The differentiation is initiated in response to activation
through T cell receptor, costimulatory molecules and cytokine
receptors. The main cytokine that directs the Th1 commitment
is IL12, when Il4 drives the Th2 polarization. The effects
of IL12 and Il4 are mediated through Stat4 and Stat6, respectively.
Other key regulatory factors involved in Th1 and Th2 differentioation
are T-bet and Gata-3, respectively.
We and others have conducted large-scale gene expression analysis
to identify genes involved in differentiation process after
2 days or later using oligonucleotide arrays. In order to
solve the molecular mechanisms leading to Th1 and Th2 lineage
commitment, it is crucial to define the upstream factors at
the very earliest phase of initiation of Th cell commitment.
The aims of our study include the indetification of the immediate
early genes that are differentially regulated in response
to activation- and Th1- and Th2 -inducing cytokines IL12 and
IL4.
Affymetrix U95Av2 arrays containing probes for ~9300 genes
were used to study the changes in gene expression profiles
after 2 and 6 hours of CD3/CD28-activation and induction of
Th1 and Th2 polarization. After 2 h of activation alone upregulated
expression of 437 and downregulated 361 probe sets as compared
to the Thp cells. The peak for activation-mediated changes
was seen after 6 h, when activation had induced upregulation
of 832 and downregulation of 856 probe sets.
To identify the genes differentially expressed by the cells
induced to polarize to Th1 or Th2 direction, the gene expression
profiles of the cells cultured for 2 h or 6 h in Th1 polarizing
conditions (anti-CD3/anti-CD28/IL12) were compared with Th2
polarizing conditions (anti-CD3/anti-CD28/IL4). In result,
total of 63 genes were identified as differentially expressed
by the cells induced to polarize to Th1 or Th2 direction.
To further characterize the genes regulated by IL12 or IL4,
the expression profiles of cells induced to polarize to Th1
or Th2 direction were compared to expression profiles of cells
of the CD3/CD28 activated cells without polarizing cytokines.
This comparison revealed that the early changes in the gene
expression were mainly driven by IL4. The only genes that
were regulated by IL12 after 6h were IFN-gamma (1.87 fold)
and GBP1 (1.62-fold). Of the 63 differentially regulated genes
in Th1 and Th2 conditions, for 26 genes the regulation by
IL4 were seen at both 2h and 6h. the early polarization is
mainly driven by IL4, since the activated Th cells seem unresponsive
to IL12.
To illustrate the putative functional roles of the newly identified
immediate targets of IL4, the genes were grouped to functional
gategories base on Gene Ontology annotations. The dominating
functional groups consisted of transcription factors, cell
adhesion molecules and receptors, enzymes, and other intracellular
signaling molecules. The most of genes in these groups were
induced by IL4 by 2 hours of polarization.
As expected based on previous studies, genes that displayed
constant changes thoughout the early polarization were the
well-known mediators of Th1 and Th2 differentiation GATA3,
MAF and IFNG. Functional classification of the immediate target
genes of IL4 revealed that one of the dominating functional
droup consisted of transcription factors. Although role of
many of these factors in Th1 and Th2 polarization is currently
unknown, genes such as SATB1 and TCF7 have similar functions
as GATA3 and have been associated with pathogenesis mediated
by Th1 or Th2 responses.
In a distinct setting where early polarization of CD4+ lymphocytes
was studied, the effects of the presence and absence of TGF-
were studied About 20 novel genes were identified during Th1/2
polarization, and further, target genes associated with the
function of IL-12, IL-4 and TGF- A subset of identified target
genes were observed to be coregulated by IL-12, IL-4 and TGF-
TNFSF9, E4BP4/NFIL3, CTLA1/GZMB, ID2, Cox-2, GNAI1, PLA2G4A,
and BCL2A1). The antagonizing effect of TGF- on the expression
of these genes regulated by IL-12 or IL-4 could in part explain
the inhibitory effect of TGF- on Th differentiation. In the
mouse models TGF- 1 has suppressed the airway inflammation
associated with asthma. Therefore TGF- target genes in Th2
differentiation could also serve as potential drug targets
for therapeutic approaches to treat asthma and allergy. The
mice studies have indicated thet the inhibitory actions of
TGF- involves suppression of T-bet and Gata3, respectively.
In Lahesmaa's studies these genes were not characterized among
the numerous primary genes regulated in human cells by TGF-
The basis of these differences in gene regulation during early
Th differentiation in the mouse and in the human remain to
be further elucidated.
In summary, to resolve the precise role of genes for T helper
cell differentiation identified by Riitta Lahesmaa's group,
the following questions remain to be answered: 1) hierarchial
order of the target genes, 2) upstream factors, 3) interacting
factors in the signaling pathways involved in the differentiation
process, 4) functions of the target genes, 5) their significance
for Th differentiation, and 6) can they be used to modulate
Th1 and Th2 responses. About further studies of the potential
target genes in Th differentiation, the use of RNAi method
was mentioned as a method of choice to study the function
of these genes.
Olli Kallioniemi (VTT Medical Biotechnology, Turku,
Finland)
Development and applications of novel biochip technologies
Olli
Kallioniemi presented extensive overviews of different DNA
microarray technologies, their use, applications and comparison
between distinct platforms. The applicapability of protein
arrays, their major problems of usage and applications. Olli
Kallioniemi's group has developed and applied three general
strategies to facilitate "genome-scale" translational
cancer research, as well as validation and extension of traditional
microarray experiments:
1) Application of CGH microarrays or NMD microarrays (non-sense
mediated RNA decay) to guide towards genes that could be primary
genetic alterations or causative events in the multi-step
progression of cancer. These could also represent attractive
drug targets. Parallel CGH and cDNA microarray studies have
revealed 270 such candidate targets in breast cancer. NMD
microarrays in turn, may highlight mutated transcripts with
a possible tumor suppressive function.
2) Cell-based microarrays using reverse transfection approach
(Ziauddin & Sabatini, 2001), which are based on printing
cDNAs, siRNAs, drugs or other reagents on microscope slides
and plating cells to grow on top of the array to establish
a highdensity molecular matrix for exploring cell function.
Functional cell-based microarray studies provide fundamentally
different data as compared to traditional microarrays. Most
importantly, they establish cause-and-effect relationships.
This enables cell biological studies in a highly parallel,
in a miniaturized genome-wide scale.
3) Sample-based microarray strategies (tissue microarrays),
facilitate the analysis of individual DNA, RNA, or protein
targets in thousands of samples. For example, a large-scale
clinical study of 1000s of patients can be carried out on
a single microscope slide in order to establish definitive
clinical correlations for molecular targets, or to assess
drug target distributions at the population level.
The DNA microarray platforms are based on distinct protocols
for manufacturing, hybridization and imaging analysis with
proprietary data analysis steps making comparison of the data
between platforms difficult. This inevitably restricts the
effective use of publicly available large data sets. At present,
there are only a limited number of publications, where distinct
microarray technologies have been compared.
Olli Kallioniemis group has compared the results from Affymetrix,
Agilent, and custom-made microarrays, to determine the comparability
between these platforms. They compared four breast cancer
cell lines: BT-474, MCF-7, MDA-MB-436, and MDA-MB-361 and
the reference cell line HBL-100 (ATCC). Total RNA was isolated
with trizol with subsequent Qiagen RNAesy column purification.
Affymetrix U95-Av2 arrays were applied for expression analysis
without technical replicates. The data was analysed in addition
to MAS 5.0 with Robust Multichip average (RMA) method with
quantile normalization and fitted model by median polish.
Agilent cDNA arrays contained 13,156 clones from Incytes human
cDNA library, which was analysed with a dye-swap method. The
slides were analysed by Feature Extraction software (version
A.4.0.45). Variation within platform was calculated using
GenBank accession number as an identifier.
The custom printed microarray contained 11,520 clones from
Incyte Genomics IRAL cDNA library and 1136 clones from research
genetics. The cDNA clones were spotted on poly-L-lysine slides
with an OmniGrid arrayer. Samples were hybridized on three
replicate slides. The slides were scanned using scanner by
Agilent Technologies, and the image analysis was performed
with DeArray software. The data analysis was performed with
1) ratio quality value below 0.5 were discarded (1=good, 0=poor
quality). Withinslide normalization was calculated by ratio
statistics method using all spots in the array. 2) Quality
filterrrring was used with Bayesian networks for determination
of good or bad spots. Lowess normalization calculation was
performed for print-tip groups.
The comparisons were made with Unigene cluster ID as identifier:
2340 were found in each platforms, with 1147 common and possible
to evaluate.
The correlation coefficients for ratio values between custom-made
and either of the two commercial microarrays were 0.62-0.76
and those of the two commercial ones 0.78-0.86.
The estimation of variability between platform the percentage
of genes showing more than twofold change was determined.
5.0% of genes in the commercial arrays showed this type of
variation and 9.0% of the custom arrays as compared to Affymetrix
and 11.5% as compared to Agilent. However, The biological
difference between the cell lines was more prominent than
the variation of the platform, and was clustered correctly
on all platforms.
The variantions observed in this study lists the following
challenges of the MA analysis:
1) Results of one MA analysis can not necessarily be reproduced
with another platform
2) Data in public databases difficult to integrate
3) Clone-errors
4) Other methods for validating studies
5) Varying results with distinct MA platforms?
Notably there exists about 16% of the clones in custom-made
arrays are associated with wrong annotation information. It
would therefore be important for the self-made arrays to determine
the percentage of wrong clones and report this in publications.
It is of importance also to pay attention that short oligonucleotides
will hybridize more specificly than long probes or cDNA fragments
to their respective targets. The estimated presence of alternatively
spliced transcripts in the human genome is between 30-50%,
and the hybridization properties of short and long cDNAs may
lie behing the variying results. Also, the cDNA arrays may
give misleading information due to the fact that both sense
and antisense strands may react, whereas oligonucleotides
give results of the correct strand only.
When different analyzing methods were compared, the results
emphasised the importance of deposition of primary data, which
is necessary to re-evaluation of the data.
n summary, Optimisation of data preprocessing, QC and normalization
for each platform is necessary, and should be considered when
choosing the platform for a long term study, as well as when
comparing the data available in public data bases. The expression
profiling analysis was however, found robustly redundant,
for diagnostic classifications. Sources of differences can
reside upon clone errors, annotation mistakes, and technical
differences(oligos vs cDNA) may be of importance to take into
consideration when choosing the proper platform for a particular
study.
In summary, microarray strategies can be applied in a large
number of molecular, cell biological and clinical studies,
thereby expanding the traditional concept of "microarray
analysis". It is likely, that the integration of the
various types of microarrays will be needed in systems biology
studies, translational cancer research and drug target discovery,
since no single analysis platform will provide a complete
answer to complex biological and clinical questions. This
poses new challenges to bioinformatics analyses in the future.
Before closing his speech Olli showed a figure of the publications
with DNA-microarrays: the amount of DNA-microarray articles
have been on exponential track for the past years, and strongly
increasing in numbers.
Second heartwarming picture was about the comparative quality
of cDNA slides manufactured commercially (e.g. MWG) and the
slides manufactured at the Turku Centre for Biotechnology:
OUR SLIDES WERE RANKED TO BE OF BEST QUALITY OF ALL SLIDES
TESTED BY KALLIOMAKI'S GROUP !!!
Stephen
Rudd (Turku Centre for Biotechnology, University of Turku
and Åbo Akademi University, Turku, Finland)
Bioinformatics and microarray experiments: is this more
than just statistics?
Stephen has been involved in development of a web based tool
for comparative genomics at his previous post in the Institute
for Bioinformatics at the GSF Research Centre, Munich. The
definitions of the openSputnik are cited below:
openSputnik forms a core environment for the address of specific
genomics questions. The core infrastructure for comparative
genomics has already been implemented within the area of "reconstructomics".
With the ability to perform large scale analysis, archival
and interpretation within a single framework and utilising
some of the most contemporary bioinformatics methods (everything
is XML defined and external transformation methods can be
defined) - openSputnik surely has some potential within a
genomics pipeline. openSputnik has a rather interesting pedigree.
Before Sputnik was born there was a GABI online resource called
miniPEDANT (it may still exist somewhere). The idea was a
multi-user, flexible, online resource for up-to-date and contemporary
bioinformatics methods without the need to know what is really
hot. openSputnik needs to present and display some rather
complicated data to a less computationally aware audience.
While I feel that XML is the best way to display everything
- I believe that most biologists will disagree. I have chosen
to use Zope as a web-application server system and have implemented
a Zope product, openZputnik (does anyone have a better name?),
that allows for the administration of the core openSputnik
server as well as the display of all contained data. Since
this is Zope it is relatively trivial to setup and maintain.
Why openSputnik? Sputnik is a pipeline and infrastructure
aimed at both plant genomics and comparative genomics that
was written and is maintained at the Institute for Bioinformatics
at the GSF Research Centre near Munich. The platform was originally
implemented to satisfy the needs of a consortium of German
sugarbeet researchers (GABIBEET), and was later adapted to
allow a more generic but high throughput analysis of plant-based
biological data. Sputnik was implemented as a collection of
loosely interacting Python scripts, a PostgreSQL database
and a simple Apache webserver. The last release of Sputnik
(version 4.0) can still be viewed at MIPS.
"I have the feeling that Sputnik is of more value to
the scientific community as a core computing infrastructure
than as just a collection of pre-digested results. With the
transition from Munich to Turku I decided to recreate the
essence of Sputnik in a different langauge while solving some
of the problems and rethinking the rationale underlying the
computational platform. As a result I have started the openSputnik
project and hope that this may make some form of impact in
other research groups."
Sputnik was first written to automate the analysis of EST
annotation for comparative genomics. openSputnik both maintains
the concept behind Sputnik and continues to develop as an
optimal solution for the processing of large EST collections.
The core concept behind successful EST annotation is to create
an object relational infrastructure where for example a unigene
cluster inherits the attributes of the underlying EST sequence
data. Such annotative attributes include information such
as e.g. the mouse strain from which the ESTs were sequenced,
the developmental stage at which the clones were sequenced
and so on. With the derivation and annotation of a peptide
sequence we can consider the other extreme. With a single
EST sequence that we have previously shown to stem from a
candidate gene we can associate annotation that stems from
peptide domains that are not associated with this EST, but
rather from ESTs that assemble either directly or indirectly
with this sequence.
The focus of my research group is firmly embedded in comparative
genomics - a wide variety of methods have been implemented
in openSputnik that allow for the selection of lineage specific
transcripts, transcript families, lineage specific domains
or domain architectures. We have all plant EST collections
with more than 5,000 ESTs clustered, assembled and placed
within the openSputnik comparative framework. The next step
(the hard one) is to make some sense of this data and to present
it in a meaningful manner.
In addition to plant EST analysis we have on-going collaborations
with various research groups working on molecular markers
in pig, mouse, chickpea and barley. We are also looking at
EST collections from some of the more exotic genomes including
Hydra magnipapillata, Bombyx mori, Cycas rumphii and Ginkgo
biloba.
In the context of ESTs and molecular markers Sputnik and openSputnik
have been mentioned in publications
Brenner, E. D., Stevenson, D. W., McCombie, R. W., Katari,
M. S., Rudd, S. A., Mayer, K. F., Palenchar, P. M., Runko,
S. J., Twigg, R. W., Dai, G., et al. (2003). Expressed sequence
tag analysis in Cycas, the most primitive living seed plant.
Genome Biol 4, R78.
Kota, R., Rudd, S., Facius, A., Kolesov, G., Thiel, T., Zhang,
H., Stein, N., Mayer, K., and Graner, A. (2003). Snipping
polymorphisms from large EST collections in barley ( Hordeum
vulgareL.). Mol Genet Genomics.
Rudd, S. (2003). Expressed sequence tags: alternative or complement
to whole genome sequences? Trends Plant Sci 8, 321-329.
Rudd, S., Mewes, H. W., and Mayer, K. F. (2003). Sputnik:
a database platform for comparative plant genomics. Nucleic
Acids Res 31, 128-132.
Mark Reimers (Genomics and Bioinformatics Group, Laboratory
of Molecular Pharmacology, Bethesda, MD)
Microarray data analysis
"The design of scientific experiments is an art of balancing
considerations: skill, cost, equipment, and accuracy."
For a comparative analysis care should be taken in planning
to keep hybridisation conditions constant. Conditions such
as RNA preservation medium, the protocols of hybridisation,
and even regional ozone levels, can introduce systematic biases
comparable in size to the biological differences you wish
to detect. Taking a great deal of care to standardize conditions
will pay off in much higher discovery rates. To do a series
of two-color hybridisations, you want to prepare enough common
reference to serve for all experiments. Chip failures are
common, and it is wise to prepare more labelled cDNA than
you expect to use.
How many microarrays is enough? If an exploratory study aims
to find large (more than two-fold) differences between two
conditions, then a design with three samples per condition
is usually adequate. If the aim is to find smaller differences,
or almost all of the large differences, then five samples
per group are necessary to obtain sufficiently reliable enough
estimates of variation among samples within conditions, in
order to distinguish true differences between conditions.
Six samples per condition allows meaningful permutation tests,
which can give more accurate, and less conservative, estimates
of p-values and false discovery rates. If there are more than
two conditions, and the treatments do not drastically alter
the cell physiology, then the number of samples within any
one condition can be somewhat less; with four or more conditions,
one can obtain reasonable estimates of within-condition variation
with only four samples per condition. All of these suggestions
assume that there are no outlying samples, which should be
discarded; it is wise to do one or two more per condition
in clinical situations, where outliers occur commonly, and
it is safer to do one more for animal experiments, where sometimes
one animal in a condition appears very different than all
the others.
"To do meaningful clustering requires at least 20 samples,
and generally many more. The key issue for clustering genes
is how many different types of samples there are, because
the different conditions expose the correlations in gene regulation.
It is not useful to try to cluster genes from only two groups,
as is sometimes done, and rarely useful to cluster genes from
a study of fewer than five groups."
Pooling?
There is considerable disagreement about whether to pool individual
samples, among practitioners and also among statisticians.
Sometimes the amount of sample from any one individual sample
is insufficient for hybridization and in that case, pooling
is a practical necessity. In theory, if the variation of a
gene among different individuals is approximately normally
distributed, then pooling n independent samples would result
in reduction of variance given by the formula:
where 2 is the variance of the expression estimates of any
one gene across samples. In principle we could then reduce
further the variation by making replicates of the pool, and
hybridising to replicate arrays. Since technical variation
is usually less than (roughly half of) individual variation,
this strategy would in theory give us more accurate estimates
of the group means for each gene. See also Prichard et al
"Project Normal", PNAS (2002)).
Dye swap experiments
Reference sample:
i. it extends easily to other experiments, if the common reference
is preserved;
ii. is robust to multiple chip failures; and
iii. reduces incidence of laboratory mistakes, because each
sample is handled the same way.
A reliable alternative is a common reference obtained by pooling
all samples. This enables samples to be compared with each
other indirectly. A pooled reference sample reduces the number
of extreme gene ratios (which have large errors) on each chip.
Some labs take this further and create a 'universal reference':
a pool of mRNA derived from several standard cell lines, which
they use most often in their experiments. Using a universal
reference enables them to compare results for all their experiments.
One complication in two-color arrays is that the two dyes
don't get taken up equally well, so that the amount of label
per amount of RNA differs (dye bias).
However the dye-swap is the basis for most other efficient
designs: the general principles of a good two-color design
are that
i. it should be balanced: every sample appears equally often
in red and green;
ii. the samples whose ratios are most interesting should appear
on the same chips most often.
A good design for KO studies (e.g. KO-receptor with ligand)
is to hybridise several dye-swap pairs between the treatment
and control within each group, and perhaps to hybridise one
or two slides between WT treated and KO treated, and between
WT control and KO control. This design gives fairly accurate
estimates of both effects of treatment vs. control (in WT
and Mutant), which enables accurate comparisons between the
effects; there is less accurate information about the direct
comparison between WT and mutant.
Microarrays often give many parallel measures for the same
target and there is usually a good deal of cross-talk between
measures of different targets, also discrepancy between measures
of the same target. A classical statistical strategy is a
general linear model.
Biological measures give a large numbers of outliers, for
which purpose a robust linear model can be applied.
The application of robust linear model for the Affymetrix
system, seems to give five- to ten-fold increases in accuracy
for RNAs of signaling proteins (as measured by variance between
replicate arrays). The hybridization raw results give surprisingly
uneven hybridization results within the Affymetrix arrays.
The hybridization is also influence how the sequences are
located in the array. Notably, in the novel 2.0 plus arrays
the design is different to the U133 design, which results
in differential hybridization results due to distinct local
sequence environment of each PM probe.
John
N. Weinstein (Bioinformatics group, Laboratory of Molecular
Pharmacology, Bethesda, MD)
Integrating data from microarrays at the DNA, RNA, and
protein levels
The new genomic and proteomic technologies are immense powerful,
but they pose novel challenges to the researchers. Partly
these challenges relate to what is happening after finishing
the experiment. The first challenge is the statistical analysis
of the microarray data. The second challenge would be the
biological interpretation of the results. The third challenge
would be to be able to integrate the microarray data with
other types of molecular and pharmacological information,
which could be called as "Integromics". Our group
has developed a number of practical software tools for meeting
those challenges:
MedMiner=speeds up 5-10 fold the organization of biomedical
literature on genes and and drugs, searches and organizes
the biomedical literature on genes, gene-gene relationships,
and gene-drug relationships. It uses GeneCards, PubMed, syntactic
analysis, truncated-keyword filtering of relationals, and
user-controlled sculpting of Boolean queries to generate key
sentences from pertinent abstracts. Abstracts selected can
be automatically entered to EndNote (Biotechniques 1999;27:1210)
CIM Maker; which produces flexible Clustered Image Maps ("heat
maps"), generates color-coded Clustered Image Maps (CIMs
or heat maps) to represent "highdimensional" data
sets such as gene expression profiles. We introduc |