focus on protein interaction
List of participants
extensive meeting report and selected papers are being published
in the October issue of Comparative
and Functional Genomics.
central theme of the workshop was the confrontation of a variety
of methods, which aim to describe the "complete"
protein interaction network that operates in a living cell.
This goal poses two problems: one is the collection of experimental
data by high throughput methods and the second is its organization
in user friendly annotated databases.
workshop addressed both problems as well as their integration.
The sixteen invited speakers and the participants included
experts in fields such as genetic methods of identifying protein
partners, protein and peptide arrays, mass spectrometry and
main purpose of the workshop was to bring these very different
expertises together, with the objective of defining strategies
and bottlenecks in the effort of collecting reliable protein
interaction information. Both the retrieval of information
from the large but poorly accessible repository of scientific
literature and the implementation of new experimental approaches,
which makes use of recently developed high throughput technologies,
received attention during the workshop.
workshop was organised into four sessions
brief description of the topics discussed is presented here.
Legrain (Hybrigenics, France) discussed recent reports
on whole genome protein interaction maps obtained by the two
hybrid method and compared the results of two efforts aimed
at characterising the complete interaction map of S. cerevisiae.
Surprisingly he noted that there is little overlap between
the two proposed maps. The different strategies were compared
and the "superiority" of an approach that tests
interaction of protein domains instead of whole proteins was
proposed. Andreas Pluckthun (Biochemisches Institut,
Switzerland) illustrated the advantages of PCA which, by working
in E.coli, permits the testing of larger libraries. The feasibility
of a library versus library experiment was also reported.
Although still in its developmental phase, ribosome display
has the clear advantage of being an in vitro technique and
is therefore not affected by the limitations of transformation
experiments. Finally Brian Kay (University of Wisconsin,
USA) reported how selection of peptide ligands from repertoires
displayed on filamentous bacteriophage can be used to infer
the natural ligands of any protein domain.
and protein chips
talks by Jens Schneider Mergener (JERINI, Germany),
Dolores Cahill (MPI Molecular Genetics, Germany) and
Ian Humphery Smith (Utrecht University, the Netherlands)
presented a clear overview of this rapidly developing field.
Pep Spot (high density peptides synthesised on solid supports)
is possibly the most established technique and a variety of
applications of "biological relevance" were reported.
The assembly of large protein or antibody arrays still poses
a variety of technical problems (collection of clones, expression
of the protein repertoire in a soluble form, attachment to
a solid support without disturbing protein structure, background
etc) but there is confidence that these will soon be solved.
Roepstorff (University of Southern Denmark, Denmark) in
his lecture highlighted some recent developments and applications
of mass spectrometry (MS) in proteome analysis. This technique
is now accepted as the method of choice for the identification
of low abundance proteins and characterisation of post-translational
modifications. Developments in MS over the past few years
now allow the characterisation of subpicomole quantities of
gel-separated proteins. The molecular weight and sequence
information derived from MS experiments can be used to interrogate
large protein and nucleotide databases for the identification
of the rapidly growing number of known proteins or alternatively,
for the cloning of novel proteins.
Benedetta Mattei (University of Rome La Sapienza, Italy)
illustrated the combination of surface plasmon resonance (SPR)
biosensors and MS as a tool to couple sensitive affinity capture
and characterisation of binding events with the ability to
identify and characterise interacting molecules. The biosensor
can confirm and quantify specific binding events to a target
and it is possible to identify interacting proteins eluted
from the chip or even from tryptic digests performed directly
on sensor surfaces.
Carol Robinson (Oxford Centre for Molecular Sciences,
UK) gave several examples on how electrospray ionization mass
spectrometry (ESI-MS) can be used to study protein interactions
driven by non-covalent forces, including the study of the
70S ribosomal particle and the complex of the spliceosome.
The gentleness of the electrospray ionization process allows
intact protein complexes to be directly detected by mass spectrometry,
allowing questions about non-covalent assembly to be addressed
by the direct observation of gas phase complexes, their assembly
in real time and their disassembly by perturbation of solution
or instrument conditions. For the study of protein interactions,
ESI-MS can be a valid complement to other biophysical methods,
such as NMR and X-ray crystallography.
Finally Giulio Superti Furga (CellZome, Germany) reported
on an ongoing project which aims to identify all of the yeast
protein complexes by genomic tagging and purification of more
than 6000 gene products.
session covered two major topics: the annotation and architecture
of biological databases (Mike Sternberg and Thure
Etzold) and the possibility of deriving information on
protein-protein interactions using computational methods (AlfonsoValencia,
Rita Casadio, Jong Park and Manuela Helmer Citterich).
Michael Sternberg (ICRF, London) opened the session
with a talk about "Structural and functional annotation
of genomes". Different methods, PSI-BLAST and 3D-PSSM,
were discussed. These can be used on translated ORFs in comparative
genomics, screening, drug design and pathway reconstruction.
Thure Etzold (EMBL-EBI Cambridge, UK), is involved
in the development of the SRS system, and discussed the possibility
of integrating biological databases.
Rita Casadio (University of Bologna, Italy) described
a new neural network approach to identify protein residues
involved in protein-protein interaction. An overview about
the mapping of the different folds involved in the formation
of protein complexes was given by Jong Park (MRC Dunn,
Cambridge, UK). Manuela Helmer Citterich (University
of Rome Tor Vergata, Italy) described the SPOT method and
the MINT database, dedicated to protein interaction.
Alfonso Valencia (CNB-CSIC Madrid, Spain) closed the
session with a talk about the prediction of protein interaction
from sequence information using a new method based on the
comparison of phylogenetic trees. Some time was also devoted
to methods for the extraction of biological information from
to compare experimental data on protein interactions ?
protein interaction maps are, with gene expression profiles,
among the first examples of datasets generated without specific
knowledge on functions of genes. These are technology-driven
experiments rather than hypothesis-driven experiments. They
are valuable tools for protein function prediction, despite
the occurrence of typical artefacts. These approaches are
still in their early stages.
The matrix approach uses the same collection of proteins as
both bait and prey. The library screening approach identifies
for each interacting prey protein the domain of interaction
with a given bait. The rate of false positive interactions
is difficult to evaluate and is largely dependent on the criteria
applied for the significance of the interactions, such as
the reproducibility of results. Moreover, the two matrix exhaustive
studies of the yeast proteome have failed to recapitulate
as many as 90% of interactions previously described in the
literature, suggesting a very high level of false negatives.
Evaluation of false positives and reproducibility requires
access to primary data. Thus, bioinformatics tools might also
contribute to identifying false positives.
Bioinformatics clustering of protein interactions represents
a powerful annotation tool which will become more and more
useful as the interaction data accumulate. However, one major
hurdle in these bioinformatics prediction algorithms is clearly
the lack of independently validated methods. In order to be
used successfully for appropriate functional annotation of
protein clusters, the data needs to be stored in elaborate
structures that allow each individual scientist to test his/her
own hypothesis against complex heterogeneous primary data
and then to design further experimental settings to validate
the functional assignment.
Mapping protein-protein interactions with combinatorial
the sequence of a genome is in hand, understanding the function
of its encoded proteins becomes a task of paramount importance.
Much like the biochemists who first outlined different biochemical
pathways, many genomic scientists are engaged in determining
which proteins interact with which proteins, thereby establishing
a protein interaction network. These interactions have evolved
specificity, affinity and cellular function over billions
of years; however, in the laboratory it is possible to isolate
peptides from combinatorial libraries that bind to the same
proteins with similar specificity and affinity and primary
structures that resemble those of the natural interacting
proteins. We have termed this phenomenon 'convergent evolution'.
Thus, a fruitful approach for mapping protein-protein interactions
is to isolate peptide ligands via phage-display to a target
protein and identify candidate interacting proteins in a sequenced
genome by computer analysis. We have applied this method to
dissecting molecular interactions in the protein machinery
involved in receptor-mediated endocytosis.
Mapping Protein-Protein-Interactions with Synthetic Peptide
and Protein Domain Arrays
peptide and protein domain arrays prepared using the SPOT
technology are increasingly applied to map the interactions
between antibodies/antigens, receptors/ligands and proteins/nucleic
acids1,2. The SPOT technology involves different aspects,
such as array preparation4, types of molecules selected for
the binding studies3,5 and the different types of binding
assays performed on the arrays1.
1Frank, R. and Schneider-Mergener, J. (2001). SPOT-synthesis:
scope and applications. Introduction to: Peptide arrays on
membrane supports - synthesis and applications. Springer lab
manual (Koch/Mahler eds.), in press.
2Reineke, U., Volkmer-Engert, R. and Schneider-Mergener, J.
(2001). Applications of peptide arrays prepared by the Spot
technology. Curr. Opin. Biotech. 12, 59-64.
3Töpert, F., Pires, R., Landgraf, C., Oschkinat, H. and
Schneider-Mergener, J. (2001). Synthesis of an array comprising
837 variants of the hYAP WW protein domain. Angew. Chemie
Int. Ed. 40, 897-900.
4Wenschuh, H., Volkmer-Engert, R., Schmidt, M., Schulz, M.,
Schneider-Mergener, J. and Reineke, U. (2000). Coherent membrane
supports for parallel micro-synthesis of bioactive peptides.
Biopolymers 55, 188-206.
5Reineke, U., Kramer, A. and Schneider-Mergener, J. (1999).
Knowledge- and library-based mapping of discontinuous protein-protein-interactions
by spot synthesis. Curr. Top. Microbiol. Immunol. 243, 23-36.
Generation and Applications of High Density Protein Arrays
full understanding of the expression profile of a tissue or
organism requires the screening of many genetic and/or protein
samples in parallel as rapidly as possible. In our laboratory,
we have automated and miniaturised various steps to enable
a high-throughput and highly parallel approach to large-scale
cDNA and protein analysis; specifically, the generation and
picking of cDNA expression libraries, and arraying of clones
into microtitre-plates. A technique known as oligonucleotide
fingerprinting has been developed to characterise cDNA libraries,
which allows the generation of non-redundant UNIgene sets.
To apply this technology to generate protein arrays, we have
clonally expressed proteins from arrayed cDNA expression libraries,
producing high density protein arrays on filters and chips.
These protein arrays have been screened with antibodies, which
detected specific protein products. This approach makes translated
gene products directly amenable to high-throughput experimentation,
allowing a link between expressed protein and sequence. We
have obtained initial results in characterising antibody specificity
and profiling autoimmune sera on protein arrays, as well as
developing antibody arrays.
Modification-Specific Proteomics: the next level in proteome
advances in DNA sequencing and the rapidly increasing amount
of genome sequence data becoming available have changed the
scope of protein analysis. Databases now contain the sequences
of more than 450,000 proteins and this number is rapidly increasing.
Consequently, the amino acid sequence of a protein of interest
is likely to be available in a database. Sequencing of complete
genomes raises a number of questions:
"Which of the genes are expressed in the organism?"
or, if the organism is a higher eukaryote, "which genes
are expressed in which cell types?" The complete 2D-PAGE
map of the proteins expressed in a given cell type has recently
been termed the "proteome"1. The proteins in relevant
gel spots are identified using mass spectrometric peptide
mass mapping or sequencing after in-gel proteolytic digestion
of the proteins.
- "Does a given protein contain post-translational modifications
and, if so, which and where?" Numerous such modifications
have been reported in proteins, the most common being truncation,
glycosylation, phosphorylation and acylation. Genome and cDNA
sequencing, however, do not give information about the presence
of these modifications: they must be studied at the protein
level and mass spectrometry is the key analytical tool for
such studies. The information generated in the protein identification
step can sometimes be used directly to assign the type of
post-translational modification. Additional mass spectrometric
experiments can be performed to fully characterize a protein;
however, specific detection of selected modifications would
- The final question concerns the function of the protein
- with what does the protein interact and how? This is a much
more complex problem which requires a number of biochemical
and chemical analytical procedures to solve it, among which
mass spectrometry plays an important role.
have attempted to develop fast, sensitive and highly specific
tools based on combining PAGE, mass spectrometry and sequence
information in databases2. Recently, we have initiated a new
approach which we have termed: 'Modification Specific Proteomics'3.
It is based on specific detection of post-translational modifications
in the 2D-gel, by specific 'pull-out' of modified proteins/peptides,
or by selective detection of the specific type of modification
in the mass spectrometer. Examples from our recent research
include mass spectrometric identification of the proteins
in the 2D-gel, determination of secondary modifications in
the identified proteins and our initial attempts to perform
modification specific proteomics4-6. We plan to extend our
mass spectrometric analysis to studies of protein interaction.
(1) Wilkins, M.R. et al. (1996) Bio/Technology, 14, 61-65.
(2) Jensen O.N., Larsen M.L., and Roepstorff, P. (1998) PROTEINS:
Structure, Function, and Genetics Suppl. 2, 74-89.
(3) Jensen ON, (2000) Proteomics: A trends guide, 36-42.
(4) Larsen M.R. and Roepstorff P. (2000) Fresenius' J Anal
Chem. 366, 677-690.
(5) Larsen MR, Soerensen GL, Fey SJ, Larsen PM, Roepstorff
P. (2001) Proteomics 1, 223-238.
(6) Larsen MR, Larsen PM, Fey SJ, Roepstorff P. (2001) Electrophoresis
combination of Surface Plasmon Resonance (SPR) biosensors
and mass spectrometry
plasmon resonance (SPR) biosensors are important tools for
ligand screening due to their versatility. They allow to measure
interactions in real time, require very little material and
usually little or any chemical modification of the interactants.
Once a sample is identified as containing a possible ligand,
the problem arises to identify the new ligand. The identification
of proteins at the femtomole level is made possible by the
use of sensitive mass spectrometers and advanced database
searching algorithms. Applications of biosensor technology
coupled with mass spectrometry have been developed, allowing
to characterize proteins eluted from sensor surfaces and to
identify proteins from tryptic digestions performed directly
on the sensor chip.
A combination of SPR and MALDI-TOF mass spectrometry was used
to affinity purify peptides of the enzyme polygalacturonase
(PG) that are recognized by its inhibitor PGIP, from a peptide
mixture obtained by limited proteolysis of the native enzyme.
One peptide was identified: it comprises residues 181-244
and includes the three catalytic residues D191, D212 and D213.
Site-directed mutagenesis and SPR data have shown that these
residues are not involved in the interaction with PGIP, while
the mutation of residues in the same peptide that are located
at the entrance of the active site cleft causes a significant
decrease in the affinity for PGIP. From these data we can
hypotesize a mechanism of inhibition based on a network of
contacts that are close to the active site but not buried
inside the cleft, thus providing steric hindrance to substrate
entry. The influence of post-translational modifications on
the binding has also been studied by performing enzymatic
deglycosylation of PG and characterizing by mass spectrometry
the protein eluted from the sensor surface.
Dynamic protein complexes: insights from mass spectrometry
Carol V. Robinson
recent sequencing of the human genome has revealed that the
human cell contains only twice the number of genes as, for
example, cells in the worm or fly1. This implies that the
regulation of gene products and their interactions accounts
for the increased biological complexity of higher organisms.
Consequently, in order to exploit the wealth of information
provided by genome sequencing it is essential to be able to
study both stable and transient macromolecular complexes.
While the yeast two-hybrid system and cross linking combined
with mass spectrometry (MS) are exceptionally powerful approaches
to defining stable complexes within the proteome, MS has the
additional potential to describe transient, dynamic complexes
through two major developments. These are (i) the ability
to probe molecular dynamics through the coupling of MS with
hydrogen/deuterium exchange technologies, and (ii) the control
of the conditions within the mass spectrometer such that non-covalent
interactions between proteins and cofactors can be examined.
Over the last decade, hydrogen/deuterium exchange in conjunction
with MS has developed to an extent where it can probe the
exchange behaviour of regions of secondary structure in macromolecular
complexes2 and even individual residues in smaller proteins3.
In parallel with these developments, major advances have been
made in the ability to study non-covalent complexes in the
gas phase. Specifically, the coupling of time-of-flight methods
with electrospray and the refinement of this process to a
nanoflow technique have enabled the study of simple dimeric
complexes4, homo-5 and hetero-oligomeric complexes6 and even
whole particles7,8. Consequently, the ability to probe both
non-covalent interactions and hydrogen/deuterium exchange
enables definition of not only the stoichiometry of interacting
subunits but also their conformational dynamics. The fact
that complexes can be observed in the mass spectrometer enables
their stability and folding to be probed in the presence of
a wide range of ligands and cofactors as well as in response
to thermal and chemical denaturation.
Central to the success of protein folding in vivo is the prevention
of aggregation, a role ascribed to molecular chaperones. The
most widely studied of all molecular chaperones is the Escherichia
coli chaperonin GroEL and its co-chaperonin GroES. GroES forms
a single heptameric ring of seven identical subunits (70 kDa).
The 14 GroEL subunits (57 kDa) are joined through non-covalent
forces to form a double toroidal structure with a molecular
mass of 800 kDa. Using a quadruple time-of-flight (Q-ToF)
mass spectrometer and a carefully controlled balance of pressures,
conditions were found whereby the GroEL and GroES chaperone
assemblies remained intact. For the GroES heptamer a population
of monomeric subunits was always observed, consistent with
the micromolar Kd measured for this oligomeric complex. The
GroEL 14-mer was found to be remarkably stable, but acceleration
and collision-induced dissociation of this complex within
the collision cell of a Q-ToF mass spectrometer revealed the
topology of the interacting subunits5.
This protocol of maintaining many low energy collisions to
absorb the excess translational energy of the ions or alternatively
inducing their dissociation by high energy collisions has
recently been applied to probe the subunit arrangement in
a newly described molecular chaperone, MtGimC. MtGimC is an
archaeal homologue related to the eukaryotic chaperonin cofactor
GimC/prefoldin, involved in the folding of actin and tubulin9.
The complex was characterized by first defining the molecular
weight of the intact complex10. This had been analysed previously
by size exclusion chromatography but the precision afforded
by the MS method enabled an unequivocal determination of the
stoichiometry. This corresponded to a well-defined hexamer
of two a and four b subunits10. Dissociation of the complex
within the gas phase was used to probe the quaternary arrangement
and two central subunits, both a, and four peripheral b subunits,
consistent with these measurements, were proposed. In an extension
to this study, a thermally controlled nanoflow device was
constructed to monitor the thermal stability of this heat
shock complex. The results demonstrated that a significant
proportion of the MtGimC hexamer remains intact under low-salt
conditions even at 70°C. In addition, it was possible
to monitor in real-time the assembly of the MtGimC hexamer
from its component subunits. A mixture of the two subunits
in a 1:2 ratio of a:b subunits was placed in the nanoflow
capillary and, after the dead time of the experiment, spectra
were recorded continuously. The mass spectra showed the absence
of any intermediates, demonstrating that the assembly process
is highly cooperative, leading exclusively to the hexamer.
Despite the relative size and complexity of ribosomes, which
in E. coli comprises three large RNA molecules and 55 different
proteins, these macromolecules have also been shown to remain
intact in the gas phase8. Spectra recorded of the70S particle
in the presence of Mg2+ showed that ions from the intact ribosome
have m/z values in excess of 20,0008. Through controlled dissociation
of this particle in the gas phase it was possible to remove
subsets of proteins both individually and as complexes of
up to six proteins. Further dissociation into smaller macromolecular
complexes and then individual proteins could be induced by
subjecting the particles to increasingly energetic gas-phase
collisions. The ease with which proteins dissociated from
the intact species was found to be related to their known
interactions in the ribosome particle. The fact that the 2.3
MDa particle can traverse a mass spectrometer remaining intact
until mass measurement enables the sensitivity of the ribosome
to a number of external conditions to be examined. For example,
lowering the Mg2+ concentration in solution led to dissociation
into its component 30S and 50S subunits. The dynamic properties
of individual proteins (L10 and L11) within the whole particle
has also been addressed. The results suggest that these two
proteins are tightly packed within the ribosome structure2.
The foundations are now in place to gain insight into the
structure of the ribosome during various stages of its dynamic
function as well in repsonse to the many theraputic agents
that are known to target the ribosome.
1) International Human Genome Sequencing Consortium, I. H.
G. S. Nature 2001, 409, 860-921.
2)Benjamin, D. R.; Robinson, C. V.; Hendrick, J. P.; Hartl,
F. U.; Dobson, C. M. Proc. Natl. Acad. Sci. USA 1998, 93,
3)Tito, P.; Nettleton, E. J.; Robinson, C. V. J. Mol. Biol.
2000, 303, 267-278.
4)Vis, H.; Heinemann, U.; Dobson, C. M.; Robinson, C. V. J.
Am. Chem. Soc. 1998, 120, 6427-6428.
5)Rostom, A. A.; Robinson, C. V. J. Am. Chem. Soc. 1999, 121,
6)Rostom, A. A.; Sunde, M.; Richardson, S. J.; Schreiber,
G.; Jarvis, S.; Bateman, R.; Dobson, C. M.; Robinson, C. V.
Proteins Struct. Func. and Genetics 1998, Suppl. 2, 3-11.
7)Tito, M. A.; Tars, K.; Valegard, K.; Hadju, J.; Robinson,
C. V. J. Am. Chem. Soc. 2000, 122, 350-351.
8)Rostom, A. A.; Fucini, P.; Benjamin, D. R.; Juenemann, R.;
Nierhaus, K. H.; Hartl, F. U.; Dobson, C. M.; Robinson, C.
V. Proc. Natl. Acad. Sci. USA 2000, 97, 5185-5190.
9)Leroux, M. R.; Fändrich, M.; Klunker, D.; Siegers,
K.; Lupas, A. N.; Brown, J. R.; Schiebel, E.; Dobson, C. M.;
Hartl, F. U. EMBO J. 1999, 18, 6730-6743.
10)Fändrich, M.; Tito, M. A.; Leroux, M. R.; Rostom,
A. A.; Hartl, F. U.; Dobson, C. M.; Robinson, C. V. Proc.
Natl. Acad. Sci. USA 2000, 97, 14151-14155.
iSPOT and MINT: a method and a database dedicated to molecular
are interested in the basic principles of molecular recognition.
We have developed the SPOT method for the inference of protein
domain specificity and a new database of Molecular INTeractions
iSPOT (iSpecificity Prediction Of Target) is a web tool developed
to infer the protein-protein interactions between families
of peptide recognition modules. The SPOT procedure (Brannetti
et al, 2000) utilizes information extracted, for each protein
domain family, from position-specific contacts derived from
all the available domain/peptide complexes of known structure.
The framework of domain/peptide contacts defined on the structure
of the complexes is used to build a residue/residue interaction
database derived from ligands obtained by panning peptide
libraries displayed on filamentous phage.
The method is being optimised with a genetic algorithm and
will soon be available on the web. It has been applied to
SH3 and PDZ domains and to MHC class I molecules. iSPOT will
offer the possibility to answer the following questions: which
protein (or peptide) is a possible ligand for a given SH3
(or PDZ or MHC class I molecule)? Which is the best possible
SH3 (or PDZ or MHC class I) interacting domain for a given
protein/peptide sequence? What residues should one mutate
in a domain to lower/increase its affinity for a given peptide
MINT is a relational database built to collect and integrate
protein interaction data in a unique database accessible via
a user-friendly web interface. MINT now contains experimentally
determined protein-protein interaction data. In the near future,
MINT will be enriched with protein-DNA and protein-RNA interaction
data. It will also allow the collection of peptide lists selected
from a molecular repertoire like those resulting from phage
display experiments. We plan to add information about interactions
inferred by computational predictive methods.
Curators manually submit the interactions. MINT is an SQL
database and the web server is written in an HTML-embedded
language named PHP (hypertext preprocessor, derived from PERL).
Mapping protein-protein interaction structurally and globally
the postgenomic era, one of the most interesting and important
challenges is to understand protein interactions on a large
scale. The physical interactions between protein domains are
fundamental to the workings of a cell: in multidomain polypeptide
chains, in multisubunit proteins and in transient complexes
between proteins that also exist independently. To study the
large-scale patterns and evolution of interactions between
protein domains, we view interactions between protein domains
in terms of the interactions between structural families of
evolutionarily related domains. This allows us to classify
8151 interactions between individual domains in the Protein
Data Bank and the yeast Saccharomyces cerevisiae in terms
of 664 types of interactions between protein families. At
least 51 interactions do not occur in the Protein Data Bank
and can be derived only from the yeast data. The map of interactions
between protein families has the form of a scale-free network,
meaning that most protein families interact with only one
or two other families, while a few families are extremely
versatile in their interactions and are connected to many
families. We observe that almost half of all known families
engage in interactions with domains from their own family.
We also see that the repertoires of interactions of domains
within and between polypeptide chains overlap mostly for two
specific types of protein families: enzymes and same-family
interactions. This suggests that different types of protein
interaction repertoires exist for structural, functional and
Prediction of protein interactions from sequence information
considerable number of computational methods have been recently
developed for the prediction of protein interaction partners
based on different aspects of genomic information. Two new
methods also address the problem of predicting interaction
partners. These are based on the study of corresponding multiple
sequence alignments, without the explicit requirement of full
The first method1 is based on the study of correlated mutations2
between possible interaction partners. We have previously
demonstrated that correlated mutations can be used for the
detection of regions of interaction3. Predictions generated
with this type of method have been successfully tested in
two different experimental systems4-5.
The second method is based on the study of the relation between
the phylogenetic trees of the corresponding protein family6.
The complementarity of the phylogenetic trees is probably
a consequence of the co-evolution of the proteins. Some of
the best known cases are the interactions between hormones
and their receptors. The proposed method uses the degree of
correlation between protein family trees as an indicator of
the relation between protein pairs.
1.- Pazos & Valencia (2001) Submitted
2.- Gobel et al., (1994) Proteins.
3.- Pazos et al., (1997) J. Mol. Biol.
4.- Gdssler et al (1998) Proc. Natl. Acad. Sci. USA
5.- Azuma et al (1999) J. of Mol. Biol
6.- Pazos & Valencia (2001) Prot. Eng.
Prediction of protein-protein interaction sites in heterocomplexes
with neural networks
study the problem of extracting from the three-dimensional
structures of protein complexes features relevant for predicting
protein-protein interaction sites. Our approach is based on
information about properties of evolutionary conservation
and surface disposition. The predictor we implement is a neural
network system, which uses a cross-validation procedure and
allows the correct detection of 73% of the residues involved
in protein interactions. Our analysis confirms that the physical
properties of interacting surfaces are difficult to distinguish
from those of the whole protein surface. However, neural networks
trained with a reduced representation of the interacting patch
and sequence profile are sufficient to generalise over the
different features of the contact patches and to predict whether
a residue in the protein surface is or is not in contact.
The predictor can significantly complement results from functional
genomics and proteomics.
of participants ...
of Molecular Genetics, Berlin, Germany
PROT@GEN AG, Bochum, Germany
of Rome, Tor Vergata, Italy
Bioinformatics Institute, Cambridge, United Kingdom
of Wisconsin, USA
of Rome, La Sapienza, Italy
Cambridge, United Kingdom
Institut, Universitaet Zurich, Switzerland
Centre for Molecular Sciences, Oxford, UK
of Biochemistry and Molecular Biology, University of Southern
University, the Netherlands
College, London, UK
GmbH, Heidelberg, Germany
Institute, Cambridge, UK
of Turin, Italy
Human Nutrition Unit, UK
di Roma "Tor Vergata", Italy
of Trieste, Italy
degli Studi di Roma "Tor Vergata", Italy
Acidi Nucleici,CNR, Italy
Del Sal Giannino
di Fisiologia, University of Pisa, Italy
di Roma "La Sapienza" , Italy
College Cork, Ireland
Negri Institute, Italy
degli Studi "La Sapienza", Italy
degli Studi Tor Vergata , Italy
Research International, the Netherlands
Biocomputing Unit, Germany
Wiley & Sons, London, UK
of Rome "Tor Vergata", Italy
Monteiro Marques da Silva
of Aveiro, Portugal
Organica e Biochimica-University of Napoly Federico II,
degli Studi di Tor Vergata, Italy
- Interuniversitary Computing Centre, Italy
degli Studi di Tor Vergata, Italy
degli Studi di Udine, Italy
Center for Biotechnology, Spain
Acidi Nucleici CNR, Italy
, Münster, Germany
degli Studi di Tor Vergata, Italy
of Helsinki, Finland
Bioscience AG, Germany
di Roma "La Sapienza" , Italy
of Trieste, Italy
Nazionale Tumori, Italy
CNR Roma, Italy
Hematologikum, Munich, Germany
of Rome "Tor Vergata" , Italy
degli Studi Tor Vergata, Italy
candid snaps from the meeting ...