|
Proteomics:
Protein identification, characterisation, expression and interactions
As
a genome describes the genetic content of an organism, a proteome
defines the protein complement of the genome. Proteomics includes
the identification of proteins in biological tissues, the
characterisation of their physicochemical properties (complete
sequence, post-translational modifications), and the description
of their behaviour (function, expression level). After processing
and modifications, a single gene may express between one and
a few dozen different protein products; by extrapolation,
the ~50,000 human genes could produce over 500,000 different
proteins. A combination of technologies is required to characterise
a proteome fully. A standard procedure is two-dimensional
gel electrophoresis (2-DE) as the separation method, followed
by mass spectrometry (MS) analysis of the separated and enzymatically
digested proteins. The peptide mass fingerprints typically
obtained by MALDI-TOF MS are matched against sequence databases
using dedicated bioinformatics tools. The whole procedure
can be automated and robotised for high throughput purposes.
One aim of the programme will be to promote the development
of new methods themselves, as well as projects that are application
driven and using the new technologies as tools. For example,
it is important to develop further high throughput techniques
to separate efficiently and identify quickly a majority of
proteins. More specific technologies will have to be used
to identify proteins that have failed positive hits with the
main approach and to characterise individual post-translational
modifications which, while typically not deducible from gene
sequence alone, carry important functional implications. The
huge amount of data generated from both experimental results
have to be stored in specific databases, which can then be
searched for pattern-matching recognition, characterisation
or functional relevance studies. It is important that the
design of common and appropriate database formats and analysis
tools allow easy access to data and comfortable data interpretation.
In
the cell, proteins do not act in isolation, but usually form
transitory or stable complexes in order to participate in
pathways and act in networks. Protein-protein interactions
thus constitute an essential aspect of the normal workings
of the living cell and unravelling the various interactions
in which individual proteins are involved constitutes an invaluable
way of understanding protein function. Recently, Fromont-Racine
et al. (Institut Pasteur, Paris) developed a high-throughput,
genome-wide version of the yeast two-hybrid system to create
Protein Interaction Maps (PIMs) for whole cells. The automated
generic version of Fromont-Racine's procedure is rapidly becoming
the method of choice for mapping whole proteomes. With a yeast
cell mating procedure that increases screening efficiency,
Fromont-Racine et al. used their complex yeast genomic library
of 5 x 1000 000 clones to test 700 x 1000 000 interactions
against 15 proteins. They identified and classified 170 potential
interactors, including approximately 70 proteins of previously
unknown function. More than 25% of the interactors are probably
biologically relevant. The achievements of this group have
opened the way to the systematic analysis of the protein interaction
networks of the 6,000 open reading frames of the yeast proteome.
Another European team (Hybrigenics, Paris) has adapted the
Fromont-Racine procedure to analyse a bacterial genome and
has linked half of the 1600 proteins of the ulcer-provoking
bacterium Helicobacter pylori into partial PIMs. Hybrigenics
has developed the 'PIM Rider' tool to score interactions and
visualize PIMs; it also identifies Selected Interacting Domains
(SIDs) involved in the various protein-protein interactions
listed in the database. Other technologies which address proteomics
will also be included in the programme, such as the application
of phage and ribosome display libraries and purification of
complexes.
With
the automation of the procedure to establish PIMs, it becomes
clear that the data have to be presented in dedicated, user-friendly
databases. Several protein sequence databases provide annotations
for describing protein-protein interactions. A European group
(Eilbeck et al., Manchester, UK) has developed INTERACT, an
object-oriented database that aims at providing the appropriate
architecture to store, query and analyse protein interaction
data. Not all the questions that biologists would like to
ask about their favourite molecules can be answered by INTERACT.
In addition, no quality control measures were proposed to
weed out false positive results that are present when using
the classical two-hybrid matrix approach, in which collections
of baits are tested against collections of preys and selective
conditions are not adapted to each bait as they are in a screening
approach. It would be of great interest for the scientific
community if this type of database could be matched and combined
with others dealing with structural domains, metabolic pathways
or protein families.
Contacts
within the programme
Gerco
Angenent
Roz
Banks
Gerhard
Behre
Thomas
Benzing
Pierre-Alain
Binz
Helian
Boucherie
Ken
Bradbury
Andrea
Cabibbo
Giovanni
Cesareni
Brian
F.C. Clark
Jeremy Clarke
Graham
Cook
Edgar
F. da Cruz e Silva
Miguel
A. De la Rosa
AJS
Hawkins
Denis
Hochstrasser
Yiguo
Hong
Allan
Karlsen
Roland
Kellner
Hanna
Kovarova
Patricia
Kumar
Ozgur
Kutur
Riitta
Lahesmaa
Ivan
Lefkovits
Pierre
Legrain
Rune
Linding
Colm
J Lowery
Jan Maly
Antoni
Matilla
Thomas
Meyer
Chun-Ming
Liu
Serge
Muyldermans
Alfredo
Nicosia
Jong
Park
Gabriella
Pocsfalvi
Ansgar
Poetsch
Hans
Prydz
Juan
Recio
Jutta
Reinhard-Rupp
Manuel
Santos
Peter
Schellenberg
Michaela
Scigelova
Robert
Sim
Sheo
Mohan Singh
Susan
Southon
Veronika Stoka
Mike
Taussig
Joël
Vanderkerckhove
Rajani
Kanth Vangala
Frans
Van Roy
Andras
Varadi
Venkateshwar
Miguel
Vicente
Juergen
Wendland
Erwin
Witters
Lode
Wyns
Isik
Yulug
Marketa
Zvelebil
|