Proteomics

 

 

In silico methods for the description of cellular systems by data and literature mining, predictions and simulations

Co-ordinators:
Alfonso Valencia CNB-CSIC. Campus U., Autonoma de Madrid, Spain more
François Rechenmann Unité de Recherche INRIA Rhône-Alpes, Montbonnot, Saint Martin, France more

This area focuses on the development of new approaches and computational tools in the area of functional genomics. Functional genomics opens up new possibilities and raises new requirements. We see the direct contacts and exchanges with experimental biologists that will take place within the framework of this programme as an excellent opportunity for the bioinformatics community to access information obtained by the state of the art experimental technologies and to collect requirements and feedback about its own work and projects. The current scope of in silico methods is very broad, covering as it does many different topics that pave the 'virtual path' leading from sequence to global function. As a starting point for our activities, we propose to focus on four emerging domains, with the possibility of incorporating new methodologies as they appear.

1. Prediction of protein-protein interactions based on the analysis of multiple sequence alignments. These methods are related to earlier developments in sequence analysis and protein structure prediction in the area of bioinformatics. Recent advances in molecular biology have provided a vast amount of genetic information for many different organisms. One of the most challenging current issues is to establish the possible interactions between different protein components at different levels, in what has been called 'neighborhood relationships'. Rather than focusing on direct physical interactions, a number of computational efforts have recently addressed the problem of predicting proteins with general functional relationships. Functional interactions have been predicted based on comparisons of the species distributions of gene pairs. These methods assume that genomes encoding one member of an interaction pair will necessarily also encode its interacting partner. Marcotte et al. and Enright et al. predicted protein interactions for those multidomain proteins presenting a variety of domain arrangements in different organisms. Even though these approaches all have promising features, they are still unable to cope with the complexity and extension of protein interaction networks in real systems. Much remains to be done, therefore, in the development of new approaches and integration of existing ones.

2. Prediction of protein-protein interactions based on the study of regulatory and other genomic signals with data provided by genome analysis and genome comparison applications. Dandekar et al. identified a relationship between genes that are contiguous in bacterial chromosomes and proteins that formed part of protein complexes. A different approach was developed by comparing the frequencies of neighbouring genes in different genomes and their relationship to the cellular function of the proteins. Other studies have addressed the problem of predicting protein function by studying the distribution and conservation of genomic structures in different systems. Nevertheless, our knowledge of the evolutionary forces and processes which play a role in the organization of genomes is far from perfect, and general approaches able to capture the relationship between genomic and functional organization have to be developed.

3. Extraction of information on protein-protein interactions by systematic analysis of text sources, based on data mining and text analysis techniques. Very recently, new approaches have appeared for the extraction of information on protein-protein interactions. These initial systems are based on previous experience in the detection of significant, characteristic keywords in sets of Medline abstracts referring to protein families, where the use of statistical methods was sufficient to generate meaningful results without the further need to implement syntactical analysis. The challenge ahead is to incorporate more refined statistical methods together with other new computational techniques in order to improve the coverage and accuracy of detected interaction networks. Current approaches would also be extended beyond protein interactions to related biological issues, such as DNA-protein interactions, drug-protein binding, tissue distribution and disease-associated characteristics. Furthermore, problems in molecular biology will connect with medical informatics, where access to clinical records and medical information is currently a demanding issue.

4. Simulation of the behavior of metabolic and signalling pathways with techniques that include numerical and logical descriptions of interactions. It is reasonable to think that in the near future the amount of genomics and functional information available will be sufficient to define most cellular functions and interactions. Once all this information has been integrated, the molecular biology and bioinformatics communities will be at the point of taking a new step for the reconstruction of interaction networks and simulation of their behavior. Even if little practical work has yet been done in this direction, we would like to stimulate the introduction of new ideas by increasing the cross-talk of the different disciplines mentioned in this programme.

Contacts within the programme
Yaqoub Ashhab
Francisco Azuaje

Gerhard Behre
Soren Brunak
Raffaele A. Calogero
Rita Casadio
W J Coadwell
Werner Dubitzky
Franca Fraternali
Robert Glen
Alessandro Guffanti
Roderic Guigo
John Hancock
Des Higgins
Turgay Ibrikci
Juha Kere
Colm J Lowery
John Mitchell
Vaclav Paces
Syed Asad Rahman
François Rechenmann
Mischa Reinhardt
Alfonso Valencia
Paul Van der Vet
H. J. van der Wijk
Rajani Kanth Vangala