- Training Courses
- Workshops
- Grants & Fellowships
- European Conference in Functional Genomics
- Meeting Reports
- Online Registration

 

 

Computational methods for RNA analysis
26 July - 8 August 2009
Benasque, Spain

Organisers
Report
1. Scientific content

Organisers:

Eric Westhof: University Louis Pasteur of Strasbourg and Institut Universitaire de France, France
Elena Rivas: Janelia Farm Research Campus, Ashburn, Virginia, USA

Draft Report

Scientific Content

Everyday several seminars were held on RNA structure. They spanned the whole range of RNA knowledge and covered also some practical and experimental aspects of RNA research. It is the only school in the world where the major scientists on RNA bioinformatics assemble and discuss freely on their research. The gap between the theoretical scientists and the experimentalists is especially large in that field bevaseu of the complexities of the theoretical approaches and the sophistication of the experimental techniques. At the same time, the difficulties in communicating the real needs of the practical scientists to the computer scientists are real.

The first day, we had talks by experimentalists explaining what they were looking for and their despair at communicating in precise terms their needs for more computer science. An amazing talk was given on the discovery of very new and numerous RNAs in the sea and extreme environments on the basis of metagenomes. The computer tools for such searches are far from trivial. The second day was dedicated to the de novo searching of RNAs.

The semantics of family grammars were all reviewed in depth, emphasising the power and the limits of each of them and making sure the audience grasp all points. These grammars are central to RNA research. Later all programmes dedicated to searching RNAs were reviewed and assessed; how well do they perform? What are the limitations? What goes wrong? Why is there so little overlap between the various programmes for the same sets of experimental data?

The third day, more classical approaches were tackled, especially the partition functions and the underlying problems of the algorithms and combinatorics of RNA sampling. The state of the art of 2D structrue prediction was overviewed with the tools available for computations including pseudoknots.

The fourth day the central roles of databases was discussed. What are the practical tools? Why are the databases of to-day insufficient? The famous Rfam database was much discussed. How to improve it? The available tools are not reliable for automatic classification; manual intervention is necessary. Can we promote it? In a Wikipedia style? With everyone improving the annotations and alignments. Because databases contain sequences extracted by homology seraches in part (and annotated this way too), databases cannot be better than the tools used for the searches and the alignments. A recurrent question in this issue is“what is the meaning of homology?” Althought the theoretical understanding, based on darwinian evolution, is well appreciated, its manifestation at the sequence level (and especially at the 3d structure
level) is much less so.

The last days of the week was devoted on a continuation of the preceding discussion with emphasis on the lcoal alignments and the alignments with respect to a given 2D structure. The integration of substructures is like with combinatorics a cumbersome problem.

Further, we had discussions on how to assess the validity of 3D structural models. There exists now several programmes producing automatically 3D models. What is their validity? How close are they from reality? A clear assessment of the proximity between prediction and reality (again definitions of what is meant by reality, or to what one compares prediction, have to be cleared before) is absolutely necessary in order to improve the modelling methods and our common understanding of RNA structure and function.

The following week we started by presentations and discussions on new technologies on fast and deep sequencing. The impact of those technologies on RNA biology is incredible but we need to cope with the production fo data and their interpretations. How to treat short reads of RNAs? Can we produce RNA structures in a high throuput way? Those were some of the questions treated. In this framework, the visualisation tools are critical, since our brains cannot process raw data of such magnitude. A whole afternoon was dedicated to visualisation tools. Clearly, although many of such tools are similar and redundant, several others are missing to reach to the experimentalists.

The next day was dedicated to RNA ontologies and the explanations for their needs. Can we use ontologies to improve alignements? What are the reliabilities between sequence and structure alignments? Which kind of benchmarks should be offered?

Following this discussion, instead of analysing isolated RNAs, we started to look precisely on RNA-RNA inetractions. Such interactions are key intramolecularly for the folding of RNA architecture and intermolecularly for microRNAs binding to their targets of other non-coding RNAs binding to other RNAs.

Accessibilities of the RNAs are important parameters. Such accessibilities can in principle be calculated on the basis of the secondary structure but without the knowledge of the types and numbers of proteins bound to the RNAs not particularly valuable.

Next, we had to conside the folding kinetics of the RNAs, and especially the steps occurring during the synthesis of the RNA itself on the polymerase. In bacetria, it is now experimentally proven that kinetics is exploited by biology; i.e. the RNA adopts different conformations depending on the number of nucleotides produced and the speed of synthesis. In other words, one cannot anymore consider only thermodynamics stability as the sole criterion for selecting secondary structure of RNA. Experimentally, some structures are produced under one condition and they do not change when put in another condition in which a second structure is stabilised. Kinetics adds a huge difficulty to an already extremely complex and subtle field.

The last day, we discussed RNA system biology, a new buzz word. This was kept for the last day since it encompasses all of the other points discussed during the meeting. Can we integrate all the data gathered in a coherent and useful fashion? Are we stuck with collecting butterflies without deep understanding of the underlying biology?