|
4th Integrative Bioinformatics Conference
10-12 September 2007
Ghent, Belgium
Organisers:
Jacob Köhler, Rothamsted Research, UK
Martin Kuiper, University of Ghent, Belgium
Ralf Hofestaedt, Bielefeld University, Germany
Draft
Report
Summary
The fourth Integrative Bioinformatics workshop (IB07) was held as a 3 day event. An international scientific committee selected the papers to be presented at the meeting and those that were accepted papers were also published in a special issue of the Journal of Integrative Bioinformatics.
Top scientists in the field of Database Integration and Systems Biology were invited and gave keynote talks: David Searls, Carol Goble, Søren Brunak and Luis Serrano (see programme below). The small workshop format provided a good environment for scientists that are actively involved in the area and those wishing to enter the area to look at the problems, to get together, discuss the challenges and to forge new collaborations. For this reason, the workshop included an evening dinner and an informal reception where all participants were kept together to foster interactions which would otherwise be more limited.
IB07 was held in Ghent, Belgium, with Martin Kuiper as the local chairperson who was supported by Eva Sugajska. Paul Verrier (Rothamsted Research, UK) provided and maintained the submission, review and registration web-based system which was developed from the IB06 event. Paul Verrier, assisted by Karen Morris and Jan Taubert, also dealt with the financial aspects of the workshop. All talks were published in the Journal of Integrative Bioinformatics and the full papers were reproduced in the proceedings. Prof Dr. Ralf Hofestaedt, as the Editor in Chief and Thoralf Toepel, who is the editorial manager of the Journal, ensured that the quality of the papers met international standards. Jacob Koehler (Rothamsted Research , UK) and Ralf Hofestaedt (University of Bielefeld, Germany) carried the overall responsibility for the workshop and the scientific content.
The workshop attracted 125 participants made up of 45% PhD students, 45% academics and 10% from industry. More than 50% of participants came from Germany , Belgium, The Netherlands and the United Kingdom. The remainder came from other European countries with a few from non-European origins. The workshop was mainly funded by the ESF which made it possible to keep the registration costs low (standard registration = €120, Students: €60). Eli Lilly contributed €250 for presenting a stand. Eli Lilly also covered travel and accommodation costs of Dr Jacob Koehler (Jacob Koehler had at the time of the workshop accepted a position to work with Eli Lilly in the USA). Besides covering some of the conference costs (conference dinner, invited speakers etc), the ESF funding was used to partly fund a bursary scheme, which covered accommodation and registration costs for 20 European PhD students and two travel bursaries for two PhD students who attended the workshop.
ESF funding for the workshop was acknowledged in the introductory remarks of Jacob Koehler. ESF funding was also clearly visible on the workshop webpage, in the call for papers and participants, the workshop proceedings and at the workshop venue.
Scientific
Content
Format of the workshop: The workshop format enabled scientists in the field of Life Science data integration to present and discuss their work. To this end, each of the 25 accepted speakers gave a 20 minute talk which was followed by 5 minute discussions in which participants were given the opportunity to ask questions and to discuss the presented work. The four invited keynote speakers had 40 minutes to present and discuss their work. Another important component of the workshop were the two formal poster sessions where 41 posters were presented. The poster sessions were very well attended and the participants also used the coffee breaks to discuss their work and the posters which remained on the poster boards throughout the workshop.
The second goal of the workshop was to provide time and space for informal discussions and networking opportunities. This was achieved by the social events, coffee breaks etc. Due to the informal nature of the results of the networking events, it is difficult to summarise them. One important outcome and demonstration of the success of the informal discussions that took place, was the establishment of a Life Science Data Integration Forum. Further details of this are provided in the last section of the report (Assessment of the results and impact of the event on the future direction of the field).
Venue of the workshop: The venue turned out to be perfectly suited for the size of the workshop. The meeting was held in the cultural and conference centre of Ghent University , known as ‘Het Pand'. This facility used to be a Dominican monastery (founded in the 13th century), and is now one of the oldest historical buildings of the university. The building was recently completely restored to its old grandeur. At this moment the University uses the complete building. It houses a university (staff) restaurant, conference facilities, museums (e.g. medicine and holography), a magnificent ancient library (with numerous manuscripts) and a number of university administrative departments, e.g. the International Relations Office. It is located close to Ghent 's most important historic (mainly) medieval buildings. This area of Ghent is magnificent, especially by night. Several low or medium budget hotels are present in the direct vicinity (50–80 Euro per night for a single room), all within walking distance. The facility has several meeting rooms available. The event made use of the ‘Rector Vermeylen' room which was almost completely filled (it can accommodate 130 people). The facility also provided the catering services, which were operated by a special unit of Ghent University .
Summary and motivation of the scientific content: In the current post genomic era, the functional characterisation of genes is more challenging than the actual sequencing. A combination of many new high throughput techniques are used to integrate and extract information from large quantity of data, shifting the research focus of bioinformatics from sequence analysis techniques to Integrative Bioinformatics and Computational Biology; thereby enabling Systems Biology.
Biological data are scattered across hundreds of biological databases and thousands of scientific journals. Current high throughput genomics technologies generate large quantities of high dimensional data. Microarray, NMR, mass spectrometry, protein chips, gel electrophoresis data, Yeast-Two-Hybrid, QTL mapping, gene silencing and knockout experiments are all examples of technologies that capture thousands of data points, often in single experiments. The challenge for Integrative Bioinformatics is to capture, model, integrate and analyse these data, often alongside the sequence data, in a consistent way to provide new and deeper insights into complex biological systems. There is a continuing need to get scientists interested in this developing area together to discuss and identify new challenges, and to disseminate novel tools and approaches. The fourth workshop on Integrative Bioinformatics served this purpose and was of interest to (systems biology) Life Scientists, Bioinformaticians and Computer Scientists.
Detailed summary: This section summarises the scientific content of the talks. Further detail can be found in the workshop programme below and on the IB07 web pages at http://www.rothamsted.bbsrc.ac.uk/bab/conf/ib07/prog.php
The accepted papers were published in the Journal of Integrative Bioinformatics and are available online free of charge at http://journal.imbio.de/.
The workshop was organised in four sessions:
Database Integration and Integrative Databases
Integrative Systems Biology (2 sessions)
Data Analysis and Interpretation
Data Classification
The session topics were established bottom-up, and rather than direct the content of the workshop from the outset, the talks were selected based purely on the quality of the papers selected by the scientific committee. Thus the workshop presented a relatively unbiased selection of talks, which represent the direction in which the field is moving relatively well.
Jacob Köhler and Martin Kuiper opened the workshop and welcomed the participants. They also gave an overview of the previous IB workshops, discussed future directions of the field and the IB workshop series and acknowledged ESF funding.
Prof Dr. Soren Brunak started the scientific part of the workshop with his invited talk entitled “Understanding interactomes by data integration ”. The talk described a functional classification approach that predicts functional role categories in the “feature” space of the proteome, rather than using the “sequence” space of the genome. One important result from the work is that many proteins seem to display conservation in feature space rather than in sequence space, and the method is therefore able to transfer functional information from one species to another in new ways. This type of prediction can be integrated with experimental data, such as gene expression data and protein-protein interaction data, and interaction networks can be extracted and characterized. The talk focussed on cell cycle regulated proteins, including a comparative analysis across eukaryotic organisms.
This was then followed by the first session which was titled “Database Integration and Integrative Databases”. The papers in this session presented new technologies and approaches which are applicable to a range of biological problems.
The second day started with the Keynote talk of Prof Carole Goble which was titled “myExperiment: A MySpace for the Self-serving Bioinformatician?” . myExperiment is a new initiative from the myGrid project to create a Virtual Research Environment which makes it easier for workflow workers to share and discuss workflows and their related scientific artefacts; enable e-Scientists to share, re-use and repurpose workflows; and hopefully reduce time-to-experiment, share expertise and avoid unnecessary reinvention. myExperiment draws upon social networking websites such as MySpace and YouTube, immediately familiar to the new generation of scientists. The idea is that scientists should be able to “shop” for workflows like they shop on Amazon. The talk gave an overview about the myExperiment initiative, and the technical, political and social setting, implications and challenges. In particular, how do we acknowledge the inherent self-interest of the scientist to gain participation and still create a market-place for workflows and a “water-cooler” experience for workflow gossip.
The two “Integrative systems biology” sessions were arguably the highlight of the workshop. Systems Biology is a scientific field that is currently benefiting from significant funding at national and international levels. This session presented research on modelling and simulation of biological systems as well as research on the analysis of complex regulatory networks. The talks showed that data integration is in many cases an important technology that underpins many systems biological problems. However, even though the technical and scientific quality of the research presented in the Systems Biology sessions was extremely high, it also showed that the field of Systems Biology is not yet very mature.
Luis Serrano was the 3rd Invited speaker. He talked about “Structures in Systems Biology”. He argued that structural biology should play a very important role in systems biology, although at the final stage of understanding a signal transduction pathway, a cell, an organ or a living system, structures could be obviated, we need them to be able to reach that stage. Structures of macromolecules, especially molecular machines, could provide quantitative parameters, help to elucidate functional networks or enable rational designed perturbation experiments for reverse engineering.
When seen from a biological point of view, Serrano argued, systems biologists are still struggling to address open-ended biological problems and it is evident that Systems Biology is still rather immature and full of promises and visions. In his view, it remains to be seen whether these promises can be realised. The session following this keynote was on “Data analysis and Interpretation”. Although possibly considered by some as having less exiting content; when seen from a biological point of view, the work presented in these talks successfully addressed a range of realistic biological problems using data integration technology.
The last day was opened by David Searls who is the Senior Vice President, Informatics at GlaxoSmithKline. Based on the title of his talk “Integrative Drug Discovery” many participants probably expected a practical exposition of the routine problems of data integration in pharmaceutical R&D. However, David Searls took a step back from the prosaic issues faced in companies such as GSK and argued that Systems Biology and complex diseases appear to challenge traditional reductionist approaches to scientific investigation. Many of these challenges arise out of the study of biological pathways and in particular networks, which are often said to exhibit emergent properties -- that is, behaviours of the whole system that are not predictable from those of its component parts. It is important to be precise about what is meant by reduction and emergence in this context, because of the potential consequences for therapeutic approaches oriented to targets "in isolation". This talk did touch on network topologies, dynamic behaviours, and such phenomena as pleiotropy, functional redundancy, crosstalk, etc. as they relate to computational analysis and therapeutic intervention.
The last session of the workshop presented talks on “Data Classification”. Like the previous slot on “Integrative Data Analysis”, the talks in this session also showed that integrative bioinformatics is a discipline which successfully addresses realistic biological problems.
Assessment of the results & impact of the event
The 4th integrative Bioinformatics workshop was very successful. It attracted more paper submissions and participants than all the previous workshops in this series.
Data integration as a scientific discipline is almost as old as sequence analysis. However, whereas sequence analysis as a discipline has established standard tools and algorithms as famous as BLAST and the Smith-Waterman algorithm, data integration is just starting to deliver practical results. This, together with the urgent need to integrate and understand the ever increasing amount of life science data, will turn data integration into one of the key disciplines in the field of Bioinformatics and Medical informatics. Although this field of research does not receive the same attention as Systems Biology, the organisers of the workshop are convinced that data integration will have a strong and sustained impact on the Life Sciences, potentially more important and long-lived than Systems Biology.
The workshop was a good training ground with nearly half the participants being PhD students. Students were able to get a wide view of the breadth of the science relevant to the conference.

In addition, the participants were from a wide international mix, giving immense opportunities to forge new collaborations. This will certainly influence the development of the science and is a major impact in this area of work.

Together with the DILS workshop series, which attracts a similar number of submissions and participants, IB has established itself as a leading workshop in data integration. It thus plays an important role for the Life Science data integration community, where new ideas are presented and where scientists who work in this field meet and initiate collaborations.
One specific result of the workshop is the establishment of a Life Science Data Integration Forum:
The aim of the forum is to provide a platform for the exchange of experience, ideas and to initiate collaborations in the field of data integration. This will bring together international leading experts from academia and industry. The forum will consist of a series of bi-annual meetings, from which initiatives and sub teams may spin off. The forum will include, but is not limited to the following topics:
Joint pilot studies and pre-competitive research
Exchange of experience from technology evaluation
Initiation of joint projects
Organization of the annual Integrative Bioinformatics workshop
IB ‘08 – Lutherstadt Wittenberg, Germany
IB ‘09 – Indianapolis, USA
IB ‘10 – UK
Organization of scientific/technical meetings on specific data integration topics
Coordination among large data integration projects to materialize potential synergies. Some current projects which expressed an interest in coordinating their activities are
- ComparaGRID http://bioinf.ncl.ac.uk/comparagrid/
- eSysBio/FUGE http://www.bioinfo.no/
- ONDEX SABR http://ondex.sourceforge.net/
Joint fund raising efforts
Knowledge transfer
Promotion of data and software standards and sharing best practices
Prof Ralf Hofestaedt and Dr Thoralf Toepel have submitted an ESF networking proposal to financially underpin this activity.
In the future, the IB organisers plan to restrict the growth of the meetings in terms of number of participants to retain the workshop style. The size of the IB meetings (100 – 130 scientists) is big enough to attract leading scientists, but it is small enough to allow people to establish new contacts and collaborations. The organisers do, however, plan to further improve the quality of the presented talks.
We expect that several topics will become increasingly important. At a technical level, aspects of semantic data integration will remain an important field of research and in the near future it is expected that more applications of new standards such as RDF/OWL will emerge. In terms of applications, it is expected that the support and contributions to Systems Biology will remain an important topic of data integration. However, as already seen in this year's meeting, data classification in combination with data integration is a new area of research in which data integration technologies will play an increasingly important role. This is not surprising, since such techniques can be applied to the prioritisation of genetic and molecular targets, to disease diagnostics and to the development of tailored therapeutics. All these applications will benefit if data integration and data classification techniques come together and make use of data from a wide range of sources.
Programme
Monday 10th September 2007 |
11:30 |
|
Registration desk opens |
12:30 |
|
Lunch (buffet style) |
13:10 |
|
Welcome followed by some administrivia
Jacob Köhler and Martin Kuiper |
|
Chairman: Ralf Hofestädt |
13:20
keynote |
|
Understanding interactomes by data integration
Søren Brunak ( Technical University of Denmark ) |
|
|
Session 1
Database Integration and Integrative Databases |
14:00 |
|
A Methodology for Comparative Functional Genomics
Alam, I; Cornell, M; Soanes, D M; Hedeler, C; Wong, H M; Rattray, M; Hubbard, S J; Talbot, N J; Oliver, S J; Paton, N W |
14:25 |
|
Community-based Linking of Biological Network Resources: Databases, Formats and Tools
Telgkamp, M.; Koschützki, D.; Schwöbbermeyer, H.; Schreiber, F. |
14:50 |
|
Defining Mapping Mashups with BioXMash
Hunt, E; Jakubowska, J; Boesinger, C; Norrie, M |
15:15 |
|
Data Linkage Graph: computation, querying and knowledge discovery of life science database networks
Lange, M.; Himmelbach, A.; Schweizer, P.; Scholz, U. |
15:40 |
|
Coffee break |
|
Chairman: Falk Schreiber |
16:00 |
|
Exploring PSI-MI XML Collections Using DescribeX
Samavi, R.; Consens, M.P.; Khatchadourian, C.; Topaloglou, T. |
16:25 |
|
VINEdb: a data warehouse for integration and interactive exploration of life science data
Hariharaputran, S; Töpel, T; Brockschmidt, B; Hofestädt, R |
16:50 |
|
Analysis of integrated biomolecular networks using a generic network analysis suite
Oesterheld, M; Mewes, HW; Stümpflen, V |
17:15 |
|
The OXL format for the exchange of integrated datasets
Taubert, J; Sieren, KP; Hindle, M; Hoekman, B; Winnenburg, R; Philippi, S; Rawlings, C; Köhler, J. |
17:40 |
|
Poster Session |
18:30 |
|
Belgian Beer Festival |
|
|
|
Tuesday 11 th September 2007 |
|
Chairman: Martin Kuiper |
09:30
keynote |
|
myExperiment: A MySpace for the Self-serving Bioinformatician?
Carole Goble ( University of manchester , UK ) |
|
|
Session 2
Integrative Systems Biology I |
10:10 |
|
Integration of Heterogeneous Cis-Antisense Gene Pair Data Sets Mapping onto the Human Genome
Orlov, YL; Zhou, J; Kuznetsov , VA |
10:35 |
|
Integration of constraints documented in SBML, SBO, and the SBML Manual facilitates validation of biological models
Lister, AL ; Pocock, M; Wipat, A |
11:00 |
|
Coffee break |
|
Chairman: Anil Wipat |
11:20 |
|
Stochastic effects in a compartmental model for mitotic checkpoint regulation
Ibrahim, B; Dittrich, P; Diekmann, S; Schmitt, E |
11:45 |
|
Reactome: An integrated expert model of human molecular processes and access toolkit
de Bono, B; Vastrik, I; D'Eustachio, P; Schmidt, E; Gopinath, G; Croft, D; Gillespie, M; Jassal, B; Lewis, S; Matthews, L; Wu, G; Birney, E; Stein, L |
12:10 |
|
A High-Level Petri Net Framework for Genetic Regulatory Networks
Banks, R; Steggles, LJ |
12:35 |
|
Lunch (buffet style) |
|
Chairman: Jacob Köhler |
13:30
keynote |
|
Structures in systems biology
Pedro Beltrao1, Christina Kiel and Luis Serrano (CRG-EMBL Systems Biology Unit , Spain ) |
|
|
Session 3
Integrative Systems Biology II |
14:10 |
|
From MIN model to ordinary differential equations
Yartseva, A; Devillers, R; Klaudel, H; Kepes, F |
14:35 |
|
GeneBrowser: an approach for integration and functional classification of genomics data
Arrais, J.; Santos , B.; Fernandes, J.; Carreto, L. |
14:50 |
|
Functional and Transcriptional Coherency of Modules in the Human Protein Interaction Network
Futschik , ME ; Chaurasia, G; Tschaut, A; Russ, J; Babu, MM; Herzel, H |
15:15 |
|
Coffee break |
|
Chairman: Thodoros Topaloglou |
|
|
Session 4
Data Analysis and Interpretation |
15:45 |
|
Mining for Single Nucleotide Polymorphisms in Expressed Sequence Tags
Souche, E.L.; Hellemans, B.; Van Houdt, J.K.J.; Canario, A.; Klages, S.; Reinhardt, R.; Volckaert, F.A.M. |
16:10 |
|
Mapping protein information to disease terminologies
Mottaz, A.; Yip, Y.L.; Ruch, P.; Veuthey, A.-L. |
16:35 |
|
Developmental Anatomy Ontology of Zebrafish - An Integrative semantic framework
Belmamoune, M.; Verbeek, F.J. |
17:00 |
|
CIDA: An integrated software for the design, characterisation and global comparison of microarrays
Khalid, S; Khan, M; Symonds, A; Fraser, K; Wang, P; Liu, X; Li, S |
17:25 |
|
Poster session |
19:30 |
|
Conference dinner |
|
|
|
Wednesday 12th September 2007 |
|
Chairman: Paul Verrier |
09:30
keynote |
|
Integrative Drug Discovery
David Searls (GlaxoSmithKline Pharmaceuticals, USA ) |
|
|
Session 5
Data Classification |
10:10 |
|
Supervised classification of combined copy number and gene expression data
Riccadonna, S; Jurman, G; Merler, S; Paoli, S; Quattrone, A; Furlanello, C |
10:35 |
|
IMS2 -- An integrated medical software system for early lung cancer detection using ion mobility spectrometry data of human breath
Baumbach, J; Bunkowski, A; Lange, S; Oberwahrenbrock, T; Kleinboelting, N; Rahmann, S; Baumbach, JI |
11:00 |
|
Prediction of protein-protein interactions using one-class classification methods and integrating diverse biological data
Reyes, J. A.; Gilbert, D. R. |
11:25 |
|
Coffee break |
|
Chairman: Thoralf Töpel |
11:45 |
|
Monophyletic clustering and characterization of protein families
Zhang, J.; Zhao, Z.; Evershed, J.; Li, G. |
12:10 |
|
A Tool for Evaluating Strategies for Grouping of Biological Data
Jakoniene, V; Lambrix, P |
12:35 |
|
Closing remarks |
12:40 |
|
Lunch (buffet style)
and Depart |
|