Keynotes

Not just stamp collections: Molecular Biology Databases in the 21st century.

Alex Bateman, EMBL-European Bioinformatics Institute.

"Head of Protein Sequence Resources at European Bioinformatics Institute. Pfam/Rfam generalissimo and Wikipedia evangelist" [from his Twitter profile].
Apart from being the leader of some of the most important databases and data curation efforts, Dr. Bateman is actively involved in bridging the gap between scientists and the public, particularly through the use of tools such as Wikipedia. His address will surely be of great interest and value to students and young researchers.

There are now thousands of biological databases available covering almost every aspect of molecular and computational biology. Some of these have become so important and all pervasive that we scarcely even notice them any more, except when they go wrong. I have watched this growth for the past twenty years and have been involved in producing some popular databases such as Pfam and miRBase and now I am also leading the UniProt database. None of these would have been possible without the hard work and dedication of Biocurators. To me Biocurators are the unsung heroes of biology who order and make sense of the immense primary literature for us. The International Society of Biocuration has given this disparate group of researchers a forum to exchange the latest developments and a stronger voice on the issues they care about.

Cancer Gene Network Analysis with Supercomputer

Satoru Miyano, Institute of Medical Science of the University of Tokyo

Satoru Miyano is a Professor of the Human Genome Center at the Institute of Medical Science of the University of Tokyo. His mission is to create computational strategies for systems biology and medicine towards translational bioinformatics based on the recent advances in biomedical research that have been producing large-scale, ultra-high dimensional, ultra-heterogeneous data.

Cancer is a very complex disease that occurs from accumulation of multiple genetic and epigenetic changes in individuals who carry different genetic backgrounds and have suffered from distinct carcinogen exposures. These changes affect various pathways which are necessary for normal biological activities and gene networks are driving these pathways in disorder in the center. We present our computational methods and their analyses in Cancer Systems Biology that use the supercomputer system at Human Genome Center of The University of Tokyo (225 TFLOPS, 4.4PB storage). A big challenge is to development of a systematic methodology for unraveling gene networks and their diversity lying over genetic variations, mutations, environments and diseases. We present our challenge for uncovering systems in cancer by supercomputer from gene expression profiles. NetworkProfiler is a method that will exhibit how gene networks vary from patient to patient according to a modulator, which is any score representing characteristics of cells, e.g. survival. First we defined an EMT (epithelial-mesenchymal transition) modulator and analyzed gene expression profiles of 762 cancer cell lines. Network analysis unraveled global changes of networks with 13,508 genes of different EMT levels. By focusing on E-cadherin, 24 genes were predicted as its regulator, of which 12 have been reported in the literature. A novel EMT regulator KLF5 was also discovered in this study. We also analyzed Erlotinib resistant networks using 160 NSCLCs with GI50 as a modulator. Hubness analysis exhibited that NKX2-1/TTF-1 is the key gene for Erlotinib resistance in NSCLCs. Our microRNA/mRNA gene network analysis with Bayesian network method called SiGN-BN also revealed subnetworks with hub genes (including NKX2-1/TTF-1) that may switch cancer survival. For dynamic system modeling, we devised a state space model (SSM) with dimension reduction method for reverse-engineering gene networks from time-course data, with which we can view their dynamic changes over time by simulation. We succeeded in computing a gene network with prediction ability focused on 1500 genes from data of about 20 time-points. We applied this SSM model to human normal lung cell treated with (case)/without (control) Gefitinib, and we identified genes under differential regulations between case and control. This signature of genes was used to predict prognosis for lung cancer patients and showed a good performance for survival prediction of stage 1 patients that has been considered very difficult. On-going cancer research using K-computer (10PFLOPS supercomputer in Kobe, Japan) is also introduced.

Computational Biology and Human Genetics

Gonçalo Abecasis, University of Michigan

Goncalo Abecasis is a Professor of Biostatistics. He received his D.Phil. in Human Genetics from the University of Oxford in 2001 and joined the faculty at the University of Michigan in the same year. Dr. Abecasis' research focuses on the development of statistical tools for the identification and study of genetic variants important in human disease. Software developed by Dr. Abecasis at the University of Michigan is used in several hundred gene-mapping projects around the world.

During the past 10 years, human genetics has progressed at a rapid pace - aided not just by improvements in laboratory technology and the available of large population samples for study, but also by the creative and key contributions by computational biologists. I will review challenges and opportunities in human genetic research, with an emphasis on the important contributions that young computational biologists can make.