The CCB focuses specifically on predictive cell biology, a research area that uses computational methods to understand, explain, and predict complex function from the basic molecular building blocks. This work emerges from several simultaneous streams of research. The project’s bedrock is genome sequence, informed through evolutionary understanding and inferences. Simultaneously, we aim to explore the ‘parts list’ of a cell: starting from the genome we aim to identify and understand the functions of genomic regions, proteins, and other biomolecules that act in cellular processes. Even while this task is in progress, work is going on in trying to quantitatively describe the dynamic interactions of these cellular components. All of these enterprises progress from the computational and statistical analysis of vast quantities of molecular data.
Predictive cell biology will offer challenges for many decades, and our program will be responsive to new directions. The field will lay the groundwork for new understandings of the differences between cells, and yield insight into how differences in genome sequence, composition and organization across the kingdoms of life manifest themselves as differences in cellular behavior. These models will thus touch upon evolutionary, developmental, and comparative biology. Ultimately, predictive cell biology will inform the development of medical treatment and understanding of how we interact with environment.
The following examples, while by no means exhaustive, illustrate the scope of our agenda.
Phylogeny and systematicsDatabase design and management
Devise methods for specifying, constructing and accessing distributed databases for heterogeneous biological data such as images, protein and DNA sequences, gene expression data, and published articles – as well as the relations amongst these data.
Applied Statistics and Statistical Computing
Devise multivariate statistical learning methods and software for cluster analysis, prediction, computational inference, causal inference, multiple testing, and model/feature selection. Although biological questions can be highly-specific, statistical and computational methodology are general and can be applied to address an extraordinary variety of different biological questions, such as phylogeny, expression analysis, transcription regulation, molecular process modeling, protein structure and function prediction.