What is CSBI?

The MIT Computational and Systems Biology Initiative (CSBi) is a campus-wide education and research program that links biologists, computer scientists and engineers in a multi-disciplinary approach to the systematic analysis of complex biological phenomena. CSBi places equal emphasis on computational and experimental methods and on molecular and systems views of biological function. Multi-investigator research in CSBi is supported through a sophisticated research infrastructure, the CSBi Technology Platform.

CSBi includes about eighty faculty members from over ten academic units across MIT's Schools of Science and Engineering, the Sloan School of Management, and the Whitehead Institute for Biomedical Research.


Research and Technology Goals

OverviewThe overall goal of CSBi is to foster links among biology, engineering, and computer science and to create interdisciplinary, multi-investigator teams to undertake the systematic analysis of complex biological phenomena. CSBi places equal emphasis on computational and experimental research and on molecular and systems-level views of biological function. CSBi retains a fundamental commitment to an academic tradition placing graduate students and postdoctoral fellows at the forefront of scientific inquiry. At the same time, CSBi recognizes the increasing dependence of biological research on multidisciplinary teams and sophisticated technologies.

One of CSBi's primary research objectives is the development of methods and devices that can measure, in a systematic and precise manner, the biochemical properties of biomolecules in cells, tissues, and whole organisms. We expect many of these measurement devices to incorporate novel technology and micro-fabricated components. A second CSBi objective is building mathematical models of biological systems that link mechanistic information on molecular function to systems-wide understanding of networks and interactions. Like models in mature fields of engineering, systems biology models will be able to capture empirical information as it accumulates and will have the ability to predict experimental outcomes. It is models, not databases, that represent the most effective way to store and propagate knowledge.


Biological Complexity

The chemical and physical processes in living cells are extremely complex. The great strength of molecular genetics, a research paradigm that has dominated life sciences for over 50 years, is its ability to unravel biological problems one gene (or protein) at a time. Paul Ehrlich noted, in his 1908 Nobel lecture, that this method "does not solve the secret of life itself, which may be compared with the complicated organism of a mechanical work of art, but nevertheless the possibility of taking out individual wheels and studying them exactly signifies an advance compared with the old method of breaking into pieces the whole work and then trying to deduce something from the mixture of broken pieces." It is becoming increasingly clear, however, that component-by-component analysis will not suffice in the study of signal transduction, oncogenic transformation, neurobiology, and other processes in which many genes interact. Biological systems are characterized by distinct types of complexity that define a multi-dimensional landscape. On one axis, the complexity of the system increases as the number of molecular species under investigation rises from one to a complete genome's worth. On a second axis, mechanistic complexity increases as the type of data changes from sequence to structure, to subcellular localization, and then to time-dependent changes in protein activity in cells. On a third axis, the complexity of the biology increases from cells to tissues, to organisms, and then to populations. The early phases of systems biology have been dominated by an emphasis on studying ever more genes in simple settings and using simple types of data.

This focus has been necessary because a trade-off exists between greater complexity and our ability to extract meaningful insight. A major goal of systems biology research is to develop tools to tackle multiple sources of biological complexity at the same time in an effective and rigorous fashion.


A Central Role for Modeling

A basic premise of CSBi is that understanding complex biological processes will require the development of models that combine quantitative rigor with molecular detail. In a physiochemical model, for example, elementary reactions such as physical association, biochemical transformation, and changes in cellular compartment are expressed as equations (often ODEs) that are then linked into a large system. The resulting numerical model is based on literature data whenever possible (often uncovering major inconsistencies in the process) and trained with systematically acquired experimental data. The model can then be explored in silico with the goal of making predictions to be tested experimentally. Our thesis is that these models can be refined and expanded over time by capturing experimental data and interpretive knowledge as they accumulate – there is no initial requirement that the entire genome be captured. Effective models are expected to have significant predictive power and to increase the effectiveness of experimental analysis. The models are also likely to be hybrids incorporating highly detailed representations of critical reactions and more granular and flexible views of the system as a whole. CSBi is therefore promoting the development of a wide range of modeling methods from data mining to ODE networks. A belief in the primacy of these formal (or numerical) models contrasts with the relatively informal approach to modeling in molecular biology.

To imagine how formal models of biological systems will be developed and applied, we can turn to historical experience with combustion engineering, semiconductor fabrication, and metabolic engineering. In each of these cases, large numbers of experimental observations on chemical reactions were organized into systematic models that represent complex time-dependent processes. The models have been refined over many years of study and now have the ability to capture interpretive and experimental results. As the models mature, they are adopted commercially as central components of industrial design. By analogy, we anticipate that biological models will be extremely valuable in pharmaceutical R&D.


The Data Challenge

We often read that biologists are being overwhelmed with data and that the field is data rich. However, we believe that biology is actually data poor relative to the complexity of the problems being tackled. Developing accurate systems-wide models of biological processes will require a body of self-consistent experimental data covering a substantial number of biochemical processes specific to the problem under study (a particular cell type, for example). We refer to this as a systematic data set. We believe that the creation of systematic data sets for all but the simplest types of data (protein sequence, RNA expression levels, etc.) will be a substantial challenge. We also anticipate that the construction of systematic data sets will involve innovation in three areas: (1) the development of new experimental methods to monitor key biological reactions in cells, tissues, and organisms; (2) mathematical modeling of experimental methods to uncover sources of variation and establish the "degree of belief" associated with individual measurements (values and their probability density functions); and (3) novel informatics methods to gather and fuse measurements into reliable self-consistent data sets suitable for probabilistic analysis.

To address these challenges CSBi is active in the integration of existing technologies and in the creation of fundamentally new experimental methods. CSBi is gathering researchers with expertise in a wide range of experimental and numerical methods and forming strategic partnerships with leading companies active in areas of information technology, proteomics, genomics, and bioinformatics. In addition, since advances in systems biology will require radically new experimental methods, we discern a slow evolution in biological analysis from manual methods to automation and microinstrumentation. The special properties of fluid flow on a small scale suggests that microinstruments combining microfluidics, bio-electronics, and MEMS technologies will greatly simplify and accelerate the process of making hitherto time-consuming and difficult measurements. The active participation of microsystems engineers in CSBi is a special feature of the program.


Cultural Changes

The success of systems biology at MIT (and elsewhere) will depend in equal parts on the merits of the underlying intellectual vision and on the effectiveness of the organizational structure. Organizational and cultural issues have long been recognized as the greatest barriers to innovation in industrial research. Life sciences research has long been dominated by a culture of independent laboratories organized around single principal investigators (typically faculty members). One advantage of such an organizational structure is a proven history of innovation, creativity, and accountability while promoting close links between research and education. However, the need in systems biology for diverse skills and the relative complexity of the experimental technologies require the formation of interdisciplinary research teams. It is our goal to combine the best features of team-based science while continuing to promote the aspirations and ideas of a wide group of individual investigators.

One means to foster collaboration is a shared research infrastructure. The CSBi Technology Platform is designed with this in mind. Operating as a central resource available to all, the CSBi Technology Platform lowers barriers to entry for researchers entering systems biology and provides independent groups with the technically sophisticated resources of much larger organizations. The platform also provides informatics technologies for the efficient collection and transmission of data and for the propagation of technical know-how. Finally, the physical and computational components of the platform represent water coolers around which students and postdocs with different interests tend to gather.


The Future

It is our belief that the biological sciences are in the midst of a revolution as profound as the revolutions that followed the development of biological chemistry at the end of the 19th century and molecular biology in the middle of the 20th century. The current force for change is the creation of systematic experimental methods and numerical techniques that permit systems-wide analysis of biological processes while retaining molecular and mechanistic rigor. CSBi intends to pioneer the development of such biological models while training a new generation of scientists and engineers in their use and creation.