A key goal of biology is to understand phenotypic characteristics, such as health and disease. Comparative global gene-expression patterns (transcriptomics) of altered biological processes have allowed for the identification of important genes in disease biology. Large-scale studies of proteins (proteomics), the main components of the physiological metabolic pathways of cells, have also revealed protein expression patterns associated with disease. Despite these significant advances, the molecular framework that controls the balance between health and disease is still not fully understood.

This may be partially due to the challenge of correlating gene and protein expression data with functional activity and cellular phenotype. New insights suggest that epigenetic regulation, for instance, determines not only what parts of the genome are expressed but also how they are spliced. Alternative splicing plays critical roles in disease and is a major source for protein diversity in higher eukaryotes. Furthermore, increases in mRNA levels do not always correlate with increases in protein levels. And perhaps more significant, because of a variety of post-translational modifications, once translated, a protein may not be functionally active.

Overall, the new powerful omics technologies allow researchers to gather enormous amounts of data, but the extent to which this data is interpreted to its full potential to generate new scientific hypotheses, and verify or falsify existing ones has shown itself to be limited so far. An example is the patent disappointment with proteomics and the delivery of clinically useful protein biomarkers.
COS research program point toward not just the accumulation of massive data but data that is meaningful for answering biological questions and explicit toward a true biological understanding.

To this end, the COS researchers team will primarily use metabolomics, a newly emerging field focused on the profiling of small, naturally occurring (endogenous) molecules collectively known as the “metabolome”. Metabolites represent the most downstream end products of cellular reactions and therefore closely correlate with phenotype. Thus, by comparative untargeted profiling of metabolites, metabolomics provides a powerful strategy for understanding changes associated with a unique phenotype or disease state at the molecular level.

Extraction of metabolites from tissues, biofluids, or cells cultures and subsequent analysis by liquid and gas chromatography/mass spectrometry (LC/MS and GC/MS) and nuclear magnetic resonance (NMR) enables analysis of a large number of low molecular weight biochemicals simultaneously. This process, termed untargeted metabolomics, has the capacity to implicate unanticipated metabolites or pathways with a unique phenotype and thereby provides insight into cellular mechanisms and disease pathology. However, metabolite extraction removes the metabolites from their naturally occurring chemical environments and therefore often complicates biological interpretation.

To complement LC/MS, GC/MS and NMR profiling in solution, the COS has therefore incorporated two additional components to own research: i) MALDI imaging to localize the dysregulated metabolites with micrometer resolution to tissue regions or cell types, and ii) intact tissue NMR using the technique High Resolution Magic Angle Spinning (HR-MAS) to characterize the dysregulated metabolites.
This compilation of complementary technologies offers a powerful approach to interrogate the biochemical basis of disease and relies on the unique breadth of training experiences.

The ultimate goal of COS research is to achieve a hypothesis-driven omics integration, particularly between metabolomics and the other omics. COS will interrogate the biochemical basis of disease starting with a non-hypothesis-driven approach using the metabolomics technologies. The comprehensive characterization of the lowest level of the cellular information flow, that is, metabolites altered by disease, can facilitate the biochemical interpretation, and therefore, subsequent generation of novel hypotheses.

Such hypotheses are then mainly investigated by interrogating higher levels of cellular organization (i.e., proteins, mRNA, genes) using targeted omic approaches.