Beyond the data deluge: data integration and bio-ontologies

Judith A Blake; Carol J Bult

doi:10.1016/j.jbi.2006.01.003

Beyond the data deluge: data integration and bio-ontologies

J Biomed Inform. 2006 Jun;39(3):314-20. doi: 10.1016/j.jbi.2006.01.003. Epub 2006 Feb 21.

Authors

Judith A Blake¹, Carol J Bult

Affiliation

¹ The Jackson Laboratory, Bar Harbor, ME, USA. [email protected]

PMID: 16564748
DOI: 10.1016/j.jbi.2006.01.003

Abstract

Biomedical research is increasingly a data-driven science. New technologies support the generation of genome-scale data sets of sequences, sequence variants, transcripts, and proteins; genetic elements underpinning understanding of biomedicine and disease. Information systems designed to manage these data, and the functional insights (biological knowledge) that come from the analysis of these data, are critical to mining large, heterogeneous data sets for new biologically relevant patterns, to generating hypotheses for experimental validation, and ultimately, to building models of how biological systems work. Bio-ontologies have an essential role in supporting two key approaches to effective interpretation of genome-scale data sets: data integration and comparative genomics. To date, bio-ontologies such as the Gene Ontology have been used primarily in community genome databases as structured controlled terminologies and as data aggregators. In this paper we use the Gene Ontology (GO) and the Mouse Genome Informatics (MGI) database as use cases to illustrate the impact of bio-ontologies on data integration and for comparative genomics. Despite the profound impact ontologies are having on the digital categorization of biological knowledge, new biomedical research and the expanding and changing nature of biological information have limited the development of bio-ontologies to support dynamic reasoning for knowledge discovery.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Animals
Biology / methods
Computational Biology / methods*
Data Interpretation, Statistical
Databases, Factual
Genome
Genomics
Humans
Medical Informatics
Mice
Terminology as Topic

Abstract

Publication types

MeSH terms

Grants and funding