YeastHub: a semantic web use case for integrating data in the life sciences domain

Bioinformatics. 2005 Jun:21 Suppl 1:i85-96. doi: 10.1093/bioinformatics/bti1026.

Abstract

Motivation: As the semantic web technology is maturing and the need for life sciences data integration over the web is growing, it is important to explore how data integration needs can be addressed by the semantic web. The main problem that we face in data integration is a lack of widely-accepted standards for expressing the syntax and semantics of the data. We address this problem by exploring the use of semantic web technologies-including resource description framework (RDF), RDF site summary (RSS), relational-database-to-RDF mapping (D2RQ) and native RDF data repository-to represent, store and query both metadata and data across life sciences datasets.

Results: As many biological datasets are presently available in tabular format, we introduce an RDF structure into which they can be converted. Also, we develop a prototype web-based application called YeastHub that demonstrates how a life sciences data warehouse can be built using a native RDF data store (Sesame). This data warehouse allows integration of different types of yeast genome data provided by different resources in different formats including the tabular and RDF formats. Once the data are loaded into the data warehouse, RDF-based queries can be formulated to retrieve and query the data in an integrated fashion.

Availability: The YeastHub website is accessible via the following URL: http://yeasthub.gersteinlab.org.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Biology / methods*
  • Computational Biology / methods*
  • Database Management Systems
  • Databases as Topic
  • Databases, Protein
  • Fungal Proteins / chemistry*
  • Fungal Proteins / metabolism
  • Information Storage and Retrieval
  • Information Systems
  • Internet
  • Programming Languages
  • Software
  • Statistics as Topic
  • User-Computer Interface

Substances

  • Fungal Proteins