A case study: semantic integration of gene-disease associations for type 2 diabetes mellitus from literature and biomedical data resources

Drug Discov Today. 2014 Jul;19(7):882-9. doi: 10.1016/j.drudis.2013.10.024. Epub 2013 Nov 4.

Abstract

In the Semantic Enrichment of the Scientific Literature (SESL) project, researchers from academia and from life science and publishing companies collaborated in a pre-competitive way to integrate and share information for type 2 diabetes mellitus (T2DM) in adults. This case study exposes benefits from semantic interoperability after integrating the scientific literature with biomedical data resources, such as UniProt Knowledgebase (UniProtKB) and the Gene Expression Atlas (GXA). We annotated scientific documents in a standardized way, by applying public terminological resources for diseases and proteins, and other text-mining approaches. Eventually, we compared the genetic causes of T2DM across the data resources to demonstrate the benefits from the SESL triple store. Our solution enables publishers to distribute their content with little overhead into remote data infrastructures, such as into any Virtual Knowledge Broker.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Biomedical Research / methods*
  • Data Mining / methods*
  • Diabetes Mellitus, Type 2 / diagnosis
  • Diabetes Mellitus, Type 2 / genetics*
  • Humans
  • Knowledge Bases
  • Semantics*
  • Systems Integration*