Using conceptual modeling to improve genome data management

Brief Bioinform. 2021 Jan 18;22(1):45-54. doi: 10.1093/bib/bbaa100.

Abstract

With advances in genomic sequencing technology, a large amount of data is publicly available for the research community to extract meaningful and reliable associations among risk genes and the mechanisms of disease. However, this exponential growth of data is spread in over thousand heterogeneous repositories, represented in multiple formats and with different levels of quality what hinders the differentiation of clinically valid relationships from those that are less well-sustained and that could lead to wrong diagnosis. This paper presents how conceptual models can play a key role to efficiently manage genomic data. These data must be accessible, informative and reliable enough to extract valuable knowledge in the context of the identification of evidence supporting the relationship between DNA variants and disease. The approach presented in this paper provides a solution that help researchers to organize, store and process information focusing only on the data that are relevant and minimizing the impact that the information overload has in clinical and research contexts. A case-study (epilepsy) is also presented, to demonstrate its application in a real context.

Keywords: CSHG; case study; framework; genomic data; information systems.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Data Management / methods*
  • Data Systems
  • Epilepsy / genetics
  • Genetic Predisposition to Disease
  • Genomics / methods*
  • Humans