Electronic health records: the next wave of complex disease genetics

Hum Mol Genet. 2018 May 1;27(R1):R14-R21. doi: 10.1093/hmg/ddy081.

Abstract

The combination of electronic health records (EHRs) with genetic data has ushered in the next wave of complex disease genetics. Population-based biobanks and other large cohorts provide sufficient sample sizes to identify novel genetic associations across the hundreds to thousands of phenotypes gleaned from EHRs. In this review, we summarize the current state of these EHR-linked biobanks, explore ongoing methods development in the field and highlight recent discoveries of genetic associations. We enumerate the many existing biobanks with EHRs linked to genetic data, many of which are available to researchers via application and contain sample sizes >50 000. We also discuss the computational and statistical considerations for analysis of such large datasets including mixed models, phenotype curation and cloud computing. Finally, we demonstrate how genome-wide association studies and phenome-wide association studies have identified novel genetic findings for complex diseases, specifically cardiometabolic traits. As more researchers employ innovative hypotheses and analysis approaches to study EHR-linked biobanks, we anticipate a richer understanding of the genetic etiology of complex diseases.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Cardiovascular Diseases / genetics*
  • Cardiovascular Diseases / pathology
  • Cloud Computing
  • Databases, Genetic / trends
  • Electronic Health Records*
  • Genetic Diseases, Inborn / genetics*
  • Genetic Diseases, Inborn / pathology
  • Genetics, Population / trends
  • Genome-Wide Association Study / trends*
  • Genotype
  • Humans
  • Polymorphism, Single Nucleotide / genetics
  • Quantitative Trait Loci / genetics