Exploration of haplotype research consortium imputation for genome-wide association studies in 20,032 Generation Scotland participants

Genome Med. 2017 Mar 7;9(1):23. doi: 10.1186/s13073-017-0414-4.

Abstract

Background: The Generation Scotland: Scottish Family Health Study (GS:SFHS) is a family-based population cohort with DNA, biological samples, socio-demographic, psychological and clinical data from approximately 24,000 adult volunteers across Scotland. Although data collection was cross-sectional, GS:SFHS became a prospective cohort due to of the ability to link to routine Electronic Health Record (EHR) data. Over 20,000 participants were selected for genotyping using a large genome-wide array.

Methods: GS:SFHS was analysed using genome-wide association studies (GWAS) to test the effects of a large spectrum of variants, imputed using the Haplotype Research Consortium (HRC) dataset, on medically relevant traits measured directly or obtained from EHRs. The HRC dataset is the largest available haplotype reference panel for imputation of variants in populations of European ancestry and allows investigation of variants with low minor allele frequencies within the entire GS:SFHS genotyped cohort.

Results: Genome-wide associations were run on 20,032 individuals using both genotyped and HRC imputed data. We present results for a range of well-studied quantitative traits obtained from clinic visits and for serum urate measures obtained from data linkage to EHRs collected by the Scottish National Health Service. Results replicated known associations and additionally reveal novel findings, mainly with rare variants, validating the use of the HRC imputation panel. For example, we identified two new associations with fasting glucose at variants near to Y_RNA and WDR4 and four new associations with heart rate at SNPs within CSMD1 and ASPH, upstream of HTR1F and between PROKR2 and GPCPD1. All were driven by rare variants (minor allele frequencies in the range of 0.08-1%). Proof of principle for use of EHRs was verification of the highly significant association of urate levels with the well-established urate transporter SLC2A9.

Conclusions: GS:SFHS provides genetic data on over 20,000 participants alongside a range of phenotypes as well as linkage to National Health Service laboratory and clinical records. We have shown that the combination of deeper genotype imputation and extended phenotype availability make GS:SFHS an attractive resource to carry out association studies to gain insight into the genetic architecture of complex traits.

Keywords: Electronic health records; Genetics; Genome-wide association studies (GWAS); Glucose; Haplotype Research Consortium (HRC); Heart rate; Imputation; Quantitative trait; Urate.

MeSH terms

  • Blood Glucose / genetics
  • Cross-Sectional Studies
  • Electronic Health Records
  • Fasting
  • Female
  • Genes
  • Genome-Wide Association Study
  • Haplotypes*
  • Heart Rate / genetics
  • Humans
  • Male
  • Polymorphism, Single Nucleotide*
  • Prospective Studies
  • Quantitative Trait, Heritable*
  • Scotland
  • Uric Acid / blood
  • White People / genetics

Substances

  • Blood Glucose
  • Uric Acid