Contemporary Considerations for Constructing a Genetic Risk Score: An Empirical Approach

Genet Epidemiol. 2015 Sep;39(6):439-45. doi: 10.1002/gepi.21912. Epub 2015 Jul 22.

Abstract

Genetic risk scores are an increasingly popular tool for summarizing the cumulative risk of a set of Single Nucleotide Polymorphisms (SNPs) with disease. Typically only the set of the SNPs that have reached genome-wide significance compose these scores. However recent work suggests that including additional SNPs may aid risk assessment. In this paper, we used the Atherosclerosis Risk in Communities (ARIC) Study cohort to illustrate how one can choose the optimal set of SNPs for a genetic risk score (GRS). In addition to P-value threshold, we also examined linkage disequilibrium, imputation quality, and imputation type. We provide a variety of evaluation metrics. Results suggest that P-value threshold had the greatest impact on GRS quality for the outcome of coronary heart disease, with an optimal threshold around 0.001. However, GRSs are relatively robust to both linkage disequilibrium and imputation quality. We also show that the optimal GRS partially depends on the evaluation metric and consequently the way one intends to use the GRS. Overall the implications highlight both the robustness of GRS and a means to empirically choose the best set of GRSs.

Keywords: coronary heart disease; risk assessment; risk score.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Cohort Studies
  • Coronary Artery Disease / diagnosis
  • Coronary Artery Disease / genetics
  • Female
  • Genome-Wide Association Study*
  • Genotype
  • Humans
  • Linkage Disequilibrium
  • Male
  • Middle Aged
  • Polymorphism, Single Nucleotide
  • Risk Factors