Genetic prediction of quantitative lipid traits: comparing shrinkage models to gene scores

Genet Epidemiol. 2014 Jan;38(1):72-83. doi: 10.1002/gepi.21777. Epub 2013 Nov 23.

Abstract

Accurate genetic prediction of quantitative traits related to complex disease risk would have potential clinical impact, so investigation of statistical methodology to improve predictive performance is important. We compare a simple approach of polygenic scores using top ranking single nucleotide polymorphisms (SNPs) to a set of shrinkage models, namely Ridge Regression, Lasso and Hyper-Lasso. These penalised regression methods analyse all genotyped SNPs simultaneously, potentially including much larger sets of SNPs in the models, not only those with the smallest P values. We compare the accuracy of these models for predicting low-density lipoprotein (LDL) and high-density lipoprotein (HDL) cholesterol, two lipid traits of clinical relevance, in the Whitehall II and British Women's Health and Heart Study cohorts, using SNPs from the HumanCVD BeadChip. For gene scores, the most accurate predictions arise from multivariate weighted scores and include only a small number of SNPs, identified as top hits by the HumanCVD BeadChip. Furthermore, there was little benefit from including external results from published sets of SNPs. We found that shrinkage approaches rarely improved significantly on gene score results. Genetic predictive performance is trait specific, depending on the heritability and genetic architecture of the trait, and is limited by the training data sample size. Our results for lipid traits suggest no current benefit of more complex methods over existing gene score methods. Instead, the most important choice for the prediction model is the number of SNPs and selection of the most predictive SNPs to include. However further comparisons, in larger samples and for other phenotypes, would still be of interest.

Keywords: SNP selection; lipids; penalised regression; polygenic score; prediction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Cholesterol, HDL / genetics*
  • Cholesterol, LDL / genetics*
  • Cohort Studies
  • Female
  • Genes*
  • Genotype
  • Health Surveys
  • Heart
  • Humans
  • Models, Genetic*
  • Multifactorial Inheritance / genetics
  • Phenotype
  • Polymorphism, Single Nucleotide / genetics*
  • Quantitative Trait Loci / genetics*
  • Regression Analysis
  • Sample Size
  • United Kingdom

Substances

  • Cholesterol, HDL
  • Cholesterol, LDL