Polygenic Risk Scores

Curr Protoc. 2021 May;1(5):e126. doi: 10.1002/cpz1.126.

Abstract

As genome-wide association studies have continued to identify loci associated with complex traits, the implications of and necessity for proper use of these findings, including prediction of disease risk, have become apparent. Many complex diseases have numerous associated loci with detectable effects implicating risk for or protection from disease. A common contemporary approach to using this information for disease prediction is through the application of genetic risk scores. These scores estimate an individual's liability for a specific outcome by aggregating the effects of associated loci into a single measure as described in the previous version of this article. Although genetic risk scores have traditionally included variants that meet criteria for genome-wide significance, an extension known as the polygenic risk score has been developed to include the effects of more variants across the entire genome. Here, we describe common methods and software packages for calculating and interpreting polygenic risk scores. In this revised version of the article, we detail information that is needed to perform a polygenic risk score analysis, considerations for planning the analysis and interpreting results, as well as discussion of the limitations based on the choices made. We also provide simulated sample data and a walkthrough for four different polygenic risk score software. © 2021 Wiley Periodicals LLC.

Keywords: area under the curve; complex traits and diseases; disease prediction; genetic risk score; polygenic risk score.

MeSH terms

  • Genome-Wide Association Study*
  • Multifactorial Inheritance*
  • Risk Factors
  • Software