Comparing Ancestry Standardization Approaches for a Transancestry Colorectal Cancer Polygenic Risk Score

Genet Epidemiol. 2025 Jan;49(1):e22590. doi: 10.1002/gepi.22590. Epub 2024 Sep 24.

Abstract

Colorectal cancer (CRC) is a complex disease with monogenic, polygenic and environmental risk factors. Polygenic risk scores (PRSs) aim to identify high polygenic risk individuals. Due to differences in genetic background, PRS distributions vary by ancestry, necessitating standardization. We compared four post-hoc methods using the All of Us Research Program Whole Genome Sequence data for a transancestry CRC PRS. We contrasted results from linear models trained on A. the entire data or an ancestrally diverse subset AND B. covariates including principal components of ancestry or admixture. Standardization with the training subset also adjusted the variance. All methods performed similarly within ancestry, OR (95% C.I.) per s.d. change in PRS: African 1.5 (1.02, 2.08), Admixed American 2.2 (1.27, 3.85), European 1.6 (1.43, 1.89), and Middle Eastern 1.1 (0.71, 1.63). Using admixture and an ancestrally diverse training set provided distributions closest to standard Normal. Training a model on ancestrally diverse participants, adjusting both the mean and variance using admixture as covariates, created standard Normal z-scores, which can be used to identify patients at high polygenic risk. These scores can be incorporated into comprehensive risk calculation including other known risk factors, allowing for more precise risk estimates.

Keywords: admixture; all of us; colorectal cancer; polygenic risk score; transancestry.

Publication types

  • Comparative Study

MeSH terms

  • Colorectal Neoplasms* / genetics
  • Ethnicity / genetics
  • Female
  • Genetic Risk Score*
  • Genome-Wide Association Study / standards
  • Humans
  • Male
  • Middle Aged
  • Polymorphism, Single Nucleotide
  • Racial Groups / genetics