Interpreting SNP heritability in admixed populations

Jinguo Huang; Nicole Kleman; Saonli Basu; Mark D Shriver; Arslan A Zaidi

doi:10.1101/2023.08.04.551959

Interpreting SNP heritability in admixed populations

bioRxiv [Preprint]. 2024 Aug 6:2023.08.04.551959. doi: 10.1101/2023.08.04.551959.

Authors

Jinguo Huang^{1

2}, Nicole Kleman³, Saonli Basu⁴, Mark D Shriver², Arslan A Zaidi^{3

5}

Affiliations

¹ Bioinformatics and Genomics, Huck Institutes of the Life Sciences, Pennsylvania State University.
² Department of Anthropology, Pennsylvania State University.
³ Department of Genetics, Cell Biology, and Development, University of Minnesota.
⁴ Department of Biostatistics, University of Minnesota.
⁵ Institute of Health Informatics, University of Minnesota.

Abstract

SNP heritability $(h_{s n p}^{2})$ is defined as the proportion of phenotypic variance explained by genotyped SNPs and is believed to be a lower bound of heritability ( $h^{2}$ ), being equal to it if all causal variants are known. Despite the simple intuition behind $h_{s n p}^{2}$ , its interpretation and equivalence to $h^{2}$ is unclear, particularly in the presence of population structure and assortative mating. It is well known that population structure can lead to inflation in ${\hat{h}}_{s n p}^{2}$ estimates because of confounding due to linkage disequilibrium (LD) or shared environment. Here we use analytical theory and simulations to demonstrate that $h_{s n p}^{2}$ estimates can be biased in admixed populations, even in the absence of confounding and even if all causal variants are known. This is because admixture generates LD, which contributes to the genetic variance, and therefore to heritability. Genome-wide restricted maximum likelihood (GREML) does not capture this contribution leading to under- or over-estimates of $h_{s n p}^{2}$ relative to $h^{2}$ , depending on the genetic architecture. In contrast, Haseman-Elston (HE) regression exaggerates the LD contribution leading to biases in the opposite direction. For the same reason, GREML and HE estimates of local ancestry heritability $(h_{γ}^{2})$ are also biased. We describe this bias in ${\hat{h}}_{s n p}^{2}$ and ${\hat{h}}_{γ}^{2}$ as a function of admixture history and the genetic architecture of the trait and show that it can be recovered under some conditions. We clarify the interpretation of ${\hat{h}}_{s n p}^{2}$ in admixed populations and discuss its implication for genome-wide association studies and polygenic prediction.

Publication types

Preprint

Grants and funding

R00 GM137076/GM/NIGMS NIH HHS/United States