Heterogeneity-aware integrative regression for ancestry-specific association studies

Biometrics. 2024 Oct 3;80(4):ujae109. doi: 10.1093/biomtc/ujae109.

Abstract

Ancestry-specific proteome-wide association studies (PWAS) based on genetically predicted protein expression can reveal complex disease etiology specific to certain ancestral groups. These studies require ancestry-specific models for protein expression as a function of SNP genotypes. In order to improve protein expression prediction in ancestral populations historically underrepresented in genomic studies, we propose a new penalized maximum likelihood estimator for fitting ancestry-specific joint protein quantitative trait loci models. Our estimator borrows information across ancestral groups, while simultaneously allowing for heterogeneous error variances and regression coefficients. We propose an alternative parameterization of our model that makes the objective function convex and the penalty scale invariant. To improve computational efficiency, we propose an approximate version of our method and study its theoretical properties. Our method provides a substantial improvement in protein expression prediction accuracy in individuals of African ancestry, and in a downstream PWAS analysis, leads to the discovery of multiple associations between protein expression and blood lipid traits in the African ancestry population.

Keywords: integrative analysis; population heterogeneity; protein quantitative trait loci; proteome-wide association study.

MeSH terms

  • Biometry / methods
  • Black People / genetics
  • Black People / statistics & numerical data
  • Computer Simulation
  • Genome-Wide Association Study / statistics & numerical data
  • Humans
  • Likelihood Functions
  • Models, Statistical
  • Polymorphism, Single Nucleotide*
  • Proteome
  • Quantitative Trait Loci*
  • Regression Analysis

Substances

  • Proteome