Hierarchical joint analysis of marginal summary statistics-Part II: High-dimensional instrumental analysis of omics data

Genet Epidemiol. 2024 Oct;48(7):291-309. doi: 10.1002/gepi.22577. Epub 2024 Jun 17.

Abstract

Instrumental variable (IV) analysis has been widely applied in epidemiology to infer causal relationships using observational data. Genetic variants can also be viewed as valid IVs in Mendelian randomization and transcriptome-wide association studies. However, most multivariate IV approaches cannot scale to high-throughput experimental data. Here, we leverage the flexibility of our previous work, a hierarchical model that jointly analyzes marginal summary statistics (hJAM), to a scalable framework (SHA-JAM) that can be applied to a large number of intermediates and a large number of correlated genetic variants-situations often encountered in modern experiments leveraging omic technologies. SHA-JAM aims to estimate the conditional effect for high-dimensional risk factors on an outcome by incorporating estimates from association analyses of single-nucleotide polymorphism (SNP)-intermediate or SNP-gene expression as prior information in a hierarchical model. Results from extensive simulation studies demonstrate that SHA-JAM yields a higher area under the receiver operating characteristics curve (AUC), a lower mean-squared error of the estimates, and a much faster computation speed, compared to an existing approach for similar analyses. In two applied examples for prostate cancer, we investigated metabolite and transcriptome associations, respectively, using summary statistics from a GWAS for prostate cancer with more than 140,000 men and high dimensional publicly available summary data for metabolites and transcriptomes.

Keywords: Mendelian randomization; hierarchical joint analysis of marginal summary data (hJAM); instrumental variable analysis; omics data; summary statistics; transcriptome‐wide association study (TWAS).

MeSH terms

  • Computer Simulation
  • Genome-Wide Association Study / methods
  • Humans
  • Male
  • Mendelian Randomization Analysis
  • Models, Statistical
  • Polymorphism, Single Nucleotide*
  • Prostatic Neoplasms* / genetics
  • ROC Curve