STAREG: Statistical replicability analysis of high throughput experiments with applications to spatial transcriptomic studies

PLoS Genet. 2024 Oct 3;20(10):e1011423. doi: 10.1371/journal.pgen.1011423. eCollection 2024 Oct.

Abstract

Replicable signals from different yet conceptually related studies provide stronger scientific evidence and more powerful inference. We introduce STAREG, a statistical method for replicability analysis of high throughput experiments, and apply it to analyze spatial transcriptomic studies. STAREG uses summary statistics from multiple studies of high throughput experiments and models the the joint distribution of p-values accounting for the heterogeneity of different studies. It effectively controls the false discovery rate (FDR) and has higher power by information borrowing. Moreover, it provides different rankings of important genes. With the EM algorithm in combination with pool-adjacent-violator-algorithm (PAVA), STAREG is scalable to datasets with millions of genes without any tuning parameters. Analyzing two pairs of spatially resolved transcriptomic datasets, we are able to make biological discoveries that otherwise cannot be obtained by using existing methods.

MeSH terms

  • Algorithms*
  • Animals
  • Gene Expression Profiling* / methods
  • Humans
  • Models, Statistical
  • Reproducibility of Results
  • Transcriptome* / genetics

Grants and funding

The author(s) received no specific funding for this work.