STAREG: Statistical replicability analysis of high throughput experiments with applications to spatial transcriptomic studies

Yan Li; Xiang Zhou; Rui Chen; Xianyang Zhang; Hongyuan Cao

doi:10.1371/journal.pgen.1011423

STAREG: Statistical replicability analysis of high throughput experiments with applications to spatial transcriptomic studies

PLoS Genet. 2024 Oct 3;20(10):e1011423. doi: 10.1371/journal.pgen.1011423. eCollection 2024 Oct.

Authors

Yan Li^{1

2}, Xiang Zhou³, Rui Chen⁴, Xianyang Zhang⁵, Hongyuan Cao⁶

Affiliations

¹ School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, Jilin, China.
² School of Mathematics, Jilin University, Changchun, Jilin, China.
³ Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America.
⁴ Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America.
⁵ Department of Statistics, Texas A&M University, College Station, Texas, United States of America.
⁶ Department of Statistics, Florida State University, Tallahassee, Florida, United States of America.

Abstract

Replicable signals from different yet conceptually related studies provide stronger scientific evidence and more powerful inference. We introduce STAREG, a statistical method for replicability analysis of high throughput experiments, and apply it to analyze spatial transcriptomic studies. STAREG uses summary statistics from multiple studies of high throughput experiments and models the the joint distribution of p-values accounting for the heterogeneity of different studies. It effectively controls the false discovery rate (FDR) and has higher power by information borrowing. Moreover, it provides different rankings of important genes. With the EM algorithm in combination with pool-adjacent-violator-algorithm (PAVA), STAREG is scalable to datasets with millions of genes without any tuning parameters. Analyzing two pairs of spatially resolved transcriptomic datasets, we are able to make biological discoveries that otherwise cannot be obtained by using existing methods.

Copyright: © 2024 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Algorithms*
Animals
Gene Expression Profiling* / methods
Humans
Models, Statistical
Reproducibility of Results
Transcriptome* / genetics

Grants and funding

The author(s) received no specific funding for this work.