A maximum-type microbial differential abundance test with application to high-dimensional microbiome data analyses

Front Cell Infect Microbiol. 2022 Oct 28:12:988717. doi: 10.3389/fcimb.2022.988717. eCollection 2022.

Abstract

Background: High-throughput metagenomic sequencing technologies have shown prominent advantages over traditional pathogen detection methods, bringing great potential in clinical pathogen diagnosis and treatment of infectious diseases. Nevertheless, how to accurately detect the difference in microbiome profiles between treatment or disease conditions remains computationally challenging.

Results: In this study, we propose a novel test for identifying the difference between two high-dimensional microbiome abundance data matrices based on the centered log-ratio transformation of the microbiome compositions. The test p-value can be calculated directly with a closed-form solution from the derived asymptotic null distribution. We also investigate the asymptotic statistical power against sparse alternatives that are typically encountered in microbiome studies. The proposed test is maximum-type equal-covariance-assumption-free (MECAF), making it widely applicable to studies that compare microbiome compositions between conditions. Our simulation studies demonstrated that the proposed MECAF test achieves more desirable power than competing methods while having the type I error rate well controlled under various scenarios. The usefulness of the proposed test is further illustrated with two real microbiome data analyses. The source code of the proposed method is freely available at https://github.com/Jiyuan-NYU-Langone/MECAF.

Conclusions: MECAF is a flexible differential abundance test and achieves statistical efficiency in analyzing high-throughput microbiome data. The proposed new method will allow us to efficiently discover shifts in microbiome abundances between disease and treatment conditions, broadening our understanding of the disease and ultimately improving clinical diagnosis and treatment.

Keywords: differential abundance analysis; high-dimensional compositional; microbiome data; relative abundances; sparse alternatives.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computer Simulation
  • Data Analysis*
  • High-Throughput Nucleotide Sequencing
  • Microbiota*