PERMANOVA-S: association test for microbial community composition that accommodates confounders and multiple distances

Bioinformatics. 2016 Sep 1;32(17):2618-25. doi: 10.1093/bioinformatics/btw311. Epub 2016 May 19.

Abstract

Motivation: Recent advances in sequencing technology have made it possible to obtain high-throughput data on the composition of microbial communities and to study the effects of dysbiosis on the human host. Analysis of pairwise intersample distances quantifies the association between the microbiome diversity and covariates of interest (e.g. environmental factors, clinical outcomes, treatment groups). In the design of these analyses, multiple choices for distance metrics are available. Most distance-based methods, however, use a single distance and are underpowered if the distance is poorly chosen. In addition, distance-based tests cannot flexibly handle confounding variables, which can result in excessive false-positive findings.

Results: We derive presence-weighted UniFrac to complement the existing UniFrac distances for more powerful detection of the variation in species richness. We develop PERMANOVA-S, a new distance-based method that tests the association of microbiome composition with any covariates of interest. PERMANOVA-S improves the commonly-used Permutation Multivariate Analysis of Variance (PERMANOVA) test by allowing flexible confounder adjustments and ensembling multiple distances. We conducted extensive simulation studies to evaluate the performance of different distances under various patterns of association. Our simulation studies demonstrate that the power of the test relies on how well the selected distance captures the nature of the association. The PERMANOVA-S unified test combines multiple distances and achieves good power regardless of the patterns of the underlying association. We demonstrate the usefulness of our approach by reanalyzing several real microbiome datasets.

Availability and implementation: miProfile software is freely available at https://medschool.vanderbilt.edu/tang-lab/software/miProfile

Contact: [email protected] or [email protected]

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Analysis of Variance*
  • Computer Simulation
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Microbiota*
  • Software