A New Alignment-Free Whole Metagenome Comparison Tool and Its Application on Gut Microbiomes of Wild Giant Pandas

Front Microbiol. 2020 Jun 16:11:1061. doi: 10.3389/fmicb.2020.01061. eCollection 2020.

Abstract

The comparison of metagenomes is crucial for studying the relationship between microbial communities and environmental factors. One recently published alignment-free whole metagenome comparison method based on k-mer frequencies, Libra, showed higher resolutions than the present fastest method, Mash, on whole metagenomic sequencing reads, but it did not perform as well on the assembled contigs. Here, we developed a new alignment-free tool, KmerFreqCalc, for the comparison of the whole metagenomic data, which first calculated the frequencies of both forward and reverse complementary sequences of k-mers like Mash and then computed the cosine distance between the samples based on k-mer frequency vectors like Libra. We applied KmerFreqCalc on the assembled contigs of the gut microbiomes of wild giant pandas and compared the results to Libra and Mash. The results indicated that KmerFreqCalc was able to detect the subtle difference between giant panda samples caused by seasonal diet change, showing better clustering than Libra and Mash. Therefore, KmerFreqCalc has high resolution and accuracy for assembled contigs, being very suitable for comparison of samples with low dissimilarity.

Keywords: cosine distance; gut microbiomes; k-mer frequencies; reverse complementary sequence; whole metagenome comparison; wild giant pandas.

Associated data

  • figshare/10.6084/m9.figshare.6303713