Motivation: High-resolution target pathogen detection using metagenomic sequencing data represents a major challenge due to the low concentration of target pathogens in samples. We introduced mStrain, a novel Yesinia pestis strain/lineage-level identification tool that utilizes metagenomic data. mStrain successfully identified Y. pestis at the strain/lineage level by extracting sufficient information regarding single-nucleotide polymorphisms (SNPs), which can therefore be an effective tool for identification and source tracking of Y. pestis based on metagenomic data during a plague outbreak.
Definition: .
Strain-level identification: Assigning the reads in the metagenomic sequencing data to an exactly known or most closely representative Y. pestis strain.
Lineage-level identification: Assigning the reads in the metagenomic sequencing data to a specific lineage on the phylogenetic tree.
canosnps: The unique and typical SNPs present in all representative strains.
Ancestor/derived state: An SNP is defined as the ancestor state when consistent with the allele of Yersinia pseudotuberculosis strain IP32953; otherwise, the SNP is defined as the derived state.
Availability and implementation: The code for running mStrain, the test dataset, and instructions for running the code can be found at the following GitHub repository: https://github.com/xwqian1123/mStrain.
© The Author(s) 2023. Published by Oxford University Press.