Considering that the human microbiota plays a critical role in health and disease, an accurate and high-resolution taxonomic classification is thus essential for meaningful microbiome analysis. In this study, we developed an automatic system, named MultiTax pipeline, for generating de novo taxonomy from full-length 16S rRNA sequences using the Genome Taxonomy Database and other existing reference databases. We first constructed the MultiTax-human database, a high-resolution resource specifically designed for human microbiome research and clinical applications. The database includes 842,649 high-quality full-length 16S rRNA sequences, extracted from multiple public repositories and human-related studies, offering a comprehensive and accurate portrayal of the human microbiome. To validate the MultiTax-human database, we profiled the human microbiome across various body sites, identified core microbial taxa, and tested its performance using an independent data set. Additionally, the database is equipped with a user-friendly web interface for easy querying and data exploration. The MultiTax-human database is poised to serve as a valuable tool for researchers, enhancing the precision of human microbiome studies and advancing our understanding of its impact on human health and diseases.IMPORTANCEUnderstanding the human microbiome, the collection of microorganisms in and on our bodies, is essential for advancing health research. Current methods often lack precision and consistency, hindering our ability to study these microorganisms effectively. Our study presents the MultiTax-human database, a high-resolution reference tool specifically designed for human microbiome research. By integrating data from multiple sources and employing advanced classification techniques, this database offers an accurate and detailed map of the human microbiome. This resource enhances the ability of researchers and clinicians to explore the roles of microorganisms in health and disease, potentially leading to improved diagnostics, treatments, and insights into various medical conditions.
Keywords: 16S rRNA; GTDB; human microbiome; reference database; taxonomy.