Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach

Sandra Dérozier; Robert Bossy; Louise Deléger; Mouhamadou Ba; Estelle Chaix; Olivier Harlé; Valentin Loux; Hélène Falentin; Claire Nédellec

doi:10.1371/journal.pone.0272473

Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach

PLoS One. 2023 Jan 20;18(1):e0272473. doi: 10.1371/journal.pone.0272473. eCollection 2023.

Authors

Affiliations

¹ Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France.
² Université Paris-Saclay, INRAE, BioinfOmics, MIGALE Bioinformatics Facility, Jouy-en-Josas, France.
³ INRAE, STLO, Rennes, France.

Abstract

The dramatic increase in the number of microbe descriptions in databases, reports, and papers presents a two-fold challenge for accessing the information: integration of heterogeneous data in a standard ontology-based representation and normalization of the textual descriptions by semantic analysis. Recent text mining methods offer powerful ways to extract textual information and generate ontology-based representation. This paper describes the design of the Omnicrobe application that gathers comprehensive information on habitats, phenotypes, and usages of microbes from scientific sources of high interest to the microbiology community. The Omnicrobe database contains around 1 million descriptions of microbe properties. These descriptions are created by analyzing and combining six information sources of various kinds, i.e. biological resource catalogs, sequence databases and scientific literature. The microbe properties are indexed by the Ontobiotope ontology and their taxa are indexed by an extended version of the taxonomy maintained by the National Center for Biotechnology Information. The Omnicrobe application covers all domains of microbiology. With simple or rich ontology-based queries, it provides easy-to-use support in the resolution of scientific questions related to the habitats, phenotypes, and uses of microbes. We illustrate the potential of Omnicrobe with a use case from the food innovation domain.

Copyright: © 2023 Dérozier et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Data Mining* / methods
Databases, Factual
Ecosystem*
Phenotype
Publications

Grants and funding

The author(s) received no specific funding for this work.