@MInter: automated text-mining of microbial interactions

Bioinformatics. 2016 Oct 1;32(19):2981-7. doi: 10.1093/bioinformatics/btw357. Epub 2016 Jun 16.

Abstract

Motivation: Microbial consortia are frequently defined by numerous interactions within the community that are key to understanding their function. While microbial interactions have been extensively studied experimentally, information regarding them is dispersed in the scientific literature. As manual collation is an infeasible option, automated data processing tools are needed to make this information easily accessible.

Results: We present @MInter, an automated information extraction system based on Support Vector Machines to analyze paper abstracts and infer microbial interactions. @MInter was trained and tested on a manually curated gold standard dataset of 735 species interactions and 3917 annotated abstracts, constructed as part of this study. Cross-validation analysis showed that @MInter was able to detect abstracts pertaining to one or more microbial interactions with high specificity (specificity = 95%, AUC = 0.97). Despite challenges in identifying specific microbial interactions in an abstract (interaction level recall = 95%, precision = 25%), @MInter was shown to reduce annotator workload 13-fold compared to alternate approaches. Applying @MInter to 175 bacterial species abundant on human skin, we identified a network of 357 literature-reported microbial interactions, demonstrating its utility for the study of microbial communities.

Availability and implementation: @MInter is freely available at https://github.com/CSB5/atminter

Contact: [email protected]

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Data Mining*
  • Electronic Data Processing
  • Microbial Interactions*
  • Support Vector Machine