ThioFinder: a web-based tool for the identification of thiopeptide gene clusters in DNA sequences

PLoS One. 2012;7(9):e45878. doi: 10.1371/journal.pone.0045878. Epub 2012 Sep 24.

Abstract

Thiopeptides are a growing class of sulfur-rich, highly modified heterocyclic peptides that are mainly active against Gram-positive bacteria including various drug-resistant pathogens. Recent studies also reveal that many thiopeptides inhibit the proliferation of human cancer cells, further expanding their application potentials for clinical use. Thiopeptide biosynthesis shares a common paradigm, featuring a ribosomally synthesized precursor peptide and conserved posttranslational modifications, to afford a characteristic core system, but differs in tailoring to furnish individual members. Identification of new thiopeptide gene clusters, by taking advantage of increasing information of DNA sequences from bacteria, may facilitate new thiopeptide discovery and enrichment of the unique biosynthetic elements to produce novel drug leads by applying the principle of combinatorial biosynthesis. In this study, we have developed a web-based tool ThioFinder to rapidly identify thiopeptide biosynthetic gene cluster from DNA sequence using a profile Hidden Markov Model approach. Fifty-four new putative thiopeptide biosynthetic gene clusters were found in the sequenced bacterial genomes of previously unknown producing microorganisms. ThioFinder is fully supported by an open-access database ThioBase, which contains the sufficient information of the 99 known thiopeptides regarding the chemical structure, biological activity, producing organism, and biosynthetic gene (cluster) along with the associated genome if available. The ThioFinder website offers researchers a unique resource and great flexibility for sequence analysis of thiopeptide biosynthetic gene clusters. ThioFinder is freely available at http://db-mml.sjtu.edu.cn/ThioFinder/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteriocins / genetics*
  • Databases, Genetic
  • Drug Discovery
  • Genes, Bacterial
  • Markov Chains
  • Multigene Family*
  • Peptides, Cyclic / genetics*
  • Sequence Analysis, DNA / methods
  • Software*

Substances

  • Bacteriocins
  • Peptides, Cyclic

Grants and funding

This study was supported in part by grants from the National Natural Science Foundation of China; the Ministry of Science and Technology of China (973 and 863 Programs); the Program for New Century Excellent Talents in University, Ministry of Education of China [NCET-10-0572]; the Chen Xing Young Scholars Program, Shanghai Jiaotong University; and Shanghai Municipality. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.