Transcriptome dynamics-based operon prediction in prokaryotes

BMC Bioinformatics. 2014 May 16:15:145. doi: 10.1186/1471-2105-15-145.

Abstract

Background: Inferring operon maps is crucial to understanding the regulatory networks of prokaryotic genomes. Recently, RNA-seq based transcriptome studies revealed that in many bacterial species the operon structure vary with the change of environmental conditions. Therefore, new computational solutions that use both static and dynamic data are necessary to create condition specific operon predictions.

Results: In this work, we propose a novel classification method that integrates RNA-seq based transcriptome profiles with genomic sequence features to accurately identify the operons that are expressed under a measured condition. The classifiers are trained on a small set of confirmed operons and then used to classify the remaining gene pairs of the organism studied. Finally, by linking consecutive gene pairs classified as operons, our computational approach produces condition-dependent operon maps. We evaluated our approach on various RNA-seq expression profiles of the bacteria Haemophilus somni, Porphyromonas gingivalis, Escherichia coli and Salmonella enterica. Our results demonstrate that, using features depending on both transcriptome dynamics and genome sequence characteristics, we can identify operon pairs with high accuracy. Moreover, the combination of DNA sequence and expression data results in more accurate predictions than each one alone.

Conclusion: We present a computational strategy for the comprehensive analysis of condition-dependent operon maps in prokaryotes. Our method can be used to generate condition specific operon maps of many bacterial organisms for which high-resolution transcriptome data is available.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Expression Profiling / methods*
  • Genome, Bacterial*
  • Genomics / methods
  • Molecular Sequence Annotation
  • Operon*
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, RNA / methods*