Proteogenomic analysis of Bradyrhizobium japonicum USDA110 using GenoSuite, an automated multi-algorithmic pipeline

Mol Cell Proteomics. 2013 Nov;12(11):3388-97. doi: 10.1074/mcp.M112.027169. Epub 2013 Jul 23.

Abstract

We present GenoSuite, an integrated proteogenomic pipeline to validate, refine and discover protein coding genes using high-throughput mass spectrometry (MS) data from prokaryotes. To demonstrate the effectiveness of GenoSuite, we analyzed proteomics data of Bradyrhizobium japonicum (USDA110), a model organism to study agriculturally important rhizobium-legume symbiosis. Our analysis confirmed 31% of known genes, refined 49 gene models for their translation initiation site (TIS) and discovered 59 novel protein coding genes. Notably, a novel protein which redefined the boundary of a crucial cytochrome P450 system related operon was discovered, known to be highly expressed in the anaerobic symbiotic bacteroids. A focused analysis on N-terminally acetylated peptides indicated downstream TIS for gene blr0594. Finally, ortho-proteogenomic analysis revealed three novel genes in recently sequenced B. japonicum USDA6(T) genome. The discovery of large number of missing genes and correction of gene models have expanded the proteomic landscape of B. japonicum and presents an unparalleled utility of proteogenomic analyses and versatility of GenoSuite for annotating prokaryotic genomes including pathogens.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism
  • Bradyrhizobium / genetics*
  • Bradyrhizobium / metabolism*
  • Genome, Bacterial
  • Genomics / methods*
  • Genomics / statistics & numerical data
  • Glycine max / microbiology
  • Mass Spectrometry
  • Operon
  • Proteomics / methods*
  • Proteomics / statistics & numerical data
  • Software*
  • Symbiosis / genetics

Substances

  • Bacterial Proteins