Ab Initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment

Sci Rep. 2013:3:1895. doi: 10.1038/srep01895.

Abstract

Genome-wide protein structure prediction and structure-based function annotation have been a long-term goal in molecular biology but not yet become possible due to difficulties in modeling distant-homology targets. We developed a hybrid pipeline combining ab initio folding and template-based modeling for genome-wide structure prediction applied to the Escherichia coli genome. The pipeline was tested on 43 known sequences, where QUARK-based ab initio folding simulation generated models with TM-score 17% higher than that by traditional comparative modeling methods. For 495 unknown hard sequences, 72 are predicted to have a correct fold (TM-score > 0.5) and 321 have a substantial portion of structure correctly modeled (TM-score > 0.35). 317 sequences can be reliably assigned to a SCOP fold family based on structural analogy to existing proteins in PDB. The presented results, as a case study of E. coli, represent promising progress towards genome-wide structure modeling and fold family assignment using state-of-the-art ab initio folding algorithms.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Bacterial Proteins / chemistry*
  • Base Sequence
  • Computational Biology
  • Escherichia coli* / genetics
  • Escherichia coli* / metabolism
  • Genome, Bacterial
  • Internet
  • Models, Molecular*
  • Molecular Sequence Data
  • Protein Conformation*
  • Protein Folding*
  • Software*

Substances

  • Bacterial Proteins