Efficiency improvement of peptide identification for an organism without complete genome sequence, using expressed sequence tag database and tandem mass spectral data

Proteomics. 2003 Dec;3(12):2305-9. doi: 10.1002/pmic.200300620.

Abstract

We compared peptide identification by database (DB) search methods with de novo sequencing results for proteomics study in an organism without genome sequence information. When the former was done by searching the Expressed Sequence Tag (EST) DB of the sample organism or the NCBI nonredundant (nr) protein DB of green plants using either the MASCOT or SEQUEST software program, it was confirmed that the former is as accurate as the latter. Peptides identified from EST DB were twice as many as those from the nr protein DB, in spite of the fact that the EST DB has less data (26 222 EST) than the NCBI nr protein DB (224 238). This study demonstrates that EST DB with tandem mass spectra can be used reliably for high-throughput proteomics studies in an organism without genome information.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computational Biology
  • Databases, Protein*
  • Expressed Sequence Tags*
  • Genome*
  • Molecular Sequence Data
  • Peptides / genetics
  • Plants / chemistry
  • Plants / genetics*
  • Proteomics*
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization

Substances

  • Peptides