The ability to sequence DNA rapidly, inexpensively and in a high-throughput fashion provides a unique opportunity to sequence whole genomes of a large number of species. The cataloging of protein-coding genes from these species, however, remains a non-trivial task with the majority of initial genome annotation dependent on the use of gene prediction algorithms. Recent advances in mass spectrometry-based proteomics now enable generation of accurate and comprehensive protein sequence of tissues and organisms. Proteogenomics allows us to harness the wealth of information available at the proteome level and apply it to the available genomic information of organisms. This includes identifying novel genes and splice isoforms, assigning correct start sites and validating predicted exons and genes. It is also possible to use proteogenomics to identify protein variants that could cause diseases, to identify protein biomarkers and to study genome variation. We anticipate proteogenomics to become a powerful approach that will be routinely employed by 'Genome and Proteome Centers' of the future.
Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.