Rp3: Ribosome profiling-assisted proteogenomics improves coverage and confidence during microprotein discovery

Nat Commun. 2024 Aug 9;15(1):6839. doi: 10.1038/s41467-024-50301-4.

Abstract

There has been a dramatic increase in the identification of non-canonical translation and a significant expansion of the protein-coding genome. Among the strategies used to identify unannotated small Open Reading Frames (smORFs) that encode microproteins, Ribosome profiling (Ribo-Seq) is the gold standard for the annotation of novel coding sequences by reporting on smORF translation. In Ribo-Seq, ribosome-protected footprints (RPFs) that map to multiple genomic sites are removed since they cannot be unambiguously assigned to a specific genomic location. Furthermore, RPFs necessarily result in short (25-34 nucleotides) reads, increasing the chance of multi-mapping alignments, such that smORFs residing in these regions cannot be identified by Ribo-Seq. Moreover, it has been challenging to identify protein evidence for Ribo-Seq. To solve this, we developed Rp3, a pipeline that integrates proteogenomics and Ribosome profiling to provide unambiguous evidence for a subset of microproteins missed by current Ribo-Seq pipelines. Here, we show that Rp3 maximizes proteomics detection and confidence of microprotein-encoding smORFs.

MeSH terms

  • Humans
  • Open Reading Frames* / genetics
  • Protein Biosynthesis
  • Proteins / genetics
  • Proteins / metabolism
  • Proteogenomics* / methods
  • Proteomics / methods
  • Ribosome Profiling
  • Ribosomes* / genetics
  • Ribosomes* / metabolism

Substances

  • Proteins