Motivation: Sequence repositories have few well-annotated virus mature peptide sequences. Therefore post-translational proteolytic processing of polyproteins into mature peptides (MPs) has been performed in silico, with a new computational method, for over 200 species in 5 pathogenic virus families (Caliciviridae, Coronaviridae, Flaviviridae, Picornaviridae and Togaviridae).
Results: Using pairwise alignment with reference sequences, MPs have been annotated and their sequences made available for search, analysis and download. At publication the method had produced 156 216 sequences, a large portion of the protein sequences now available in https://www.viprbrc.org. It represents a new and comprehensive mature peptide collection.
Availability and implementation: The data are available at the Virus Pathogen Resource https://www.viprbrc.org, and the software at https://github.com/VirusBRC/vipr_mat_peptide.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected].