Background: The genome annotations of rhesus (Macaca mulatta) and cynomolgus (Macaca fascicularis) macaques, two of the most common non-human primate animal models, are limited.
Methods: We analyzed large-scale macaque RNA-based next-generation sequencing (RNAseq) data to identify un-annotated macaque transcripts.
Results: For both macaque species, we uncovered thousands of novel isoforms for annotated genes and thousands of un-annotated intergenic transcripts enriched with non-coding RNAs. We also identified thousands of transcript sequences which are partially or completely 'missing' from current macaque genome assemblies. We showed that many newly identified transcripts were differentially expressed during SIV infection of rhesus macaques or during Ebola virus infection of cynomolgus macaques.
Conclusions: For two important macaque species, we uncovered thousands of novel isoforms and un-annotated intergenic transcripts including coding and non-coding RNAs, polyadenylated and non-polyadenylated transcripts. This resource will greatly improve future macaque studies, as demonstrated by their applications in infectious disease studies.
Keywords: RNAseq; intergenic transcripts; non-coding RNA; non-polyadenylated RNA.
© 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.