RNA sequencing (RNA-Seq) allows for the identification of novel exon-exon junctions and quantification of gene expression levels. We show that from RNA-Seq data one may also detect utilization of alternative polyadenylation (APA) in 3' untranslated regions (3' UTRs) known to play a critical role in the regulation of mRNA stability, cellular localization and translation efficiency. Given the dynamic nature of APA, it is desirable to examine the APA on a sample by sample basis. We used a Poisson hidden Markov model (PHMM) of RNA-Seq data to identify potential APA in human liver and brain cortex tissues leading to shortened 3' UTRs. Over three hundred transcripts with shortened 3' UTRs were detected with sensitivity >75% and specificity >60%. Tissue-specific 3' UTR shortening was observed for 32 genes with a q-value ≤ 0.1. When compared to alternative isoforms detected by Cufflinks or MISO, our PHMM method agreed on over 100 transcripts with shortened 3' UTRs. Given the increasing usage of RNA-Seq for gene expression profiling, using PHMM to investigate sample-specific 3' UTR shortening could be an added benefit from this emerging technology.
Keywords: 3′ UTRs; 3′ untranslated regions; APA; Alternative polyadenylation; BIC; Bayesian information criterion; CDFs; EM; EST; Expectation and Maximization; GEO; Gene Expression Omnibus; IVT; MicroRNA; Microarray; PAS-Seq; PHMM; Poisson hidden Markov model; RACE; RNA-Seq; RNA-sequencing; SRA; SVM; Sequence Read Archive; Untranslated region; alternative polyadenylation; base-pairs; bp; chip design files; expressed sequence tag; in vitro transcription; miRNA; polyadenylation site sequencing; rapid amplification of cDNA ends; support vector machine.
© 2013.