Mutations that affect mRNA splicing often produce multiple mRNA isoforms, resulting in complex molecular phenotypes. Definition of an exon and its inclusion in mature mRNA relies on joint recognition of both acceptor and donor splice sites. This study predicts cryptic and exon-skipping isoforms in mRNA produced by splicing mutations from the combined information contents (R(i), which measures binding-site strength, in bits) and distribution of the splice sites defining these exons. The total information content of an exon (R(i),total) is the sum of the R(i) values of its acceptor and donor splice sites, adjusted for the self-information of the distance separating these sites, that is, the gap surprisal. Differences between total information contents of an exon (ΔR(i,total)) are predictive of the relative abundance of these exons in distinct processed mRNAs. Constraints on splice site and exon selection are used to eliminate nonconforming and poorly expressed isoforms. Molecular phenotypes are computed by the Automated Splice Site and Exon Definition Analysis (http://splice.uwo.ca) server. Predictions of splicing mutations were highly concordant (85.2%; n = 61) with published expression data. In silico exon definition analysis will contribute to streamlining assessment of abnormal and normal splice isoforms resulting from mutations.
© 2013 Wiley Periodicals, Inc.