Although single base-pair substitutions in splice junctions constitute at least 10% of all mutations causing human inherited disease, the factors that determine their phenotypic consequences at the RNA level remain to be fully elucidated. Employing a neural network for splice-site recognition, we performed a meta-analysis of 478 disease-associated splicing mutations, in 38 different genes, for which detailed laboratory-based mRNA phenotype assessment had been performed. Inspection of the +/-50-bp DNA sequence context of the mutations revealed that exon skipping was the preferred phenotype when the immediate vicinity of the affected exon-intron junctions was devoid of alternative splice-sites. By contrast, in the presence of at least one such motif, cryptic splice-site utilization, became more prevalent. This association was, however, confined to donor splice-sites. Outside the obligate dinucleotide, the spatial distribution of pathological mutations was found to differ significantly from that of SNPs. Whereas disease-associated lesions clustered at positions -1 and +3 to +6 for donor sites and -3 for acceptor sites, SNPs were found to be almost evenly distributed over all sequence positions considered. When all putative missense mutations in the vicinity of splice-sites were extracted from the Human Gene Mutation Database for the 38 studied genes, a significantly higher proportion of changes at donor sites (37/152; 24.3%) than at acceptor splice-sites (1/142; 0.7%) was found to reduce the neural network signal emitted by the respective splice-site. Based upon these findings, we estimate that some 1.6% of disease-causing missense substitutions in human genes are likely to affect the mRNA splicing phenotype. Taken together, our results are consistent with correct donor splice-site recognition being a key step in exon recognition.
(c) 2006 Wiley-Liss, Inc.