Fine-scale variation and genetic determinants of alternative splicing across individuals

PLoS Genet. 2009 Dec;5(12):e1000766. doi: 10.1371/journal.pgen.1000766. Epub 2009 Dec 11.

Abstract

Recently, thanks to the increasing throughput of new technologies, we have begun to explore the full extent of alternative pre-mRNA splicing (AS) in the human transcriptome. This is unveiling a vast layer of complexity in isoform-level expression differences between individuals. We used previously published splicing sensitive microarray data from lymphoblastoid cell lines to conduct an in-depth analysis on splicing efficiency of known and predicted exons. By combining publicly available AS annotation with a novel algorithm designed to search for AS, we show that many real AS events can be detected within the usually unexploited, speculative majority of the array and at significance levels much below standard multiple-testing thresholds, demonstrating that the extent of cis-regulated differential splicing between individuals is potentially far greater than previously reported. Specifically, many genes show subtle but significant genetically controlled differences in splice-site usage. PCR validation shows that 42 out of 58 (72%) candidate gene regions undergo detectable AS, amounting to the largest scale validation of isoform eQTLs to date. Targeted sequencing revealed a likely causative SNP in most validated cases. In all 17 incidences where a SNP affected a splice-site region, in silico splice-site strength modeling correctly predicted the direction of the micro-array and PCR results. In 13 other cases, we identified likely causative SNPs disrupting predicted splicing enhancers. Using Fst and REHH analysis, we uncovered significant evidence that 2 putative causative SNPs have undergone recent positive selection. We verified the effect of five SNPs using in vivo minigene assays. This study shows that splicing differences between individuals, including quantitative differences in isoform ratios, are frequent in human populations and that causative SNPs can be identified using in silico predictions. Several cases affected disease-relevant genes and it is likely some of these differences are involved in phenotypic diversity and susceptibility to complex diseases.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Alternative Splicing*
  • Genetic Predisposition to Disease
  • Humans
  • Oligonucleotide Array Sequence Analysis
  • Polymerase Chain Reaction
  • Polymorphism, Single Nucleotide
  • RNA, Messenger / genetics

Substances

  • RNA, Messenger