Short ultraconserved promoter regions delineate a class of preferentially expressed alternatively spliced transcripts

Genomics. 2009 Nov;94(5):308-16. doi: 10.1016/j.ygeno.2009.07.005. Epub 2009 Aug 4.

Abstract

Ultraconservation has been variously defined to describe sequences that have remained identical or nearly so over long periods of evolution to a degree that is higher than expected for sequences under typical constraints associated with protein-coding sequences, splice sites, or transcription factor binding sites. Most intergenic ultraconserved elements (UCE) appear to be tissue-specific enhancers, whereas another class of intragenic UCEs is involved in regulation of gene expression by means of alternative splicing. In this study we define a set of 2827 short ultraconserved promoter regions (SUPR) in 5 kb upstream regions of 1268 human protein-coding genes using a definition of 98% identity for at least 30 bp in 7 mammalian species. Our analysis shows that SUPRs are enriched in genes playing a role in regulation and development. Many of the genes having a SUPR-containing promoter have additional alternative promoters that do not contain SUPRs. Comparison of such promoters by CAGE tag, EST, and Solexa read analysis revealed that SUPR-associated transcripts show a significantly higher mean expression than transcripts associated with non-SUPR-containing promoters. The same was true for the comparison between all SUPR-associated and non-SUPR-associated transcripts on a genome-wide basis. SUPR-associated genes show a highly significant tendency to occur in regions that are also enriched for intergenic short ultraconserved elements (SUE) in the vicinity of developmental genes. A number of predicted transcription factor binding sites (TFBS) are overrepresented in SUPRs and SUEs, including those for transcription factors of the homeodomain family, but in contrast to SUEs, SUPRs are also enriched in core-promoter motifs. These observations suggest that SUPRs delineate a distinct class of ultraconserved sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Alternative Splicing*
  • Animals
  • Cattle
  • Conserved Sequence / genetics*
  • Dogs
  • Exons / genetics
  • Female
  • Gene Expression Regulation, Developmental*
  • Genomics*
  • Humans
  • Infant
  • Infant, Newborn
  • Introns / genetics
  • Male
  • Mice
  • Organ Specificity
  • Promoter Regions, Genetic / genetics*
  • Proteins / chemistry
  • Proteins / genetics
  • Proteins / metabolism*
  • Rabbits

Substances

  • Proteins