A structural split in the human genome

PLoS One. 2007 Jul 11;2(7):e603. doi: 10.1371/journal.pone.0000603.

Abstract

Background: Promoter-associated CpG islands (PCIs) mediate methylation-dependent gene silencing, yet tend to co-locate to transcriptionally active genes. To address this paradox, we used data mining to assess the behavior of PCI-positive (PCI+) genes in the human genome.

Results: PCI+ genes exhibit a bimodal distribution: (1) a 'housekeeping-like' subset characterized by higher GC content and lower intron length/number, and (2) a 'pseudogene paralog' subset characterized by lower GC content and higher intron length/number (p<0.001). These subsets are functionally distinguishable, with the former gene group characterized by higher expression levels and lower evolutionary rate (p<0.001). PCI-negative (PCI-) genes exhibit higher evolutionary rate and narrower expression breadth than PCI+ genes (p<0.001), consistent with more frequent tissue-specific inactivation.

Conclusions: Adaptive evolution of the human genome appears driven in part by declining transcription of a subset of PCI+ genes, predisposing to both CpG-->TpA mutation and intron insertion. We propose a model of evolving biological complexity in which environmentally-selected gains or losses of PCI methylation respectively favor positive or negative selection, thus polarizing PCI+ gene structures around a genomic core of ancestral PCI- genes.

MeSH terms

  • Animals
  • Base Composition
  • CpG Islands / genetics
  • DNA Transposable Elements / genetics
  • Evolution, Molecular
  • Genome, Human*
  • Humans
  • Introns / genetics
  • Multigene Family / genetics
  • Mutation
  • Promoter Regions, Genetic / genetics
  • Pseudogenes / genetics
  • Species Specificity
  • Transcription, Genetic

Substances

  • DNA Transposable Elements