Genetic variation in mRNA coding sequences of highly conserved genes

Physiol Genomics. 2001 Apr 2;5(3):113-8. doi: 10.1152/physiolgenomics.2001.5.3.113.

Abstract

The frequency and distribution of genetic polymorphism in the human genome is a question of major importance. We have studied this in highly conserved genes, which encode crucial functions such as DNA replication, mRNA transcription, and translation. Evolutionary comparisons suggest that these genes are under particularly strong selective pressure, and their frequency of nucleotide sequence polymorphism would be expected to represent a minimum estimate for sequence variation throughout the genome. We have analyzed the complete coding sequence and the 3'-untranslated region (3'-UTR) of 22 human genes, most of which have homologs in all cellular organisms and all of which are at least 25% amino acid identical to homologs in yeast. Comparisons with similar studies of less conserved human disease genes indicate that 1) evolutionarily conserved genes are, on average, less polymorphic than disease related genes; 2) the difference in polymorphism levels is attributable almost entirely to reduced levels of variation in protein coding sequences, whereas noncoding sequences have similar levels of polymorphism; and 3) the character of polymorphism, in terms of the spectrum and frequency of mutational changes, is similar.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line
  • Conserved Sequence
  • Evolution, Molecular
  • Fungal Proteins / genetics
  • Genes*
  • Genetic Variation
  • Humans
  • Polymorphism, Single Nucleotide*
  • RNA, Messenger / genetics*
  • Sequence Homology, Amino Acid
  • Yeasts / genetics

Substances

  • Fungal Proteins
  • RNA, Messenger