Atypical structure and phylogenomic evolution of the new eutherian oocyte- and embryo-expressed KHDC1/DPPA5/ECAT1/OOEP gene family

Genomics. 2007 Nov;90(5):583-94. doi: 10.1016/j.ygeno.2007.06.003. Epub 2007 Oct 3.

Abstract

Several recent studies have shown that genes specifically expressed by the oocyte are subject to rapid evolution, in particular via gene duplication mechanisms. In the present work, we have focused our attention on a family of genes, specific to eutherian mammals, that are located in unstable genomic regions. We have identified two genes specifically expressed in the mouse oocyte: Khdc1a (KH homology domain containing 1a, also named Ndg1 for Nur 77 downstream gene 1, a target gene of the Nur77 orphan receptor), and another gene structurally related to Khdc1a that we have renamed Khdc1b. In this paper, we show that Khdc1a and Khdc1b belong to a family of several members including the so-called developmental pluripotency A5 (Dppa5) genes, the cat/dog oocyte expressed protein (cat OOEP and dog OOEP) genes, and the ES cell-associated transcript 1 (Ecat1) genes. These genes encode structurally related proteins that are characterized by an atypical RNA-binding KH domain and are specifically expressed in oocytes and/or embryonic stem cells. They are absent in fish, bird, and marsupial genomes and thus seem to have first appeared in eutherian mammals, in which they have evolved rapidly. They are located in a single syntenic region in all mammalian genomes studied, except in rodents, in which a synteny rupture due to a paracentric inversion has separated this gene family into two genomic regions and seems to be associated with increased instability in these regions. Overall, we have identified and characterized a novel family of oocyte and/or embryonic stem cell-specific genes encoding proteins that share an atypical KH RNA-binding domain and that have evolved rapidly since their emergence in eutherian mammalian genomes.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Computational Biology
  • DNA-Binding Proteins / genetics*
  • DNA-Binding Proteins / metabolism
  • Embryonic Stem Cells / metabolism*
  • Evolution, Molecular
  • Female
  • Genome*
  • Homeodomain Proteins / genetics*
  • Homeodomain Proteins / metabolism
  • In Situ Hybridization
  • Mice
  • Molecular Sequence Data
  • Multigene Family*
  • Nanog Homeobox Protein
  • Oocytes / metabolism*
  • Phylogeny*
  • Proteins / genetics*
  • Proteins / metabolism
  • Reverse Transcriptase Polymerase Chain Reaction
  • Sequence Homology, Amino Acid

Substances

  • DNA-Binding Proteins
  • Dppa5 protein, mouse
  • Homeodomain Proteins
  • Nanog Homeobox Protein
  • Nanog protein, mouse
  • Proteins