Identification and characterization of the human long form of Sox5 (L-SOX5) gene

Gene. 2002 Sep 18;298(1):59-68. doi: 10.1016/s0378-1119(02)00927-7.

Abstract

The Sox (Sry-type HMG box) group of transcription factors, which is defined by a high-mobility group (HMG) DNA-binding domain, is categorized into six subfamilies. Sox5 and Sox6 belong to the group D subfamily, which is characterized by conserved N-terminal domains including a leucine-zipper, a coiled-coil domain and a Q-box. Group D Sox genes are expressed as long and short transcripts that exhibit differential expression patterns. In mouse, the long form of Sox5, L-Sox5, is co-expressed and interacts with Sox6; together, these two proteins appear to play a key role in chondrogenesis and myogenesis. In humans, however, only the short form of Sox5 has previously been identified. To gain insight into Sox5 function, we have identified and characterized human L-SOX5. The human L-SOX5 cDNA encodes a 763-amino-acid protein that is 416 residues longer than the short form and contains all of the characteristic motifs of group D Sox proteins. The predicted L-SOX5 protein shares 97% amino acid identity with its mouse counterpart and 59% identity with human SOX6. The L-SOX5 gene contains 18 exons and shows similar genomic structure to SOX6. We have identified two transcription start sites in L-SOX5 and multiple alternatively spliced mRNA variants that are distinct from the short form. Unlike the short form, which shows testis-specific expression, L-SOX5 is expressed in multiple tissues. Like SOX6, L-SOX5 shows strong expression in chondrocytes and striated muscles, indicating a likely role in human cartilage and muscle development.

MeSH terms

  • Alternative Splicing
  • Amino Acid Sequence
  • Base Sequence
  • DNA / chemistry
  • DNA / genetics
  • DNA, Complementary / chemistry
  • DNA, Complementary / genetics
  • DNA-Binding Proteins / genetics*
  • Exons
  • Female
  • Gene Expression
  • High Mobility Group Proteins / genetics*
  • Humans
  • Introns
  • Male
  • Molecular Sequence Data
  • Nuclear Proteins / genetics*
  • Polymorphism, Single Nucleotide / genetics
  • Protein Isoforms / genetics
  • SOXD Transcription Factors
  • Sequence Analysis, DNA
  • Sequence Homology, Amino Acid
  • Transcription Factors
  • Tumor Cells, Cultured

Substances

  • DNA, Complementary
  • DNA-Binding Proteins
  • High Mobility Group Proteins
  • Nuclear Proteins
  • Protein Isoforms
  • SOX5 protein, human
  • SOXD Transcription Factors
  • Transcription Factors
  • DNA

Associated data

  • GENBANK/AB081588
  • GENBANK/AB081589
  • GENBANK/AB081590
  • GENBANK/AB081591