Computational and biological analysis of 680 kb of DNA sequence from the human 5q31 cytokine gene cluster region

Genome Res. 1997 May;7(5):495-512. doi: 10.1101/gr.7.5.495.

Abstract

With the human genome project advancing into what will be a 7- to 10-year DNA sequencing phase, we are presented with the challenge of developing strategies to convert genomic sequence data, as they become available, into biologically meaningful information. We have analyzed 680 kb of noncontiguous DNA sequence from a 1-Mb region of human chromosome 5q31, coupling computational analysis with gene expression studies of tissues isolated from humans as well as from mice containing human YAC transgenes. This genomic interval has been noted previously for containing the cytokine gene cluster and a quantitative trait locus associated with inflammatory diseases. Our analysis identified and verified expression of 16 new genes, as well as 7 previously known genes. Of the total of 23 genes in this region, 78% had similarity matches to sequences in protein databases and 83% had exact expressed sequence tag (EST) database matches. Comparative mapping studies of eight of the new human genes discovered in the 5q31 region revealed that all are located in the syntenic region of mouse chromosome 11q. Our analysis demonstrates an approach for examining human sequence as it is made available from large sequencing programs and has resulted in the discovery of several biomedically important genes, including a cyclin, a transcription factor that is homologous to an oncogene, a protein involved in DNA repair, and several new members of a family of transporter proteins.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Blotting, Northern
  • Chromosome Mapping / methods
  • Chromosomes, Artificial, Yeast
  • Chromosomes, Human, Pair 5*
  • Computational Biology / methods*
  • Cytokines / genetics*
  • Humans
  • Interleukins / genetics
  • Mice
  • Mice, Transgenic
  • Molecular Sequence Data
  • Multigene Family*
  • Polymerase Chain Reaction
  • Protein Biosynthesis
  • Proteins / genetics
  • RNA / genetics
  • Sequence Analysis, DNA / methods
  • Sequence Homology, Nucleic Acid
  • Sequence Tagged Sites
  • Software
  • Transcription, Genetic

Substances

  • Cytokines
  • Interleukins
  • Proteins
  • RNA

Associated data

  • GENBANK/D10041
  • GENBANK/D12645
  • GENBANK/D25728
  • GENBANK/D63072
  • GENBANK/H46577
  • GENBANK/L13773
  • GENBANK/L27651
  • GENBANK/M79022
  • GENBANK/N44937
  • GENBANK/N48057
  • GENBANK/N50853
  • GENBANK/N59157
  • GENBANK/N78208
  • GENBANK/R53169
  • GENBANK/R66031
  • GENBANK/U00055
  • GENBANK/U16163
  • GENBANK/U28966
  • GENBANK/U34360
  • GENBANK/W22101
  • GENBANK/W26686
  • GENBANK/X14814
  • GENBANK/X16609
  • GENBANK/X69063
  • GENBANK/X76454
  • GENBANK/X82498
  • GENBANK/X83543
  • GENBANK/X93510
  • GENBANK/Z14997
  • GENBANK/Z37110