Identifying gene regulatory elements by genome-wide recovery of DNase hypersensitive sites

Proc Natl Acad Sci U S A. 2004 Jan 27;101(4):992-7. doi: 10.1073/pnas.0307540100. Epub 2004 Jan 19.

Abstract

Analysis of the human genome sequence has identified approximately 25000-30000 protein-coding genes, but little is known about how most of these are regulated. Mapping DNase I hypersensitive (HS) sites has traditionally represented the gold-standard experimental method for identifying regulatory elements, but the labor-intensive nature of this technique has limited its application to only a small number of human genes. We have developed a protocol to generate a genome-wide library of gene regulatory sequences by cloning DNase HS sites. We generated a library of DNase HS sites from quiescent primary human CD4(+) T cells and analyzed approximately 5600 of the resulting clones. Compared to sequences from randomly generated in silico libraries, sequences from these clones were found to map more frequently to regions of the genome known to contain regulatory elements, such as regions upstream of genes, within CpG islands, and in sequences that align between mouse and human. These cloned sites also tend to map near genes that have detectable transcripts in CD4(+) T cells, demonstrating that transcriptionally active regions of the genome are being selected. Validation of putative regulatory elements was achieved by repeated recovery of the same sequence and real-time PCR. This cloning strategy, which can be scaled up and applied to any cell line or tissue, will be useful in identifying regulatory elements controlling global expression differences that delineate tissue types, stages of development, and disease susceptibility.

MeSH terms

  • CD4-Positive T-Lymphocytes / enzymology
  • Cells, Cultured
  • Cloning, Molecular
  • Computational Biology
  • Deoxyribonucleases / metabolism*
  • Genome*
  • Humans
  • Polymerase Chain Reaction
  • Regulatory Sequences, Nucleic Acid*

Substances

  • Deoxyribonucleases