Genome-Wide TSS Identification in Maize

Methods Mol Biol. 2018:1830:239-256. doi: 10.1007/978-1-4939-8657-6_14.

Abstract

Regulation of gene expression is a fundamental biological process that relies on transcription factors (TF) recognizing specific cis motifs in the regulatory regions of the genes that they control. In most eukaryotic organisms, cis-regulatory elements are significantly enriched around the transcription start site (TSS). However, different from other genic features, TSSs need to be experimentally determined, becoming then important components of genome annotations. One of the methods for experimentally determining TSSs at the genome-wide level is CAGE (cap analysis of gene expression). This chapter describes how to prepare a CAGE library for sequencing, starting with RNA extraction, library construction, and quality controls before proceed to sequencing in the Illumina platform. We then describe how to use a computational pipeline to determine, from the alignment of CAGE tags, the genome-wide location of TSSs, followed with statistical approaches required to cluster TSSs that operate as transcriptional units, and to determine core promoter properties such as shape. The analyses described here focus on maize, since its large and yet deficiently annotated genome creates some unique challenges, but with some modifications can be easily adopted for other organisms as well.

Keywords: CAGE; Cap analysis of gene expression; Maize; Promoter shape; Transcription factor; Transcription start site.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • DNA, Complementary / genetics
  • Gene Expression Regulation, Plant
  • Genome, Plant*
  • Molecular Biology / methods*
  • RNA, Plant / genetics
  • RNA, Plant / isolation & purification
  • Transcription Initiation Site*
  • Zea mays / genetics*

Substances

  • DNA, Complementary
  • RNA, Plant