Computational dissection of Arabidopsis smRNAome leads to discovery of novel microRNAs and short interfering RNAs associated with transcription start sites

Genomics. 2011 Apr;97(4):235-43. doi: 10.1016/j.ygeno.2011.01.006. Epub 2011 Feb 2.

Abstract

The profiling of small RNAs by high-throughput sequencing (smRNA-Seq) has revealed the complexity of the RNA world. Here, we describe a computational scheme for dissecting the plant smRNAome by integrating smRNA-Seq datasets in Arabidopsis thaliana. Our analytical approach first defines ab initio the genomic loci that produce smRNAs as basic units, then utilizes principal component analysis (PCA) to predict novel miRNAs. Secondary structure prediction of candidates' putative precursors discovered a group of long hairpin double-stranded RNAs (lh-dsRNAs) formed by inverted duplications of decayed coding genes. These gene remnants produce miRNA-like small RNAs which are predominantly 21- and 22-nt long, dependent of DCL1 but independent of RDR2 and DCL2/3/4, and associated with AGO1. Additionally, we found two classes of transcription start site associated (TSSa) RNAs located at sense (+) and antisense (-) approximately 100-200 bp downstream of TSSs, but are differentially incorporated into AGO1 and AGO4, respectively.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics*
  • Arabidopsis Proteins / genetics
  • Argonaute Proteins
  • Computational Biology
  • MicroRNAs / genetics*
  • Principal Component Analysis / methods
  • RNA, Double-Stranded / genetics
  • RNA, Small Interfering / genetics*
  • RNA-Dependent RNA Polymerase / genetics
  • Sequence Analysis, DNA
  • Transcription Initiation Site*

Substances

  • AGO1 protein, Arabidopsis
  • AGO4 protein, Arabidopsis
  • Arabidopsis Proteins
  • Argonaute Proteins
  • MicroRNAs
  • RNA, Double-Stranded
  • RNA, Small Interfering
  • RDR2 protein, Arabidopsis
  • RNA-Dependent RNA Polymerase