Large-scale integration of MicroRNA and gene expression data for identification of enriched microRNA-mRNA associations in biological systems

Methods Mol Biol. 2010:667:297-315. doi: 10.1007/978-1-60761-811-9_20.

Abstract

The discovery of microRNAs (miRNAs) revealed a hidden layer of gene regulation that is able to integrate multiple genes into biologically meaningful networks. A number of computational prediction programs have been developed to identify putative miRNA targets. Collectively, the miRNAs that have been discovered so far have the potential to target over 60% of genes in our genome. A minimum of six consecutive nucleotides in the 5'-seed (nucleotides 2-8) in the miRNA must bind through complimentary base pairing to the 3'-untranslated (3'-UTRs) of target genes. Given the small sequence match required, a given miRNA has the potential to target hundreds of genes and a given mRNA can have 0-50 miRNA binding sites. The low-throughput nature of the query design (gene by gene or miRNA by miRNA) and a fairly high rate of false positives and negatives uncovered by the limited number of functional studies remain as the major limitations. Programs that integrate genome-wide gene and miRNA expression data determined by microarray and/or next-generation sequencing (NGS) technologies with the publicly available target prediction algorithms are extremely valuable on two fronts. First, they allow the investigator to fully capitalize on all the data generated to reveal new genes and pathways underlying the biological process under study. Second, these programs allow the investigator to lift a small network of genes they are currently following into a larger network through the integrative properties of miRNAs. In this chapter, we discuss the latest methodologies for determining genome-wide miRNA and gene expression changes and three programs (Sigterms, CORNA, and MMIA) that allow the investigator to generate short lists of enriched miRNA:target mRNA candidates for large-scale miRNA:target mRNA validation. These efforts are essential for determining false positive and negative rates of existing algorithms and refining our knowledge on the rules of miRNA-mRNA relationships.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Animals
  • Base Sequence
  • Computational Biology / methods*
  • Gene Expression Profiling / instrumentation
  • Gene Expression Profiling / methods
  • Gene Expression*
  • Humans
  • MicroRNAs* / genetics
  • MicroRNAs* / metabolism
  • Microarray Analysis / instrumentation
  • Microarray Analysis / methods
  • RNA, Messenger* / genetics
  • RNA, Messenger* / metabolism
  • Software

Substances

  • MicroRNAs
  • RNA, Messenger