PICARA, an analytical pipeline providing probabilistic inference about a priori candidates genes underlying genome-wide association QTL in plants

PLoS One. 2012;7(11):e46596. doi: 10.1371/journal.pone.0046596. Epub 2012 Nov 7.

Abstract

PICARA is an analytical pipeline designed to systematically summarize observed SNP/trait associations identified by genome wide association studies (GWAS) and to identify candidate genes involved in the regulation of complex trait variation. The pipeline provides probabilistic inference about a priori candidate genes using integrated information derived from genome-wide association signals, gene homology, and curated gene sets embedded in pathway descriptions. In this paper, we demonstrate the performance of PICARA using data for flowering time variation in maize - a key trait for geographical and seasonal adaption of plants. Among 406 curated flowering time-related genes from Arabidopsis, we identify 61 orthologs in maize that are significantly enriched for GWAS SNP signals, including key regulators such as FT (Flowering Locus T) and GI (GIGANTEA), and genes centered in the Arabidopsis circadian pathway, including TOC1 (Timing of CAB Expression 1) and LHY (Late Elongated Hypocotyl). In addition, we discover a regulatory feature that is characteristic of these a priori flowering time candidates in maize. This new probabilistic analytical pipeline helps researchers infer the functional significance of candidate genes associated with complex traits and helps guide future experiments by providing statistical support for gene candidates based on the integration of heterogeneous biological information.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Arabidopsis / genetics*
  • Arabidopsis Proteins / genetics*
  • Gene Expression Regulation, Plant
  • Genes, Plant
  • Models, Genetic
  • Polymorphism, Single Nucleotide
  • Probability
  • Quantitative Trait Loci*

Substances

  • Arabidopsis Proteins

Grants and funding

This research is supported by the National Science Foundation (NSF) Plant Genome Research Grant 0703908. SM is also funded by NSF Plant Genome Research Grant 1026555. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.