Interpreting transcriptional changes using causal graphs: new methods and their practical utility on public networks

BMC Bioinformatics. 2016 Aug 24;17(1):318. doi: 10.1186/s12859-016-1181-8.

Abstract

Background: Inference of active regulatory cascades under specific molecular and environmental perturbations is a recurring task in transcriptional data analysis. Commercial tools based on large, manually curated networks of causal relationships offering such functionality have been used in thousands of articles in the biomedical literature. The adoption and extension of such methods in the academic community has been hampered by the lack of freely available, efficient algorithms and an accompanying demonstration of their applicability using current public networks.

Results: In this article, we propose a new statistical method that will infer likely upstream regulators based on observed patterns of up- and down-regulated transcripts. The method is suitable for use with public interaction networks with a mix of signed and unsigned causal edges. It subsumes and extends two previously published approaches and we provide a novel algorithmic method for efficient statistical inference. Notably, we demonstrate the feasibility of using the approach to generate biological insights given current public networks in the context of controlled in-vitro overexpression experiments, stem-cell differentiation data and animal disease models. We also provide an efficient implementation of our method in the R package QuaternaryProd available to download from Bioconductor.

Conclusions: In this work, we have closed an important gap in utilizing causal networks to analyze differentially expressed genes. Our proposed Quaternary test statistic incorporates all available evidence on the potential relevance of an upstream regulator. The new approach broadens the use of these types of statistics for highly curated signed networks in which ambiguities arise but also enables the use of networks with unsigned edges. We design and implement a novel computational method that can efficiently estimate p-values for upstream regulators in current biological settings. We demonstrate the ready applicability of the implemented method to analyze differentially expressed genes using the publicly available networks.

Keywords: Causal reasoning on biological networks; Gene set enrichment analysis; Inference on gene regulatory networks.

MeSH terms

  • Algorithms*
  • Animals
  • Cell Differentiation / genetics
  • Data Interpretation, Statistical
  • Gene Expression Regulation
  • Gene Regulatory Networks*
  • Humans
  • Stem Cells / cytology
  • Stem Cells / metabolism
  • Transcription, Genetic