A Poisson Log-Normal Model for Constructing Gene Covariation Network Using RNA-seq Data

J Comput Biol. 2017 Jul;24(7):721-731. doi: 10.1089/cmb.2017.0053. Epub 2017 May 30.

Abstract

Constructing expression networks using transcriptomic data is an effective approach for studying gene regulation. A popular approach for constructing such a network is based on the Gaussian graphical model (GGM), in which an edge between a pair of genes indicates that the expression levels of these two genes are conditionally dependent, given the expression levels of all other genes. However, GGMs are not appropriate for non-Gaussian data, such as those generated in RNA-seq experiments. We propose a novel statistical framework that maximizes a penalized likelihood, in which the observed count data follow a Poisson log-normal distribution. To overcome the computational challenges, we use Laplace's method to approximate the likelihood and its gradients, and apply the alternating directions method of multipliers to find the penalized maximum likelihood estimates. The proposed method is evaluated and compared with GGMs using both simulated and real RNA-seq data. The proposed method shows improved performance in detecting edges that represent covarying pairs of genes, particularly for edges connecting low-abundant genes and edges around regulatory hubs.

Keywords: Gaussian graphical model; Poisson log-normal distribution; RNA-seq; alternating directions method of multipliers; penalized likelihood.

MeSH terms

  • Algorithms
  • Computer Simulation
  • Ethnicity / genetics*
  • Gene Expression Regulation
  • Gene Regulatory Networks*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Likelihood Functions*
  • Lymphocytes / metabolism
  • Metagenomics
  • Models, Statistical*
  • Poisson Distribution*
  • Sequence Analysis, RNA / methods*