A Combined PLS and Negative Binomial Regression Model for Inferring Association Networks from Next-Generation Sequencing Count Data

IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):760-773. doi: 10.1109/TCBB.2017.2665495. Epub 2017 Feb 7.

Abstract

A major challenge of genomics data is to detect interactions displaying functional associations from large-scale observations. In this study, a new cPLS-algorithm combining partial least squares approach with negative binomial regression is suggested to reconstruct a genomic association network for high-dimensional next-generation sequencing count data. The suggested approach is applicable to the raw counts data, without requiring any further pre-processing steps. In the settings investigated, the cPLS-algorithm outperformed the two widely used comparative methods, graphical lasso, and weighted correlation network analysis. In addition, cPLS is able to estimate the full network for thousands of genes without major computational load. Finally, we demonstrate that cPLS is capable of finding biologically meaningful associations by analyzing an example data set from a previously published study to examine the molecular anatomy of the craniofacial development.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Databases, Genetic
  • Gene Expression Profiling
  • Gene Regulatory Networks* / genetics
  • Gene Regulatory Networks* / physiology
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Least-Squares Analysis
  • Maxillofacial Development / genetics
  • Maxillofacial Development / physiology
  • Models, Biological
  • Models, Statistical*
  • Oligonucleotide Array Sequence Analysis