Unsupervised Extraction of Stable Expression Signatures from Public Compendia with an Ensemble of Neural Networks

Cell Syst. 2017 Jul 26;5(1):63-71.e6. doi: 10.1016/j.cels.2017.06.003. Epub 2017 Jul 12.

Abstract

Cross-experiment comparisons in public data compendia are challenged by unmatched conditions and technical noise. The ADAGE method, which performs unsupervised integration with denoising autoencoder neural networks, can identify biological patterns, but because ADAGE models, like many neural networks, are over-parameterized, different ADAGE models perform equally well. To enhance model robustness and better build signatures consistent with biological pathways, we developed an ensemble ADAGE (eADAGE) that integrated stable signatures across models. We applied eADAGE to a compendium of Pseudomonas aeruginosa gene expression profiling experiments performed in 78 media. eADAGE revealed a phosphate starvation response controlled by PhoB in media with moderate phosphate and predicted that a second stimulus provided by the sensor kinase, KinB, is required for this PhoB activation. We validated this relationship using both targeted and unbiased genetic approaches. eADAGE, which captures stable biological patterns, enables cross-experiment comparisons that can highlight measured but undiscovered relationships.

Keywords: Pseudomonas aeruginosa; crosstalk; denoising autoencoders; ensemble modeling; gene expression; neural networks; phosphate starvation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / metabolism*
  • Gene Expression Profiling
  • Gene Expression Regulation
  • Health Knowledge, Attitudes, Practice
  • Humans
  • Information Storage and Retrieval / trends
  • Neural Networks, Computer*
  • Pseudomonas aeruginosa / physiology*
  • Public Sector
  • Starvation
  • Systems Integration
  • Transcriptome

Substances

  • Bacterial Proteins
  • PhoB protein, Bacteria