Discovering functional gene expression patterns in the metabolic network of Escherichia coli with wavelets transforms

BMC Bioinformatics. 2006 Mar 8:7:119. doi: 10.1186/1471-2105-7-119.

Abstract

Background: Microarray technology produces gene expression data on a genomic scale for an endless variety of organisms and conditions. However, this vast amount of information needs to be extracted in a reasonable way and funneled into manageable and functionally meaningful patterns. Genes may be reasonably combined using knowledge about their interaction behaviour. On a proteomic level, biochemical research has elucidated an increasingly complete image of the metabolic architecture, especially for less complex organisms like the well studied bacterium Escherichia coli.

Results: We sought to discover central components of the metabolic network, regulated by the expression of associated genes under changing conditions. We mapped gene expression data from E. coli under aerobic and anaerobic conditions onto the enzymatic reaction nodes of its metabolic network. An adjacency matrix of the metabolites was created from this graph. A consecutive ones clustering method was used to obtain network clusters in the matrix. The wavelet method was applied on the adjacency matrices of these clusters to collect features for the classifier. With a feature extraction method the most discriminating features were selected. We yielded network sub-graphs from these top ranking features representing formate fermentation, in good agreement with the anaerobic response of hetero-fermentative bacteria. Furthermore, we found a switch in the starting point for NAD biosynthesis, and an adaptation of the l-aspartate metabolism, in accordance with its higher abundance under anaerobic conditions.

Conclusion: We developed and tested a novel method, based on a combination of rationally chosen machine learning methods, to analyse gene expression data on the basis of interaction data, using a metabolic network of enzymes. As a case study, we applied our method to E. coli under oxygen deprived conditions and extracted physiologically relevant patterns that represent an adaptation of the cells to changing environmental conditions. In general, our concept may be transferred to network analyses on biological interaction data, when data for two comparable states of the associated nodes are made available.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Anaerobiosis / physiology
  • Computer Simulation
  • Energy Metabolism / physiology
  • Escherichia coli / metabolism*
  • Escherichia coli Proteins / metabolism*
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Bacterial / physiology*
  • Models, Biological*
  • Oxygen / metabolism
  • Protein Interaction Mapping / methods
  • Signal Transduction / physiology*

Substances

  • Escherichia coli Proteins
  • Oxygen