Ranking genes by their co-expression to subsets of pathway members

Ann N Y Acad Sci. 2009 Mar:1158:1-13. doi: 10.1111/j.1749-6632.2008.03747.x.

Abstract

Cellular processes are often carried out by intricate systems of interacting genes and proteins. Some of these systems are rather well studied and described in pathway databases, while the roles and functions of the majority of genes are poorly understood. A large compendium of public microarray data is available that covers a variety of conditions, samples, and tissues and provides a rich source for genome-scale information. We focus our study on the analysis of 35 curated biological pathways in the context of gene co-expression over a large variety of biological conditions. By defining a global co-expression similarity rank for each gene and pathway, we perform exhaustive leave-one-out computations to describe existing pathway memberships using other members of the corresponding pathway as reference. We demonstrate that while successful in recovering biological base processes such as metabolism and translation, the global correlation measure fails to detect gene memberships in signaling pathways where co-expression is less evident. Our results also show that pathway membership detection is more effective when using only a subset of corresponding pathway members as reference, supporting the existence of more tightly co-expressed subsets of genes within pathways. Our study assesses the predictive power of global gene expression correlation measures in reconstructing biological systems of various functions and specificity. The developed computational network has immediate applications in detecting dubious pathway members and predicting novel member candidates.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Computer Simulation
  • Databases, Genetic
  • Gene Expression Profiling / methods
  • Gene Expression*
  • Gene Regulatory Networks*
  • Humans
  • Metabolic Networks and Pathways / genetics*
  • Oligonucleotide Array Sequence Analysis / methods
  • ROC Curve
  • Signal Transduction