Biomarker discovery is one of the major topics in translational biomedicine study based on high-throughput biological data analysis. Traditional methods focus on differentially expressed genes (or node-biomarkers) but ignore non-differentials. However, non-differentially expressed genes also play important roles in the biological processes and the rewired interactions / edges among non-differential genes may reveal fundamental difference between variable conditions. Therefore, it is necessary to identify relevant interactions or gene pairs to elucidate the molecular mechanism of complex biological phenomena, e.g. distinguish different phenotypes. To address this issue, we proposed a new method based on a new vector representation of an edge, EdgeMarker, to (1) identify edge-biomarkers, i.e. the differentially correlated molecular pairs (e.g., gene pairs) with optimal classification ability, and (2) transform the 'node expression' data in node space into the 'edge expression' data in edge space and classify the phenotype of each single sample in edge space, which generally cannot be achieved in traditional methods. Unlike the traditional methods which analyze the node space (i.e. molecular expression space) or higher dimensional space using arbitrary kernel methods, this study provides a mathematical model to explore the edge space (i.e. correlation space) for classification of a single sample. In this work, we show that the identified edge-biomarkers indeed have strong ability in distinguishing normal and disease samples even when all involved genes are not significantly differentially expressed. The analysis of human cholangiocarcinoma dataset and diabetes dataset also suggested that the identified edge-biomarkers may cast new biological insights into the pathogenesis of human complex diseases.
Keywords: Differential network; Differentially correlated molecular pair; Edge-biomarker; Network biomarker; Single sample.
Copyright © 2014 Elsevier Ltd. All rights reserved.