Mining coherent dense subgraphs across massive biological networks for functional discovery

Haiyan Hu; Xifeng Yan; Yu Huang; Jiawei Han; Xianghong Jasmine Zhou

doi:10.1093/bioinformatics/bti1049

Mining coherent dense subgraphs across massive biological networks for functional discovery

Bioinformatics. 2005 Jun:21 Suppl 1:i213-21. doi: 10.1093/bioinformatics/bti1049.

Authors

Haiyan Hu¹, Xifeng Yan, Yu Huang, Jiawei Han, Xianghong Jasmine Zhou

Affiliation

¹ Program in Molecular and Computational Biology, University of Southern California Los Angeles, CA 90089, USA.

PMID: 15961460
DOI: 10.1093/bioinformatics/bti1049

Abstract

Motivation: The rapid accumulation of biological network data translates into an urgent need for computational methods for graph pattern mining. One important problem is to identify recurrent patterns across multiple networks to discover biological modules. However, existing algorithms for frequent pattern mining become very costly in time and space as the pattern sizes and network numbers increase. Currently, no efficient algorithm is available for mining recurrent patterns across large collections of genome-wide networks.

Results: We developed a novel algorithm, CODENSE, to efficiently mine frequent coherent dense subgraphs across large numbers of massive graphs. Compared with previous methods, our approach is scalable in the number and size of the input graphs and adjustable in terms of exact or approximate pattern mining. Applying CODENSE to 39 co-expression networks derived from microarray datasets, we discovered a large number of functionally homogeneous clusters and made functional predictions for 169 uncharacterized yeast genes.

Availability: http://zhoulab.usc.edu/CODENSE/

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms
Cluster Analysis
Computational Biology / methods*
Computer Graphics
Fungal Proteins / chemistry
Genes, Fungal
Genome
Genomics / methods*
Oligonucleotide Array Sequence Analysis
Pattern Recognition, Automated
Software
Statistics as Topic

Substances

Fungal Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding