Feature Subset Selection for Cancer Classification Using Weight Local Modularity

Sci Rep. 2016 Oct 5:6:34759. doi: 10.1038/srep34759.

Abstract

Microarray is recently becoming an important tool for profiling the global gene expression patterns of tissues. Gene selection is a popular technology for cancer classification that aims to identify a small number of informative genes from thousands of genes that may contribute to the occurrence of cancers to obtain a high predictive accuracy. This technique has been extensively studied in recent years. This study develops a novel feature selection (FS) method for gene subset selection by utilizing the Weight Local Modularity (WLM) in a complex network, called the WLMGS. In the proposed method, the discriminative power of gene subset is evaluated by using the weight local modularity of a weighted sample graph in the gene subset where the intra-class distance is small and the inter-class distance is large. A higher local modularity of the gene subset corresponds to a greater discriminative of the gene subset. With the use of forward search strategy, a more informative gene subset as a group can be selected for the classification process. Computational experiments show that the proposed algorithm can select a small subset of the predictive gene as a group while preserving classification accuracy.

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Databases, Genetic
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Neoplastic
  • Gene Regulatory Networks
  • Genetic Predisposition to Disease
  • Humans
  • Neoplasms / classification*
  • Neoplasms / genetics
  • Oligonucleotide Array Sequence Analysis / methods*