Classifier ensemble selection based on affinity propagation clustering

J Biomed Inform. 2016 Apr:60:234-42. doi: 10.1016/j.jbi.2016.02.010. Epub 2016 Feb 23.

Abstract

A small number of features are significantly correlated with classification in high-dimensional data. An ensemble feature selection method based on cluster grouping is proposed in this paper. Classification-related features are chosen using a ranking aggregation technique. These features are divided into unrelated groups by an affinity propagation clustering algorithm with a bicor correlation coefficient. Some diversity and distinguishing feature subsets are constructed by randomly selecting a feature from each group and are used to train base classifiers. Finally, some base classifiers that have better classification performance are selected using a kappa coefficient and integrated using a majority voting strategy. The experimental results based on five gene expression datasets show that the proposed method has low classification error rates, stable classification performance and strong scalability in terms of sensitivity, specificity, accuracy and G-Mean criteria.

Keywords: Affinity propagation clustering; Classification; Ensemble feature selection; Kappa correlation; Ranking aggregation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cluster Analysis*
  • Databases, Genetic
  • Gene Expression Regulation*
  • Humans
  • Medical Informatics / methods*
  • Models, Statistical
  • Neoplasms / genetics*
  • Reproducibility of Results
  • Sensitivity and Specificity