Enhancing instance-based classification with local density: a new algorithm for classifying unbalanced biomedical data

Bioinformatics. 2006 Apr 15;22(8):981-8. doi: 10.1093/bioinformatics/btl027. Epub 2006 Jan 27.

Abstract

Motivation: Classification is an important data mining task in biomedicine. In particular, classification on biomedical data often claims the separation of pathological and healthy samples with highest discriminatory performance for diagnostic issues. Even more important than the overall accuracy is the balance of a classifier, particularly if datasets of unbalanced class size are examined.

Results: We present a novel instance-based classification technique which takes both information of different local density of data objects and local cluster structures into account. Our method, which adopts the basic ideas of density-based outlier detection, determines the local point density in the neighborhood of an object to be classified and of all clusters in the corresponding region. A data object is assigned to that class where it fits best into the local cluster structure. The experimental evaluation on biomedical data demonstrates that our approach outperforms most popular classification methods.

Availability: The algorithm LCF is available for testing under http://biomed.umit.at/upload/lcfx.zip.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Cluster Analysis
  • Computer Simulation
  • Database Management Systems*
  • Databases, Factual*
  • Information Storage and Retrieval / methods*
  • Models, Biological*
  • Pattern Recognition, Automated / methods*