Efficient learning by combining confidence-rated classifiers to incorporate unlabeled medical data

Med Image Comput Comput Assist Interv. 2005;8(Pt 1):745-52. doi: 10.1007/11566465_92.

Abstract

In this paper, we propose a new dynamic learning framework that requires a small amount of labeled data in the beginning, then incrementally discovers informative unlabeled data to be hand-labeled and incorporates them into the training set to improve learning performance. This approach has great potential to reduce the training expense in many medical image analysis applications. The main contributions lie in a new strategy to combine confidence-rated classifiers learned on different feature sets and a robust way to evaluate the "informativeness" of each unlabeled example. Our framework is applied to the problem of classifying microscopic cell images. The experimental results show that 1) our strategy is more effective than simply multiplying the predicted probabilities, 2) the error rate of high-confidence predictions is much lower than the average error rate, and 3) hand-labeling informative examples with low-confidence predictions improves performance efficiently and the performance difference from hand-labeling all unlabeled data is very small.

MeSH terms

  • Algorithms*
  • Animals
  • Artificial Intelligence*
  • Cell Count / methods*
  • Cell Enlargement
  • Cell Physiological Phenomena*
  • Cells, Cultured
  • Confidence Intervals
  • Documentation / methods*
  • Humans
  • Image Enhancement / methods*
  • Image Interpretation, Computer-Assisted / methods*
  • Information Storage and Retrieval / methods
  • Microscopy / methods*
  • Pattern Recognition, Automated / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity