Retrieval evaluation and distance learning from perceived similarity between endomicroscopy videos

Med Image Comput Comput Assist Interv. 2011;14(Pt 3):297-304. doi: 10.1007/978-3-642-23626-6_37.

Abstract

Evaluating content-based retrieval (CBR) is challenging because it requires an adequate ground-truth. When the available groundtruth is limited to textual metadata such as pathological classes, retrieval results can only be evaluated indirectly, for example in terms of classification performance. In this study we first present a tool to generate perceived similarity ground-truth that enables direct evaluation of endomicroscopic video retrieval. This tool uses a four-points Likert scale and collects subjective pairwise similarities perceived by multiple expert observers. We then evaluate against the generated ground-truth a previously developed dense bag-of-visual-words method for endomicroscopic video retrieval. Confirming the results of previous indirect evaluation based on classification, our direct evaluation shows that this method significantly outperforms several other state-of-the-art CBR methods. In a second step, we propose to improve the CBR method by learning an adjusted similarity metric from the perceived similarity ground-truth. By minimizing a margin-based cost function that differentiates similar and dissimilar video pairs, we learn a weight vector applied to the visual word signatures of videos. Using cross-validation, we demonstrate that the learned similarity distance is significantly better correlated with the perceived similarity than the original visual-word-based distance.

MeSH terms

  • Algorithms
  • Colonic Polyps / diagnosis*
  • Databases, Factual
  • Diagnostic Imaging / methods*
  • Education, Distance
  • Endoscopy / methods*
  • Humans
  • Information Storage and Retrieval
  • Microscopy / methods*
  • Models, Statistical
  • Perception
  • Radiography / methods*
  • Video Recording / methods*
  • Videotape Recording