Benefiting from the large-scale archiving of digitized whole-slide images (WSIs), computer-aided diagnosis has been well developed to assist pathologists in decision-making. Content-based WSI retrieval can be a new approach to find highly correlated WSIs in a historically diagnosed WSI archive, which has the potential usages for assisted clinical diagnosis, medical research, and trainee education. During WSI retrieval, it is particularly challenging to encode the semantic content of histopathological images and to measure the similarity between images for interpretable results due to the gigapixel size of WSIs. In this work, we propose a Retrieval with Clustering-guided Contrastive Learning (RetCCL) framework for robust and accurate WSI-level image retrieval, which integrates a novel self-supervised feature learning method and a global ranking and aggregation algorithm for much improved performance. The proposed feature learning method makes use of existing large-scale unlabeled histopathological image data, which helps learn universal features that could be used directly for subsequent WSI retrieval tasks without extra fine-tuning. The proposed WSI retrieval method not only returns a set of WSIs similar to a query WSI, but also highlights patches or sub-regions of each WSI that share high similarity with patches of the query WSI, which helps pathologists interpret the searching results. Our WSI retrieval framework has been evaluated on the tasks of anatomical site retrieval and cancer subtype retrieval using over 22,000 slides, and the performance exceeds other state-of-the-art methods significantly (around 10% for the anatomic site retrieval in terms of average mMV@10). Besides, the patch retrieval using our learned feature representation offers a performance improvement of 24% on the TissueNet dataset in terms of mMV@5 compared with using ImageNet pre-trained features, which further demonstrates the effectiveness of the proposed CCL feature learning method.
Keywords: Feature extraction; Histopathology; Image retrieval; Self-supervised learning.
Copyright © 2022 Elsevier B.V. All rights reserved.