In recent years, deep learning as a state-of-the-art machine learning technique has made great success in histopathological image classification. However, most of deep learning approaches rely heavily on the substantial task-specific annotations, which require experienced pathologists' manual labelling. As a result, they are laborious and time-consuming, and many unlabeled pathological images are difficult to use without experts' annotations. To mitigate the requirement for data annotation, we propose a self-supervised Deep Adaptive Regularized Clustering (DARC) framework to pre-train a neural network. DARC iteratively clusters the learned representations and utilizes the cluster assignments as pseudo-labels to learn the parameters of the network. To learn feasible representations and encourage the representations to become more discriminative, we design an objective function combining a network loss with a clustering loss using an adaptive regularization function, which is updated adaptively throughout the training process to learn feasible representations. The proposed DARC is evaluated on three public datasets, including NCT-CRC-HE-100K, PCam and LC25000. Compared to the strategy of training from scratch, fine-tuning using the pre-trained weights of DARC can obviously boost the accuracy of neural networks on histopathological classification. The accuracy of using the network trained using DARC pre-trained weights with only 10% labeled data is already comparable to the network trained from scratch with 100% training data. The network using DARC pre-trained weights achieves the fastest convergence speed on the downstream classification task. Moreover, visualization through t-distributed stochastic neighbor embedding (t-SNE) shows that the learned representations are generalizable and discriminative.
Keywords: Adaptive regularization; Clustering; Histopathological image analysis; Representation learning; Self-supervised learning.
Copyright © 2022 Elsevier B.V. All rights reserved.