Contrastive Clustering-Based Patient Normalization to Improve Automated In Vivo Oral Cancer Diagnosis from Multispectral Autofluorescence Lifetime Images

Cancers (Basel). 2024 Dec 9;16(23):4120. doi: 10.3390/cancers16234120.

Abstract

Background: Multispectral autofluorescence lifetime imaging systems have recently been developed to quickly and non-invasively assess tissue properties for applications in oral cancer diagnosis. As a non-traditional imaging modality, the autofluorescence signal collected from the system cannot be directly visually assessed by a clinician and a model is needed to generate a diagnosis for each image. However, training a deep learning model from scratch on small multispectral autofluorescence datasets can fail due to inter-patient variability, poor initialization, and overfitting. Methods: We propose a contrastive-based pre-training approach that teaches the network to perform patient normalization without requiring a direct comparison to a reference sample. We then use the contrastive pre-trained encoder as a favorable initialization for classification. To train the classifiers, we efficiently use available data and reduce overfitting through a multitask framework with margin delineation and cancer diagnosis tasks. We evaluate the model over 67 patients using 10-fold cross-validation and evaluate significance using paired, one-tailed t-tests. Results: The proposed approach achieves a sensitivity of 82.08% and specificity of 75.92% on the cancer diagnosis task with a sensitivity of 91.83% and specificity of 79.31% for margin delineation as an auxiliary task. In comparison to existing approaches, our method significantly outperforms a support vector machine (SVM) implemented with either sequential feature selection (SFS) (p = 0.0261) or L1 loss (p = 0.0452) when considering the average of sensitivity and specificity. Specifically, the proposed approach increases performance by 2.75% compared to the L1 model and 4.87% compared to the SFS model. In addition, there is a significant increase in specificity of 8.34% compared to the baseline autoencoder model (p = 0.0070). Conclusions: Our method effectively trains deep learning models for small data applications when existing, large pre-trained models are not suitable for fine-tuning. While we designed the network for a specific imaging modality, we report the development process so that the insights gained can be applied to address similar challenges in other non-traditional imaging modalities. A key contribution of this paper is a neural network framework for multi-spectral fluorescence lifetime-based tissue discrimination that performs patient normalization without requiring a reference (healthy) sample from each patient at test time.

Keywords: automated cancer diagnosis; deep learning; margin delineation; multispectral autofluorescence lifetime imaging; patient normalization; regularization.