Cancers of unknown primary (CUP) are metastatic cancers for which the primary tumor is not found despite thorough diagnostic investigations. Multiple molecular assays have been proposed to identify the tissue of origin (TOO) and inform clinical care; however, none has been able to combine accuracy, interpretability, and easy access for routine use. We developed a classifier tool based on the training of a variational autoencoder to predict tissue of origin based on RNA-sequencing data. We used as training data 20,918 samples corresponding to 94 different categories, including 39 cancer types and 55 normal tissues. The TransCUPtomics classifier was applied to a retrospective cohort of 37 CUP patients and 11 prospective patients. TransCUPtomics exhibited an overall accuracy of 96% on reference data for TOO prediction. The TOO could be identified in 38 (79%) of 48 CUP patients. Eight of 11 prospective CUP patients (73%) could receive first-line therapy guided by TransCUPtomics prediction, with responses observed in most patients. The variational autoencoder added further utility by enabling prediction interpretability, and diagnostic predictions could be matched to detection of gene fusions and expressed variants. TransCUPtomics confidently predicted TOO for CUP and enabled tailored treatments leading to significant clinical responses. The interpretability of our approach is a powerful addition to improve the management of CUP patients.
Copyright © 2021 Association for Molecular Pathology and American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.