Background: Accurate cancer classification is essential for correct treatment selection and better prognostication. microRNAs (miRNAs) are small RNA molecules that negatively regulate gene expression, and their dyresgulation is a common disease mechanism in many cancers. Through a clearer understanding of miRNA dysregulation in cancer, improved mechanistic knowledge and better treatments can be sought.
Results: We present a topology-preserving deep learning framework to study miRNA dysregulation in cancer. Our study comprises miRNA expression profiles from 3685 cancer and non-cancer tissue samples and hierarchical annotations on organ and neoplasticity status. Using unsupervised learning, a two-dimensional topological map is trained to cluster similar tissue samples. Labelled samples are used after training to identify clustering accuracy in terms of tissue-of-origin and neoplasticity status. In addition, an approach using activation gradients is developed to determine the attention of the networks to miRNAs that drive the clustering. Using this deep learning framework, we classify the neoplasticity status of held-out test samples with an accuracy of 91.07%, the tissue-of-origin with 86.36%, and combined neoplasticity status and tissue-of-origin with an accuracy of 84.28%. The topological maps display the ability of miRNAs to recognize tissue types and neoplasticity status. Importantly, when our approach identifies samples that do not cluster well with their respective classes, activation gradients provide further insight in cancer subtypes or grades.
Conclusions: An unsupervised deep learning approach is developed for cancer classification and interpretation. This work provides an intuitive approach for understanding molecular properties of cancer and has significant potential for cancer classification and treatment selection.
Keywords: Cancer classification; Deep learning; miRNA.
© 2022. The Author(s).