Mutations in proto-oncogenes (ONGO) and the loss of regulatory function of tumor suppression genes (TSG) are the common underlying mechanism for uncontrolled tumor growth. While cancer is a heterogeneous complex of distinct diseases, finding the potentiality of the genes related functionality to ONGO or TSG through computational studies can help develop drugs that target the disease. This paper proposes a classification method that starts with a preprocessing stage to extract the feature map sets from the input 3D protein structural information. The next stage is a deep convolutional neural network stage (DCNN) that outputs the probability of functional classification of genes. We explored and tested two approaches: in Approach 1, all filtered and cleaned 3D-protein-structures (PDB) are pooled together, whereas in Approach 2, the primary structures and their corresponding PDBs are separated according to the genes' primary structural information. Following the DCNN stage, a dynamic programming-based method is used to determine the final prediction of the primary structures' functionality. We validated our proposed method using the COSMIC online database. For the ONGO vs TSG classification problem the AUROC of the DCNN stage for Approach 1 and Approach 2 DCNN are 0.978 and 0.765, respectively. The AUROCs of the final genes' primary structure functionality classification for Approach 1 and Approach 2 are 0.989, and 0.879, respectively. For comparison, the current state-of-the-art reported AUROC is 0.924. Our results warrant further study to apply the deep learning models to humans' (GRCh38) genes, for predicting their corresponding probabilities of functionality in the cancer drivers.
Keywords: 2D-CNN; 3D-protein-structures (PDB); Biochemical property; Cancer Tier-1 and Tier-2; Convolutional neural network; Cα atom; DCNN; Deep convolutional neural network; Fusion; Gene functional classification; ONGO; Protooncogenes; Surface residue; TSG; Tumor suppression genes.
Copyright © 2021 Elsevier Ltd. All rights reserved.