A Support Vector Machine Model Predicting the Risk of Duodenal Cancer in Patients with Familial Adenomatous Polyposis at the Transcript Levels

Biomed Res Int. 2020 Jun 16:2020:5807295. doi: 10.1155/2020/5807295. eCollection 2020.

Abstract

Objective: Familial adenomatous polyposis (FAP) is one major type of inherited duodenal cancer. The estimate of duodenal cancer risk in patients with FAP is critical for selecting the optimal treatment strategy.

Methods: Microarray datasets related with FAP were retrieved from the Gene Expression Omnibus (GEO) database. Differentially expressed genes were identified by FAP vs. normal samples and FAP and duodenal cancer vs. normal samples. Furthermore, functional enrichment analyses of these differentially expressed genes were performed. A support vector machine (SVM) was performed to train and validate cancer risk prediction model.

Results: A total of 196 differentially expressed genes were identified between FAP compared with normal samples. 177 similarly expressed genes were identified both in FAP and duodenal cancer, which were mainly enriched in pathways in cancer and metabolic-related pathway, indicating that these genes in patients with FAP could contribute to duodenal cancer. Among them, Cyclin D1, SDF-1, AXIN, and TCF were significantly upregulated in FAP tissues using qRT-PCR. Based on the 177 genes, an SVM model was constructed for prediction of the risk of cancer in patients with FAP. After validation, the model can accurately distinguish FAP patients with high risk from those with low risk for duodenal cancer.

Conclusion: This study proposed a cancer risk prediction model based on an SVM at the transcript levels.

MeSH terms

  • Adenomatous Polyposis Coli* / epidemiology
  • Adenomatous Polyposis Coli* / genetics
  • Adenomatous Polyposis Coli* / metabolism
  • Computational Biology
  • Databases, Genetic
  • Duodenal Neoplasms* / epidemiology
  • Duodenal Neoplasms* / genetics
  • Duodenal Neoplasms* / metabolism
  • Humans
  • Metabolic Networks and Pathways / genetics
  • Risk Assessment
  • Support Vector Machine*
  • Transcriptome / genetics*