Construction of a Wilms tumor risk model based on machine learning and identification of cuproptosis-related clusters

BMC Med Inform Decis Mak. 2024 Nov 4;24(1):325. doi: 10.1186/s12911-024-02716-8.

Abstract

Background: Cuproptosis, a recently identified type of programmed cell death triggered by copper, has mechanisms in Wilms tumor (WT) that are not yet fully understood. This research focuses on examining the link between WT and Cuproptosis-related genes (CRGs), with the goal of developing a predictive model for WT.

Methods: Four gene expression datasets related to WT were sourced from the GEO database. Subsequently, expression profiles of CRGs were extracted for differential analysis and immune infiltration studies. Utilizing 105 WT samples, clusters related to Cuproptosis were identified. This involved analyzing associated immune cell infiltration and conducting functional enrichment analysis. Disease-characteristic genes were pinpointed using weighted gene co-expression network analysis. Finally, the WT risk prediction model was constructed by four machine learning methods: random forest, support vector machine (SVM), generalized linear and extreme gradient strength model. The best-performing machine learning model was chosen, and a nomogram was created. The effectiveness of this predictive model was validated using methods such as the calibration curve, decision curve analysis, and by appiying it to the TARGET-GTEx dataset.

Results: Thirteen differentially expressed Cuproptosis-related genes were identified. The infiltration level of CD8 + T cells in WT children was lower than that in Normal tissue (NT) children, and the level of M0 infiltration of macrophages and T follicular helper cells was higher than that in NT children. In addition, two clusters of cuproptosis-related WT were identified. Enrichment analysis results indicated that genes in cluster 2 were primarily involved in cell division, nuclear division regulation, DNA biosynthesis process, ubiquitin-mediated proteolysis. The SVM model was judged to be the optimal model using 5 genes. Its accuracy was confirmed through a calibration curve and decision curve analysis, demonstrating satisfactory performance on the TARGET-GTEx validation dataset. Additional analysis revealed that these five genes exhibited high expression in both the TARGET-GTEx validation dataset and sequencing data.

Conclusion: This research established a link between WT and Cuproptosis. It developed a predictive model for assessing the risk of WT and pinpointed five key genes associated with the disease.

Keywords: Cuproptosis; Immune infiltration; Machine learning; Molecular clusters; Wilms tumor.

MeSH terms

  • Humans
  • Kidney Neoplasms / genetics
  • Machine Learning*
  • Nomograms
  • Risk Assessment
  • Wilms Tumor* / genetics