[Establishment and validation of a prediction model for early-stage epithelial ovarian cancer based on LASSO regression]

Zhonghua Yi Xue Za Zhi. 2024 Jun 18;104(23):2167-2172. doi: 10.3760/cma.j.cn112137-20231019-00823.
[Article in Chinese]

Abstract

Objective: To establish and validate a prediction model for early-stage epithelial ovarian cancer based on least absolute shrinkage and selection operator (LASSO) regression. Methods: A total of 509 cases ovarian mass patients who underwent surgical treatment in Tianjin Medical General Hospital from January 2018 to March 2023 were retrospectively analyzed. The patients were randomly divided into modeling group [n=356, M(Q1,Q3) for age were 43 (31, 61) years] and internal validation group [n=153, age 42 (31, 60) years] by 7∶3 ratio. In addition, 86 patients [age 44 (33, 61) years] who underwent surgical treatment for ovarian mass in Tianjin Medical University General Hospital from April to November 2023 were collected as external validation group. The variables were screened by LASSO regression. The nomogram model was established and plotted by multivariate logistic regression. Internal and external validation were then conducted. The model performance and clinical applicability were evaluated using receiver operating characteristic (ROC) curve, calibration curve and decision curve. Results: Five variables including age (OR=1.040,95%CI:1.000-1.050,P=0.002), carbohydrate antigen 125 (CA125) (OR=1.001, 95%CI: 1.000-1.010, P=0.017), human epididymis protein 4 (HE4) (OR=1.020, 95%CI: 1.000-1.030, P=0.002), carbohydrate antigen 199 (CA199) (OR=1.001, 95%CI:1.000-1.020, P=0.023) and lactate dehydrogenase (LDH) (OR=1.020, 95%CI: 1.010-1.022, P=0.001) were screened as risk factors for early-stage epithelial ovarian cancer. The nomogram model was constructed based on these above five risk factors to predict early-stage epithelial ovarian cancer. ROC curves showed the area under curve (AUC) were 0.915(95%CI:0.910-0.932)for modeling group, 0.891(95%CI:0.874-0.905) for internal validation group, and 0.924(95%CI:0.907-0.942) for external verification. The calibration curves and clinical decision curves showed the model exhibited good consistency and clinical practicability. Conclusions: The nomogram model built includes age, CA125, HE4, CA199, and LDH. It can effectively predict early-stage epithelial ovarian cancer and has strong clinical practicability.

目的: 建立并验证基于最小绝对值收敛和选择算子(LASSO)回归的早期上皮性卵巢癌预测模型。 方法: 回顾性纳入2018年1月至2023年3月因卵巢肿块在天津医科大学总医院行手术治疗的患者509例,按7∶3比例随机分成建模组[n=356,年龄MQ1Q3)为43(31,61)岁]和内部验证组[n=153,年龄为42(31,60)岁]。另收集2023年4至11月因卵巢肿块在天津医科大学总医院行手术治疗的患者86例[年龄为44(33,61)岁]作为外部验证组。通过LASSO回归筛选变量,多因素logistic回归建立模型并绘制列线图,随后对模型进行内部及外部验证。采用受试者工作特征(ROC)曲线、校准曲线及决策曲线评估模型性能及临床适用性。 结果: 共筛选出5个预测因素,包括年龄(OR=1.040,95%CI:1.000~1.050,P=0.002)、糖链抗原125(CA125)(OR=1.001,95%CI:1.000~1.010,P=0.017)、人附睾蛋白4(HE4)(OR=1.020,95%CI:1.000~1.030,P=0.002)、糖链抗原199(CA199)(OR=1.001,95%CI:1.000~1.020,P=0.023)和乳酸脱氢酶(LDH)(OR=1.020,95%CI:1.010~1.022,P=0.001)是早期上皮性卵巢癌的危险因素。基于以上5个危险因素构建预测早期上皮性卵巢癌的列线图模型,ROC曲线显示该模型预测早期上皮性卵巢癌的曲线下面积(AUC)在建模组为0.915(95%CI:0.910~0.932),内部验证组AUC为0.891(95%CI:0.874~0.905),外部验证组AUC为0.924(95%CI:0.907~0.942)。校准曲线和临床决策曲线显示该模型的一致性和临床实用性较好。 结论: 建立的列线图模型包括年龄、CA125、HE4、CA199和LDH,其可以有效预测早期上皮性卵巢癌,具有较强的临床实用性。.

Publication types

  • English Abstract

MeSH terms

  • Adult
  • CA-125 Antigen / blood
  • Carcinoma, Ovarian Epithelial* / pathology
  • Female
  • Humans
  • Logistic Models
  • Middle Aged
  • Neoplasm Staging
  • Nomograms*
  • Ovarian Neoplasms* / pathology
  • ROC Curve
  • Retrospective Studies
  • Risk Factors
  • WAP Four-Disulfide Core Domain Protein 2 / analysis

Substances

  • CA-125 Antigen
  • WAP Four-Disulfide Core Domain Protein 2