Comparison of the diagnostic efficacy of mathematical models in distinguishing ultrasound imaging of breast nodules

Sci Rep. 2023 Sep 25;13(1):16047. doi: 10.1038/s41598-023-42937-x.

Abstract

This study compared the diagnostic efficiency of benign and malignant breast nodules using ultrasonographic characteristics coupled with several machine-learning models, including logistic regression (Logistics), partial least squares discriminant analysis (PLS-DA), linear support vector machine (Linear SVM), linear discriminant analysis (LDA), K-nearest neighbor (KNN), artificial neural network (ANN) and random forest (RF). The clinical information and ultrasonographic characteristics of 926 female patients undergoing breast nodule surgery were collected and their relationships were analyzed using Pearson's correlation. The stepwise regression method was used for variable selection and the Monte Carlo cross-validation method was used to randomly divide these nodule cases into training and prediction sets. Our results showed that six independent variables could be used for building models, including age, background echotexture, shape, calcification, resistance index, and axillary lymph node. In the prediction set, Linear SVM had the highest diagnosis rate of benign nodules (0.881), and Logistics, ANN and LDA had the highest diagnosis rate of malignant nodules (0.910~0.912). The area under the ROC curve (AUC) of Linear SVM was the highest (0.890), followed by ANN (0.883), LDA (0.880), Logistics (0.878), RF (0.874), PLS-DA (0.866), and KNN (0.855), all of which were better than that of individual variances. On the whole, the diagnostic efficacy of Linear SVM was better than other methods.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Area Under Curve
  • Calcification, Physiologic*
  • Calcinosis*
  • Cluster Analysis
  • Female
  • Humans
  • Models, Theoretical