Machine learning applied to near-infrared spectra for clinical pleural effusion classification

Sci Rep. 2021 May 3;11(1):9411. doi: 10.1038/s41598-021-87736-4.

Abstract

Lung cancer patients with malignant pleural effusions (MPE) have a particular poor prognosis. It is crucial to distinguish MPE from benign pleural effusion (BPE). The present study aims to develop a rapid, convenient and economical diagnostic method based on FTIR near-infrared spectroscopy (NIRS) combined with machine learning strategy for clinical pleural effusion classification. NIRS spectra were recorded for 47 MPE samples and 35 BPE samples. The sample data were randomly divided into train set (n = 62) and test set (n = 20). Partial least squares, random forest, support vector machine (SVM), and gradient boosting machine models were trained, and subsequent predictive performance were predicted on the test set. Besides the whole spectra used in modeling, selected features using SVM recursive feature elimination algorithm were also investigated in modeling. Among those models, NIRS combined with SVM showed the best predictive performance (accuracy: 1.0, kappa: 1.0, and AUCROC: 1.0). SVM with the top 50 feature wavenumbers also displayed a high predictive performance (accuracy: 0.95, kappa: 0.89, AUCROC: 0.99). Our study revealed that the combination of NIRS and machine learning is an innovative, rapid, and convenient method for clinical pleural effusion classification, and worth further evaluation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenocarcinoma of Lung / pathology*
  • Female
  • Humans
  • Lung Neoplasms / pathology*
  • Male
  • Middle Aged
  • Pleural Cavity / pathology
  • Pleural Effusion, Malignant / classification*
  • Pleural Effusion, Malignant / diagnosis*
  • Principal Component Analysis
  • Prognosis
  • Spectroscopy, Fourier Transform Infrared
  • Spectroscopy, Near-Infrared
  • Support Vector Machine*
  • Tuberculosis, Pleural / pathology*