Differentiating Pulmonary Nodule Malignancy Using Exhaled Volatile Organic Compounds: A Prospective Observational Study

Cancer Med. 2025 Jan;14(1):e70545. doi: 10.1002/cam4.70545.

Abstract

Background: Advances in imaging technology have enhanced the detection of pulmonary nodules. However, determining malignancy often requires invasive procedures or repeated radiation exposure, underscoring the need for safer, noninvasive diagnostic alternatives. Analyzing exhaled volatile organic compounds (VOCs) shows promise, yet its effectiveness in assessing the malignancy of pulmonary nodules remains underexplored.

Methods: Employing a prospective study design from June 2023 to January 2024 at the Affiliated Hospital of Yangzhou University, we assessed the malignancy of pulmonary nodules using the Mayo Clinic model and collected exhaled breath samples alongside lifestyle and health examination data. We applied five machine learning (ML) algorithms to develop predictive models which were evaluated using area under the curve (AUC), sensitivity, specificity, and other relevant metrics.

Results: A total of 267 participants were enrolled, including 210 with low-risk and 57 with moderate-risk pulmonary nodules. Univariate analysis identified 11 exhaled VOCs associated with nodule malignancy, alongside two lifestyle factors (smoke index and sites of tobacco smoke inhalation) and one clinical metric (nodule diameter) as independent predictors for moderate-risk nodules. The logistic regression model integrating lifestyle and health data achieved an AUC of 0.91 (95% CI: 0.8611-0.9658), while the random forest model incorporating exhaled VOCs achieved an AUC of 0.99 (95% CI: 0.974-1.00). Calibration curves indicated strong concordance between predicted and observed risks. Decision curve analysis confirmed the net benefit of these models over traditional methods. A nomogram was developed to aid clinicians in assessing nodule malignancy based on VOCs, lifestyle, and health data.

Conclusions: The integration of ML algorithms with exhaled biomarkers and clinical data provides a robust framework for noninvasive assessment of pulmonary nodules. These models offer a safer alternative to traditional methods and may enhance early detection and management of pulmonary nodules. Further validation through larger, multicenter studies is necessary to establish their generalizability.

Trial registration: Number ChiCTR2400081283.

Keywords: breath biomarkers; malignancy risk; pulmonary nodules; volatile organic compounds.

Publication types

  • Observational Study

MeSH terms

  • Adult
  • Aged
  • Biomarkers, Tumor / analysis
  • Biomarkers, Tumor / metabolism
  • Breath Tests* / methods
  • Diagnosis, Differential
  • Exhalation
  • Female
  • Humans
  • Lung Neoplasms* / diagnosis
  • Lung Neoplasms* / metabolism
  • Machine Learning
  • Male
  • Middle Aged
  • Multiple Pulmonary Nodules / diagnosis
  • Multiple Pulmonary Nodules / metabolism
  • Prospective Studies
  • Solitary Pulmonary Nodule / diagnosis
  • Solitary Pulmonary Nodule / metabolism
  • Volatile Organic Compounds* / analysis

Substances

  • Volatile Organic Compounds
  • Biomarkers, Tumor