Predicting the T790M mutation in non-small cell lung cancer (NSCLC) using brain metastasis MR radiomics: a study with an imbalanced dataset

Discov Oncol. 2024 Sep 14;15(1):447. doi: 10.1007/s12672-024-01333-1.

Abstract

Background: Early detection of T790M mutation in exon 20 of epidermal growth factor receptor (EGFR) in non-small cell lung cancer (NSCLC) patients with brain metastasis is crucial for optimizing treatment strategies. In this study, we developed radiomics models to distinguish NSCLC patients with T790M-positive mutations from those with T790M-negative mutations using multisequence MR images of brain metastasis despite an imbalanced dataset. Various resampling techniques and classifiers were employed to identify the most effective strategy.

Methods: Radiomic analyses were conducted on a dataset comprising 125 patients, consisting of 18 with EGFR T790M-positive mutations and 107 with T790M-negative mutations. Seventeen first- and second-order statistical features were selected from CET1WI, T2WI, T2FLAIR, and DWI images. Four classifiers (logistic regression, support vector machine, random forest [RF], and extreme gradient boosting [XGBoost]) were evaluated under 13 different resampling conditions.

Results: The area under the curve (AUC) value achieved was 0.89, using the SVM-SMOTE oversampling method in combination with the XGBoost classifier. This performance was measured against the AUC reported in the literature, serving as an upper-bound reference. Additionally, comparable results were observed with other oversampling methods paired with RF or XGBoost classifiers.

Conclusions: Our study demonstrates that, even when dealing with an imbalanced EGFR T790M dataset, reasonable predictive outcomes can be achieved by employing an appropriate combination of resampling techniques and classifiers. This approach has significant potential for enhancing T790M mutation detection in NSCLC patients with brain metastasis.

Keywords: Brain metastases; EGFR; Imbalanced data; Machine learning; Magnetic resonance imaging; Non-small cell lung cancer; Radiomics; T790M.