Glioblastoma (GBM) is a fatal brain tumor with limited treatment options. O6-methylguanine-DNA-methyltransferase (MGMT) promoter methylation status is the central molecular biomarker linked to both the response to temozolomide, the standard chemotherapy drug employed for GBM, and to patient survival. However, MGMT status is captured on tumor tissue which, given the difficulty in acquisition, limits the use of this molecular feature for treatment monitoring. MGMT protein expression levels may offer additional insights into the mechanistic understanding of MGMT but, currently, they correlate poorly to promoter methylation. The difficulty of acquiring tumor tissue for MGMT testing drives the need for non-invasive methods to predict MGMT status. Feature selection aims to identify the most informative features to build accurate and interpretable prediction models. This study explores the new application of a combined feature selection (i.e., LASSO and mRMR) and the rank-based weighting method (i.e., MGMT ProFWise) to non-invasively link MGMT promoter methylation status and serum protein expression in patients with GBM. Our method provides promising results, reducing dimensionality (by more than 95%) when employed on two large-scale proteomic datasets (7k SomaScan® panel and CPTAC) for all our analyses. The computational results indicate that the proposed approach provides 14 shared serum biomarkers that may be helpful for diagnostic, prognostic, and/or predictive operations for GBM-related processes, given further validation.
Keywords: MGMT; feature selection; glioblastoma; machine learning; pattern recognition; protein; proteomic.