A quantitative benchmark of neural network feature selection methods for detecting nonlinear signals

Antoine Passemiers; Pietro Folco; Daniele Raimondi; Giovanni Birolo; Yves Moreau; Piero Fariselli

doi:10.1038/s41598-024-82583-5

A quantitative benchmark of neural network feature selection methods for detecting nonlinear signals

Sci Rep. 2024 Dec 28;14(1):31180. doi: 10.1038/s41598-024-82583-5.

Authors

Antoine Passemiers¹, Pietro Folco², Daniele Raimondi^{3

4}, Giovanni Birolo², Yves Moreau⁵, Piero Fariselli⁶

Affiliations

¹ ESAT-STADIUS, KU Leuven, Leuven, Belgium. [email protected].
² Department of Medical Sciences, University of Torino, Torino, Italy.
³ ESAT-STADIUS, KU Leuven, Leuven, Belgium. [email protected].
⁴ Institut de Génétique Moléculaire de Montpellier, Université de Montpellier, Montpellier, France. [email protected].
⁵ ESAT-STADIUS, KU Leuven, Leuven, Belgium.
⁶ Department of Medical Sciences, University of Torino, Torino, Italy. [email protected].

Abstract

Classification and regression problems can be challenging when the relevant input features are diluted in noisy datasets, in particular when the sample size is limited. Traditional Feature Selection (FS) methods address this issue by relying on some assumptions such as the linear or additive relationship between features. Recently, a proliferation of Deep Learning (DL) models has emerged to tackle both FS and prediction at the same time, allowing non-linear modeling of the selected features. In this study, we systematically assess the performance of DL-based feature selection methods on synthetic datasets of varying complexity, and benchmark their efficacy in uncovering non-linear relationships between features. We also use the same settings to benchmark the reliability of gradient-based feature attribution techniques for Neural Networks (NNs), such as Saliency Maps (SM). A quantitative evaluation of the reliability of these approaches is currently missing. Our analysis indicates that even simple synthetic datasets can significantly challenge most of the DL-based FS and SM methods, while Random Forests, TreeShap, mRMR and LassoNet are the best performing FS methods. Our conclusion is that when quantifying the relevance of a few non linearly-entangled predictive features diluted in a large number of irrelevant noisy variables, DL-based FS and SM interpretation methods are still far from being reliable.

Abstract

Grants and funding