The utility of data-driven feature selection: re: Chu et al. 2012

Wesley T Kerr; Pamela K Douglas; Ariana Anderson; Mark S Cohen

doi:10.1016/j.neuroimage.2013.07.050

The utility of data-driven feature selection: re: Chu et al. 2012

Neuroimage. 2014 Jan 1:84:1107-10. doi: 10.1016/j.neuroimage.2013.07.050. Epub 2013 Jul 25.

Authors

Wesley T Kerr¹, Pamela K Douglas, Ariana Anderson, Mark S Cohen

Affiliation

¹ David Geffen School of Medicine at UCLA, USA. Electronic address: [email protected].

Abstract

The recent Chu et al. (2012) manuscript discusses two key findings regarding feature selection (FS): (1) data driven FS was no better than using whole brain voxel data and (2) a priori biological knowledge was effective to guide FS. Use of FS is highly relevant in neuroimaging-based machine learning, as the number of attributes can greatly exceed the number of exemplars. We strongly endorse their demonstration of both of these findings, and we provide additional important practical and theoretical arguments as to why, in their case, the data-driven FS methods they implemented did not result in improved accuracy. Further, we emphasize that the data-driven FS methods they tested performed approximately as well as the all-voxel case. We discuss why a sparse model may be favored over a complex one with similar performance. We caution readers that the findings in the Chu et al. report should not be generalized to all data-driven FS methods.

Keywords: Feature selection; Machine learning; Neuroimaging.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Comment

MeSH terms

Alzheimer Disease / classification*
Alzheimer Disease / pathology*
Cognitive Dysfunction / classification*
Cognitive Dysfunction / pathology*
Female
Humans
Magnetic Resonance Imaging*
Male
Neuroimaging*

Abstract

Publication types

MeSH terms

Grants and funding