A genetic algorithm to find optimal reading test word subsets for estimating full-scale IQ

Ian van der Linde; Peter Bright

doi:10.1371/journal.pone.0205754

A genetic algorithm to find optimal reading test word subsets for estimating full-scale IQ

PLoS One. 2018 Oct 18;13(10):e0205754. doi: 10.1371/journal.pone.0205754. eCollection 2018.

Authors

Ian van der Linde^{1

2}, Peter Bright^{2

3}

Affiliations

¹ Department of Computing & Technology, Anglia Ruskin University, Cambridge, United Kingdom.
² Vision & Eye Research Unit (VERU), School of Medicine, Anglia Ruskin University, Cambridge, United Kingdom.
³ Department of Psychology, Anglia Ruskin University, Cambridge, United Kingdom.

Abstract

In clinical neuropsychology the cognitive abilities of neurological patients are commonly estimated using well-established paper-based tests. Typically, scores on some tests remain relatively well preserved, whilst others exhibit a significant and disproportionate decline. Scores on those tests that measure preserved cognitive functions (so-called 'hold' tests) may be used to estimate premorbid abilities, including scores in non-hold tests that would have been expected prior to the onset of cognitive impairment. Many hold tests entail word reading, with each word being graded as correctly or incorrectly pronounced. Inevitably, such tests are likely to contain words that provide little or no diagnostic power (i.e., can be eliminated without negatively affecting prediction accuracy). In this paper, a genetic algorithm is developed and demonstrated, using n = 92 neurologically healthy participants, to identify optimal word subsets from the National Adult Reading Test that minimize the mean error in predicting the most widely used clinical measure of IQ and cognitive ability, the Wechsler Adult Intelligence Scale Fourth Edition IQ. In addition to requiring only 17-20 of the original 50 words (suggesting that this test could be revised to be up to 66% shorter) and minimizing mean prediction error, the algorithm increases the proportion of the variance in the predicted variable explained in comparison to using all words (from r2 = 0.46 to r2 = 0.61). In a clinical setting this would improve estimates of premorbid cognitive function and, if an abbreviated revision to this test were to be adopted, reduce the arduousness of the test for patients. The proposed method is evaluated with jackknifing and leave one out cross validation. The general approach may be used to optimize the relationship between any two psychological tests by finding the question subset in one test that minimizes the prediction error in a second test by training the genetic algorithm using data collected from participants upon whom both tests have been administered. This approach may also be used to develop new predictive tests, since it provides a method to identify an optimal subset of a set of candidate questions (for which empirical data have been collected) that maximizes prediction accuracy and the proportion of variance in the predicted variable that can be explained.

MeSH terms

Adult
Aged
Algorithms
Cognition
Cognitive Dysfunction / diagnosis*
Female
Healthy Volunteers
Humans
Intelligence
Logistic Models
Male
Middle Aged
Models, Biological*
Neuropsychology / methods*
Predictive Value of Tests
Reading*
Wechsler Scales*
Young Adult

Grants and funding

The authors received no specific funding for this work.