Concurrent Validity and Feasibility of Short Tests Currently Used to Measure Early Childhood Development in Large Scale Studies

Marta Rubio-Codina; M Caridad Araujo; Orazio Attanasio; Pablo Muñoz; Sally Grantham-McGregor

doi:10.1371/journal.pone.0160962

Concurrent Validity and Feasibility of Short Tests Currently Used to Measure Early Childhood Development in Large Scale Studies

PLoS One. 2016 Aug 22;11(8):e0160962. doi: 10.1371/journal.pone.0160962. eCollection 2016.

Authors

Marta Rubio-Codina^{1

2}, M Caridad Araujo¹, Orazio Attanasio^{2

3}, Pablo Muñoz⁴, Sally Grantham-McGregor⁵

Affiliations

¹ Social Protection and Health Division, Inter-American Development Bank, Washington, D.C., United States of America.
² Centre for the Evaluation of Development Policies, Institute for Fiscal Studies, London, United Kingdom.
³ Department of Economics, University College London, London, United Kingdom.
⁴ École de Psychologie, Université Laval, Quebec, Canada.
⁵ Faculty of Population Health Sciences, Institute of Child Health, University College London, London, United Kingdom.

Abstract

In low- and middle-income countries (LIMCs), measuring early childhood development (ECD) with standard tests in large scale surveys and evaluations of interventions is difficult and expensive. Multi-dimensional screeners and single-domain tests ('short tests') are frequently used as alternatives. However, their validity in these circumstances is unknown. We examined the feasibility, reliability, and concurrent validity of three multi-dimensional screeners (Ages and Stages Questionnaires (ASQ-3), Denver Developmental Screening Test (Denver-II), Battelle Developmental Inventory screener (BDI-2)) and two single-domain tests (MacArthur-Bates Short-Forms (SFI and SFII), WHO Motor Milestones (WHO-Motor)) in 1,311 children 6-42 months in Bogota, Colombia. The scores were compared with those on the Bayley Scales of Infant and Toddler Development (Bayley-III), taken as the 'gold standard'. The Bayley-III was given at a center by psychologists; whereas the short tests were administered in the home by interviewers, as in a survey setting. Findings indicated good internal validity of all short tests except the ASQ-3. The BDI-2 took long to administer and was expensive, while the single-domain tests were quickest and cheapest and the Denver-II and ASQ-3 were intermediate. Concurrent validity of the multi-dimensional tests' cognitive, language, and fine motor scales with the corresponding Bayley-III scale was low below 19 months. However, it increased with age, becoming moderate-to-high over 30 months. In contrast, gross motor scales' concurrence was high under 19 months and then decreased. Of the single-domain tests, the WHO-Motor had high validity with gross motor under 16 months, and the SFI and SFII expressive scales showed moderate correlations with language under 30 months. Overall, the Denver-II was the most feasible and valid multi-dimensional test and the ASQ-3 performed poorly under 31 months. By domain, gross motor development had the highest concurrence below 19 months, and language above. Predictive validity investigation is needed to further guide the choice of instruments for large scale studies.

MeSH terms

Child Development / physiology*
Child, Preschool
Colombia
Developing Countries
Female
Humans
Infant
Language Tests / standards*
Male
Motor Skills / physiology*
Neuropsychological Tests / standards*
Poverty
Psychometrics / methods*
Psychomotor Performance / physiology*

Grants and funding

Data collection was funded by Fund RG-T1907 from the Inter-American Development Bank (IDB). Rubio-Codina’s research time was partly financed by the Leverhulme Trust Early Career Fellowship ECF/2008/0170. Attanasio’s research time was partially financed by the European Research Council (ERC) Advanced Grants 249612 and the Economic and Social Research Council (ESRC) Professorial Fellowship ES/K010700/1. The funder (IDB) provided support in the form of salaries for authors (MRC, MCA), but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the IDB, its Board of Directors, or the countries they represent.