Longitudinal multiple imputation approaches for body mass index or other variables with very low individual-level variability: the mibmi command in Stata

Evangelos Kontopantelis; Rosa Parisi; David A Springate; David Reeves

doi:10.1186/s13104-016-2365-z

Longitudinal multiple imputation approaches for body mass index or other variables with very low individual-level variability: the mibmi command in Stata

BMC Res Notes. 2017 Jan 13;10(1):41. doi: 10.1186/s13104-016-2365-z.

Authors

Evangelos Kontopantelis^{1

2}, Rosa Parisi³, David A Springate^{4

5}, David Reeves^{4

5}

Affiliations

¹ NIHR School for Primary Care Research, University of Manchester, Williamson Building, Oxford Road, Manchester, M13 9PL, UK. [email protected].
² Farr Institute for Health Informatics Research, University of Manchester, Vaughan House, Portsmouth Street, Manchester, M13 9GB, UK. [email protected].
³ Centre for Pharmacoepidemiology & Drug Safety, University of Manchester, Stopford Building, Oxford Road, Manchester, M13 9PL, UK.
⁴ NIHR School for Primary Care Research, University of Manchester, Williamson Building, Oxford Road, Manchester, M13 9PL, UK.
⁵ Centre for Biostatistics, University of Manchester, JMF Building, Oxford Road, Manchester, M13 9PL, UK.

Abstract

Background: In modern health care systems, the computerization of all aspects of clinical care has led to the development of large data repositories. For example, in the UK, large primary care databases hold millions of electronic medical records, with detailed information on diagnoses, treatments, outcomes and consultations. Careful analyses of these observational datasets of routinely collected data can complement evidence from clinical trials or even answer research questions that cannot been addressed in an experimental setting. However, 'missingness' is a common problem for routinely collected data, especially for biological parameters over time. Absence of complete data for the whole of a individual's study period is a potential bias risk and standard complete-case approaches may lead to biased estimates. However, the structure of the data values makes standard cross-sectional multiple-imputation approaches unsuitable. In this paper we propose and evaluate mibmi, a new command for cleaning and imputing longitudinal body mass index data.

Results: The regression-based data cleaning aspects of the algorithm can be useful when researchers analyze messy longitudinal data. Although the multiple imputation algorithm is computationally expensive, it performed similarly or even better to existing alternatives, when interpolating observations.

Conclusion: The mibmi algorithm can be a useful tool for analyzing longitudinal body mass index data, or other longitudinal data with very low individual-level variability.

Keywords: Body mass index; Cleaning; Longitudinal data; Multiple imputation.

MeSH terms

Algorithms
Body Mass Index*
Humans
Longitudinal Studies
United Kingdom

Abstract

MeSH terms

Grants and funding