A nonparametric updating method to correct clinical prediction model drift

Sharon E Davis; Robert A Greevy; Christopher Fonnesbeck; Thomas A Lasko; Colin G Walsh; Michael E Matheny

doi:10.1093/jamia/ocz127

A nonparametric updating method to correct clinical prediction model drift

J Am Med Inform Assoc. 2019 Dec 1;26(12):1448-1457. doi: 10.1093/jamia/ocz127.

Authors

Sharon E Davis¹, Robert A Greevy², Christopher Fonnesbeck², Thomas A Lasko¹, Colin G Walsh^{1

3

4}, Michael E Matheny^{1

2

3

5}

Affiliations

¹ Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
² Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
³ Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
⁴ Department of Psychiatry, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
⁵ Geriatrics Research, Education, and Clinical Care, Nashville VA Medical Center, VA Tennessee Valley Healthcare System, Nashville, Tennessee, USA.

Abstract

Objective: Clinical prediction models require updating as performance deteriorates over time. We developed a testing procedure to select updating methods that minimizes overfitting, incorporates uncertainty associated with updating sample sizes, and is applicable to both parametric and nonparametric models.

Materials and methods: We describe a procedure to select an updating method for dichotomous outcome models by balancing simplicity against accuracy. We illustrate the test's properties on simulated scenarios of population shift and 2 models based on Department of Veterans Affairs inpatient admissions.

Results: In simulations, the test generally recommended no update under no population shift, no update or modest recalibration under case mix shifts, intercept correction under changing outcome rates, and refitting under shifted predictor-outcome associations. The recommended updates provided superior or similar calibration to that achieved with more complex updating. In the case study, however, small update sets lead the test to recommend simpler updates than may have been ideal based on subsequent performance.

Discussion: Our test's recommendations highlighted the benefits of simple updating as opposed to systematic refitting in response to performance drift. The complexity of recommended updating methods reflected sample size and magnitude of performance drift, as anticipated. The case study highlights the conservative nature of our test.

Conclusions: This new test supports data-driven updating of models developed with both biostatistical and machine learning approaches, promoting the transportability and maintenance of a wide array of clinical prediction models and, in turn, a variety of applications relying on modern prediction tools.

Keywords: calibration; model updating; predictive analytics.

MeSH terms

Humans
Maschinelles Lernen
Models, Statistical*
Prognosis
Risk Assessment / methods*
Risk Assessment / statistics & numerical data
Statistics, Nonparametric*