Objective: To assess whether the outcome generation true model could be identified from other candidate models for clinical practice with current conventional model performance measures considering various simulation scenarios and a CVD risk prediction as exemplar.
Study design and setting: Thousands of scenarios of true models were used to simulate clinical data, various candidate models and true models were trained on training datasets and then compared on testing datasets with 25 conventional use model performance measures. This consists of univariate simulation (179.2k simulated datasets and over 1.792 million models), multivariate simulation (728k simulated datasets and over 8.736 million models) and a CVD risk prediction case analysis.
Results: True models had overall C statistic and 95% range of 0.67 (0.51, 0.96) across all scenarios in univariate simulation, 0.81 (0.54, 0.98) in multivariate simulation, 0.85 (0.82, 0.88) in univariate case analysis and 0.85 (0.82, 0.88) in multivariate case analysis. Measures showed very clear differences between the true model and flip-coin model, little or none differences between the true model and candidate models with extra noises, relatively small differences between the true model and proxy models missing causal predictors.
Conclusion: The study found the true model is not always identified as the "outperformed" model by current conventional measures for binary outcome, even though such true model is presented in the clinical data. New statistical approaches or measures should be established to identify the casual true model from proxy models, especially for those in proxy models with extra noises and/or missing causal predictors.
Keywords: Cardiovascular disease; Clinical risk prediction model; Model performance measures; Outcome generation true model.
© 2025. The Author(s).