Objective: To compare different prediction models for assessing outcome of patients undergoing non-invasive risk stratification of suspected or known coronary artery disease.
Methods: Six statistical classifiers and data mining models were applied to the prospective data bank of two different institutions. Of these, one represented the training (n = 2777) and the other one the test (n = 2679) set, each set consisting of usual clinical and stress echo information of patients followed-up for the combined endpoint of all-cause mortality and non-fatal acute coronary syndromes. The following models were used: Logistic regression, Generalized Additive Model, Projection Pursuit Regression, Linear Discriminant Analysis, Quadratic Discriminant Analysis and Artificial Neural Networks. Models were selected using the Akaike Information Criterion and compared in terms of accuracy and Negative Predictive Value, overall Misclassification Rate and ROC Area Under Curve.
Results: During a median follow-up of 31 months, 573 events occurred: 271 in the training and 302 in the test set respectively. All models selected the same subset of covariates as significantly associated with the outcome. The comparison of model performance showed that: (1) Quadratic Discriminant Analysis and Artificial Neural Networks provided a worse prediction of outcome than models more closely bonded to the hypothesis of linearity of the covariates effect; (2) overall predictive capability of the best performing models was excellent (>90% and >85% for training and test set respectively); and (3) there was a substantial lack of agreement among model indications in the individual patient.
Conclusions: The selection of variables and predictive models are not independent processes and may affect the performance of risk scoring systems or algorithms designed to transfer general prognostic rules into clinical practice. Thus, caution must be used in translating model prediction into strict clinical indications.