Background and purpose: It has been hypothesized that algorithms predicting the final outcome in acute ischemic stroke may provide future tools for identifying salvageable tissue and hence guide individualized therapy. We developed means of quantifying predictive model performance to identify model training strategies that optimize performance and reduce bias in predicted lesion volumes.
Methods: We optimized predictive performance based on the area under the receiver operating curve for logistic regression and used simulated data to illustrate the effect of an unbalanced (unequal number of infarcting and surviving voxels) training set on predicted infarct risk. We then tested the performance and optimality of models based on perfusion-weighted, diffusion-weighted, and structural MRI modalities by changing the proportion of mismatch voxels in balanced training material.
Results: Predictive performance (area under the receiver operating curve) based on all brain voxels is excessively optimistic and lacks sensitivity in performance in mismatch tissue. The ratio of infarcting and noninfarcting voxels used for training predictive algorithms significantly biases tissue infarct risk estimates. Optimal training strategy is obtained using a balanced training set. We show that 60% of noninfarcted voxels consists of mismatch voxels in an optimal balanced training set for the patient data presented.
Conclusions: An equal number of infarcting and noninfarcting voxels should be used when training predictive models. The choice of test and training sets critically affects predictive model performance and should be closely evaluated before comparisons across patient cohorts.