Introduction: Accurate prediction of short-term offending in young people exhibiting antisocial behaviour could support targeted interventions. Here we develop a set of machine learning (ML) models that predict offending status with good accuracy; furthermore, we show interpretable ML analyses can complement models to inform clinical decision-making.
Methods: This study included 679 individuals aged 11-17 years who displayed moderate-to-severe antisocial behaviour, from a controlled trial of Multisystemic-therapy in England. The outcome was any criminal offence in the 18 months after study baseline. Four types of ML algorithms were trained: logistic regression, elastic net regression, random forest, and gradient boosting machine (GBM). Prediction models were developed (1) using predictors readily available to clinicians (e.g. sociodemographics, previous convictions), and (2) with additional information (e.g. parenting). Model agnostic feature importance values were calculated and the most important predictors identified. Nested cross-validation with 100 iterations of random data splits and 10-fold cross-validation within each iteration was employed, and the average predictive performance was reported.
Results: Among the ML models using readily available predictors, the GBM is the strongest model (AUC 0.85, 95% CI 0.85-0.86); the other models have average AUCs of 0.82. This performance was better than using only the total number of previous offences as the predictor (0.67, 0.66-0.68), and the model simply assuming past offending status as the prediction (0.81, 0.80-0.81). Additional predictors slightly increased the performance of logistic regression and random forest models but decreased the performance of elastic net regression and gradient boosting machine-based models.
Conclusion: The potential utility of ML approaches for accurately predicting criminal offences in high-risk youth is demonstrated. Interpretable ML-based predictive models could be utilised in youth services or research to help develop and deliver effective interventions.
Keywords: Criminal offending; Machine learning; Prediction modelling; Recidivism; Youth crime.
© 2024. The Author(s).