Patterns of high-risk drinking among medical students: A web-based survey with machine learning

Comput Biol Med. 2021 Sep:136:104747. doi: 10.1016/j.compbiomed.2021.104747. Epub 2021 Aug 16.

Abstract

Background: Prior studies have found increased rates of alcohol consumption among physicians and medical students. The present study aims to build machine learning (ML) models to identify patterns of high-risk drinking (HRD), including alcohol use disorder, within this population.

Methods: We analyzed data collected through a web-based survey among Brazilian medical students. Variables included sociodemographic data, personal information, university status, and mental health. Stratification for HRD was carried out based on the AUDIT-C scores. Three ML algorithms were used to build classifiers to predict HRD among medical students: elastic net regularization, random forest, and artificial neural networks. Model interpretation techniques were adopted to assess the most influential predictors for models' decisions, which represent potential factors associated with HRD.

Results: A total of 4840 medical students were included in the study. The prevalence of HRD was 53.03%. The three ML models built were able to distinguish individuals with HRD from low-risk drinking (LRD) with very similar performance. The average AUC scores in the cross-validation procedure were around 0.72, and this performance was replicated in the test set. The most important features for the ML models were the use of tobacco and cannabis, monthly family income, marital status, sexual orientation, and physical activities.

Conclusions: This study proposes that ML models may serve as tools for initial screening of students regarding their susceptibility for at-risk drinking or alcohol use disorder. In addition, we identified several key factors associated with HRD that could be further investigated and explored for preventive and assistance measures.

Keywords: Classification models; High-risk drinking; Machine learning; Medical students.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Female
  • Humans
  • Internet
  • Machine Learning
  • Male
  • Neural Networks, Computer
  • Students, Medical*