Text mining to improve screening for trauma-related symptoms in a global sample

D Marengo; C M Hoeboer; B P Veldkamp; GPS-txt consortium; M Olff

doi:10.1016/j.psychres.2022.114753

Text mining to improve screening for trauma-related symptoms in a global sample

Psychiatry Res. 2022 Oct:316:114753. doi: 10.1016/j.psychres.2022.114753. Epub 2022 Jul 28.

Authors

D Marengo¹, C M Hoeboer², B P Veldkamp³; GPS-txt consortium; M Olff⁴

Affiliations

¹ Department of Psychology, University of Turin, Via Verdi 10, Turin 10124, Italy.
² Department of Psychiatry, Amsterdam Public Health, University of Amsterdam, Amsterdam UMC location, Meibergdreef 9, Amsterdam 1105 AZ, the Netherlands. Electronic address: [email protected].
³ ARQ National Psychotrauma Centre, Diemen, the Netherlands.
⁴ Department of Psychiatry, Amsterdam Public Health, University of Amsterdam, Amsterdam UMC location, Meibergdreef 9, Amsterdam 1105 AZ, the Netherlands; Department of Learning, Data-Analytics, and Technology, Faculty of Behavioral Management and Social Sciences, University of Twente, the Netherlands.

PMID: 35940089
DOI: 10.1016/j.psychres.2022.114753

Abstract

Previous studies showed that textual information could be used to screen respondents for posttraumatic stress disorder (PTSD). In this study, we explored the feasibility of using language features extracted from short text descriptions respondents provided of stressful events to predict trauma-related symptoms assessed using the Global Psychotrauma Screen. Texts were analyzed with both closed- and open-vocabulary methods to extract language features representing the occurrence of words, phrases, or specific topics in the description of stressful events. We also evaluated whether combining language features with self-report information, including respondents' demographics, event characteristics, and risk factors for trauma-related disorders, would improve the prediction performance. Data were collected using an online survey on a cross-national sample of 5048 respondents. Results showed that language data achieved the highest predictive power when both closed- and open-vocabulary features were included as predictors. Combining language data and self-report information resulted in a significant increase in performance and in a model which achieved good accuracy as a screener for probable PTSD diagnosis (.7 < AUC ≤ .8), with similar results regardless of the length of the text description of the event. Overall, results indicated that short texts add to the detection of trauma-related symptoms and probable PTSD diagnosis.

Keywords: PTSD; Screening; Text mining; Trauma-related symptoms.

MeSH terms

Data Mining* / methods
Humans
Mass Screening
Risk Factors
Self Report
Stress Disorders, Post-Traumatic* / diagnosis
Stress Disorders, Post-Traumatic* / epidemiology