Detecting and tracking depression through temporal topic modeling of tweets: insights from a 180-day study

Ranganathan Chandrasekaran; Suhas Kotaki; Abhilash Hosaagrahaara Nagaraja

doi:10.1038/s44184-024-00107-5

Detecting and tracking depression through temporal topic modeling of tweets: insights from a 180-day study

Npj Ment Health Res. 2024 Dec 6;3(1):62. doi: 10.1038/s44184-024-00107-5.

Authors

Ranganathan Chandrasekaran^{1

2}, Suhas Kotaki³, Abhilash Hosaagrahaara Nagaraja³

Affiliations

¹ Department of Information & Decision Sciences, University of Illinois at Chicago, Chicago, IL, USA. [email protected].
² Department of Biomedical and Health Information Sciences, University of Illinois at Chicago, Chicago, IL, USA. [email protected].
³ Department of Information & Decision Sciences, University of Illinois at Chicago, Chicago, IL, USA.

Abstract

Depression affects over 280 million people globally, yet many cases remain undiagnosed or untreated due to stigma and lack of awareness. Social media platforms like X (formerly Twitter) offer a way to monitor and analyze depression markers. This study analyzes Twitter data 90 days before and 90 days after a self-disclosed clinical diagnosis. We gathered 246,637 tweets from 229 diagnosed users. CorEx topic modeling identified seven themes: causes, physical symptoms, mental symptoms, swear words, treatment, coping/support mechanisms, and lifestyle, and conditional logistic regression assessed the odds of these themes occurring post-diagnosis. A control group of healthy users (284,772 tweets) was used to develop and evaluate machine learning classifiers-support vector machines, naive Bayes, and logistic regression-to distinguish between depressed and non-depressed users. Logistic regression and SVM performed best. These findings show the potential of Twitter data for tracking depression and changes in symptoms, coping mechanisms, and treatment use.