Depression affects over 280 million people globally, yet many cases remain undiagnosed or untreated due to stigma and lack of awareness. Social media platforms like X (formerly Twitter) offer a way to monitor and analyze depression markers. This study analyzes Twitter data 90 days before and 90 days after a self-disclosed clinical diagnosis. We gathered 246,637 tweets from 229 diagnosed users. CorEx topic modeling identified seven themes: causes, physical symptoms, mental symptoms, swear words, treatment, coping/support mechanisms, and lifestyle, and conditional logistic regression assessed the odds of these themes occurring post-diagnosis. A control group of healthy users (284,772 tweets) was used to develop and evaluate machine learning classifiers-support vector machines, naive Bayes, and logistic regression-to distinguish between depressed and non-depressed users. Logistic regression and SVM performed best. These findings show the potential of Twitter data for tracking depression and changes in symptoms, coping mechanisms, and treatment use.
© 2024. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.