Piecing together the narrative of #longcovid: an unsupervised deep learning of 1,354,889 X (formerly Twitter) posts from 2020 to 2023

Front Public Health. 2024 Dec 16:12:1491087. doi: 10.3389/fpubh.2024.1491087. eCollection 2024.

Abstract

Objective: To characterize the public conversations around long COVID, as expressed through X (formerly Twitter) posts from May 2020 to April 2023.

Methods: Using X as the data source, we extracted tweets containing #long-covid, #long_covid, or "long covid," posted from May 2020 to April 2023. We then conducted an unsupervised deep learning analysis using Bidirectional Encoder Representations from Transformers (BERT). This method allowed us to process and analyze large-scale textual data, focusing on individual user tweets. We then employed BERT-based topic modeling, followed by reflexive thematic analysis to categorize and further refine tweets into coherent themes to interpret the overarching narratives within the long COVID discourse. In contrast to prior studies, the constructs framing our analyses were data driven as well as informed by the tenets of social constructivism.

Results: Out of an initial dataset of 2,905,906 tweets, a total of 1,354,889 unique, English-language tweets from individual users were included in the final dataset for analysis. Three main themes were generated: (1) General discussions of long COVID, (2) Skepticism about long COVID, and (3) Adverse effects of long COVID on individuals. These themes highlighted various aspects, including public awareness, community support, misinformation, and personal experiences with long COVID. The analysis also revealed a stable temporal trend in the long COVID discussions from 2020 to 2023, indicating its sustained interest in public discourse.

Conclusion: Social media, specifically X, helped in shaping public awareness and perception of long COVID, and the posts demonstrate a collective effort in community building and information sharing.

Keywords: BERTopic; Twitter; X; long COVID; machine learning; social media; topic modeling.

MeSH terms

  • COVID-19* / epidemiology
  • Deep Learning*
  • Humans
  • Narration
  • SARS-CoV-2
  • Social Media*
  • Unsupervised Machine Learning

Grants and funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.