Given the severe and rapid impact of COVID-19, the pace of information sharing has been accelerated. However, traditional methods of disseminating and digesting medical information can be time-consuming and cumbersome. In a pilot study, the authors used social listening to quickly extract information from social media channels to explore what people with COVID-19 are talking about regarding symptoms and disease progression. The goal was to determine whether, by amplifying patient voices, new information could be identified that might have been missed through other sources. Two data sets from social media groups of people with or presumed to have COVID-19 were analyzed: a Facebook group poll, and conversation data from a Reddit group including detailed disease natural history-like posts. Content analysis and a customized analytics engine that incorporates machine learning and natural language processing were used to quickly identify symptoms mentioned. Key findings include more than 20 symptoms in the data sets that were not listed in online lists of symptoms from 4 respected medical information sources. The disease natural history-like posts revealed that people can experience symptoms for many weeks and that some symptoms change over time. This study demonstrates that social media can offer novel insights into patient experiences as a source of real-world data. This inductive research approach can quickly generate descriptive information that can be used to develop hypotheses and new research questions. Also, the method allows rapid assessments of large numbers of social media conversations that could be applied to monitor public health for emerging and rapidly spreading diseases such as COVID-19.
Keywords: COVID-19; content analysis; data mining; disease natural histories; social listening; social media.