Many web-based pharmaceutical e-commerce platforms allow consumers to post open-ended textual reviews based on their purchase experiences. Understanding the true voice of consumers by analyzing such a large amount of user-generated content is of great significance to pharmaceutical manufacturers and e-commerce websites. The aim of this paper is to automatically extract hidden topics from web-based drug reviews using the structural topic model (STM) to examine consumers' concerns when they buy drugs online. The STM is a probabilistic extension of Latent Dirichlet Allocation (LDA), which allows the consolidation of document-level covariates. This innovation allows us to capture consumer dissatisfaction along with their dynamics over time. We extract 12 topics, and five of them are negative topics representing consumer dissatisfaction, whose appearances in the negative reviews are substantially higher than those in the positive reviews. We also come to the conclusion that the prevalence of these five negative topics has not decreased over time. Furthermore, our results reveal that the prevalence of price-related topics has decreased significantly in positive reviews, which indicates that low-price strategies are becoming less attractive to customers. To the best of our knowledge, our work is the first study using STM to analyze the unstructured textual data of drug reviews, which enhances the understanding of the aspects of drug consumer concerns and contributes to the research of pharmaceutical e-commerce literature.
Keywords: consumer concerns; online drug review; structural topic model; text mining.