From theory to practice: insights and hurdles in collecting social media data for social science research

Front Big Data. 2024 May 30:7:1379921. doi: 10.3389/fdata.2024.1379921. eCollection 2024.

Abstract

Social media has profoundly changed our modes of self-expression, communication, and participation in public discourse, generating volumes of conversations and content that cover every aspect of our social lives. Social media platforms have thus become increasingly important as data sources to identify social trends and phenomena. In recent years, academics have steadily lost ground on access to social media data as technology companies have set more restrictions on Application Programming Interfaces (APIs) or entirely closed public APIs. This circumstance halts the work of many social scientists who have used such data to study issues of public good. We considered the viability of eight approaches for image-based social media data collection: data philanthropy organizations, data repositories, data donation, third-party data companies, homegrown tools, and various web scraping tools and scripts. This paper discusses the advantages and challenges of these approaches from literature and from the authors' experience. We conclude the paper by discussing mechanisms for improving social media data collection that will enable this future frontier of social science research.

Keywords: Instagram; application programming interfaces; data collection; data ethics; landscape research; secondary data; social media; visual methods.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. Funding for YC was provided by the Social Sciences and Humanities Research Council of Canada through Insight Grant 435-2018-1018, 2018-2022 (MS as PI, KS CI), Insight Grant 435-2021-0221, 2021-2025 (KS as PI), and Nova Scotia Research and Innovation Graduate Scholarship (2018-2023).