Mar 2, 2023 8:00 AM

A Unique Experiment That Could Make Social Media Better

Academic researchers weren’t getting anywhere by criticizing Big Tech platforms, so we decided to try collaborating instead.

Photo collage of Facebook like symbols a person looking at their phone and a graph

Social media, news, music, shopping, and other sites all rely on recommender systems: algorithms that personalize what each individual user sees. These systems are largely driven by predictions of what each person will click, like, share, buy, and so on, usually shorthanded as “engagement.” These reactions can contain useful information about what’s important to us, but—as the existence of clickbait proves—just because we click on it doesn’t mean it’s good.

Many critics argue that platforms should not try to maximize engagement, but instead optimize for some measure of long-term value for users. Some of the people who work for these platforms agree: Meta and other social media platforms, for example, have for some time been working on incorporating more direct feedback into recommender systems.

For the past two years, we have been collaborating with Meta employees—as well as researchers from the University of Toronto, UC Berkeley, MIT, Harvard, Stanford, and KAIST, plus representatives from nonprofits and advocacy organizations—to do research that advances these efforts. This involves an experimental change to Facebook’s feed ranking—for users who choose to participate in our study—in order to make it respond to their feedback over a period of several months.

Here’s how our study, which launches later this year, will work: Over three months, we will repeatedly ask participants about their experiences on the Facebook feed using a survey that aims to measure positive experiences, including spending time online with friends and getting good advice. (Our survey is a modified version of the previously validated Online Social Support Scale.) Then we’ll try to model the relationship between what was in a participant’s feed—for example, which sources and topics they saw—and their answers over time. Using this predictive model, we’ll then run the experiment again, this time trying to select the content that we think will lead to the best outcomes over time, as measured by the recurring surveys.

Our goal is to show that it’s technically possible to drive content selection algorithms by asking users about their experiences over a sustained period of time, rather than relying primarily on their immediate online reactions.

We’re not suggesting that Meta, or any other company, should prioritize the specific survey questions we’re using. There are many ways to assess the long-term impact and value of recommendations, and there isn’t yet any consensus on which metrics to use or how to balance competing goals. Rather, the goal of this collaboration is to show how, potentially, any survey measure could be used to drive content recommendations toward chosen long-term outcomes. This might be applied to any recommender system on any platform. While engagement will always be a key signal, this work will establish both the principle and the technique for incorporating other information, including longer-term consequences. If this works, it might help the entire industry build products that lead to better user experiences.

A study like ours has never been done before, in part due to serious distrust between the researchers studying how to improve recommender systems and the platforms that operate them. Our experience shows just how difficult it is to arrange such an experiment, and how important it is to do so.

The project came out of informal conversations between an independent researcher and a Meta product manager more than two years ago. We then assembled the academic team, as well as researchers from nonprofits and advocacy groups to help keep the focus on public benefit. Perhaps we were naive, but we were taken aback by rejections from people who nevertheless agreed that we were asking valuable questions. Some organizations passed because of the communications risk, or because some of their staff argued that collaborations with Big Tech are PR efforts at best, if not outright unethical.

Some of the pushback comes from the fact that Meta is putting money toward the project. Although no external researchers are being paid, the University of Toronto has contracted with Meta to manage the university-based parts of the collaboration. This project has significant administrative and engineering costs, in part because we decided to ensure research integrity by externally writing key parts of the code that Meta will run. This funding might have been more trouble than it was worth, but there’s also no reason researchers should have to scrape together pennies or spend taxpayer money when working with the largest companies in the world to develop socially beneficial technology. In the future, third-party funders could support the academic and civil society end of platform research collaborations, as they have sometimes done.

The problem with instinctive distrust of platforms is not that platforms are above criticism, but that blanket distrust blocks some of the most valuable work that can be done to make these systems less harmful, more beneficial, and more open. Many observers are placing their hopes in transparency, especially transparency required by law. The recently passed EU Digital Services Act requires platforms to make data available to qualified researchers, and a number of similar policy proposals have been introduced to the US Congress. Yet our work necessarily goes far beyond “data access.”

In our view, only an experiment that involves intervening on a live platform can test the hypothesis that recommender systems can be oriented to long-term positive outcomes, and develop sharable technology to do so. More than that, it’s unlikely that law alone can compel a company to engage in good faith on a complex project like this one; designing the core experiment took over a year and wouldn’t have been possible without the expertise of the Meta engineers who work with the platform’s technology daily. In any case, attempts to pass American laws ensuring researcher access to data have, so far, gone nowhere.

Yet collaborative experiments with public results are disincentivized. The answer isn’t to do technosocial research in secret—or worse, not at all—but to do it to higher ethical standards. Our experiment is being overseen by the University of Toronto’s human subjects experimentation review process (IRB), which is recognized by all the other universities involved as meeting their ethics requirements. All of the users in our study will have given informed consent to participate, and will be paid for their time. We were happy to find champions within Meta who believe in open research.

This level of cooperation requires navigating complex expectations about what information can, should, and won’t be shared. We designed a novel approach to resolving disagreements about confidentiality. We received contractual guarantees that our research will result in a scientific publication meeting peer review standards, and can’t be altered or held up for any reason other than legitimate privacy and confidentiality concerns. We also negotiated the freedom to talk publicly about our collaboration, and in the event the project is halted, the freedom to disclose the reasons why. We’re pretty sure nobody has seen an agreement like this before in an academic-industry collaboration. It took time to design and negotiate this new way of doing research.

Finally, we insisted that the results be in the public domain, including any resulting intellectual property. We are trying to shift industry norms of secrecy, because virtually every platform faces similar challenges. Everyone would benefit from routine sharing of research.

When we started two years ago, the first reaction to this project was skepticism: “Meta will never do this, and I wouldn’t work with them even if they did.” Today the reaction is more often, “how can we do this too?” It now seems obvious that open research is the only way of addressing the intricate challenges of society-scale algorithms in a democratically legitimate way.

The risks haven’t gone away; we haven’t actually run the experiment yet. Collaborative science moves slower than industry, and Meta’s business priorities and regulatory environment can change quickly. Nor have we yet had to resolve any significant disagreements about what can and cannot be shared publicly. Either party could still derail this project, and set back societally important platform research by years. But we think there’s no substitute for taking such gambles, as researchers cannot run platform experiments alone and platforms cannot achieve legitimacy without openness. There is a crucial place for criticism and accountability, but something more optimistic is also needed to advance the field. We’re all better off when this sort of work happens.

You Might Also Like …