-
Cura: Curation at Social Media Scale
Authors:
Wanrong He,
Mitchell L. Gordon,
Lindsay Popowski,
Michael S. Bernstein
Abstract:
How can online communities execute a focused vision for their space? Curation offers one approach, where community leaders manually select content to share with the community. Curation enables leaders to shape a space that matches their taste, norms, and values, but the practice is often intractable at social media scale: curators cannot realistically sift through hundreds or thousands of submissi…
▽ More
How can online communities execute a focused vision for their space? Curation offers one approach, where community leaders manually select content to share with the community. Curation enables leaders to shape a space that matches their taste, norms, and values, but the practice is often intractable at social media scale: curators cannot realistically sift through hundreds or thousands of submissions daily. In this paper, we contribute algorithmic and interface foundations enabling curation at scale, and manifest these foundations in a system called Cura. Our approach draws on the observation that, while curators' attention is limited, other community members' upvotes are plentiful and informative of curators' likely opinions. We thus contribute a transformer-based curation model that predicts whether each curator will upvote a post based on previous community upvotes. Cura applies this curation model to create a feed of content that it predicts the curator would want in the community. Evaluations demonstrate that the curation model accurately estimates opinions of diverse curators, that changing curators for a community results in clearly recognizable shifts in the community's content, and that, consequently, curation can reduce anti-social behavior by half without extra moderation effort. By sampling different types of curators, Cura lowers the threshold to genres of curated social media ranging from editorial groups to stakeholder roundtables to democracies.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
Jury Learning: Integrating Dissenting Voices into Machine Learning Models
Authors:
Mitchell L. Gordon,
Michelle S. Lam,
Joon Sung Park,
Kayur Patel,
Jeffrey T. Hancock,
Tatsunori Hashimoto,
Michael S. Bernstein
Abstract:
Whose labels should a machine learning (ML) algorithm learn to emulate? For ML tasks ranging from online comment toxicity to misinformation detection to medical diagnosis, different groups in society may have irreconcilable disagreements about ground truth labels. Supervised ML today resolves these label disagreements implicitly using majority vote, which overrides minority groups' labels. We intr…
▽ More
Whose labels should a machine learning (ML) algorithm learn to emulate? For ML tasks ranging from online comment toxicity to misinformation detection to medical diagnosis, different groups in society may have irreconcilable disagreements about ground truth labels. Supervised ML today resolves these label disagreements implicitly using majority vote, which overrides minority groups' labels. We introduce jury learning, a supervised ML approach that resolves these disagreements explicitly through the metaphor of a jury: defining which people or groups, in what proportion, determine the classifier's prediction. For example, a jury learning model for online toxicity might centrally feature women and Black jurors, who are commonly targets of online harassment. To enable jury learning, we contribute a deep learning architecture that models every annotator in a dataset, samples from annotators' models to populate the jury, then runs inference to classify. Our architecture enables juries that dynamically adapt their composition, explore counterfactuals, and visualize dissent.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
Goal-setting And Achievement In Activity Tracking Apps: A Case Study Of MyFitnessPal
Authors:
Mitchell L. Gordon,
Tim Althoff,
Jure Leskovec
Abstract:
Activity tracking apps often make use of goals as one of their core motivational tools. There are two critical components to this tool: setting a goal, and subsequently achieving that goal. Despite its crucial role in how a number of prominent self-tracking apps function, there has been relatively little investigation of the goal-setting and achievement aspects of self-tracking apps.
Here we exp…
▽ More
Activity tracking apps often make use of goals as one of their core motivational tools. There are two critical components to this tool: setting a goal, and subsequently achieving that goal. Despite its crucial role in how a number of prominent self-tracking apps function, there has been relatively little investigation of the goal-setting and achievement aspects of self-tracking apps.
Here we explore this issue, investigating a particular goal setting and achievement process that is extensive, recorded, and crucial for both the app and its users' success: weight loss goals in MyFitnessPal. We present a large-scale study of 1.4 million users and weight loss goals, allowing for an unprecedented detailed view of how people set and achieve their goals. We find that, even for difficult long-term goals, behavior within the first 7 days predicts those who ultimately achieve their goals, that is, those who lose at least as much weight as they set out to, and those who do not. For instance, high amounts of early weight loss, which some researchers have classified as unsustainable, leads to higher goal achievement rates. We also show that early food intake, self-monitoring motivation, and attitude towards the goal are important factors. We then show that we can use our findings to predict goal achievement with an accuracy of 79% ROC AUC just 7 days after a goal is set. Finally, we discuss how our findings could inform steps to improve goal achievement in self-tracking apps.
△ Less
Submitted 4 April, 2019;
originally announced April 2019.
-
HYPE: A Benchmark for Human eYe Perceptual Evaluation of Generative Models
Authors:
Sharon Zhou,
Mitchell L. Gordon,
Ranjay Krishna,
Austin Narcomey,
Li Fei-Fei,
Michael S. Bernstein
Abstract:
Generative models often use human evaluations to measure the perceived quality of their outputs. Automated metrics are noisy indirect proxies, because they rely on heuristics or pretrained embeddings. However, up until now, direct human evaluation strategies have been ad-hoc, neither standardized nor validated. Our work establishes a gold standard human benchmark for generative realism. We constru…
▽ More
Generative models often use human evaluations to measure the perceived quality of their outputs. Automated metrics are noisy indirect proxies, because they rely on heuristics or pretrained embeddings. However, up until now, direct human evaluation strategies have been ad-hoc, neither standardized nor validated. Our work establishes a gold standard human benchmark for generative realism. We construct Human eYe Perceptual Evaluation (HYPE) a human benchmark that is (1) grounded in psychophysics research in perception, (2) reliable across different sets of randomly sampled outputs from a model, (3) able to produce separable model performances, and (4) efficient in cost and time. We introduce two variants: one that measures visual perception under adaptive time constraints to determine the threshold at which a model's outputs appear real (e.g. 250ms), and the other a less expensive variant that measures human error rate on fake and real images sans time constraints. We test HYPE across six state-of-the-art generative adversarial networks and two sampling techniques on conditional and unconditional image generation using four datasets: CelebA, FFHQ, CIFAR-10, and ImageNet. We find that HYPE can track model improvements across training epochs, and we confirm via bootstrap sampling that HYPE rankings are consistent and replicable.
△ Less
Submitted 31 October, 2019; v1 submitted 1 April, 2019;
originally announced April 2019.