-
Implicit degree bias in the link prediction task
Authors:
Rachith Aiyappa,
Xin Wang,
Munjung Kim,
Ozgur Can Seckin,
Jisung Yoon,
Yong-Yeol Ahn,
Sadamori Kojaku
Abstract:
Link prediction -- a task of distinguishing actual hidden edges from random unconnected node pairs -- is one of the quintessential tasks in graph machine learning. Despite being widely accepted as a universal benchmark and a downstream task for representation learning, the validity of the link prediction benchmark itself has been rarely questioned. Here, we show that the common edge sampling proce…
▽ More
Link prediction -- a task of distinguishing actual hidden edges from random unconnected node pairs -- is one of the quintessential tasks in graph machine learning. Despite being widely accepted as a universal benchmark and a downstream task for representation learning, the validity of the link prediction benchmark itself has been rarely questioned. Here, we show that the common edge sampling procedure in the link prediction task has an implicit bias toward high-degree nodes and produces a highly skewed evaluation that favors methods overly dependent on node degree, to the extent that a ``null'' link prediction method based solely on node degree can yield nearly optimal performance. We propose a degree-corrected link prediction task that offers a more reasonable assessment that aligns better with the performance in the recommendation task. Finally, we demonstrate that the degree-corrected benchmark can more effectively train graph machine-learning models by reducing overfitting to node degrees and facilitating the learning of relevant structures in graphs.
△ Less
Submitted 29 May, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Benchmarking zero-shot stance detection with FlanT5-XXL: Insights from training data, prompting, and decoding strategies into its near-SoTA performance
Authors:
Rachith Aiyappa,
Shruthi Senthilmani,
Jisun An,
Haewoon Kwak,
Yong-Yeol Ahn
Abstract:
We investigate the performance of LLM-based zero-shot stance detection on tweets. Using FlanT5-XXL, an instruction-tuned open-source LLM, with the SemEval 2016 Tasks 6A, 6B, and P-Stance datasets, we study the performance and its variations under different prompts and decoding strategies, as well as the potential biases of the model. We show that the zero-shot approach can match or outperform stat…
▽ More
We investigate the performance of LLM-based zero-shot stance detection on tweets. Using FlanT5-XXL, an instruction-tuned open-source LLM, with the SemEval 2016 Tasks 6A, 6B, and P-Stance datasets, we study the performance and its variations under different prompts and decoding strategies, as well as the potential biases of the model. We show that the zero-shot approach can match or outperform state-of-the-art benchmarks, including fine-tuned models. We provide various insights into its performance including the sensitivity to instructions and prompts, the decoding strategies, the perplexity of the prompts, and to negations and oppositions present in prompts. Finally, we ensure that the LLM has not been trained on test datasets, and identify a positivity bias which may partially explain the performance differences across decoding strategie
△ Less
Submitted 29 February, 2024;
originally announced March 2024.
-
Can we trust the evaluation on ChatGPT?
Authors:
Rachith Aiyappa,
Jisun An,
Haewoon Kwak,
Yong-Yeol Ahn
Abstract:
ChatGPT, the first large language model (LLM) with mass adoption, has demonstrated remarkable performance in numerous natural language tasks. Despite its evident usefulness, evaluating ChatGPT's performance in diverse problem domains remains challenging due to the closed nature of the model and its continuous updates via Reinforcement Learning from Human Feedback (RLHF). We highlight the issue of…
▽ More
ChatGPT, the first large language model (LLM) with mass adoption, has demonstrated remarkable performance in numerous natural language tasks. Despite its evident usefulness, evaluating ChatGPT's performance in diverse problem domains remains challenging due to the closed nature of the model and its continuous updates via Reinforcement Learning from Human Feedback (RLHF). We highlight the issue of data contamination in ChatGPT evaluations, with a case study of the task of stance detection. We discuss the challenge of preventing data contamination and ensuring fair model evaluation in the age of closed and continuously trained models.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
A Multi-Platform Collection of Social Media Posts about the 2022 U.S. Midterm Elections
Authors:
Rachith Aiyappa,
Matthew R. DeVerna,
Manita Pote,
Bao Tran Truong,
Wanying Zhao,
David Axelrod,
Aria Pessianzadeh,
Zoher Kachwala,
Munjung Kim,
Ozgur Can Seckin,
Minsuk Kim,
Sunny Gandhi,
Amrutha Manikonda,
Francesco Pierri,
Filippo Menczer,
Kai-Cheng Yang
Abstract:
Social media are utilized by millions of citizens to discuss important political issues. Politicians use these platforms to connect with the public and broadcast policy positions. Therefore, data from social media has enabled many studies of political discussion. While most analyses are limited to data from individual platforms, people are embedded in a larger information ecosystem spanning multip…
▽ More
Social media are utilized by millions of citizens to discuss important political issues. Politicians use these platforms to connect with the public and broadcast policy positions. Therefore, data from social media has enabled many studies of political discussion. While most analyses are limited to data from individual platforms, people are embedded in a larger information ecosystem spanning multiple social networks. Here we describe and provide access to the Indiana University 2022 U.S. Midterms Multi-Platform Social Media Dataset (MEIU22), a collection of social media posts from Twitter, Facebook, Instagram, Reddit, and 4chan. MEIU22 links to posts about the midterm elections based on a comprehensive list of keywords and tracks the social media accounts of 1,011 candidates from October 1 to December 25, 2022. We also publish the source code of our pipeline to enable similar multi-platform research projects.
△ Less
Submitted 26 March, 2023; v1 submitted 16 January, 2023;
originally announced January 2023.
-
Emergence of simple and complex contagion dynamics from weighted belief networks
Authors:
Rachith Aiyappa,
Alessandro Flammini,
Yong-Yeol Ahn
Abstract:
Social contagion is a ubiquitous and fundamental process that drives individual and social changes. Although social contagion arises as a result of cognitive processes and biases, the integration of cognitive mechanisms with the theory of social contagion remains an open challenge. In particular, studies on social phenomena usually assume contagion dynamics to be either simple or complex, rather t…
▽ More
Social contagion is a ubiquitous and fundamental process that drives individual and social changes. Although social contagion arises as a result of cognitive processes and biases, the integration of cognitive mechanisms with the theory of social contagion remains an open challenge. In particular, studies on social phenomena usually assume contagion dynamics to be either simple or complex, rather than allowing it to emerge from cognitive mechanisms, despite empirical evidence indicating that a social system can exhibit a spectrum of contagion dynamics -- from simple to complex -- simultaneously. Here, we propose a model of interacting beliefs, from which both simple and complex contagion dynamics can organically arise. Our model also elucidates how a fundamental mechanism of complex contagion -- resistance -- can come about from cognitive mechanisms.
△ Less
Submitted 29 April, 2024; v1 submitted 5 January, 2023;
originally announced January 2023.
-
Identifying and characterizing superspreaders of low-credibility content on Twitter
Authors:
Matthew R. DeVerna,
Rachith Aiyappa,
Diogo Pacheco,
John Bryden,
Filippo Menczer
Abstract:
The world's digital information ecosystem continues to struggle with the spread of misinformation. Prior work has suggested that users who consistently disseminate a disproportionate amount of low-credibility content -- so-called superspreaders -- are at the center of this problem. We quantitatively confirm this hypothesis and introduce simple metrics to predict the top superspreaders several mont…
▽ More
The world's digital information ecosystem continues to struggle with the spread of misinformation. Prior work has suggested that users who consistently disseminate a disproportionate amount of low-credibility content -- so-called superspreaders -- are at the center of this problem. We quantitatively confirm this hypothesis and introduce simple metrics to predict the top superspreaders several months into the future. We then conduct a qualitative review to characterize the most prolific superspreaders and analyze their sharing behaviors. Superspreaders include pundits with large followings, low-credibility media outlets, personal accounts affiliated with those media outlets, and a range of influencers. They are primarily political in nature and use more toxic language than the typical user sharing misinformation. We also find concerning evidence that suggests Twitter may be overlooking prominent superspreaders. We hope this work will further public understanding of bad actors and promote steps to mitigate their negative impacts on healthy digital discourse.
△ Less
Submitted 30 January, 2024; v1 submitted 19 July, 2022;
originally announced July 2022.