-
Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns?
Authors:
Kuei-Chun Kao,
Ruochen Wang,
Cho-Jui Hsieh
Abstract:
Large Language Models (LLMs) have demonstrated remarkable performance in solving math problems, a hallmark of human intelligence. Despite high success rates on current benchmarks; however, these often feature simple problems with only one or two unknowns, which do not sufficiently challenge their reasoning capacities. This paper introduces a novel benchmark, BeyondX, designed to address these limi…
▽ More
Large Language Models (LLMs) have demonstrated remarkable performance in solving math problems, a hallmark of human intelligence. Despite high success rates on current benchmarks; however, these often feature simple problems with only one or two unknowns, which do not sufficiently challenge their reasoning capacities. This paper introduces a novel benchmark, BeyondX, designed to address these limitations by incorporating problems with multiple unknowns. Recognizing the challenges in proposing multi-unknown problems from scratch, we developed BeyondX using an innovative automated pipeline that progressively increases complexity by expanding the number of unknowns in simpler problems. Empirical study on BeyondX reveals that the performance of existing LLMs, even those fine-tuned specifically on math tasks, significantly decreases as the number of unknowns increases - with a performance drop of up to 70\% observed in GPT-4. To tackle these challenges, we propose the Formulate-and-Solve strategy, a generalized prompting approach that effectively handles problems with an arbitrary number of unknowns. Our findings reveal that this strategy not only enhances LLM performance on the BeyondX benchmark but also provides deeper insights into the computational limits of LLMs when faced with more complex mathematical challenges.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation
Authors:
Seo Hyun Kim,
Kai Tzu-iunn Ong,
Taeyoon Kwon,
Namyoung Kim,
Keummin Ka,
SeongHyeon Bae,
Yohan Jo,
Seung-won Hwang,
Dongha Lee,
Jinyoung Yeo
Abstract:
Large language models (LLMs) are capable of processing lengthy dialogue histories during prolonged interaction with users without additional memory modules; however, their responses tend to overlook or incorrectly recall information from the past. In this paper, we revisit memory-augmented response generation in the era of LLMs. While prior work focuses on getting rid of outdated memories, we argu…
▽ More
Large language models (LLMs) are capable of processing lengthy dialogue histories during prolonged interaction with users without additional memory modules; however, their responses tend to overlook or incorrectly recall information from the past. In this paper, we revisit memory-augmented response generation in the era of LLMs. While prior work focuses on getting rid of outdated memories, we argue that such memories can provide contextual cues that help dialogue systems understand the development of past events and, therefore, benefit response generation. We present Theanine, a framework that augments LLMs' response generation with memory timelines -- series of memories that demonstrate the development and causality of relevant past events. Along with Theanine, we introduce TeaFarm, a counterfactual-driven question-answering pipeline addressing the limitation of G-Eval in long-term conversations. Supplementary videos of our methods and the TeaBag dataset for TeaFarm evaluation are in https://theanine-693b0.web.app/.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Ever-Evolving Memory by Blending and Refining the Past
Authors:
Seo Hyun Kim,
Keummin Ka,
Yohan Jo,
Seung-won Hwang,
Dongha Lee,
Jinyoung Yeo
Abstract:
For a human-like chatbot, constructing a long-term memory is crucial. However, current large language models often lack this capability, leading to instances of missing important user information or redundantly asking for the same information, thereby diminishing conversation quality. To effectively construct memory, it is crucial to seamlessly connect past and present information, while also poss…
▽ More
For a human-like chatbot, constructing a long-term memory is crucial. However, current large language models often lack this capability, leading to instances of missing important user information or redundantly asking for the same information, thereby diminishing conversation quality. To effectively construct memory, it is crucial to seamlessly connect past and present information, while also possessing the ability to forget obstructive information. To address these challenges, we propose CREEM, a novel memory system for long-term conversation. Improving upon existing approaches that construct memory based solely on current sessions, CREEM blends past memories during memory formation. Additionally, we introduce a refining process to handle redundant or outdated information. Unlike traditional paradigms, we view responding and memory construction as inseparable tasks. The blending process, which creates new memories, also serves as a reasoning step for response generation by informing the connection between past and present. Through evaluation, we demonstrate that CREEM enhances both memory and response qualities in multi-session personalized dialogues.
△ Less
Submitted 7 April, 2024; v1 submitted 3 March, 2024;
originally announced March 2024.
-
Exposing the Obscured Influence of State-Controlled Media: A Causal Estimation of Influence Between Media Outlets Via Quotation Propagation
Authors:
Joseph Schlessinger,
Richard Bennet,
Jacob Coakwell,
Steven T. Smith,
Edward K. Kao
Abstract:
This study quantifies influence between media outlets by applying a novel methodology that uses causal effect estimation on networks and transformer language models. We demonstrate the obscured influence of state-controlled outlets over other outlets, regardless of orientation, by analyzing a large dataset of quotations from over 100 thousand articles published by the most prominent European and R…
▽ More
This study quantifies influence between media outlets by applying a novel methodology that uses causal effect estimation on networks and transformer language models. We demonstrate the obscured influence of state-controlled outlets over other outlets, regardless of orientation, by analyzing a large dataset of quotations from over 100 thousand articles published by the most prominent European and Russian traditional media outlets, appearing between May 2018 and October 2019. The analysis maps out the network structure of influence with news wire services serving as prominent bridges that connect outlets in different geo-political spheres. Overall, this approach demonstrates capabilities to identify and quantify the channels of influence in intermedia agenda setting over specific topics.
△ Less
Submitted 16 January, 2022;
originally announced January 2022.
-
Automatic Detection of Influential Actors in Disinformation Networks
Authors:
Steven T. Smith,
Edward K. Kao,
Erika D. Mackin,
Danelle C. Shah,
Olga Simek,
Donald B. Rubin
Abstract:
The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IOs). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing,…
▽ More
The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IOs). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and a novel network causal inference approach to quantify the impact of individual actors in spreading IO narratives. We demonstrate its capability on real-world hostile IO campaigns with Twitter datasets collected during the 2017 French presidential elections, and known IO accounts disclosed by Twitter over a broad range of IO campaigns (May 2007 to February 2020), over 50,000 accounts, 17 countries, and different account types including both trolls and bots. Our system detects IO accounts with 96% precision, 79% recall, and 96% area-under-the-PR-curve, maps out salient network communities, and discovers high-impact accounts that escape the lens of traditional impact statistics based on activity counts and network centrality. Results are corroborated with independent sources of known IO accounts from U.S. Congressional reports, investigative journalism, and IO datasets provided by Twitter.
△ Less
Submitted 7 January, 2021; v1 submitted 21 May, 2020;
originally announced May 2020.
-
Deploying Image Deblurring across Mobile Devices: A Perspective of Quality and Latency
Authors:
Cheng-Ming Chiang,
Yu Tseng,
Yu-Syuan Xu,
Hsien-Kai Kuo,
Yi-Min Tsai,
Guan-Yu Chen,
Koan-Sin Tan,
Wei-Ting Wang,
Yu-Chieh Lin,
Shou-Yao Roy Tseng,
Wei-Shiang Lin,
Chia-Lin Yu,
BY Shen,
Kloze Kao,
Chia-Ming Cheng,
Hung-Jen Chen
Abstract:
Recently, image enhancement and restoration have become important applications on mobile devices, such as super-resolution and image deblurring. However, most state-of-the-art networks present extremely high computational complexity. This makes them difficult to be deployed on mobile devices with acceptable latency. Moreover, when deploying to different mobile devices, there is a large latency var…
▽ More
Recently, image enhancement and restoration have become important applications on mobile devices, such as super-resolution and image deblurring. However, most state-of-the-art networks present extremely high computational complexity. This makes them difficult to be deployed on mobile devices with acceptable latency. Moreover, when deploying to different mobile devices, there is a large latency variation due to the difference and limitation of deep learning accelerators on mobile devices. In this paper, we conduct a search of portable network architectures for better quality-latency trade-off across mobile devices. We further present the effectiveness of widely used network optimizations for image deblurring task. This paper provides comprehensive experiments and comparisons to uncover the in-depth analysis for both latency and image quality. Through all the above works, we demonstrate the successful deployment of image deblurring application on mobile devices with the acceleration of deep learning accelerators. To the best of our knowledge, this is the first paper that addresses all the deployment issues of image deblurring task across mobile devices. This paper provides practical deployment-guidelines, and is adopted by the championship-winning team in NTIRE 2020 Image Deblurring Challenge on Smartphone Track.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
Influence Estimation on Social Media Networks Using Causal Inference
Authors:
Steven T. Smith,
Edward K. Kao,
Danelle C. Shah,
Olga Simek,
Donald B. Rubin
Abstract:
Estimating influence on social media networks is an important practical and theoretical problem, especially because this new medium is widely exploited as a platform for disinformation and propaganda. This paper introduces a novel approach to influence estimation on social media networks and applies it to the real-world problem of characterizing active influence operations on Twitter during the 20…
▽ More
Estimating influence on social media networks is an important practical and theoretical problem, especially because this new medium is widely exploited as a platform for disinformation and propaganda. This paper introduces a novel approach to influence estimation on social media networks and applies it to the real-world problem of characterizing active influence operations on Twitter during the 2017 French presidential elections. The new influence estimation approach attributes impact by accounting for narrative propagation over the network using a network causal inference framework applied to data arising from graph sampling and filtering. This causal framework infers the difference in outcome as a function of exposure, in contrast to existing approaches that attribute impact to activity volume or topological features, which do not explicitly measure nor necessarily indicate actual network influence. Cramér-Rao estimation bounds are derived for parameter estimation as a step in the causal analysis, and used to achieve geometrical insight on the causal inference problem. The ability to infer high causal influence is demonstrated on real-world social media accounts that are later independently confirmed to be either directly affiliated or correlated with foreign influence operations using evidence supplied by the U.S. Congress and journalistic reports.
△ Less
Submitted 11 April, 2018;
originally announced April 2018.
-
Causal Inference Under Network Interference: A Framework for Experiments on Social Networks
Authors:
Edward K. Kao
Abstract:
No man is an island, as individuals interact and influence one another daily in our society. When social influence takes place in experiments on a population of interconnected individuals, the treatment on a unit may affect the outcomes of other units, a phenomenon known as interference. This thesis develops a causal framework and inference methodology for experiments where interference takes plac…
▽ More
No man is an island, as individuals interact and influence one another daily in our society. When social influence takes place in experiments on a population of interconnected individuals, the treatment on a unit may affect the outcomes of other units, a phenomenon known as interference. This thesis develops a causal framework and inference methodology for experiments where interference takes place on a network of influence (i.e. network interference). In this framework, the network potential outcomes serve as the key quantity and flexible building blocks for causal estimands that represent a variety of primary, peer, and total treatment effects. These causal estimands are estimated via principled Bayesian imputation of missing outcomes. The theory on the unconfoundedness assumptions leading to simplified imputation highlights the importance of including relevant network covariates in the potential outcome model. Additionally, experimental designs that result in balanced covariates and sizes across treatment exposure groups further improve the causal estimate, especially by mitigating potential outcome model mis-specification. The true potential outcome model is not typically known in real-world experiments, so the best practice is to account for interference and confounding network covariates through both balanced designs and model-based imputation. A full factorial simulated experiment is formulated to demonstrate this principle by comparing performance across different randomization schemes during the design phase and estimators during the analysis phase, under varying network topology and true potential outcome models. Overall, this thesis asserts that interference is not just a nuisance for analysis but rather an opportunity for quantifying and leveraging peer effects in real-world experiments.
△ Less
Submitted 28 August, 2017;
originally announced August 2017.
-
Bayesian Discovery of Threat Networks
Authors:
Steven T. Smith,
Edward K. Kao,
Kenneth D. Senne,
Garrett Bernstein,
Scott Philips
Abstract:
A novel unified Bayesian framework for network detection is developed, under which a detection algorithm is derived based on random walks on graphs. The algorithm detects threat networks using partial observations of their activity, and is proved to be optimum in the Neyman-Pearson sense. The algorithm is defined by a graph, at least one observation, and a diffusion model for threat. A link to wel…
▽ More
A novel unified Bayesian framework for network detection is developed, under which a detection algorithm is derived based on random walks on graphs. The algorithm detects threat networks using partial observations of their activity, and is proved to be optimum in the Neyman-Pearson sense. The algorithm is defined by a graph, at least one observation, and a diffusion model for threat. A link to well-known spectral detection methods is provided, and the equivalence of the random walk and harmonic solutions to the Bayesian formulation is proven. A general diffusion model is introduced that utilizes spatio-temporal relationships between vertices, and is used for a specific space-time formulation that leads to significant performance improvements on coordinated covert networks. This performance is demonstrated using a new hybrid mixed-membership blockmodel introduced to simulate random covert networks with realistic properties.
△ Less
Submitted 8 September, 2014; v1 submitted 21 November, 2013;
originally announced November 2013.
-
Network Detection Theory and Performance
Authors:
Steven T. Smith,
Kenneth D. Senne,
Scott Philips,
Edward K. Kao,
Garrett Bernstein
Abstract:
Network detection is an important capability in many areas of applied research in which data can be represented as a graph of entities and relationships. Oftentimes the object of interest is a relatively small subgraph in an enormous, potentially uninteresting background. This aspect characterizes network detection as a "big data" problem. Graph partitioning and network discovery have been major r…
▽ More
Network detection is an important capability in many areas of applied research in which data can be represented as a graph of entities and relationships. Oftentimes the object of interest is a relatively small subgraph in an enormous, potentially uninteresting background. This aspect characterizes network detection as a "big data" problem. Graph partitioning and network discovery have been major research areas over the last ten years, driven by interest in internet search, cyber security, social networks, and criminal or terrorist activities. The specific problem of network discovery is addressed as a special case of graph partitioning in which membership in a small subgraph of interest must be determined. Algebraic graph theory is used as the basis to analyze and compare different network detection methods. A new Bayesian network detection framework is introduced that partitions the graph based on prior information and direct observations. The new approach, called space-time threat propagation, is proved to maximize the probability of detection and is therefore optimum in the Neyman-Pearson sense. This optimality criterion is compared to spectral community detection approaches which divide the global graph into subsets or communities with optimal connectivity properties. We also explore a new generative stochastic model for covert networks and analyze using receiver operating characteristics the detection performance of both classes of optimal detection techniques.
△ Less
Submitted 22 March, 2013;
originally announced March 2013.