Search | arXiv e-print repository

Multi-LogiEval: Towards Evaluating Multi-Step Logical Reasoning Ability of Large Language Models

Authors: Nisarg Patel, Mohith Kulkarni, Mihir Parmar, Aashna Budhiraja, Mutsumi Nakamura, Neeraj Varshney, Chitta Baral

Abstract: As Large Language Models (LLMs) continue to exhibit remarkable performance in natural language understanding tasks, there is a crucial need to measure their ability for human-like multi-step logical reasoning. Existing logical reasoning evaluation benchmarks often focus primarily on simplistic single-step or multi-step reasoning with a limited set of inference rules. Furthermore, the lack of datas… ▽ More As Large Language Models (LLMs) continue to exhibit remarkable performance in natural language understanding tasks, there is a crucial need to measure their ability for human-like multi-step logical reasoning. Existing logical reasoning evaluation benchmarks often focus primarily on simplistic single-step or multi-step reasoning with a limited set of inference rules. Furthermore, the lack of datasets for evaluating non-monotonic reasoning represents a crucial gap since it aligns more closely with human-like reasoning. To address these limitations, we propose Multi-LogiEval, a comprehensive evaluation dataset encompassing multi-step logical reasoning with various inference rules and depths. Multi-LogiEval covers three logic types--propositional, first-order, and non-monotonic--consisting of more than 30 inference rules and more than 60 of their combinations with various depths. Leveraging this dataset, we conduct evaluations on a range of LLMs including GPT-4, ChatGPT, Gemini-Pro, Yi, Orca, and Mistral, employing a zero-shot chain-of-thought. Experimental results show that there is a significant drop in the performance of LLMs as the reasoning steps/depth increases (average accuracy of ~68% at depth-1 to ~43% at depth-5). We further conduct a thorough investigation of reasoning chains generated by LLMs which reveals several important findings. We believe that Multi-LogiEval facilitates future research for evaluating and enhancing the logical reasoning ability of LLMs. Data is available at https://github.com/Mihir3009/Multi-LogiEval. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 23 Pages

arXiv:2110.15547 [pdf, ps, other]

Does Momentum Help? A Sample Complexity Analysis

Authors: Swetha Ganesh, Rohan Deb, Gugan Thoppe, Amarjit Budhiraja

Abstract: Stochastic Heavy Ball (SHB) and Nesterov's Accelerated Stochastic Gradient (ASG) are popular momentum methods in stochastic optimization. While benefits of such acceleration ideas in deterministic settings are well understood, their advantages in stochastic optimization is still unclear. In fact, in some specific instances, it is known that momentum does not help in the sample complexity sense. Ou… ▽ More Stochastic Heavy Ball (SHB) and Nesterov's Accelerated Stochastic Gradient (ASG) are popular momentum methods in stochastic optimization. While benefits of such acceleration ideas in deterministic settings are well understood, their advantages in stochastic optimization is still unclear. In fact, in some specific instances, it is known that momentum does not help in the sample complexity sense. Our work shows that a similar outcome actually holds for the whole of quadratic optimization. Specifically, we obtain a lower bound on the sample complexity of SHB and ASG for this family and show that the same bound can be achieved by the vanilla SGD. We note that there exist results claiming the superiority of momentum based methods in quadratic optimization, but these are based on one-sided or flawed analyses. △ Less

Submitted 11 July, 2022; v1 submitted 29 October, 2021; originally announced October 2021.

arXiv:2110.03665 [pdf, other]

Revisiting SVD to generate powerful Node Embeddings for Recommendation Systems

Authors: Amar Budhiraja

Abstract: Graph Representation Learning (GRL) is an upcoming and promising area in recommendation systems. In this paper, we revisit the Singular Value Decomposition (SVD) of adjacency matrix for embedding generation of users and items and use a two-layer neural network on top of these embeddings to learn relevance between user-item pairs. Inspired by the success of higher-order learning in GRL, we further… ▽ More Graph Representation Learning (GRL) is an upcoming and promising area in recommendation systems. In this paper, we revisit the Singular Value Decomposition (SVD) of adjacency matrix for embedding generation of users and items and use a two-layer neural network on top of these embeddings to learn relevance between user-item pairs. Inspired by the success of higher-order learning in GRL, we further propose an extension of this method to include two-hop neighbors for SVD through the second order of the adjacency matrix and demonstrate improved performance compared with the simple SVD method which only uses one-hop neighbors. Empirical validation on three publicly available datasets of recommendation system demonstrates that the proposed methods, despite being simple, beat many state-of-the-art methods and for two of three datasets beats all of them up to a margin of 10%. Through our research, we want to shed light on the effectiveness of matrix factorization approaches, specifically SVD, in the deep learning era and show that these methods still contribute as important baselines in recommendation systems. △ Less

Submitted 5 October, 2021; originally announced October 2021.

Comments: 7 pages, 3 figures, and 4 tables

arXiv:2008.03646 [pdf, other]

doi 10.1109/ICASSP40776.2020.9053425

Augmenting Molecular Images with Vector Representations as a Featurization Technique for Drug Classification

Authors: Daniel de Marchi, Amarjit Budhiraja

Abstract: One of the key steps in building deep learning systems for drug classification and generation is the choice of featurization for the molecules. Previous featurization methods have included molecular images, binary strings, graphs, and SMILES strings. This paper proposes the creation of molecular images captioned with binary vectors that encode information not contained in or easily understood from… ▽ More One of the key steps in building deep learning systems for drug classification and generation is the choice of featurization for the molecules. Previous featurization methods have included molecular images, binary strings, graphs, and SMILES strings. This paper proposes the creation of molecular images captioned with binary vectors that encode information not contained in or easily understood from a molecular image alone. Specifically, we use Morgan fingerprints, which encode higher level structural information, and MACCS keys, which encode yes or no questions about a molecules properties and structure. We tested our method on the HIV dataset published by the Pande lab, which consists of 41,127 molecules labeled by if they inhibit the HIV virus. Our final model achieved a state of the art AUC ROC on the HIV dataset, outperforming all other methods. Moreover, the model converged significantly faster than most other methods, requiring dramatically less computational power than unaugmented images. △ Less

Submitted 9 August, 2020; originally announced August 2020.

Journal ref: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

arXiv:2004.09095 [pdf, other]

The State and Fate of Linguistic Diversity and Inclusion in the NLP World

Authors: Pratik Joshi, Sebastin Santy, Amar Budhiraja, Kalika Bali, Monojit Choudhury

Abstract: Language technologies contribute to promoting multilingualism and linguistic diversity around the world. However, only a very small number of the over 7000 languages of the world are represented in the rapidly evolving language technologies and applications. In this paper we look at the relation between the types of languages, resources, and their representation in NLP conferences to understand th… ▽ More Language technologies contribute to promoting multilingualism and linguistic diversity around the world. However, only a very small number of the over 7000 languages of the world are represented in the rapidly evolving language technologies and applications. In this paper we look at the relation between the types of languages, resources, and their representation in NLP conferences to understand the trajectory that different languages have followed over time. Our quantitative investigation underlines the disparity between languages, especially in terms of their resources, and calls into question the "language agnostic" status of current models and systems. Through this paper, we attempt to convince the ACL community to prioritise the resolution of the predicaments highlighted here, so that no language is left behind. △ Less

Submitted 26 January, 2021; v1 submitted 20 April, 2020; originally announced April 2020.

Comments: Accepted at ACL 2020 (10 pages + 2 pages Appendix). P.J., S.S. and A.B. contributed equally

arXiv:2001.10495 [pdf, other]

Rich-Item Recommendations for Rich-Users: Exploiting Dynamic and Static Side Information

Authors: Amar Budhiraja, Gaurush Hiranandani, Darshak Chhatbar, Aditya Sinha, Navya Yarrabelly, Ayush Choure, Oluwasanmi Koyejo, Prateek Jain

Abstract: In this paper, we study the problem of recommendation system where the users and items to be recommended are rich data structures with multiple entity types and with multiple sources of side-information in the form of graphs. We provide a general formulation for the problem that captures the complexities of modern real-world recommendations and generalizes many existing formulations. In our formul… ▽ More In this paper, we study the problem of recommendation system where the users and items to be recommended are rich data structures with multiple entity types and with multiple sources of side-information in the form of graphs. We provide a general formulation for the problem that captures the complexities of modern real-world recommendations and generalizes many existing formulations. In our formulation, each user/document that requires a recommendation and each item or tag that is to be recommended, both are modeled by a set of static entities and a dynamic component. The relationships between entities are captured by several weighted bipartite graphs. To effectively exploit these complex interactions and learn the recommendation model, we propose MEDRES- a multiple graph-CNN based novel deep-learning architecture. MEDRES uses AL-GCN, a novel graph convolution network block, that harnesses strong representative features from the underlying graphs. Moreover, in order to capture highly heterogeneous engagement of different users with the system and constraints on the number of items to be recommended, we propose a novel ranking metric pAp@k along with a method to optimize the metric directly. We demonstrate effectiveness of our method on two benchmarks: a) citation data, b) Flickr data. In addition, we present two real-world case studies of our formulation and the MEDRES architecture. We show how our technique can be used to naturally model the message recommendation problem and the teams recommendation problem in the Microsoft Teams (MSTeams) product and demonstrate that it is 5-6% points more accurate than the production-grade models. △ Less

Submitted 26 July, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

Comments: The first two authors contributed equally. 21 pages, 8 figures and 6 tables

arXiv:1812.11172 [pdf, other]

doi 10.1007/s10514-019-09856-1

Distributed Assignment with Limited Communication for Multi-Robot Multi-Target Tracking

Authors: Yoonchang Sung, Ashish Kumar Budhiraja, Ryan K. Williams, Pratap Tokekar

Abstract: We study the problem of tracking multiple moving targets using a team of mobile robots. Each robot has a set of motion primitives to choose from in order to collectively maximize the number of targets tracked or the total quality of tracking. Our focus is on scenarios where communication is limited and the robots have limited time to share information with their neighbors. As a result, we seek dis… ▽ More We study the problem of tracking multiple moving targets using a team of mobile robots. Each robot has a set of motion primitives to choose from in order to collectively maximize the number of targets tracked or the total quality of tracking. Our focus is on scenarios where communication is limited and the robots have limited time to share information with their neighbors. As a result, we seek distributed algorithms that can find solutions in bounded amount of time. We present two algorithms: (1) a greedy algorithm that is guaranteed finds a $2$-approximation to the optimal (centralized) solution albeit requiring $|R|$ communication rounds in the worst-case, where $|R|$ denotes the number of robots; and (2) a local algorithm that finds a $\mathcal{O}\left((1+ε)(1+1/h)\right)$-approximation algorithm in $\mathcal{O}(h\log 1/ε)$ communication rounds. Here, $h$ and $ε$ are parameters that allow the user to trade-off the solution quality with communication time. In addition to theoretical results, we present empirical evaluation including comparisons with centralized optimal solutions. △ Less

Submitted 28 May, 2019; v1 submitted 22 December, 2018; originally announced December 2018.

Comments: 18 pages, 17 figures, Published in Autonomous Robots. arXiv admin note: text overlap with arXiv:1706.02245

arXiv:1706.02245 [pdf, other]

Distributed Simultaneous Action and Target Assignment for Multi-Robot Multi-Target Tracking

Authors: Yoonchang Sung, Ashish Kumar Budhiraja, Ryan K. Williams, Pratap Tokekar

Abstract: We study a multi-robot assignment problem for multi-target tracking. The proposed problem can be viewed as the mixed packing and covering problem. To deal with a limitation on both sensing and communication ranges, a distributed approach is taken into consideration. A local algorithm gives theoretical bounds on both the running time and approximation ratio to an optimal solution. We employ a local… ▽ More We study a multi-robot assignment problem for multi-target tracking. The proposed problem can be viewed as the mixed packing and covering problem. To deal with a limitation on both sensing and communication ranges, a distributed approach is taken into consideration. A local algorithm gives theoretical bounds on both the running time and approximation ratio to an optimal solution. We employ a local algorithm of max-min linear programs to solve the proposed task. Simulation result shows that a local algorithm is an effective solution to the multi-robot task allocation. △ Less

Submitted 6 November, 2018; v1 submitted 7 June, 2017; originally announced June 2017.

Comments: 6 pages, 9 figures, Published in IEEE International Conference on Robotics and Automation (ICRA), 2018

arXiv:1704.00079 [pdf, other]

Algorithms for Routing of Unmanned Aerial Vehicles with Mobile Recharging Stations

Authors: Kevin Yu, Ashish Kumar Budhiraja, Pratap Tokekar

Abstract: We study the problem of planning a tour for an energy-limited Unmanned Aerial Vehicle (UAV) to visit a set of sites in the least amount of time. We envision scenarios where the UAV can be recharged along the way either by landing on stationary recharging stations or on Unmanned Ground Vehicles (UGVs) acting as mobile recharging stations. This leads to a new variant of the Traveling Salesperson Pro… ▽ More We study the problem of planning a tour for an energy-limited Unmanned Aerial Vehicle (UAV) to visit a set of sites in the least amount of time. We envision scenarios where the UAV can be recharged along the way either by landing on stationary recharging stations or on Unmanned Ground Vehicles (UGVs) acting as mobile recharging stations. This leads to a new variant of the Traveling Salesperson Problem (TSP) with mobile recharging stations. We present an algorithm that finds not only the order in which to visit the sites but also when and where to land on the charging stations to recharge. Our algorithm plans tours for the UGVs as well as determines best locations to place stationary charging stations. While the problems we study are NP-Hard, we present a practical solution using Generalized TSP that finds the optimal solution. If the UGVs are slower, the algorithm also finds the minimum number of UGVs required to support the UAV mission such that the UAV is not required to wait for the UGV. Our simulation results show that the running time is acceptable for reasonably sized instances in practice. △ Less

Submitted 18 September, 2017; v1 submitted 31 March, 2017; originally announced April 2017.

Comments: 7 pages, 14 figures, ICRA2018 under review

arXiv:1612.03246 [pdf, other]

Algorithms for Visibility-Based Monitoring with Robot Teams

Authors: Pratap Tokekar, Ashish Kumar Budhiraja, Vijay Kumar

Abstract: We study the problem of planning paths for a team of robots for visually monitoring an environment. Our work is motivated by surveillance and persistent monitoring applications. We are given a set of target points in a polygonal environment that must be monitored using robots with cameras. The goal is to compute paths for all robots such that every target is visible from at least one path. In its… ▽ More We study the problem of planning paths for a team of robots for visually monitoring an environment. Our work is motivated by surveillance and persistent monitoring applications. We are given a set of target points in a polygonal environment that must be monitored using robots with cameras. The goal is to compute paths for all robots such that every target is visible from at least one path. In its general form, this problem is NP-hard as it generalizes the Art Gallery Problem and the Watchman Route Problem. We study two versions: (i) a geometric version in \emph{street polygons} for which we give a polynomial time $4$--approximation algorithm; and (ii) a general version for which we present a practical solution that finds the optimal solution in possibly exponential time. In addition to theoretical proofs, we also present results from simulation studies. △ Less

Submitted 9 December, 2016; originally announced December 2016.

Comments: 12 pages, 10 figures, submitted to IEEE Transactions on Robotics(TRO) and preliminary version of the paper was published in Intelligent Robots and Systems (IROS), 2015

arXiv:1510.05193 [pdf, other]

Source detection algorithms for dynamic contaminants based on the analysis of a hydrodynamic limit

Authors: Sergio A. Almada Monter, Amarjit Budhiraja, Jan Hannig

Abstract: In this work we propose and numerically analyze an algorithm for detection of a contaminant source using a dynamic sensor network. The algorithm is motivated using a global probabilistic optimization problem and is based on the analysis of the hydrodynamic limit of a discrete time evolution equation on the lattice under a suitable scaling of time and space. Numerical results illustrating the effec… ▽ More In this work we propose and numerically analyze an algorithm for detection of a contaminant source using a dynamic sensor network. The algorithm is motivated using a global probabilistic optimization problem and is based on the analysis of the hydrodynamic limit of a discrete time evolution equation on the lattice under a suitable scaling of time and space. Numerical results illustrating the effectiveness of the algorithm are presented. △ Less

Submitted 19 October, 2015; v1 submitted 17 October, 2015; originally announced October 2015.

MSC Class: 60G35; 65C35; 86A22; 93E10

Showing 1–11 of 11 results for author: Budhiraja, A