Skip to main content

Showing 1–50 of 60 results for author: Upadhyay, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.09823  [pdf, other

    cs.CL cs.AI

    NativQA: Multilingual Culturally-Aligned Natural Query for LLMs

    Authors: Md. Arid Hasan, Maram Hasanain, Fatema Ahmad, Sahinur Rahman Laskar, Sunaya Upadhyay, Vrunda N Sukhadia, Mucahid Kutlu, Shammur Absar Chowdhury, Firoj Alam

    Abstract: Natural Question Answering (QA) datasets play a crucial role in developing and evaluating the capabilities of large language models (LLMs), ensuring their effective usage in real-world applications. Despite the numerous QA datasets that have been developed, there is a notable lack of region-specific datasets generated by native users in their own languages. This gap hinders the effective benchmark… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: LLMs, Native, Multilingual, Language Diversity, Contextual Understanding, Minority Languages, Culturally Informed, Foundation Models, Large Language Models

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  2. arXiv:2407.04966  [pdf, other

    cs.SD cs.LG eess.AS

    A Layer-Anchoring Strategy for Enhancing Cross-Lingual Speech Emotion Recognition

    Authors: Shreya G. Upadhyay, Carlos Busso, Chi-Chun Lee

    Abstract: Cross-lingual speech emotion recognition (SER) is important for a wide range of everyday applications. While recent SER research relies heavily on large pretrained models for emotion training, existing studies often concentrate solely on the final transformer layer of these models. However, given the task-specific nature and hierarchical architecture of these models, each transformer layer encapsu… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  3. arXiv:2406.06519  [pdf, other

    cs.IR

    UMBRELA: UMbrela is the (Open-Source Reproduction of the) Bing RELevance Assessor

    Authors: Shivani Upadhyay, Ronak Pradeep, Nandan Thakur, Nick Craswell, Jimmy Lin

    Abstract: Copious amounts of relevance judgments are necessary for the effective training and accurate evaluation of retrieval systems. Conventionally, these judgments are made by human assessors, rendering this process expensive and laborious. A recent study by Thomas et al. from Microsoft Bing suggested that large language models (LLMs) can accurately perform the relevance assessment task and provide huma… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 5 pages, 3 figures

  4. arXiv:2405.10311  [pdf, other

    cs.IR

    UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models

    Authors: Sahel Sharifymoghaddam, Shivani Upadhyay, Wenhu Chen, Jimmy Lin

    Abstract: Recently, Multi-Modal(MM) Large Language Models(LLMs) have unlocked many complex use-cases that require MM understanding (e.g., image captioning or visual question answering) and MM generation (e.g., text-guided image generation or editing) capabilities. To further improve the output fidelity of MM-LLMs we introduce the model-agnostic UniRAG technique that adds relevant retrieved information to pr… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 11 pages, 7 figures

  5. arXiv:2405.04727  [pdf, other

    cs.IR

    LLMs Can Patch Up Missing Relevance Judgments in Evaluation

    Authors: Shivani Upadhyay, Ehsan Kamalloo, Jimmy Lin

    Abstract: Unjudged documents or holes in information retrieval benchmarks are considered non-relevant in evaluation, yielding no gains in measuring effectiveness. However, these missing judgments may inadvertently introduce biases into the evaluation as their prevalence for a retrieval model is heavily contingent on the pooling process. Thus, filling holes becomes crucial in ensuring reliable and accurate e… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 5 pages, 4 figures

  6. arXiv:2403.12031  [pdf, other

    cs.LG cs.AI

    RouterBench: A Benchmark for Multi-LLM Routing System

    Authors: Qitian Jason Hu, Jacob Bieker, Xiuyu Li, Nan Jiang, Benjamin Keigwin, Gaurav Ranganath, Kurt Keutzer, Shriyash Kaustubh Upadhyay

    Abstract: As the range of applications for Large Language Models (LLMs) continues to grow, the demand for effective serving solutions becomes increasingly critical. Despite the versatility of LLMs, no single model can optimally address all tasks and applications, particularly when balancing performance with cost. This limitation has led to the development of LLM routing systems, which combine the strengths… ▽ More

    Submitted 28 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

  7. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  8. arXiv:2403.00470  [pdf, other

    astro-ph.IM astro-ph.EP cs.LG cs.RO

    Autonomous Robotic Arm Manipulation for Planetary Missions using Causal Machine Learning

    Authors: C. McDonnell, M. Arana-Catania, S. Upadhyay

    Abstract: Autonomous robotic arm manipulators have the potential to make planetary exploration and in-situ resource utilization missions more time efficient and productive, as the manipulator can handle the objects itself and perform goal-specific actions. We train a manipulator to autonomously study objects of which it has no prior knowledge, such as planetary rocks. This is achieved using causal machine l… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 8 pages, ASTRA 2023: 17th Symposium on Advanced Space Technologies in Robotics and Automation, 18-20 October 2023, Leiden, The Netherlands

  9. arXiv:2402.17937  [pdf, other

    cs.RO

    Can an LLM-Powered Socially Assistive Robot Effectively and Safely Deliver Cognitive Behavioral Therapy? A Study With University Students

    Authors: Mina J. Kian, Mingyu Zong, Katrin Fischer, Abhyuday Singh, Anna-Maria Velentza, Pau Sang, Shriya Upadhyay, Anika Gupta, Misha A. Faruki, Wallace Browning, Sebastien M. R. Arnold, Bhaskar Krishnamachari, Maja J. Mataric

    Abstract: Cognitive behavioral therapy (CBT) is a widely used therapeutic method for guiding individuals toward restructuring their thinking patterns as a means of addressing anxiety, depression, and other challenges. We developed a large language model (LLM)-powered prompt-engineered socially assistive robot (SAR) that guides participants through interactive CBT at-home exercises. We evaluated the performa… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  10. Epistral Network: Revolutionizing Media Curation and Consumption through Decentralization

    Authors: Dipankar Sarkar, Shubham Upadhyay

    Abstract: Blockchain technology has revolutionized media consumption and distribution in the digital age, allowing creators, consumers, and regulators to participate in a decentralized, fair, and engaging media environment. Epistral, an innovative media network that leverages blockchain technology, aims to be the world's first anti-mimetic media curation and consumption network, addressing the core challeng… ▽ More

    Submitted 10 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  11. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  12. arXiv:2310.12963  [pdf, other

    cs.CL cs.AI

    AutoMix: Automatically Mixing Language Models

    Authors: Pranjal Aggarwal, Aman Madaan, Ankit Anand, Srividya Pranavi Potharaju, Swaroop Mishra, Pei Zhou, Aditya Gupta, Dheeraj Rajagopal, Karthik Kappaganthu, Yiming Yang, Shyam Upadhyay, Manaal Faruqui, Mausam

    Abstract: Large language models (LLMs) are now available from cloud API providers in various sizes and configurations. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present Automix, an approach that strategically routes queries to larger LMs, based on the approximate correctness… ▽ More

    Submitted 28 June, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: The first two authors contributed equally. Work started and partly done during Aman's internship at Google. This version adds results on additional models and datasets

  13. arXiv:2310.03051  [pdf, other

    cs.CL cs.AI

    How FaR Are Large Language Models From Agents with Theory-of-Mind?

    Authors: Pei Zhou, Aman Madaan, Srividya Pranavi Potharaju, Aditya Gupta, Kevin R. McKee, Ari Holtzman, Jay Pujara, Xiang Ren, Swaroop Mishra, Aida Nematzadeh, Shyam Upadhyay, Manaal Faruqui

    Abstract: "Thinking is for Doing." Humans can infer other people's mental states from observations--an ability called Theory-of-Mind (ToM)--and subsequently act pragmatically on those inferences. Existing question answering benchmarks such as ToMi ask models questions to make inferences about beliefs of characters in a story, but do not test whether models can then use these inferences to guide their action… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: Preprint, 18 pages, 6 figures, 6 tables

  14. arXiv:2307.10633  [pdf, other

    cs.CL cs.LG

    Multi-Method Self-Training: Improving Code Generation With Text, And Vice Versa

    Authors: Shriyash K. Upadhyay, Etan J. Ginsberg

    Abstract: Large Language Models have many methods for solving the same problem. This introduces novel strengths (different methods may work well for different problems) and weaknesses (it may be difficult for users to know which method to use). In this paper, we introduce Multi-Method Self-Training (MMST), where one method is trained on the filtered outputs of another, allowing us to augment the strengths a… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: 23 pages, 3 figures

  15. arXiv:2307.08970  [pdf, other

    cs.LG cs.CR

    A Unifying Framework for Differentially Private Sums under Continual Observation

    Authors: Monika Henzinger, Jalaj Upadhyay, Sarvagya Upadhyay

    Abstract: We study the problem of maintaining a differentially private decaying sum under continual observation. We give a unifying framework and an efficient algorithm for this problem for \emph{any sufficiently smooth} function. Our algorithm is the first differentially private algorithm that does not have a multiplicative error for polynomially-decaying weights. Our algorithm improves on all prior works… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: 32 pages

  16. arXiv:2303.05432  [pdf, other

    physics.soc-ph cs.SI q-fin.CP

    Describing the effect of influential spreaders on the different sectors of Indian market: a complex networks perspective

    Authors: Anwesha Sengupta, Shashankaditya Upadhyay, Indranil Mukherjee, Prasanta K. Panigrahi

    Abstract: Market competition has a role which is directly or indirectly associated with influential effects of individual sectors on other sectors of the economy. The present work studies the relative position of a product in the market through the identification of influential spreaders and its corresponding effect on the other sectors of the market using complex network analysis during the pre-, in-, and… ▽ More

    Submitted 17 October, 2022; originally announced March 2023.

    Comments: 17 pages, 21 figures

    Journal ref: J Comput Soc Sc (2023) 1-41

  17. arXiv:2301.09244  [pdf, other

    cs.CL cs.AI

    Efficient Encoders for Streaming Sequence Tagging

    Authors: Ayush Kaushal, Aditya Gupta, Shyam Upadhyay, Manaal Faruqui

    Abstract: A naive application of state-of-the-art bidirectional encoders for streaming sequence tagging would require encoding each token from scratch for each new token in an incremental streaming input (like transcribed speech). The lack of re-usability of previous computation leads to a higher number of Floating Point Operations (or FLOPs) and higher number of unnecessary label flips. Increased FLOPs con… ▽ More

    Submitted 16 March, 2023; v1 submitted 22 January, 2023; originally announced January 2023.

    Comments: EACL 2023 Camera-ready

  18. arXiv:2212.03495  [pdf, other

    stat.ML cs.HC cs.LG

    Metric Elicitation; Moving from Theory to Practice

    Authors: Safinah Ali, Sohini Upadhyay, Gaurush Hiranandani, Elena L. Glassman, Oluwasanmi Koyejo

    Abstract: Metric Elicitation (ME) is a framework for eliciting classification metrics that better align with implicit user preferences based on the task and context. The existing ME strategy so far is based on the assumption that users can most easily provide preference feedback over classifier statistics such as confusion matrices. This work examines ME, by providing a first ever implementation of the ME s… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: The paper to appear at Human-Centered AI workshop at NeurIPS, 2022. arXiv admin note: text overlap with arXiv:2208.09142

  19. arXiv:2211.07514  [pdf, other

    cs.CL

    CST5: Data Augmentation for Code-Switched Semantic Parsing

    Authors: Anmol Agarwal, Jigar Gupta, Rahul Goel, Shyam Upadhyay, Pankaj Joshi, Rengarajan Aravamudhan

    Abstract: Extending semantic parsers to code-switched input has been a challenging problem, primarily due to a lack of supervised training data. In this work, we introduce CST5, a new data augmentation technique that finetunes a T5 model using a small seed set ($\approx$100 utterances) to generate code-switched utterances from English utterances. We show that CST5 generates high quality code-switched data,… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

  20. arXiv:2211.05006  [pdf, other

    cs.LG cs.CR cs.DS

    Almost Tight Error Bounds on Differentially Private Continual Counting

    Authors: Monika Henzinger, Jalaj Upadhyay, Sarvagya Upadhyay

    Abstract: The first large-scale deployment of private federated learning uses differentially private counting in the continual release model as a subroutine (Google AI blog titled "Federated Learning with Formal Differential Privacy Guarantees"). In this case, a concrete bound on the error is very relevant to reduce the privacy parameter. The standard mechanism for continual counting is the binary mechanism… ▽ More

    Submitted 5 February, 2024; v1 submitted 9 November, 2022; originally announced November 2022.

    Comments: Updated the citations to include two papers we learned about since version 01

  21. arXiv:2208.13322  [pdf, other

    cs.CL cs.SD eess.AS

    Streaming Intended Query Detection using E2E Modeling for Continued Conversation

    Authors: Shuo-yiin Chang, Guru Prakash, Zelin Wu, Qiao Liang, Tara N. Sainath, Bo Li, Adam Stambler, Shyam Upadhyay, Manaal Faruqui, Trevor Strohman

    Abstract: In voice-enabled applications, a predetermined hotword isusually used to activate a device in order to attend to the query.However, speaking queries followed by a hotword each timeintroduces a cognitive burden in continued conversations. Toavoid repeating a hotword, we propose a streaming end-to-end(E2E) intended query detector that identifies the utterancesdirected towards the device and filters… ▽ More

    Submitted 28 August, 2022; originally announced August 2022.

    Comments: 5 pages, Interspeech 2022

  22. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  23. arXiv:2205.12816  [pdf, other

    cs.NI

    P4Filter: A two level defensive mechanism against attacks in SDN using P4

    Authors: Ananya Saxena, Ritvik Muttreja, Shivam Upadhyay, K. Shiv Kumar, Venkanna U

    Abstract: The advancements in networking technologies have led to a new paradigm of controlling networks, with data plane programmability as a basis. This facility opens up many advantages, such as flexibility in packet processing and better network management, which leads to better security in the network. However, the current literature lacks network security solutions concerning authentication and preven… ▽ More

    Submitted 6 June, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

  24. Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations

    Authors: Jessica Dai, Sohini Upadhyay, Ulrich Aivodji, Stephen H. Bach, Himabindu Lakkaraju

    Abstract: As post hoc explanation methods are increasingly being leveraged to explain complex models in high-stakes settings, it becomes critical to ensure that the quality of the resulting explanations is consistently high across various population subgroups including the minority groups. For instance, it should not be the case that explanations associated with instances belonging to a particular gender su… ▽ More

    Submitted 1 July, 2022; v1 submitted 15 May, 2022; originally announced May 2022.

    Comments: As presented at AIES 2022

  25. arXiv:2203.15272  [pdf, other

    cs.RO cs.CV

    Sparse Image based Navigation Architecture to Mitigate the need of precise Localization in Mobile Robots

    Authors: Pranay Mathur, Rajesh Kumar, Sarthak Upadhyay

    Abstract: Traditional simultaneous localization and mapping (SLAM) methods focus on improvement in the robot's localization under environment and sensor uncertainty. This paper, however, focuses on mitigating the need for exact localization of a mobile robot to pursue autonomous navigation using a sparse set of images. The proposed method consists of a model architecture - RoomNet, for unsupervised learning… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: 7 Pages, 4 figures

  26. arXiv:2203.08685  [pdf, other

    cs.CL cs.AI cs.HC

    A Feasibility Study of Answer-Agnostic Question Generation for Education

    Authors: Liam Dugan, Eleni Miltsakaki, Shriyash Upadhyay, Etan Ginsberg, Hannah Gonzalez, Dayheon Choi, Chuning Yuan, Chris Callison-Burch

    Abstract: We conduct a feasibility study into the applicability of answer-agnostic question generation models to textbook passages. We show that a significant portion of errors in such systems arise from asking irrelevant or uninterpretable questions and that such errors can be ameliorated by providing summarized input. We find that giving these models human-written summaries instead of the original text re… ▽ More

    Submitted 29 March, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: To be published in 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022)

    ACM Class: I.2.7

  27. arXiv:2203.05931  [pdf, other

    stat.ML cs.LG

    FedSyn: Synthetic Data Generation using Federated Learning

    Authors: Monik Raj Behera, Sudhir Upadhyay, Suresh Shetty, Sudha Priyadarshini, Palka Patel, Ker Farn Lee

    Abstract: As Deep Learning algorithms continue to evolve and become more sophisticated, they require massive datasets for model training and efficacy of models. Some of those data requirements can be met with the help of existing datasets within the organizations. Current Machine Learning practices can be leveraged to generate synthetic data from an existing dataset. Further, it is well established that div… ▽ More

    Submitted 5 April, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

  28. arXiv:2203.00274  [pdf, other

    cs.CL

    TableFormer: Robust Transformer Modeling for Table-Text Encoding

    Authors: Jingfeng Yang, Aditya Gupta, Shyam Upadhyay, Luheng He, Rahul Goel, Shachi Paul

    Abstract: Understanding tables is an important aspect of natural language understanding. Existing models for table understanding require linearization of the table structure, where row or column order is encoded as an unwanted bias. Such spurious biases make the model vulnerable to row and column order perturbations. Additionally, prior work has not thoroughly modeled the table structures or table-text alig… ▽ More

    Submitted 3 May, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: ACL 2022, 10 pages

  29. arXiv:2202.07764  [pdf, other

    quant-ph cs.CR cs.NI physics.optics

    Paving the Way towards 800 Gbps Quantum-Secured Optical Channel Deployment in Mission-Critical Environments

    Authors: Marco Pistoia, Omar Amer, Monik R. Behera, Joseph A. Dolphin, James F. Dynes, Benny John, Paul A. Haigh, Yasushi Kawakura, David H. Kramer, Jeffrey Lyon, Navid Moazzami, Tulasi D. Movva, Antigoni Polychroniadou, Suresh Shetty, Greg Sysak, Farzam Toudeh-Fallah, Sudhir Upadhyay, Robert I. Woodward, Andrew J. Shields

    Abstract: This article describes experimental research studies conducted towards understanding the implementation aspects of high-capacity quantum-secured optical channels in mission-critical metro-scale operational environments using Quantum Key Distribution (QKD) technology. To the best of our knowledge, this is the first time that an 800 Gbps quantum-secured optical channel -- along with several other De… ▽ More

    Submitted 2 March, 2023; v1 submitted 15 February, 2022; originally announced February 2022.

    Comments: 11 pages, 9 figures, 2 tables

    Journal ref: Quantum Science and Technology, Institute of Physics, May 2023

  30. arXiv:2108.04371  [pdf, other

    cs.AI

    Extending LIME for Business Process Automation

    Authors: Sohini Upadhyay, Vatche Isahagian, Vinod Muthusamy, Yara Rizk

    Abstract: AI business process applications automate high-stakes business decisions where there is an increasing demand to justify or explain the rationale behind algorithmic decisions. Business process applications have ordering or constraints on tasks and feature values that cause lightweight, model-agnostic, existing explanation methods like LIME to fail. In response, we propose a local explanation framew… ▽ More

    Submitted 9 August, 2021; originally announced August 2021.

  31. arXiv:2107.10243  [pdf, other

    cs.CR cs.LG

    Federated Learning using Smart Contracts on Blockchains, based on Reward Driven Approach

    Authors: Monik Raj Behera, Sudhir Upadhyay, Suresh Shetty

    Abstract: Over the recent years, Federated machine learning continues to gain interest and momentum where there is a need to draw insights from data while preserving the data provider's privacy. However, one among other existing challenges in the adoption of federated learning has been the lack of fair, transparent and universally agreed incentivization schemes for rewarding the federated learning contribut… ▽ More

    Submitted 25 March, 2022; v1 submitted 19 July, 2021; originally announced July 2021.

    Comments: 9 pages, 7 figures and 1 table

  32. arXiv:2106.13346  [pdf, other

    cs.LG cs.AI cs.CY

    What will it take to generate fairness-preserving explanations?

    Authors: Jessica Dai, Sohini Upadhyay, Stephen H. Bach, Himabindu Lakkaraju

    Abstract: In situations where explanations of black-box models may be useful, the fairness of the black-box is also often a relevant concern. However, the link between the fairness of the black-box model and the behavior of explanations for the black-box is unclear. We focus on explanations applied to tabular datasets, suggesting that explanations do not necessarily preserve the fairness properties of the b… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: Presented at ICML 2021 Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI

  33. arXiv:2106.09992  [pdf, other

    cs.LG cs.AI

    Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis

    Authors: Martin Pawelczyk, Chirag Agarwal, Shalmali Joshi, Sohini Upadhyay, Himabindu Lakkaraju

    Abstract: As machine learning (ML) models become more widely deployed in high-stakes applications, counterfactual explanations have emerged as key tools for providing actionable model explanations in practice. Despite the growing popularity of counterfactual explanations, a deeper understanding of these explanations is still lacking. In this work, we systematically analyze counterfactual explanations throug… ▽ More

    Submitted 19 October, 2021; v1 submitted 18 June, 2021; originally announced June 2021.

    Journal ref: International Conference on Artificial Intelligence and Statistics (AISTATS), 28-30 March 2022

  34. arXiv:2106.04571  [pdf, other

    cs.CL

    TIMEDIAL: Temporal Commonsense Reasoning in Dialog

    Authors: Lianhui Qin, Aditya Gupta, Shyam Upadhyay, Luheng He, Yejin Choi, Manaal Faruqui

    Abstract: Everyday conversations require understanding everyday events, which in turn, requires understanding temporal commonsense concepts interwoven with those events. Despite recent progress with massive pre-trained language models (LMs) such as T5 and GPT-3, their capability of temporal reasoning in dialogs remains largely under-explored. In this paper, we present the first study to investigate pre-trai… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

  35. arXiv:2106.04016  [pdf, other

    cs.CL

    Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

    Authors: Aditya Gupta, Jiacheng Xu, Shyam Upadhyay, Diyi Yang, Manaal Faruqui

    Abstract: Disfluencies is an under-studied topic in NLP, even though it is ubiquitous in human conversation. This is largely due to the lack of datasets containing disfluencies. In this paper, we present a new challenge question answering dataset, Disfl-QA, a derivative of SQuAD, where humans introduce contextual disfluencies in previously fluent questions. Disfl-QA contains a variety of challenging disflue… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: Findings of ACL 2021

  36. arXiv:2102.13620  [pdf, other

    cs.LG cs.AI

    Towards Robust and Reliable Algorithmic Recourse

    Authors: Sohini Upadhyay, Shalmali Joshi, Himabindu Lakkaraju

    Abstract: As predictive models are increasingly being deployed in high-stakes decision making (e.g., loan approvals), there has been growing interest in post hoc techniques which provide recourse to affected individuals. These techniques generate recourses under the assumption that the underlying predictive model does not change. However, in practice, models are often regularly updated for a variety of reas… ▽ More

    Submitted 13 July, 2021; v1 submitted 26 February, 2021; originally announced February 2021.

  37. arXiv:2102.12082  [pdf, other

    cs.CL

    Hopeful_Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers

    Authors: Ishan Sanjeev Upadhyay, Nikhil E, Anshul Wadhawan, Radhika Mamidi

    Abstract: This paper aims to describe the approach we used to detect hope speech in the HopeEDI dataset. We experimented with two approaches. In the first approach, we used contextual embeddings to train classifiers using logistic regression, random forest, SVM, and LSTM based models.The second approach involved using a majority voting ensemble of 11 models which were obtained by fine-tuning pre-trained tra… ▽ More

    Submitted 24 February, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

  38. arXiv:2102.10618  [pdf, other

    cs.LG

    Towards the Unification and Robustness of Perturbation and Gradient Based Explanations

    Authors: Sushant Agarwal, Shahin Jabbari, Chirag Agarwal, Sohini Upadhyay, Zhiwei Steven Wu, Himabindu Lakkaraju

    Abstract: As machine learning black boxes are increasingly being deployed in critical domains such as healthcare and criminal justice, there has been a growing emphasis on developing techniques for explaining these black boxes in a post hoc manner. In this work, we analyze two popular post hoc interpretation techniques: SmoothGrad which is a gradient based method, and a variant of LIME which is a perturbati… ▽ More

    Submitted 19 July, 2021; v1 submitted 21 February, 2021; originally announced February 2021.

    Comments: The short version of this paper appears in the proceedings of ICML-21

  39. arXiv:2010.12574  [pdf, other

    cs.LG stat.ML

    Online Semi-Supervised Learning with Bandit Feedback

    Authors: Sohini Upadhyay, Mikhail Yurochkin, Mayank Agarwal, Yasaman Khazaeni, DjallelBouneffouf

    Abstract: We formulate a new problem at the intersectionof semi-supervised learning and contextual bandits,motivated by several applications including clini-cal trials and ad recommendations. We demonstratehow Graph Convolutional Network (GCN), a semi-supervised learning approach, can be adjusted tothe new problem formulation. We also propose avariant of the linear contextual bandit with semi-supervised mis… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

  40. arXiv:2010.09473  [pdf, other

    cs.LG cs.AI

    Double-Linear Thompson Sampling for Context-Attentive Bandits

    Authors: Djallel Bouneffouf, Raphaël Féraud, Sohini Upadhyay, Yasaman Khazaeni, Irina Rish

    Abstract: In this paper, we analyze and extend an online learning framework known as Context-Attentive Bandit, motivated by various practical applications, from medical diagnosis to dialog systems, where due to observation costs only a small subset of a potentially large number of context variables can be observed at each iteration;however, the agent has a freedom to choose which variables to observe. We de… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: arXiv admin note: text overlap with arXiv:1906.09384

  41. arXiv:2009.02668  [pdf, ps, other

    cs.LG cs.CR cs.DS stat.ML

    A Framework for Private Matrix Analysis

    Authors: Jalaj Upadhyay, Sarvagya Upadhyay

    Abstract: We study private matrix analysis in the sliding window model where only the last $W$ updates to matrices are considered useful for analysis. We give first efficient $o(W)$ space differentially private algorithms for spectral approximation, principal component analysis, and linear regression. We also initiate and show efficient differentially private algorithms for two important variants of princip… ▽ More

    Submitted 6 September, 2020; originally announced September 2020.

    Comments: 41 pages

  42. arXiv:2007.06368  [pdf, other

    cs.LG cs.AI stat.ML

    Contextual Bandit with Missing Rewards

    Authors: Djallel Bouneffouf, Sohini Upadhyay, Yasaman Khazaeni

    Abstract: We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the reward associated with each context-based decision may not always be observed("missing rewards"). This new problem is motivated by certain online settings including clinical trial and ad recommendation applications. In order to addre… ▽ More

    Submitted 18 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

  43. arXiv:2001.00658  [pdf, ps, other

    quant-ph cs.DM

    Compressed Quadratization of Higher Order Binary Optimization Problems

    Authors: Avradip Mandal, Arnab Roy, Sarvagya Upadhyay, Hayato Ushijima-Mwesigwa

    Abstract: Recent hardware advances in quantum and quantum-inspired annealers promise substantial speedup for solving NP-hard combinatorial optimization problems compared to general-purpose computers. These special-purpose hardware are built for solving hard instances of Quadratic Unconstrained Binary Optimization (QUBO) problems. In terms of number of variables and precision of these hardware are usually re… ▽ More

    Submitted 2 January, 2020; originally announced January 2020.

  44. arXiv:1911.09810  [pdf, other

    cs.DS

    Leveraging Special-Purpose Hardware for Local Search Heuristics

    Authors: Xiaoyuan Liu, Hayato Ushijima-Mwesigwa, Avradip Mandal, Sarvagya Upadhyay, Ilya Safro, Arnab Roy

    Abstract: As we approach the physical limits predicted by Moore's law, a variety of specialized hardware is emerging to tackle specialized tasks in different domains. Within combinatorial optimization, adiabatic quantum computers, CMOS annealers, and optical parametric oscillators are few of the emerging specialized hardware technology aimed at solving optimization problems. In terms of mathematical framewo… ▽ More

    Submitted 28 November, 2020; v1 submitted 21 November, 2019; originally announced November 2019.

    MSC Class: 68R05 90C27 90C59

  45. arXiv:1909.11218  [pdf, other

    cs.CL cs.LG

    Attention Interpretability Across NLP Tasks

    Authors: Shikhar Vashishth, Shyam Upadhyay, Gaurav Singh Tomar, Manaal Faruqui

    Abstract: The attention layer in a neural network model provides insights into the model's reasoning behind its prediction, which are usually criticized for being opaque. Recently, seemingly contradictory viewpoints have emerged about the interpretability of attention weights (Jain & Wallace, 2019; Vig & Belinkov, 2019). Amid such confusion arises the need to understand attention mechanism more systematical… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

    Report number: 2019

  46. arXiv:1908.09453  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    OpenSpiel: A Framework for Reinforcement Learning in Games

    Authors: Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinicius Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes , et al. (2 additional authors not shown)

    Abstract: OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games. OpenSpiel supports n-player (single- and multi- agent) zero-sum, cooperative and general-sum, one-shot and sequential, strictly turn-taking and simultaneous-move, perfect and imperfect information games, as well as traditional multiagent environments such as (partia… ▽ More

    Submitted 26 September, 2020; v1 submitted 25 August, 2019; originally announced August 2019.

  47. arXiv:1906.09384  [pdf, other

    cs.AI cs.CL cs.LG

    A Bandit Approach to Posterior Dialog Orchestration Under a Budget

    Authors: Sohini Upadhyay, Mayank Agarwal, Djallel Bounneffouf, Yasaman Khazaeni

    Abstract: Building multi-domain AI agents is a challenging task and an open problem in the area of AI. Within the domain of dialog, the ability to orchestrate multiple independently trained dialog agents, or skills, to create a unified system is of particular significance. In this work, we study the task of online posterior dialog orchestration, where we define posterior orchestration as the task of selecti… ▽ More

    Submitted 22 June, 2019; originally announced June 2019.

    Comments: 2nd Conversational AI Workshop, NeurIPS 2018

  48. arXiv:1809.07807  [pdf

    cs.CL

    Bootstrapping Transliteration with Constrained Discovery for Low-Resource Languages

    Authors: Shyam Upadhyay, Jordan Kodner, Dan Roth

    Abstract: Generating the English transliteration of a name written in a foreign script is an important and challenging step in multilingual knowledge acquisition and information extraction. Existing approaches to transliteration generation require a large (>5000) number of training examples. This difficulty contrasts with transliteration discovery, a somewhat easier task that involves picking a plausible tr… ▽ More

    Submitted 20 September, 2018; originally announced September 2018.

    Comments: EMNLP 2018

  49. arXiv:1809.07657  [pdf, other

    cs.CL

    Joint Multilingual Supervision for Cross-lingual Entity Linking

    Authors: Shyam Upadhyay, Nitish Gupta, Dan Roth

    Abstract: Cross-lingual Entity Linking (XEL) aims to ground entity mentions written in any language to an English Knowledge Base (KB), such as Wikipedia. XEL for most languages is challenging, owing to limited availability of resources as supervision. We address this challenge by developing the first XEL approach that combines supervision from multiple languages jointly. This enables our approach to: (a) au… ▽ More

    Submitted 20 September, 2018; originally announced September 2018.

    Comments: EMNLP 2018

  50. A Deep Structure of Person Re-Identification using Multi-Level Gaussian Models

    Authors: Dinesh Kumar Vishwakarma, Sakshi Upadhyay

    Abstract: Person re-identification is being widely used in the forensic, and security and surveillance system, but person re-identification is a challenging task in real life scenario. Hence, in this work, a new feature descriptor model has been proposed using a multilayer framework of Gaussian distribution model on pixel features, which include color moments, color space values and Schmid filter responses.… ▽ More

    Submitted 20 May, 2018; originally announced May 2018.

    Comments: 9 pages

    Report number: 8469037

    Journal ref: IEEE Transactions on Multi-Scale Computing Systems 4 (2018) 513 - 521