Zum Hauptinhalt springen

Showing 1–50 of 135 results for author: Gupta, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.11699  [pdf, other

    cs.LO cs.SE

    Automating Semantic Analysis of System Assurance Cases using Goal-directed ASP

    Authors: Anitha Murugesan, Isaac Wong, Joaquín Arias, Robert Stroud, Srivatsan Varadarajan, Elmer Salazar, Gopal Gupta, Robin Bloomfield, John Rushby

    Abstract: Assurance cases offer a structured way to present arguments and evidence for certification of systems where safety and security are critical. However, creating and evaluating these assurance cases can be complex and challenging, even for systems of moderate complexity. Therefore, there is a growing need to develop new automation methods for these tasks. While most existing assurance case tools foc… ▽ More

    Submitted 29 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

  2. arXiv:2408.09909  [pdf, ps, other

    cs.LO cs.SE

    Early Validation of High-level System Requirements with Event Calculus and Answer Set Programming

    Authors: Ondřej Vašíček, Joaquin Arias, Jan Fiedor, Gopal Gupta, Brendan Hall, Bohuslav Křena, Brian Larson, Sarat Chandra Varanasi, Tomáš Vojnar

    Abstract: This paper proposes a new methodology for early validation of high-level requirements on cyber-physical systems with the aim of improving their quality and, thus, lowering chances of specification errors propagating into later stages of development where it is much more expensive to fix them. The paper presents a transformation of a real-world requirements specification of a medical device$-$a PCA… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: Accepted for ICLP 2024

  3. arXiv:2407.18498  [pdf, other

    cs.CL cs.AI cs.LO

    A Reliable Common-Sense Reasoning Socialbot Built Using LLMs and Goal-Directed ASP

    Authors: Yankai Zeng, Abhiramon Rajashekharan, Kinjal Basu, Huaduo Wang, Joaquín Arias, Gopal Gupta

    Abstract: The development of large language models (LLMs), such as GPT, has enabled the construction of several socialbots, like ChatGPT, that are receiving a lot of attention for their ability to simulate a human conversation. However, the conversation is not guided by a goal and is hard to control. In addition, because LLMs rely more on pattern recognition than deductive reasoning, they can give confusing… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  4. arXiv:2407.15022  [pdf

    cs.CY cs.AI

    Encouraging Responsible Use of Generative AI in Education: A Reward-Based Learning Approach

    Authors: Aditi Singh, Abul Ehtesham, Saket Kumar, Gaurav Kumar Gupta, Tala Talaei Khoei

    Abstract: This research introduces an innovative mathematical learning approach that integrates generative AI to cultivate a structured learning rather than quick solution. Our method combines chatbot capabilities and generative AI to offer interactive problem-solving exercises, enhancing learning through a stepby-step approach for varied problems, advocating for the responsible use of AI in education. Our… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

    Comments: 9 pages, 4 figures

  5. arXiv:2407.14967  [pdf, other

    cs.CV cs.LG

    Base and Exponent Prediction in Mathematical Expressions using Multi-Output CNN

    Authors: Md Laraib Salam, Akash S Balsaraf, Gaurav Gupta

    Abstract: The use of neural networks and deep learning techniques in image processing has significantly advanced the field, enabling highly accurate recognition results. However, achieving high recognition rates often necessitates complex network models, which can be challenging to train and require substantial computational resources. This research presents a simplified yet effective approach to predicting… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

    Comments: 4 pages, 9 figures

  6. arXiv:2407.14129  [pdf, other

    cs.LG

    Comparing and Contrasting Deep Learning Weather Prediction Backbones on Navier-Stokes and Atmospheric Dynamics

    Authors: Matthias Karlbauer, Danielle C. Maddix, Abdul Fatir Ansari, Boran Han, Gaurav Gupta, Yuyang Wang, Andrew Stuart, Michael W. Mahoney

    Abstract: Remarkable progress in the development of Deep Learning Weather Prediction (DLWP) models positions them to become competitive with traditional numerical weather prediction (NWP) models. Indeed, a wide number of DLWP architectures -- based on various backbones, including U-Net, Transformer, Graph Neural Network (GNN), and Fourier Neural Operator (FNO) -- have demonstrated their potential at forecas… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  7. arXiv:2407.08179  [pdf, other

    cs.AI cs.LG cs.LO

    CoGS: Causality Constrained Counterfactual Explanations using goal-directed ASP

    Authors: Sopam Dasgupta, Joaquín Arias, Elmer Salazar, Gopal Gupta

    Abstract: Machine learning models are increasingly used in areas such as loan approvals and hiring, yet they often function as black boxes, obscuring their decision-making processes. Transparency is crucial, and individuals need explanations to understand decisions, especially for the ones not desired by the user. Ethical and legal considerations require informing individuals of changes in input attribute v… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  8. arXiv:2406.14901  [pdf, other

    cs.IR

    IDentity with Locality: An ideal hash for gene sequence search

    Authors: Aditya Desai, Gaurav Gupta, Tianyi Zhang, Anshumali Shrivastava

    Abstract: Gene sequence search is a fundamental operation in computational genomics. Due to the petabyte scale of genome archives, most gene search systems now use hashing-based data structures such as Bloom Filters (BF). The state-of-the-art systems such as Compact bit-slicing signature index (COBS) and Repeated And Merged Bloom filters (RAMBO) use BF with Random Hash (RH) functions for gene representation… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 13 pages

  9. arXiv:2405.20622  [pdf, other

    cs.LG

    Superfast Selection for Decision Tree Algorithms

    Authors: Huaduo Wang, Gopal Gupta

    Abstract: We present a novel and systematic method, called Superfast Selection, for selecting the "optimal split" for decision tree and feature selection algorithms over tabular data. The method speeds up split selection on a single feature by lowering the time complexity, from O(MN) (using the standard selection methods) to O(M), where M represents the number of input examples and N the number of unique va… ▽ More

    Submitted 3 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  10. arXiv:2405.15956  [pdf, other

    cs.AI cs.LG cs.LO

    CFGs: Causality Constrained Counterfactual Explanations using goal-directed ASP

    Authors: Sopam Dasgupta, Joaquín Arias, Elmer Salazar, Gopal Gupta

    Abstract: Machine learning models that automate decision-making are increasingly used in consequential areas such as loan approvals, pretrial bail approval, and hiring. Unfortunately, most of these models are black boxes, i.e., they are unable to reveal how they reach these prediction decisions. A need for transparency demands justification for such predictions. An affected individual might also desire expl… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.04382

  11. arXiv:2405.15886  [pdf, other

    cs.CV

    A Neurosymbolic Framework for Bias Correction in Convolutional Neural Networks

    Authors: Parth Padalkar, Natalia Ślusarz, Ekaterina Komendantskaya, Gopal Gupta

    Abstract: Recent efforts in interpreting Convolutional Neural Networks (CNNs) focus on translating the activation of CNN filters into a stratified Answer Set Program (ASP) rule-sets. The CNN filters are known to capture high-level image concepts, thus the predicates in the rule-set are mapped to the concept that their corresponding filter represents. Hence, the rule-set exemplifies the decision-making proce… ▽ More

    Submitted 22 August, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  12. arXiv:2405.06712  [pdf, other

    cs.CL cs.AI

    Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses

    Authors: Gaurav Kumar Gupta, Aditi Singh, Sijo Valayakkad Manikandan, Abul Ehtesham

    Abstract: The recent swift development of LLMs like GPT-4, Gemini, and GPT-3.5 offers a transformative opportunity in medicine and healthcare, especially in digital diagnostics. This study evaluates each model diagnostic abilities by interpreting a user symptoms and determining diagnoses that fit well with common illnesses, and it demonstrates how each of these models could significantly increase diagnostic… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 14 pages, 4 figures

  13. arXiv:2405.05852  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.RO stat.ML

    Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

    Authors: Gunshi Gupta, Karmesh Yadav, Yarin Gal, Dhruv Batra, Zsolt Kira, Cong Lu, Tim G. J. Rudner

    Abstract: Embodied AI agents require a fine-grained understanding of the physical world mediated through visual and language inputs. Such capabilities are difficult to learn solely from task-specific data. This has led to the emergence of pre-trained vision-language models as a tool for transferring representations learned from internet-scale data to downstream tasks and new domains. However, commonly used… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  14. arXiv:2405.03637  [pdf, other

    cs.LG

    Collage: Light-Weight Low-Precision Strategy for LLM Training

    Authors: Tao Yu, Gaurav Gupta, Karthick Gopalswamy, Amith Mamidala, Hao Zhou, Jeffrey Huynh, Youngsuk Park, Ron Diamant, Anoop Deoras, Luke Huan

    Abstract: Large models training is plagued by the intense compute cost and limited hardware memory. A practical solution is low-precision representation but is troubled by loss in numerical accuracy and unstable training rendering the model less useful. We argue that low-precision floating points can perform well provided the error is properly compensated at the critical locations in the training process. W… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  15. arXiv:2404.10630  [pdf, other

    cs.CL cs.LG

    HLAT: High-quality Large Language Model Pre-trained on AWS Trainium

    Authors: Haozheng Fan, Hao Zhou, Guangtai Huang, Parameswaran Raman, Xinwei Fu, Gaurav Gupta, Dhananjay Ram, Yida Wang, Jun Huan

    Abstract: Getting large language models (LLMs) to perform well on the downstream tasks requires pre-training over trillions of tokens. This typically demands a large number of powerful computational devices in addition to a stable distributed training framework to accelerate the training. The growing number of applications leveraging AI/ML had led to a scarcity of the expensive conventional accelerators (su… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  16. arXiv:2404.09403  [pdf, other

    cs.LG

    Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning

    Authors: Xiongye Xiao, Gengshuo Liu, Gaurav Gupta, Defu Cao, Shixuan Li, Yaxing Li, Tianqing Fang, Mingxi Cheng, Paul Bogdan

    Abstract: Integrating and processing information from various sources or modalities are critical for obtaining a comprehensive and accurate perception of the real world in autonomous systems and cyber-physical systems. Drawing inspiration from neuroscience, we develop the Information-Theoretic Hierarchical Perception (ITHP) model, which utilizes the concept of information bottleneck. Different from most tra… ▽ More

    Submitted 22 April, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: Accepted by ICLR 2024. Camera Ready Version

  17. arXiv:2403.10642  [pdf, other

    cs.LG math.NA

    Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs

    Authors: S. Chandra Mouli, Danielle C. Maddix, Shima Alizadeh, Gaurav Gupta, Andrew Stuart, Michael W. Mahoney, Yuyang Wang

    Abstract: Existing work in scientific machine learning (SciML) has shown that data-driven learning of solution operators can provide a fast approximate alternative to classical numerical partial differential equation (PDE) solvers. Of these, Neural Operators (NOs) have emerged as particularly promising. We observe that several uncertainty quantification (UQ) methods for NOs fail for test inputs that are eve… ▽ More

    Submitted 12 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: ICML 2024

  18. arXiv:2403.05882  [pdf, other

    cs.LG

    DiffRed: Dimensionality Reduction guided by stable rank

    Authors: Prarabdh Shukla, Gagan Raj Gupta, Kunal Dutta

    Abstract: In this work, we propose a novel dimensionality reduction technique, DiffRed, which first projects the data matrix, A, along first $k_1$ principal components and the residual matrix $A^{*}$ (left after subtracting its $k_1$-rank approximation) along $k_2$ Gaussian random vectors. We evaluate M1, the distortion of mean-squared pair-wise distance, and Stress, the normalized value of RMS of distortio… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  19. arXiv:2402.15968  [pdf, other

    cs.LG cs.AI

    CoDream: Exchanging dreams instead of models for federated aggregation with heterogeneous models

    Authors: Abhishek Singh, Gauri Gupta, Ritvik Kapila, Yichuan Shi, Alex Dang, Sheshank Shankar, Mohammed Ehab, Ramesh Raskar

    Abstract: Federated Learning (FL) enables collaborative optimization of machine learning models across decentralized data by aggregating model parameters. Our approach extends this concept by aggregating "knowledge" derived from models, instead of model parameters. We present a novel framework called CoDream, where clients collaboratively optimize randomly initialized data using federated optimization in th… ▽ More

    Submitted 27 February, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    Comments: 16 pages, 12 figures, 5 tables

  20. arXiv:2402.04382  [pdf, other

    cs.AI

    Counterfactual Generation with Answer Set Programming

    Authors: Sopam Dasgupta, Farhad Shakerin, Joaquín Arias, Elmer Salazar, Gopal Gupta

    Abstract: Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail approval, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they are unable to reveal how they reach these prediction decisions. A need for transparency demands justification for such predictions. An affected individual might… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 16 Pages

  21. arXiv:2401.08588  [pdf

    cs.CV

    Improved Pothole Detection Using YOLOv7 and ESRGAN

    Authors: Nirmal Kumar Rout, Gyanateet Dutta, Varun Sinha, Arghadeep Dey, Subhrangshu Mukherjee, Gopal Gupta

    Abstract: Potholes are common road hazards that is causing damage to vehicles and posing a safety risk to drivers. The introduction of Convolutional Neural Networks (CNNs) is widely used in the industry for object detection based on Deep Learning methods and has achieved significant progress in hardware improvement and software implementations. In this paper, a unique better algorithm is proposed to warrant… ▽ More

    Submitted 10 November, 2023; originally announced January 2024.

  22. arXiv:2401.04795  [pdf, other

    cs.MA cs.LG cs.SI physics.soc-ph

    First 100 days of pandemic; an interplay of pharmaceutical, behavioral and digital interventions -- A study using agent based modeling

    Authors: Gauri Gupta, Ritvik Kapila, Ayush Chopra, Ramesh Raskar

    Abstract: Pandemics, notably the recent COVID-19 outbreak, have impacted both public health and the global economy. A profound understanding of disease progression and efficient response strategies is thus needed to prepare for potential future outbreaks. In this paper, we emphasize the potential of Agent-Based Models (ABM) in capturing complex infection dynamics and understanding the impact of intervention… ▽ More

    Submitted 5 February, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: 12 pages, 12 figures, In Proc. of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024), Auckland, New Zealand, 2024

  23. arXiv:2312.17168  [pdf, other

    cs.LG cs.AI

    Can Active Sampling Reduce Causal Confusion in Offline Reinforcement Learning?

    Authors: Gunshi Gupta, Tim G. J. Rudner, Rowan Thomas McAllister, Adrien Gaidon, Yarin Gal

    Abstract: Causal confusion is a phenomenon where an agent learns a policy that reflects imperfect spurious correlations in the data. Such a policy may falsely appear to be optimal during training if most of the training data contain such spurious correlations. This phenomenon is particularly pronounced in domains such as robotics, with potentially large gaps between the open- and closed-loop performance of… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Proceedings of the 2nd Conference on Causal Learning and Reasoning (CLeaR 2021)

  24. arXiv:2312.02337  [pdf, other

    cs.CL

    Measuring Distributional Shifts in Text: The Advantage of Language Model-Based Embeddings

    Authors: Gyandev Gupta, Bashir Rastegarpanah, Amalendu Iyer, Joshua Rubin, Krishnaram Kenthapadi

    Abstract: An essential part of monitoring machine learning models in production is measuring input and output data drift. In this paper, we present a system for measuring distributional shifts in natural language data and highlight and investigate the potential advantage of using large language models (LLMs) for this problem. Recent advancements in LLMs and their successful adoption in different domains ind… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  25. arXiv:2311.02399  [pdf, ps, other

    cs.LG cs.DC

    Entropy Aware Training for Fast and Accurate Distributed GNN

    Authors: Dhruv Deshmukh, Gagan Raj Gupta, Manisha Chawla, Vishwesh Jatala, Anirban Haldar

    Abstract: Several distributed frameworks have been developed to scale Graph Neural Networks (GNNs) on billion-size graphs. On several benchmarks, we observe that the graph partitions generated by these frameworks have heterogeneous data distributions and class imbalance, affecting convergence, and resulting in lower performance than centralized implementations. We holistically address these challenges and d… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: 8 pages, 3 figures, 5 tables, accepted at ICDM'23

    ACM Class: I.5.1; I.5.2

  26. arXiv:2311.00429  [pdf, other

    eess.IV cs.LG

    Crop Disease Classification using Support Vector Machines with Green Chromatic Coordinate (GCC) and Attention based feature extraction for IoT based Smart Agricultural Applications

    Authors: Shashwat Jha, Vishvaditya Luhach, Gauri Shanker Gupta, Beependra Singh

    Abstract: Crops hold paramount significance as they serve as the primary provider of energy, nutrition, and medicinal benefits for the human population. Plant diseases, however, can negatively affect leaves during agricultural cultivation, resulting in significant losses in crop output and economic value. Therefore, it is crucial for farmers to identify crop diseases. However, this method frequently necessi… ▽ More

    Submitted 6 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

  27. arXiv:2310.14497  [pdf, other

    cs.AI

    Counterfactual Explanation Generation with s(CASP)

    Authors: Sopam Dasgupta, Farhad Shakerin, Joaquín Arias, Elmer Salazar, Gopal Gupta

    Abstract: Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they are unable to reveal how they reach these prediction decisions. A need for transparency demands justification for such predictions. An affected individual might desire e… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: 18 Pages

  28. arXiv:2310.13073  [pdf, other

    cs.LG cs.CV

    Using Logic Programming and Kernel-Grouping for Improving Interpretability of Convolutional Neural Networks

    Authors: Parth Padalkar, Gopal Gupta

    Abstract: Within the realm of deep learning, the interpretability of Convolutional Neural Networks (CNNs), particularly in the context of image classification tasks, remains a formidable challenge. To this end we present a neurosymbolic framework, NeSyFOLD-G that generates a symbolic rule-set using the last layer kernels of the CNN to make its underlying knowledge interpretable. What makes NeSyFOLD-G differ… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:2301.12667

  29. arXiv:2309.16202  [pdf

    cs.CL

    Marathi-English Code-mixed Text Generation

    Authors: Dhiraj Amin, Sharvari Govilkar, Sagar Kulkarni, Yash Shashikant Lalit, Arshi Ajaz Khwaja, Daries Xavier, Sahil Girijashankar Gupta

    Abstract: Code-mixing, the blending of linguistic elements from distinct languages to form meaningful sentences, is common in multilingual settings, yielding hybrid languages like Hinglish and Minglish. Marathi, India's third most spoken language, often integrates English for precision and formality. Developing code-mixed language systems, like Marathi-English (Minglish), faces resource constraints. This re… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  30. arXiv:2309.15877   

    cs.LG cs.AI

    Neuro-Inspired Hierarchical Multimodal Learning

    Authors: Xiongye Xiao, Gengshuo Liu, Gaurav Gupta, Defu Cao, Shixuan Li, Yaxing Li, Tianqing Fang, Mingxi Cheng, Paul Bogdan

    Abstract: Integrating and processing information from various sources or modalities are critical for obtaining a comprehensive and accurate perception of the real world. Drawing inspiration from neuroscience, we develop the Information-Theoretic Hierarchical Perception (ITHP) model, which utilizes the concept of information bottleneck. Distinct from most traditional fusion models that aim to incorporate all… ▽ More

    Submitted 23 April, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: I am requesting the withdrawal of this submission due to an inadvertent duplication. The paper was submitted twice under different IDs, which was not intentional. The other submission (arXiv:2404.09403) contains the most updated and comprehensive version of the paper, and I would like to retain that as the sole version on the platform

  31. arXiv:2308.15014  [pdf, other

    cs.IR

    CAPS: A Practical Partition Index for Filtered Similarity Search

    Authors: Gaurav Gupta, Jonah Yi, Benjamin Coleman, Chen Luo, Vihan Lakshman, Anshumali Shrivastava

    Abstract: With the surging popularity of approximate near-neighbor search (ANNS), driven by advances in neural representation learning, the ability to serve queries accompanied by a set of constraints has become an area of intense interest. While the community has recently proposed several algorithms for constrained ANNS, almost all of these methods focus on integration with graph-based indexes, the predomi… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: 14 pages

  32. arXiv:2307.03898   

    cs.CV eess.IV

    StyleGAN3: Generative Networks for Improving the Equivariance of Translation and Rotation

    Authors: Tianlei Zhu, Junqi Chen, Renzhe Zhu, Gaurav Gupta

    Abstract: StyleGAN can use style to affect facial posture and identity features, and noise to affect hair, wrinkles, skin color and other details. Among these, the outcomes of the picture processing will vary slightly between different versions of styleGAN. As a result, the comparison of performance differences between styleGAN2 and the two modified versions of styleGAN3 will be the main focus of this study… ▽ More

    Submitted 5 February, 2024; v1 submitted 8 July, 2023; originally announced July 2023.

    Comments: But now we feel we haven't fully studied our work and have found some new great results. So after careful consideration, we're going to rework this manuscript and try to give a more accurate model

  33. arXiv:2306.01460  [pdf, other

    cs.LG

    ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages

    Authors: Andrew Jesson, Chris Lu, Gunshi Gupta, Angelos Filos, Jakob Nicolaus Foerster, Yarin Gal

    Abstract: This paper introduces an effective and practical step toward approximate Bayesian inference in on-policy actor-critic deep reinforcement learning. This step manifests as three simple modifications to the Asynchronous Advantage Actor-Critic (A3C) algorithm: (1) applying a ReLU function to advantage estimates, (2) spectral normalization of actor-critic weights, and (3) incorporating dropout as a Bay… ▽ More

    Submitted 24 November, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  34. arXiv:2305.18404  [pdf, ps, other

    cs.CL cs.LG stat.ML

    Conformal Prediction with Large Language Models for Multi-Choice Question Answering

    Authors: Bhawesh Kumar, Charlie Lu, Gauri Gupta, Anil Palepu, David Bellamy, Ramesh Raskar, Andrew Beam

    Abstract: As large language models continue to be widely developed, robust uncertainty quantification techniques will become crucial for their safe deployment in high-stakes scenarios. In this work, we explore how conformal prediction can be used to provide uncertainty quantification in language models for the specific task of multiple-choice question-answering. We find that the uncertainty estimates from c… ▽ More

    Submitted 7 July, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: Updated sections on prompt engineering. Expanded sections 4.1 and 4.2 and appendix. Included additional references. Work published at the ICML 2023 (Neural Conversational AI TEACH) workshop

  35. arXiv:2305.18225  [pdf, other

    cs.DC cs.AI

    Locksynth: Deriving Synchronization Code for Concurrent Data Structures with ASP

    Authors: Sarat Chandra Varanasi, Neeraj Mittal, Gopal Gupta

    Abstract: We present Locksynth, a tool that automatically derives synchronization needed for destructive updates to concurrent data structures that involve a constant number of shared heap memory write operations. Locksynth serves as the implementation of our prior work on deriving abstract synchronization code. Designing concurrent data structures involves inferring correct synchronization code starting wi… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  36. arXiv:2305.15786  [pdf, other

    cs.LG math.ST stat.ML

    Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting

    Authors: Hilaf Hasson, Danielle C. Maddix, Yuyang Wang, Gaurav Gupta, Youngsuk Park

    Abstract: Ensembling is among the most popular tools in machine learning (ML) due to its effectiveness in minimizing variance and thus improving generalization. Most ensembling methods for black-box base learners fall under the umbrella of "stacked generalization," namely training an ML algorithm that takes the inferences from the base learners as input. While stacking has been widely applied in practice, i… ▽ More

    Submitted 28 August, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: ICML 2023

  37. arXiv:2304.03431  [pdf, other

    cs.LG cs.AI

    Domain Generalization In Robust Invariant Representation

    Authors: Gauri Gupta, Ritvik Kapila, Keshav Gupta, Ramesh Raskar

    Abstract: Unsupervised approaches for learning representations invariant to common transformations are used quite often for object recognition. Learning invariances makes models more robust and practical to use in real-world scenarios. Since data transformations that do not change the intrinsic properties of the object cause the majority of the complexity in recognition tasks, models that are invariant to t… ▽ More

    Submitted 24 February, 2024; v1 submitted 6 April, 2023; originally announced April 2023.

    Comments: 7 pages, 5 figures, ICLR 2023 workshop

  38. arXiv:2303.10624  [pdf, other

    cs.LG cs.DC

    PFSL: Personalized & Fair Split Learning with Data & Label Privacy for thin clients

    Authors: Manas Wadhwa, Gagan Raj Gupta, Ashutosh Sahu, Rahul Saini, Vidhi Mittal

    Abstract: The traditional framework of federated learning (FL) requires each client to re-train their models in every iteration, making it infeasible for resource-constrained mobile devices to train deep-learning (DL) models. Split learning (SL) provides an alternative by using a centralized server to offload the computation of activations and gradients for a subset of the model but suffers from problems of… ▽ More

    Submitted 19 March, 2023; originally announced March 2023.

    Comments: To be published in : THE 23RD IEEE/ACM INTERNATIONAL SYMPOSIUM ON Cluster, Cloud and Internet Computing. Granted: Open Research Objects (ORO) and Research Objects Reviewed (ROR) badges. See https://www.niso.org/publications/rp-31-2021-badging for definitions of the badges. Code available at: https://github.com/mnswdhw/PFSL

  39. arXiv:2303.08941  [pdf, other

    cs.AI cs.LO

    Automated Interactive Domain-Specific Conversational Agents that Understand Human Dialogs

    Authors: Yankai Zeng, Abhiramon Rajasekharan, Parth Padalkar, Kinjal Basu, Joaquín Arias, Gopal Gupta

    Abstract: Achieving human-like communication with machines remains a classic, challenging topic in the field of Knowledge Representation and Reasoning and Natural Language Processing. These Large Language Models (LLMs) rely on pattern-matching rather than a true understanding of the semantic meaning of a sentence. As a result, they may generate incorrect responses. To generate an assuredly correct response,… ▽ More

    Submitted 17 March, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

  40. arXiv:2303.07537  [pdf, other

    cs.LG q-bio.QM

    Fractional dynamics foster deep learning of COPD stage prediction

    Authors: Chenzhong Yin, Mihai Udrescu, Gaurav Gupta, Mingxi Cheng, Andrei Lihu, Lucretia Udrescu, Paul Bogdan, David M Mannino, Stefan Mihaicuta

    Abstract: Chronic obstructive pulmonary disease (COPD) is one of the leading causes of death worldwide. Current COPD diagnosis (i.e., spirometry) could be unreliable because the test depends on an adequate effort from the tester and testee. Moreover, the early diagnosis of COPD is challenging. We address COPD detection by constructing two novel physiological signals datasets (4432 records from 54 patients i… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Published on Advanced Science

  41. arXiv:2303.02304  [pdf, other

    cs.LG

    Coupled Multiwavelet Neural Operator Learning for Coupled Partial Differential Equations

    Authors: Xiongye Xiao, Defu Cao, Ruochen Yang, Gaurav Gupta, Gengshuo Liu, Chenzhong Yin, Radu Balan, Paul Bogdan

    Abstract: Coupled partial differential equations (PDEs) are key tasks in modeling the complex dynamics of many physical processes. Recently, neural operators have shown the ability to solve PDEs by learning the integral kernel directly in Fourier/Wavelet space, so the difficulty for solving the coupled PDEs depends on dealing with the coupled mappings between the functions. Towards this end, we propose a \t… ▽ More

    Submitted 8 December, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted to ICLR 2023

  42. arXiv:2302.11002  [pdf, other

    cs.LG math.AP math.NA

    Learning Physical Models that Can Respect Conservation Laws

    Authors: Derek Hansen, Danielle C. Maddix, Shima Alizadeh, Gaurav Gupta, Michael W. Mahoney

    Abstract: Recent work in scientific machine learning (SciML) has focused on incorporating partial differential equation (PDE) information into the learning process. Much of this work has focused on relatively "easy" PDE operators (e.g., elliptic and parabolic), with less emphasis on relatively "hard" PDE operators (e.g., hyperbolic). Within numerical PDEs, the latter problem class requires control of a type… ▽ More

    Submitted 10 October, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: ICML 2023, Physica D: Nonlinear Phenomena, Accepted

    Journal ref: Physica D: Nonlinear Phenomena, 457 (2024) 133952

  43. Reliable Natural Language Understanding with Large Language Models and Answer Set Programming

    Authors: Abhiramon Rajasekharan, Yankai Zeng, Parth Padalkar, Gopal Gupta

    Abstract: Humans understand language by extracting information (meaning) from sentences, combining it with existing commonsense knowledge, and then performing reasoning to draw conclusions. While large language models (LLMs) such as GPT-3 and ChatGPT are able to leverage patterns in the text to solve a variety of NLP tasks, they fall short in problems that require reasoning. They also cannot reliably explai… ▽ More

    Submitted 30 August, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: In Proceedings ICLP 2023, arXiv:2308.14898

    Journal ref: EPTCS 385, 2023, pp. 274-287

  44. arXiv:2301.12667  [pdf, other

    cs.LG cs.AI cs.CV

    NeSyFOLD: Neurosymbolic Framework for Interpretable Image Classification

    Authors: Parth Padalkar, Huaduo Wang, Gopal Gupta

    Abstract: Deep learning models such as CNNs have surpassed human performance in computer vision tasks such as image classification. However, despite their sophistication, these models lack interpretability which can lead to biased outcomes reflecting existing prejudices in the data. We aim to make predictions made by a CNN interpretable. Hence, we present a novel framework called NeSyFOLD to create a neuros… ▽ More

    Submitted 20 August, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  45. arXiv:2212.10772  [pdf, other

    cs.CV

    Low-Light Image and Video Enhancement: A Comprehensive Survey and Beyond

    Authors: Shen Zheng, Yiling Ma, Jinqian Pan, Changjie Lu, Gaurav Gupta

    Abstract: This paper presents a comprehensive survey of low-light image and video enhancement, addressing two primary challenges in the field. The first challenge is the prevalence of mixed over-/under-exposed images, which are not adequately addressed by existing methods. In response, this work introduces two enhanced variants of the SICE dataset: SICE_Grad and SICE_Mix, designed to better represent these… ▽ More

    Submitted 1 January, 2024; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: 21 pages, 10 tables, and 17 figures

  46. arXiv:2212.08151  [pdf, other

    cs.LG

    First De-Trend then Attend: Rethinking Attention for Time-Series Forecasting

    Authors: Xiyuan Zhang, Xiaoyong Jin, Karthick Gopalswamy, Gaurav Gupta, Youngsuk Park, Xingjian Shi, Hao Wang, Danielle C. Maddix, Yuyang Wang

    Abstract: Transformer-based models have gained large popularity and demonstrated promising results in long-term time-series forecasting in recent years. In addition to learning attention in time domain, recent works also explore learning attention in frequency domains (e.g., Fourier domain, wavelet domain), given that seasonal patterns can be better captured in these domains. In this work, we seek to unders… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: NeurIPS 2022 All Things Attention Workshop

  47. arXiv:2212.07477  [pdf, other

    cs.LG math.AP math.OA

    Guiding continuous operator learning through Physics-based boundary constraints

    Authors: Nadim Saad, Gaurav Gupta, Shima Alizadeh, Danielle C. Maddix

    Abstract: Boundary conditions (BCs) are important groups of physics-enforced constraints that are necessary for solutions of Partial Differential Equations (PDEs) to satisfy at specific spatial locations. These constraints carry important physical meaning, and guarantee the existence and the uniqueness of the PDE solution. Current neural-network based approaches that aim to solve PDEs rely only on training… ▽ More

    Submitted 2 March, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: Nadim and Gaurav contributed equally in this work. 31 pages, 7 figures, 16 tables

    Journal ref: ICLR 2023

  48. arXiv:2211.09855  [pdf, other

    cs.CL

    ProtSi: Prototypical Siamese Network with Data Augmentation for Few-Shot Subjective Answer Evaluation

    Authors: Yining Lu, Jingxi Qiu, Gaurav Gupta

    Abstract: Subjective answer evaluation is a time-consuming and tedious task, and the quality of the evaluation is heavily influenced by a variety of subjective personal characteristics. Instead, machine evaluation can effectively assist educators in saving time while also ensuring that evaluations are fair and realistic. However, most existing methods using regular machine learning and natural language proc… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  49. arXiv:2209.09408  [pdf, other

    cs.LG eess.IV

    Deep learning at the edge enables real-time streaming ptychographic imaging

    Authors: Anakha V Babu, Tao Zhou, Saugat Kandel, Tekin Bicer, Zhengchun Liu, William Judge, Daniel J. Ching, Yi Jiang, Sinisa Veseli, Steven Henke, Ryan Chard, Yudong Yao, Ekaterina Sirazitdinova, Geetika Gupta, Martin V. Holt, Ian T. Foster, Antonino Miceli, Mathew J. Cherukara

    Abstract: Coherent microscopy techniques provide an unparalleled multi-scale view of materials across scientific and technological fields, from structural materials to quantum devices, from integrated circuits to biological cells. Driven by the construction of brighter sources and high-rate detectors, coherent X-ray microscopy methods like ptychography are poised to revolutionize nanoscale materials charact… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

  50. arXiv:2208.10488  [pdf

    cs.HC cs.IR cs.LG

    Friendliness Of Stack Overflow Towards Newbies

    Authors: Aneesh Tickoo, Shweta Chauhan, Gagan Raj Gupta

    Abstract: In today's modern digital world, we have a number of online Question and Answer platforms like Stack Exchange, Quora, and GFG that serve as a medium for people to communicate and help each other. In this paper, we analyzed the effectiveness of Stack Overflow in helping newbies to programming. Every user on this platform goes through a journey. For the first 12 months, we consider them to be a newb… ▽ More

    Submitted 21 August, 2022; originally announced August 2022.

    Comments: 12 pages, International Conference on Sustainable Future: Innovations in Education