Zum Hauptinhalt springen

Showing 1–50 of 100 results for author: Jain, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15256  [pdf, other

    cs.HC cs.AI

    Improving Ontology Requirements Engineering with OntoChat and Participatory Prompting

    Authors: Yihang Zhao, Bohui Zhang, Xi Hu, Shuyin Ouyang, Jongmo Kim, Nitisha Jain, Jacopo de Berardinis, Albert Meroño-Peñuela, Elena Simperl

    Abstract: Past ontology requirements engineering (ORE) has primarily relied on manual methods, such as interviews and collaborative forums, to gather user requirements from domain experts, especially in large projects. Current OntoChat offers a framework for ORE that utilises large language models (LLMs) to streamline the process through four key functions: user story creation, competency question (CQ) extr… ▽ More

    Submitted 29 August, 2024; v1 submitted 9 August, 2024; originally announced August 2024.

  2. arXiv:2407.18278  [pdf, other

    cs.SI cs.HC

    Talking Wikidata: Communication patterns and their impact on community engagement in collaborative knowledge graphs

    Authors: Elisavet Koutsiana, Ioannis Reklos, Kholoud Saad Alghamdi, Nitisha Jain, Albert Meroño-Peñuela, Elena Simperl

    Abstract: We study collaboration patterns of Wikidata, one of the world's largest collaborative knowledge graph communities. Wikidata lacks long-term engagement with a small group of priceless members, 0.8%, to be responsible for 80% of contributions. Therefore, it is essential to investigate their behavioural patterns and find ways to enhance their contributions and participation. Previous studies have hig… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  3. arXiv:2407.16883  [pdf, other

    cs.IR cs.AI cs.CY cs.DB cs.LG

    A Standardized Machine-readable Dataset Documentation Format for Responsible AI

    Authors: Nitisha Jain, Mubashara Akhtar, Joan Giner-Miguelez, Rajat Shinde, Joaquin Vanschoren, Steffen Vogler, Sujata Goswami, Yuhan Rao, Tim Santos, Luis Oala, Michalis Karamousadakis, Manil Maskey, Pierre Marcenac, Costanza Conforti, Michael Kuchnik, Lora Aroyo, Omar Benjelloun, Elena Simperl

    Abstract: Data is critical to advancing AI technologies, yet its quality and documentation remain significant challenges, leading to adverse downstream effects (e.g., potential biases) in AI applications. This paper addresses these issues by introducing Croissant-RAI, a machine-readable metadata format designed to enhance the discoverability, interoperability, and trustworthiness of AI datasets. Croissant-R… ▽ More

    Submitted 4 June, 2024; originally announced July 2024.

    Comments: 10 pages, appendix

  4. arXiv:2407.09726  [pdf, other

    cs.CL cs.AI cs.LG

    On Mitigating Code LLM Hallucinations with API Documentation

    Authors: Nihal Jain, Robert Kwiatkowski, Baishakhi Ray, Murali Krishna Ramanathan, Varun Kumar

    Abstract: In this study, we address the issue of API hallucinations in various software engineering contexts. We introduce CloudAPIBench, a new benchmark designed to measure API hallucination occurrences. CloudAPIBench also provides annotations for frequencies of API occurrences in the public domain, allowing us to study API hallucinations at various frequency levels. Our findings reveal that Code LLMs stru… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  5. arXiv:2406.19314  [pdf, other

    cs.CL cs.AI cs.LG

    LiveBench: A Challenging, Contamination-Free LLM Benchmark

    Authors: Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann LeCun, Tom Goldstein, Willie Neiswanger, Micah Goldblum

    Abstract: Test set contamination, wherein test data from a benchmark ends up in a newer model's training set, is a well-documented obstacle for fair LLM evaluation and can quickly render benchmarks obsolete. To mitigate this, many recent benchmarks crowdsource new prompts and evaluations from human or LLM judges; however, these can introduce significant biases, and break down when scoring hard questions. In… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  6. arXiv:2406.15877  [pdf, other

    cs.SE cs.AI cs.CL

    BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

    Authors: Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen Gong, Thong Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman Jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu , et al. (8 additional authors not shown)

    Abstract: Automated software engineering has been greatly empowered by the recent advances in Large Language Models (LLMs) for programming. While current benchmarks have shown that LLMs can perform various software engineering tasks like human developers, the majority of their evaluations are limited to short and self-contained algorithmic tasks. Solving challenging and practical programming tasks requires… ▽ More

    Submitted 26 June, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: 44 pages, 14 figures, 7 tables, built with love by the BigCode community :)

  7. arXiv:2406.10323  [pdf, other

    cs.CL

    GenQA: Generating Millions of Instructions from a Handful of Prompts

    Authors: Jiuhai Chen, Rifaa Qadri, Yuxin Wen, Neel Jain, John Kirchenbauer, Tianyi Zhou, Tom Goldstein

    Abstract: Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models. To study questions about finetuning at scale, such as curricula and learning rate cooldown schedules, there is a need for industrial-scale datasets. However, this scale necessitates a data generation process that is almost entirely automated. In this work, we study… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9.5 pages, 6 Figures, and 3 tables in the main body. Dataset available at https://huggingface.co/datasets/tomg-group-umd/GenQA

  8. arXiv:2406.10209  [pdf, other

    cs.CL

    Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

    Authors: Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping, Abhinav Bhatele, Tom Goldstein

    Abstract: Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subset of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verba… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9.5 pages, 8 figures, and 1 table in the main body. Code available at https://github.com/ahans30/goldfish-loss

  9. arXiv:2405.20237  [pdf, other

    quant-ph cs.AI cs.LG

    Training-efficient density quantum machine learning

    Authors: Brian Coyle, El Amine Cherrat, Nishant Jain, Natansh Mathur, Snehal Raj, Skander Kazdaghli, Iordanis Kerenidis

    Abstract: Quantum machine learning requires powerful, flexible and efficiently trainable models to be successful in solving challenging problems. In this work, we present density quantum neural networks, a learning model incorporating randomisation over a set of trainable unitaries. These models generalise quantum neural networks using parameterised quantum circuits, and allow a trade-off between expressibi… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 17 pages main text, 9 pages appendices. 9 figures

  10. arXiv:2405.17399  [pdf, other

    cs.LG cs.AI

    Transformers Can Do Arithmetic with the Right Embeddings

    Authors: Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein

    Abstract: The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, we show that this fix ena… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  11. arXiv:2405.07828  [pdf, other

    cs.SI cs.CY

    Can LLMs Help Predict Elections? (Counter)Evidence from the World's Largest Democracy

    Authors: Pratik Gujral, Kshitij Awaldhi, Navya Jain, Bhavuk Bhandula, Abhijnan Chakraborty

    Abstract: The study of how social media affects the formation of public opinion and its influence on political results has been a popular field of inquiry. However, current approaches frequently offer a limited comprehension of the complex political phenomena, yielding inconsistent outcomes. In this work, we introduce a new method: harnessing the capabilities of Large Language Models (LLMs) to examine socia… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  12. arXiv:2404.10934  [pdf, other

    cs.LG cs.AI cs.CL

    Shears: Unstructured Sparsity with Neural Low-rank Adapter Search

    Authors: J. Pablo Muñoz, Jinjie Yuan, Nilesh Jain

    Abstract: Recently, several approaches successfully demonstrated that weight-sharing Neural Architecture Search (NAS) can effectively explore a search space of elastic low-rank adapters (LoRA), allowing the parameter-efficient fine-tuning (PEFT) and compression of large language models. In this paper, we introduce a novel approach called Shears, demonstrating how the integration of cost-effective sparsity a… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Industry Track)

  13. arXiv:2403.19546  [pdf, other

    cs.LG cs.AI cs.DB cs.IR

    Croissant: A Metadata Format for ML-Ready Datasets

    Authors: Mubashara Akhtar, Omar Benjelloun, Costanza Conforti, Pieter Gijsbers, Joan Giner-Miguelez, Nitisha Jain, Michael Kuchnik, Quentin Lhoest, Pierre Marcenac, Manil Maskey, Peter Mattson, Luis Oala, Pierre Ruyssen, Rajat Shinde, Elena Simperl, Goeffry Thomas, Slava Tykhonov, Joaquin Vanschoren, Jos van der Velde, Steffen Vogler, Carole-Jean Wu

    Abstract: Data is a critical resource for Machine Learning (ML), yet working with data remains a key friction point. This paper introduces Croissant, a metadata format for datasets that simplifies how data is used by ML tools and frameworks. Croissant makes datasets more discoverable, portable and interoperable, thereby addressing significant challenges in ML data management and responsible AI. Croissant is… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Published in Proceedings of ACM SIGMOD/PODS'24 Data Management for End-to-End Machine Learning (DEEM) Workshop https://dl.acm.org/doi/10.1145/3650203.3663326

  14. arXiv:2403.13190  [pdf, other

    cs.CV

    3D Semantic MapNet: Building Maps for Multi-Object Re-Identification in 3D

    Authors: Vincent Cartillier, Neha Jain, Irfan Essa

    Abstract: We study the task of 3D multi-object re-identification from embodied tours. Specifically, an agent is given two tours of an environment (e.g. an apartment) under two different layouts (e.g. arrangements of furniture). Its task is to detect and re-identify objects in 3D - e.g. a "sofa" moved from location A to B, a new "chair" in the second layout at location C, or a "lamp" from location D in the f… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 8pages

  15. arXiv:2403.12236  [pdf, other

    cs.LG cs.CV

    Improving Generalization via Meta-Learning on Hard Samples

    Authors: Nishant Jain, Arun S. Suggala, Pradeep Shenoy

    Abstract: Learned reweighting (LRW) approaches to supervised learning use an optimization criterion to assign weights for training instances, in order to maximize performance on a representative validation dataset. We pose and formalize the problem of optimized selection of the validation set used in LRW training, to improve classifier generalization. In particular, we show that using hard-to-classify insta… ▽ More

    Submitted 29 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted at CVPR 2024

  16. arXiv:2403.10131  [pdf, other

    cs.CL cs.AI

    RAFT: Adapting Language Model to Domain Specific RAG

    Authors: Tianjun Zhang, Shishir G. Patil, Naman Jain, Sheng Shen, Matei Zaharia, Ion Stoica, Joseph E. Gonzalez

    Abstract: Pretraining Large Language Models (LLMs) on large corpora of textual data is now a standard paradigm. When using these LLMs for many downstream applications, it is common to additionally bake in new knowledge (e.g., time-critical news, or private domain knowledge) into the pretrained model either through RAG-based-prompting, or fine-tuning. However, the optimal methodology for the model to gain su… ▽ More

    Submitted 5 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  17. arXiv:2403.07974  [pdf, other

    cs.SE cs.CL cs.LG

    LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

    Authors: Naman Jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, Ion Stoica

    Abstract: Large Language Models (LLMs) applied to code-related applications have emerged as a prominent field, attracting significant interest from both academia and industry. However, as new and improved LLMs are developed, existing evaluation benchmarks (e.g., HumanEval, MBPP) are no longer sufficient for assessing their capabilities. In this work, we propose LiveCodeBench, a comprehensive and contaminati… ▽ More

    Submitted 6 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Website - https://livecodebench.github.io/

  18. arXiv:2402.19475  [pdf, other

    cs.SE cs.AI cs.LG

    The Counterfeit Conundrum: Can Code Language Models Grasp the Nuances of Their Incorrect Generations?

    Authors: Alex Gu, Wen-Ding Li, Naman Jain, Theo X. Olausson, Celine Lee, Koushik Sen, Armando Solar-Lezama

    Abstract: While language models are increasingly more proficient at code generation, they still frequently generate incorrect programs. Many of these programs are obviously wrong, but others are more subtle and pass weaker correctness checks such as being able to compile. In this work, we focus on these counterfeit samples: programs sampled from a language model that 1) have a high enough log-probability to… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 54 pages, 25 figures

  19. arXiv:2402.19173  [pdf, other

    cs.SE cs.AI

    StarCoder 2 and The Stack v2: The Next Generation

    Authors: Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo , et al. (41 additional authors not shown)

    Abstract: The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digital commons of their source code archive. Alongside the SWH repositories spanning 619 programming languages, we carefully select other high-quality data… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  20. arXiv:2402.03671  [pdf, other

    cs.DC

    ARGO: An Auto-Tuning Runtime System for Scalable GNN Training on Multi-Core Processor

    Authors: Yi-Chien Lin, Yuyang Chen, Sameh Gobriel, Nilesh Jain, Gopi Krishna Jha, Viktor Prasanna

    Abstract: As Graph Neural Networks (GNNs) become popular, libraries like PyTorch-Geometric (PyG) and Deep Graph Library (DGL) are proposed; these libraries have emerged as the de facto standard for implementing GNNs because they provide graph-oriented APIs and are purposefully designed to manage the inherent sparsity and irregularity in graph structures. However, these libraries show poor scalability on mul… ▽ More

    Submitted 27 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: To appear in IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2024

  21. arXiv:2311.14904  [pdf, other

    cs.LG cs.SE

    LLM-Assisted Code Cleaning For Training Accurate Code Generators

    Authors: Naman Jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica

    Abstract: Natural language to code generation is an important application area of LLMs and has received wide attention from the community. The majority of relevant studies have exclusively concentrated on increasing the quantity and functional correctness of training sets while disregarding other stylistic elements of programs. More recently, data quality has garnered a lot of interest and multiple works ha… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

  22. arXiv:2311.04521  [pdf, other

    cs.CV

    Learning Robust Multi-Scale Representation for Neural Radiance Fields from Unposed Images

    Authors: Nishant Jain, Suryansh Kumar, Luc Van Gool

    Abstract: We introduce an improved solution to the neural image-based rendering problem in computer vision. Given a set of images taken from a freely moving camera at train time, the proposed approach could synthesize a realistic image of the scene from a novel viewpoint at test time. The key ideas presented in this paper are (i) Recovering accurate camera parameters via a robust pipeline from unposed day-t… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Accepted for publication at International Journal of Computer Vision (IJCV). Draft info: 22 pages, 12 figures and 14 tables

  23. arXiv:2311.00737  [pdf

    cs.LG physics.ins-det physics.med-ph

    Real-Time Magnetic Tracking and Diagnosis of COVID-19 via Machine Learning

    Authors: Dang Nguyen, Phat K. Huynh, Vinh Duc An Bui, Kee Young Hwang, Nityanand Jain, Chau Nguyen, Le Huu Nhat Minh, Le Van Truong, Xuan Thanh Nguyen, Dinh Hoang Nguyen, Le Tien Dung, Trung Q. Le, Manh-Huong Phan

    Abstract: The COVID-19 pandemic underscored the importance of reliable, noninvasive diagnostic tools for robust public health interventions. In this work, we fused magnetic respiratory sensing technology (MRST) with machine learning (ML) to create a diagnostic platform for real-time tracking and diagnosis of COVID-19 and other respiratory diseases. The MRST precisely captures breathing patterns through thre… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  24. arXiv:2310.11248  [pdf, other

    cs.LG cs.CL cs.SE

    CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion

    Authors: Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang

    Abstract: Code completion models have made significant progress in recent years, yet current popular evaluation datasets, such as HumanEval and MBPP, predominantly focus on code completion tasks within a single file. This over-simplified setting falls short of representing the real-world software development scenario where repositories span multiple files with numerous cross-file dependencies, and accessing… ▽ More

    Submitted 16 November, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: To appear at NeurIPS 2023 (Datasets and Benchmarks Track)

  25. arXiv:2310.05914  [pdf, other

    cs.CL cs.LG

    NEFTune: Noisy Embeddings Improve Instruction Finetuning

    Authors: Neel Jain, Ping-yeh Chiang, Yuxin Wen, John Kirchenbauer, Hong-Min Chu, Gowthami Somepalli, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

    Abstract: We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on AlpacaEval, which rises to 64.69% using noisy embeddings. NEFTune also improves over strong baselines on modern instruction datasets. Models trained with Evol-Instru… ▽ More

    Submitted 10 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 25 pages, Code is available on Github: https://github.com/neelsjain/NEFTune

  26. arXiv:2309.08491  [pdf, other

    cs.CL cs.AI

    Using Large Language Models for Knowledge Engineering (LLMKE): A Case Study on Wikidata

    Authors: Bohui Zhang, Ioannis Reklos, Nitisha Jain, Albert Meroño Peñuela, Elena Simperl

    Abstract: In this work, we explore the use of Large Language Models (LLMs) for knowledge engineering tasks in the context of the ISWC 2023 LM-KBC Challenge. For this task, given subject and relation pairs sourced from Wikidata, we utilize pre-trained LLMs to produce the relevant objects in string format and link them to their respective Wikidata QIDs. We developed a pipeline using LLMs for Knowledge Enginee… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: Knowledge Base Construction from Pre-trained Language Models (LM-KBC) Challenge @ ISWC 2023

  27. arXiv:2309.07499  [pdf, other

    cs.CV

    Efficiently Robustify Pre-trained Models

    Authors: Nishant Jain, Harkirat Behl, Yogesh Singh Rawat, Vibhav Vineet

    Abstract: A recent trend in deep learning algorithms has been towards training large scale models, having high parameter count and trained on big dataset. However, robustness of such large scale models towards real-world settings is still a less-explored topic. In this work, we first benchmark the performance of these models under different perturbations and datasets thereby representing real-world shifts,… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  28. arXiv:2309.00614  [pdf, other

    cs.LG cs.CL cs.CR

    Baseline Defenses for Adversarial Attacks Against Aligned Language Models

    Authors: Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer, Ping-yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, Tom Goldstein

    Abstract: As Large Language Models quickly become ubiquitous, it becomes critical to understand their security vulnerabilities. Recent work shows that text optimizers can produce jailbreaking prompts that bypass moderation and alignment. Drawing from the rich body of work on adversarial machine learning, we approach these attacks with three questions: What threat models are practically useful in this domain… ▽ More

    Submitted 4 September, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: 12 pages

  29. arXiv:2308.03854  [pdf, ps, other

    cs.DB cs.AI cs.HC cs.LG

    Revisiting Prompt Engineering via Declarative Crowdsourcing

    Authors: Aditya G. Parameswaran, Shreya Shankar, Parth Asawa, Naman Jain, Yujie Wang

    Abstract: Large language models (LLMs) are incredibly powerful at comprehending and generating data in the form of text, but are brittle and error-prone. There has been an advent of toolkits and recipes centered around so-called prompt engineering-the process of asking an LLM to do something via a series of prompts. However, for LLM-powered data processing workflows, in particular, optimizing for quality, w… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  30. arXiv:2307.13098  [pdf, other

    quant-ph cs.CR

    Quantum key distribution for data center security -- a feasibility study

    Authors: Nitin Jain, Ulrich Hoff, Marco Gambetta, Jesper Rodenberg, Tobias Gehring

    Abstract: Data centers are nowadays referred to as the digital world's cornerstone. Quantum key distribution (QKD) is a method that solves the problem of distributing cryptographic keys between two entities, with the security rooted in the laws of quantum physics. This document provides an assessment of the need and opportunity for ushering QKD in data centers. Together with technical examples and inputs on… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 23 pages, 7 figures, study initiated and supported by Copenhagen Fintech (see https://www.copenhagenfintech.dk/projects/using-qkd-for-data-center-security)

  31. arXiv:2307.12465  [pdf, other

    cs.SE

    StaticFixer: From Static Analysis to Static Repair

    Authors: Naman Jain, Shubham Gandhi, Atharv Sonwane, Aditya Kanade, Nagarajan Natarajan, Suresh Parthasarathy, Sriram Rajamani, Rahul Sharma

    Abstract: Static analysis tools are traditionally used to detect and flag programs that violate properties. We show that static analysis tools can also be used to perturb programs that satisfy a property to construct variants that violate the property. Using this insight we can construct paired data sets of unsafe-safe program pairs, and learn strategies to automatically repair property violations. We prese… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

  32. arXiv:2306.13651  [pdf, other

    cs.CL cs.LG

    Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

    Authors: Neel Jain, Khalid Saifullah, Yuxin Wen, John Kirchenbauer, Manli Shu, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

    Abstract: With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative. For example, a company deploying a client-facing chatbot must ensure that the model will not respond to client requests with profanity. Current evaluations approach this problem using small, domain-specific datasets with human-curated… ▽ More

    Submitted 29 June, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: Code is available at https://github.com/neelsjain/BYOD. First two authors contributed equally. 21 pages, 22 figures

  33. arXiv:2306.11766  [pdf, other

    cs.HC

    Agreeing and Disagreeing in Collaborative Knowledge Graph Construction: An Analysis of Wikidata

    Authors: Elisavet Koutsiana, Tushita Yadav, Nitisha Jain, Albert Meroño-Peñuela, Elena Simperl

    Abstract: In this work, we study disagreement in discussions around Wikidata, an online knowledge community that builds the data backend of Wikipedia. Discussions are important in collaborative work as they can increase contributor performance and encourage the emergence of shared norms and practices. While disagreements can play a productive role in discussions, they can also lead to conflicts and controve… ▽ More

    Submitted 23 July, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

  34. arXiv:2305.12528  [pdf, other

    cs.IR

    IR Models and the COVID-19 Pandemic: A Comparative Study of Performance and Challenges

    Authors: Moksh Shukla, Nitik Jain, Shubham Gupta

    Abstract: This research study investigates the efficiency of different information retrieval (IR) systems in accessing relevant information from the scientific literature during the COVID-19 pandemic. The study applies the TREC framework to the COVID-19 Open Research Dataset (CORD-19) and evaluates BM25, Contriever, and Bag of Embeddings IR frameworks. The objective is to build a test collection for search… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: 7 pages, 2 figures

  35. arXiv:2305.07205  [pdf, other

    cs.IR cs.AI cs.LG

    Mem-Rec: Memory Efficient Recommendation System using Alternative Representation

    Authors: Gopi Krishna Jha, Anthony Thomas, Nilesh Jain, Sameh Gobriel, Tajana Rosing, Ravi Iyer

    Abstract: Deep learning-based recommendation systems (e.g., DLRMs) are widely used AI models to provide high-quality personalized recommendations. Training data used for modern recommendation systems commonly includes categorical features taking on tens-of-millions of possible distinct values. These categorical tokens are typically assigned learned vector representations, that are stored in large embedding… ▽ More

    Submitted 14 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

  36. arXiv:2305.05904  [pdf

    cs.CR

    Revisiting Fully Homomorphic Encryption Schemes

    Authors: Nimish Jain, Aswani Kumar Cherukuri

    Abstract: Homomorphic encryption is a sophisticated encryption technique that allows computations on encrypted data to be done without the requirement for decryption. This trait makes homomorphic encryption appropriate for safe computation in sensitive data scenarios, such as cloud computing, medical data exchange, and financial transactions. The data is encrypted using a public key in homomorphic encryptio… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: A quick summary of Fully Homomorphic Encryption Schemes along with their background, concepts, applications and open-source libraries

  37. arXiv:2304.02913  [pdf, ps, other

    cs.IT

    Reversible and Reversible Complement Cyclic codes over a class of non-chain rings

    Authors: Nikita Jain, Sucheta Dutt, Ranjeet Sehmi

    Abstract: In this paper, necessary and sufficient conditions for a cyclic code of arbitrary length over the non-chain rings $Z_{4}+νZ_{4}$ for $ν^{2} \in \{0,1,ν,2ν,3ν,2+ν,2+3ν,3+2ν\}$ to be a reversible cyclic code have been established. Also, conditions for a cyclic code over these non-chain rings to be a reversible complement cyclic code which are necessary as well as sufficient have been determined. Som… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  38. arXiv:2303.17094  [pdf, other

    cs.CV cs.GR

    Enhanced Stable View Synthesis

    Authors: Nishant Jain, Suryansh Kumar, Luc Van Gool

    Abstract: We introduce an approach to enhance the novel view synthesis from images taken from a freely moving camera. The introduced approach focuses on outdoor scenes where recovering accurate geometric scaffold and camera pose is challenging, leading to inferior results using the state-of-the-art stable view synthesis (SVS) method. SVS and related methods fail for outdoor scenes primarily due to (i) over-… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: Accepted to IEEE/CVF CVPR 2023. Draft info: 13 pages, 6 Figures, 7 Tables

  39. arXiv:2303.09711  [pdf, other

    cs.HC cs.RO

    Trust in Shared Automated Vehicles: Study on Two Mobility Platforms

    Authors: Shashank Mehrotra, Jacob G Hunter, Matthew Konishi, Kumar Akash, Zhaobo Zheng, Teruhisa Misu, Anil Kumar, Tahira Reid, Neera Jain

    Abstract: The ever-increasing adoption of shared transportation modalities across the United States has the potential to fundamentally change the preferences and usage of different mobilities. It also raises several challenges with respect to the design and development of automated mobilities that can enable a large population to take advantage of this emergent technology. One such challenge is the lack of… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: https://trid.trb.org/view/2117834

    Journal ref: Transportation Research Board 102nd Annual Meeting, Washington DC, United States, 1-12 Jan 2023, No. TRBAM-23-04456. 2023

  40. arXiv:2303.07800  [pdf, ps, other

    cs.IT

    Structure and Rank of Cyclic codes over a class of non-chain rings

    Authors: Nikita Jain, Sucheta Dutt, Ranjeet Sehmi

    Abstract: The rings $Z_{4}+νZ_{4}$ have been classified into chain rings and non-chain rings on the basis of the values of $ν^{2} \in Z_{4}+νZ_{4}.$ In this paper, the structure of cyclic codes of arbitrary length over the rings $Z_{4}+νZ_{4}$ for those values of $ν^{2}$ for which these are non-chain rings has been established. A unique form of generators of these codes has also been obtained. Further, rank… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: 11 pages

    MSC Class: 94B15; 20M05; 15A03; 54A25; 13C12 (Primary) ACM Class: F.2.2

  41. arXiv:2302.03668  [pdf, other

    cs.LG cs.CL

    Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

    Authors: Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, Tom Goldstein

    Abstract: The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft" prompts, which consist of continuous feature vectors. These can be discovered using powerful optimization methods, but they cannot be easily interpreted, re-used acr… ▽ More

    Submitted 1 June, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: 15 pages, 12 figures, Code is available at https://github.com/YuxinWenRick/hard-prompts-made-easy

  42. arXiv:2302.02249  [pdf, other

    cs.CV cs.AI

    Self-supervised Multi-view Disentanglement for Expansion of Visual Collections

    Authors: Nihal Jain, Praneetha Vaddamanu, Paridhi Maheshwari, Vishwa Vinay, Kuldeep Kulkarni

    Abstract: Image search engines enable the retrieval of images relevant to a query image. In this work, we consider the setting where a query for similar images is derived from a collection of images. For visual search, the similarity measurements may be made along multiple axes, or views, such as style and color. We assume access to a set of feature extractors, each of which computes representations for a s… ▽ More

    Submitted 4 February, 2023; originally announced February 2023.

    Comments: A version of this paper has been accepted at WSDM 2023

  43. arXiv:2212.05987  [pdf, other

    cs.LG

    Selective classification using a robust meta-learning approach

    Authors: Nishant Jain, Karthikeyan Shanmugam, Pradeep Shenoy

    Abstract: Predictive uncertainty-a model's self awareness regarding its accuracy on an input-is key for both building robust models via training interventions and for test-time applications such as selective classification. We propose a novel instance-conditioned reweighting approach that captures predictive uncertainty using an auxiliary network and unifies these train- and test-time applications. The auxi… ▽ More

    Submitted 2 January, 2024; v1 submitted 12 December, 2022; originally announced December 2022.

  44. arXiv:2212.05908  [pdf, other

    cs.LG

    Instance-Conditional Timescales of Decay for Non-Stationary Learning

    Authors: Nishant Jain, Pradeep Shenoy

    Abstract: Slow concept drift is a ubiquitous, yet under-studied problem in practical machine learning systems. In such settings, although recent data is more indicative of future data, naively prioritizing recent instances runs the risk of losing valuable information from the past. We propose an optimization-driven approach towards balancing instance importance over large training windows. First, we model i… ▽ More

    Submitted 20 December, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

    Comments: Accepted at AAAI 2024

  45. Towards Realistic Underwater Dataset Generation and Color Restoration

    Authors: Neham Jain, Gopi Matta, Kaushik Mitra

    Abstract: Recovery of true color from underwater images is an ill-posed problem. This is because the wide-band attenuation coefficients for the RGB color channels depend on object range, reflectance, etc. which are difficult to model. Also, there is backscattering due to suspended particles in water. Thus, most existing deep-learning based color restoration methods, which are trained on synthetic underwater… ▽ More

    Submitted 16 December, 2022; v1 submitted 27 November, 2022; originally announced November 2022.

    Comments: Published at The Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP) 2022

  46. arXiv:2210.04233  [pdf, other

    cs.CV

    Robustifying the Multi-Scale Representation of Neural Radiance Fields

    Authors: Nishant Jain, Suryansh Kumar, Luc Van Gool

    Abstract: Neural Radiance Fields (NeRF) recently emerged as a new paradigm for object representation from multi-view (MV) images. Yet, it cannot handle multi-scale (MS) images and camera pose estimation errors, which generally is the case with multi-view images captured from a day-to-day commodity camera. Although recently proposed Mip-NeRF could handle multi-scale imaging problems with NeRF, it cannot hand… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

    Comments: Accepted for publication at British Machine Vision Conference (BMVC) 2022. Draft info: 13 pages, 3 Figures, and 4 Tables

  47. arXiv:2210.01185  [pdf, other

    cs.CL

    ContraCLM: Contrastive Learning For Causal Language Model

    Authors: Nihal Jain, Dejiao Zhang, Wasi Uddin Ahmad, Zijian Wang, Feng Nan, Xiaopeng Li, Ming Tan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Xiaofei Ma, Bing Xiang

    Abstract: Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level. We assess ContraCLM on a variety of downstream tasks. We show that ContraCLM enhances discrimination of the representations and… ▽ More

    Submitted 2 May, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 10 pages

    Journal ref: ACL 2023

  48. The Interaction Gap: A Step Toward Understanding Trust in Autonomous Vehicles Between Encounters

    Authors: Jacob G. Hunter, Matthew Konishi, Neera Jain, Kumar Akash, Xingwei Wu, Teruhisa Misu, Tahira Reid

    Abstract: Shared autonomous vehicles (SAVs) will be introduced in greater numbers over the coming decade. Due to rapid advances in shared mobility and the slower development of fully autonomous vehicles (AVs), SAVs will likely be deployed before privately-owned AVs. Moreover, existing shared mobility services are transitioning their vehicle fleets toward those with increasingly higher levels of driving auto… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: 5 pages, 3 figures

    Journal ref: Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 2022, 66(1), 147-151

  49. arXiv:2209.09868  [pdf, other

    cs.LG cs.NE

    Streaming Encoding Algorithms for Scalable Hyperdimensional Computing

    Authors: Anthony Thomas, Behnam Khaleghi, Gopi Krishna Jha, Sanjoy Dasgupta, Nageen Himayat, Ravi Iyer, Nilesh Jain, Tajana Rosing

    Abstract: Hyperdimensional computing (HDC) is a paradigm for data representation and learning originating in computational neuroscience. HDC represents data as high-dimensional, low-precision vectors which can be used for a variety of information processing tasks like learning or recall. The mapping to high-dimensional space is a fundamental problem in HDC, and existing methods encounter scalability issues… ▽ More

    Submitted 8 February, 2023; v1 submitted 20 September, 2022; originally announced September 2022.

  50. arXiv:2209.07413  [pdf, other

    cs.LG cs.CV cs.NE

    EZNAS: Evolving Zero Cost Proxies For Neural Architecture Scoring

    Authors: Yash Akhauri, J. Pablo Munoz, Nilesh Jain, Ravi Iyer

    Abstract: Neural Architecture Search (NAS) has significantly improved productivity in the design and deployment of neural networks (NN). As NAS typically evaluates multiple models by training them partially or completely, the improved productivity comes at the cost of significant carbon footprint. To alleviate this expensive training routine, zero-shot/cost proxies analyze an NN at initialization to generat… ▽ More

    Submitted 21 December, 2022; v1 submitted 15 September, 2022; originally announced September 2022.