Zum Hauptinhalt springen

Showing 1–50 of 76 results for author: Goyal, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.07177  [pdf, other

    cs.LG

    Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic

    Authors: Sachin Goyal, Pratyush Maini, Zachary C. Lipton, Aditi Raghunathan, J. Zico Kolter

    Abstract: Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets. In recent times, data curation has gained prominence with several works developing strategies to retain 'high-quality' subsets of 'raw' scraped data. For instance, the LAION public dataset retained only 10% of the total crawled data. However, these strategies are typically developed agnostic of… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Published at CVPR 2024

  2. arXiv:2403.09057  [pdf, other

    cs.CL cs.AI

    A Continued Pretrained LLM Approach for Automatic Medical Note Generation

    Authors: Dong Yuan, Eti Rastogi, Gautam Naik, Sree Prasanna Rajagopal, Sagar Goyal, Fen Zhao, Bharath Chintagunta, Jeff Ward

    Abstract: LLMs are revolutionizing NLP tasks. However, the use of the most advanced LLMs, such as GPT-4, is often prohibitively expensive for most specialized fields. We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing. Our results demonstrate that HEAL outperforms GPT-4 and PMC-LLaMA in PubMedQA, with an a… ▽ More

    Submitted 3 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted to NAACL 2024

  3. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  4. arXiv:2403.00826  [pdf, other

    cs.CL cs.CR cs.LG

    LLMGuard: Guarding Against Unsafe LLM Behavior

    Authors: Shubh Goyal, Medha Hira, Shubham Mishra, Sukriti Goyal, Arnav Goel, Niharika Dadu, Kirushikesh DB, Sameep Mehta, Nishtha Madaan

    Abstract: Although the rise of Large Language Models (LLMs) in enterprise settings brings new opportunities and capabilities, it also brings challenges, such as the risk of generating inappropriate, biased, or misleading content that violates regulations and can have legal concerns. To alleviate this, we present "LLMGuard", a tool that monitors user interactions with an LLM application and flags content aga… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

    Comments: accepted in demonstration track of AAAI-24

  5. arXiv:2402.10567  [pdf, other

    cs.CL cs.AI

    InSaAF: Incorporating Safety through Accuracy and Fairness | Are LLMs ready for the Indian Legal Domain?

    Authors: Yogesh Tripathi, Raghav Donakanti, Sahil Girhepuje, Ishan Kavathekar, Bhaskara Hanuma Vedula, Gokul S Krishnan, Shreya Goyal, Anmol Goel, Balaraman Ravindran, Ponnurangam Kumaraguru

    Abstract: Recent advancements in language technology and Artificial Intelligence have resulted in numerous Language Models being proposed to perform various tasks in the legal domain ranging from predicting judgments to generating summaries. Despite their immense potential, these models have been proven to learn and exhibit societal biases and make unfair predictions. In this study, we explore the ability o… ▽ More

    Submitted 17 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2311.09784  [pdf, other

    cs.LO cs.AI cs.SE

    Automatic Generation of Scenarios for System-level Simulation-based Verification of Autonomous Driving Systems

    Authors: Srajan Goyal, Alberto Griggio, Jacob Kimblad, Stefano Tonetta

    Abstract: With increasing complexity of Automated Driving Systems (ADS), ensuring their safety and reliability has become a critical challenge. The Verification and Validation (V&V) of these systems are particularly demanding when AI components are employed to implement perception and/or control functions. In ESA-funded project VIVAS, we developed a generic framework for system-level simulation-based V&V of… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: In Proceedings FMAS 2023, arXiv:2311.08987

    Journal ref: EPTCS 395, 2023, pp. 113-129

  8. arXiv:2310.10294  [pdf, other

    cs.CL cs.AI

    Key-phrase boosted unsupervised summary generation for FinTech organization

    Authors: Aadit Deshpande, Shreya Goyal, Prateek Nagwanshi, Avinash Tripathy

    Abstract: With the recent advances in social media, the use of NLP techniques in social media data analysis has become an emerging research direction. Business organizations can particularly benefit from such an analysis of social media discourse, providing an external perspective on consumer behavior. Some of the NLP applications such as intent detection, sentiment classification, text summarization can he… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 8 pages, 4 figures

  9. arXiv:2310.02372  [pdf, other

    cs.CL cs.AI

    ProtoNER: Few shot Incremental Learning for Named Entity Recognition using Prototypical Networks

    Authors: Ritesh Kumar, Saurabh Goyal, Ashish Verma, Vatche Isahagian

    Abstract: Key value pair (KVP) extraction or Named Entity Recognition(NER) from visually rich documents has been an active area of research in document understanding and data extraction domain. Several transformer based models such as LayoutLMv2, LayoutLMv3, and LiLT have emerged achieving state of the art results. However, addition of even a single new class to the existing model requires (a) re-annotation… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  10. arXiv:2310.02226  [pdf, other

    cs.CL cs.AI cs.LG

    Think before you speak: Training Language Models With Pause Tokens

    Authors: Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan

    Abstract: Language models generate responses by producing a series of tokens in immediate succession: the $(K+1)^{th}$ token is an outcome of manipulating $K$ hidden vectors per layer, one vector per preceding token. What if instead we were to let the model manipulate say, $K+10$ hidden vectors, before it outputs the $(K+1)^{th}$ token? We operationalize this idea by performing training and inference on lan… ▽ More

    Submitted 20 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Published at ICLR 2024

  11. arXiv:2308.13204  [pdf, other

    cs.CV

    Self-supervised learning for hotspot detection and isolation from thermal images

    Authors: Shreyas Goyal, Jagath C. Rajapakse

    Abstract: Hotspot detection using thermal imaging has recently become essential in several industrial applications, such as security applications, health applications, and equipment monitoring applications. Hotspot detection is of utmost importance in industrial safety where equipment can develop anomalies. Hotspots are early indicators of such anomalies. We address the problem of hotspot detection in therm… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  12. arXiv:2307.03132  [pdf, other

    cs.CV cs.CL cs.LG

    T-MARS: Improving Visual Representations by Circumventing Text Feature Learning

    Authors: Pratyush Maini, Sachin Goyal, Zachary C. Lipton, J. Zico Kolter, Aditi Raghunathan

    Abstract: Large web-sourced multimodal datasets have powered a slew of new methods for learning general-purpose visual representations, advancing the state of the art in computer vision and revolutionizing zero- and few-shot recognition. One crucial decision facing practitioners is how, if at all, to curate these ever-larger datasets. For example, the creators of the LAION-5B dataset chose to retain only im… ▽ More

    Submitted 18 March, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Accepted to ICLR 2024. Oral at ICCV Datacomp 2023

  13. arXiv:2306.14939  [pdf, other

    cs.CL cs.LG

    The Art of Embedding Fusion: Optimizing Hate Speech Detection

    Authors: Mohammad Aflah Khan, Neemesh Yadav, Mohit Jain, Sanyam Goyal

    Abstract: Hate speech detection is a challenging natural language processing task that requires capturing linguistic and contextual nuances. Pre-trained language models (PLMs) offer rich semantic representations of text that can improve this task. However there is still limited knowledge about ways to effectively combine representations across PLMs and leverage their complementary strengths. In this work, w… ▽ More

    Submitted 8 October, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: Published as a Tiny Paper at ICLR 2023, 12 Pages

  14. arXiv:2306.13841  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Is Pre-training Truly Better Than Meta-Learning?

    Authors: Brando Miranda, Patrick Yu, Saumya Goyal, Yu-Xiong Wang, Sanmi Koyejo

    Abstract: In the context of few-shot learning, it is currently believed that a fixed pre-trained (PT) model, along with fine-tuning the final layer during evaluation, outperforms standard meta-learning algorithms. We re-evaluate these claims under an in-depth empirical examination of an extensive set of formally diverse datasets and compare PT to Model Agnostic Meta-Learning (MAML). Unlike previous work, we… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Journal ref: Proceedings of the 40th International Conference on Machine Learning 2023 DMLR Workshop

  15. arXiv:2303.11761  [pdf, other

    cs.LG cs.DC cs.SE

    Reasonable Scale Machine Learning with Open-Source Metaflow

    Authors: Jacopo Tagliabue, Hugo Bowne-Anderson, Ville Tuulos, Savin Goyal, Romain Cledat, David Berg

    Abstract: As Machine Learning (ML) gains adoption across industries and new use cases, practitioners increasingly realize the challenges around effectively developing and iterating on ML systems: reproducibility, debugging, scalability, and documentation are elusive goals for real-world pipelines outside tech-first companies. In this paper, we review the nature of ML-oriented workloads and argue that re-pur… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

  16. arXiv:2303.11548  [pdf, other

    cs.CV

    Emotionally Enhanced Talking Face Generation

    Authors: Sahil Goyal, Shagun Uppal, Sarthak Bhagat, Yi Yu, Yifang Yin, Rajiv Ratn Shah

    Abstract: Several works have developed end-to-end pipelines for generating lip-synced talking faces with various real-world applications, such as teaching and language translation in videos. However, these prior works fail to create realistic-looking videos since they focus little on people's expressions and emotions. Moreover, these methods' effectiveness largely depends on the faces in the training datase… ▽ More

    Submitted 26 March, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

  17. arXiv:2303.07247  [pdf

    cs.CL cs.CY

    Are Models Trained on Indian Legal Data Fair?

    Authors: Sahil Girhepuje, Anmol Goel, Gokul S Krishnan, Shreya Goyal, Satyendra Pandey, Ponnurangam Kumaraguru, Balaraman Ravindran

    Abstract: Recent advances and applications of language technology and artificial intelligence have enabled much success across multiple domains like law, medical and mental health. AI-based Language Models, like Judgement Prediction, have recently been proposed for the legal sector. However, these models are strife with encoded social biases picked up from the training data. While bias and fairness have bee… ▽ More

    Submitted 14 May, 2024; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: Presented at the Symposium on AI and Law (SAIL) 2023

  18. arXiv:2302.08624  [pdf, other

    cs.CL cs.LG

    InstructABSA: Instruction Learning for Aspect Based Sentiment Analysis

    Authors: Kevin Scaria, Himanshu Gupta, Siddharth Goyal, Saurabh Arjun Sawant, Swaroop Mishra, Chitta Baral

    Abstract: We introduce InstructABSA, an instruction learning paradigm for Aspect-Based Sentiment Analysis (ABSA) subtasks. Our method introduces positive, negative, and neutral examples to each training sample, and instruction tune the model (Tk-Instruct) for ABSA subtasks, yielding significant performance improvements. Experimental results on the Sem Eval 2014, 15, and 16 datasets demonstrate that Instruct… ▽ More

    Submitted 13 November, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: 4 pages, 3 figures, 9 tables, 9 appendix pages

  19. arXiv:2301.05150  [pdf, other

    cs.CL

    Unsupervised Question Duplicate and Related Questions Detection in e-learning platforms

    Authors: Maksimjeet Chowdhary, Sanyam Goyal, Venktesh V, Mukesh Mohania, Vikram Goyal

    Abstract: Online learning platforms provide diverse questions to gauge the learners' understanding of different concepts. The repository of questions has to be constantly updated to ensure a diverse pool of questions to conduct assessments for learners. However, it is impossible for the academician to manually skim through the large repository of questions to check for duplicates when onboarding new questio… ▽ More

    Submitted 20 December, 2022; originally announced January 2023.

    Comments: 4 pages accepted as demo paper at WSDM 2023

  20. arXiv:2212.05409  [pdf, other

    cs.CL

    Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages

    Authors: Sumanth Doddapaneni, Rahul Aralikatte, Gowtham Ramesh, Shreya Goyal, Mitesh M. Khapra, Anoop Kunchukuttan, Pratyush Kumar

    Abstract: Building Natural Language Understanding (NLU) capabilities for Indic languages, which have a collective speaker base of more than one billion speakers is absolutely crucial. In this work, we aim to improve the NLU capabilities of Indic languages by making contributions along 3 important axes (i) monolingual corpora (ii) NLU testsets (iii) multilingual LLMs focusing on Indic languages. Specifically… ▽ More

    Submitted 24 May, 2023; v1 submitted 10 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  21. arXiv:2212.00638  [pdf, other

    cs.CV cs.LG

    Finetune like you pretrain: Improved finetuning of zero-shot vision models

    Authors: Sachin Goyal, Ananya Kumar, Sankalp Garg, Zico Kolter, Aditi Raghunathan

    Abstract: Finetuning image-text models such as CLIP achieves state-of-the-art accuracies on a variety of benchmarks. However, recent works like WiseFT (Wortsman et al., 2021) and LP-FT (Kumar et al., 2022) have shown that even subtle differences in the finetuning process can lead to surprisingly large differences in the final performance, both for in-distribution (ID) and out-of-distribution (OOD) data. In… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

    Comments: 20 Pages, 7 Tables, 5 Figures

  22. arXiv:2211.09174  [pdf, other

    cs.LG cs.AI

    CASPR: Customer Activity Sequence-based Prediction and Representation

    Authors: Pin-Jung Chen, Sahil Bhatnagar, Sagar Goyal, Damian Konrad Kowalczyk, Mayank Shrivastava

    Abstract: Tasks critical to enterprise profitability, such as customer churn prediction, fraudulent account detection or customer lifetime value estimation, are often tackled by models trained on features engineered from customer data in tabular format. Application-specific feature engineering adds development, operationalization and maintenance costs over time. Recent advances in representation learning pr… ▽ More

    Submitted 28 November, 2022; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: Presented at the Table Representation Learning Workshop, NeurIPS 2022, New Orleans. Authors listed in random order

  23. arXiv:2210.07471  [pdf, other

    cs.CL

    "John is 50 years old, can his son be 65?" Evaluating NLP Models' Understanding of Feasibility

    Authors: Himanshu Gupta, Neeraj Varshney, Swaroop Mishra, Kuntal Kumar Pal, Saurabh Arjun Sawant, Kevin Scaria, Siddharth Goyal, Chitta Baral

    Abstract: In current NLP research, large-scale language models and their abilities are widely being discussed. Some recent works have also found notable failures of these models. Often these failure examples involve complex reasoning abilities. This work focuses on a simple commonsense ability, reasoning about when an action (or its effect) is feasible. To this end, we introduce FeasibilityQA, a question-an… ▽ More

    Submitted 2 February, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: EACL 2023

  24. arXiv:2207.09640  [pdf, other

    cs.LG

    Test-Time Adaptation via Conjugate Pseudo-labels

    Authors: Sachin Goyal, Mingjie Sun, Aditi Raghunathan, Zico Kolter

    Abstract: Test-time adaptation (TTA) refers to adapting neural networks to distribution shifts, with access to only the unlabeled test samples from the new domain at test-time. Prior TTA methods optimize over unsupervised objectives such as the entropy of model predictions in TENT [Wang et al., 2021], but it is unclear what exactly makes a good TTA loss. In this paper, we start by presenting a surprising ph… ▽ More

    Submitted 22 November, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: Published in Neural Information Processing Systems (NeurIPS) 2022

  25. arXiv:2206.08917  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.comp-ph

    The Open Catalyst 2022 (OC22) Dataset and Challenges for Oxide Electrocatalysts

    Authors: Richard Tran, Janice Lan, Muhammed Shuaibi, Brandon M. Wood, Siddharth Goyal, Abhishek Das, Javier Heras-Domingo, Adeesh Kolluru, Ammar Rizvi, Nima Shoghi, Anuroop Sriram, Felix Therrien, Jehad Abed, Oleksandr Voznyy, Edward H. Sargent, Zachary Ulissi, C. Lawrence Zitnick

    Abstract: The development of machine learning models for electrocatalysts requires a broad set of training data to enable their use across a wide variety of materials. One class of materials that currently lacks sufficient training data is oxides, which are critical for the development of OER catalysts. To address this, we developed the OC22 dataset, consisting of 62,331 DFT relaxations (~9,854,504 single p… ▽ More

    Submitted 7 March, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: 50 pages, 14 figures

  26. arXiv:2206.08564  [pdf, other

    cs.LG stat.ML

    MET: Masked Encoding for Tabular Data

    Authors: Kushal Majmundar, Sachin Goyal, Praneeth Netrapalli, Prateek Jain

    Abstract: We consider the task of self-supervised representation learning (SSL) for tabular data: tabular-SSL. Typical contrastive learning based SSL methods require instance-wise data augmentations which are difficult to design for unstructured tabular data. Existing tabular-SSL methods design such augmentations in a relatively ad-hoc fashion and can fail to capture the underlying data manifold. Instead of… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Comments: Under Review, 18 pages, 6 Tables, 4 Figures

  27. arXiv:2203.09697  [pdf, other

    cs.LG physics.comp-ph stat.ML

    Towards Training Billion Parameter Graph Neural Networks for Atomic Simulations

    Authors: Anuroop Sriram, Abhishek Das, Brandon M. Wood, Siddharth Goyal, C. Lawrence Zitnick

    Abstract: Recent progress in Graph Neural Networks (GNNs) for modeling atomic simulations has the potential to revolutionize catalyst discovery, which is a key step in making progress towards the energy breakthroughs needed to combat climate change. However, the GNNs that have proven most effective for this task are memory intensive as they model higher-order interactions in the graphs such as those between… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: ICLR 2022

  28. arXiv:2203.06414  [pdf, other

    cs.CL

    A Survey of Adversarial Defences and Robustness in NLP

    Authors: Shreya Goyal, Sumanth Doddapaneni, Mitesh M. Khapra, Balaraman Ravindran

    Abstract: In the past few years, it has become increasingly evident that deep neural networks are not resilient enough to withstand adversarial perturbations in input data, leaving them vulnerable to attack. Various authors have proposed strong adversarial attacks for computer vision and Natural Language Processing (NLP) tasks. As a response, many defense mechanisms have also been proposed to prevent these… ▽ More

    Submitted 18 April, 2023; v1 submitted 12 March, 2022; originally announced March 2022.

    Comments: Accepted for publication at ACM Computing Surveys

  29. arXiv:2109.14970  [pdf

    cs.DB cs.LG

    A Friend Recommendation System using Semantic Based KNN Algorithm

    Authors: Srikantaiah K C, Salony Mewara, Sneha Goyal, Subhiksha S

    Abstract: Social networking has become a major part of all our lives and we depend on it for day to day purposes. It is a medium that is used by people all around the world even in the smallest of towns. Its main purpose is to promote and aid communication between people. Social networks, such as Facebook, Twitter etc. were created for the sole purpose of helping individuals communicate about anything with… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

    Journal ref: Journal of Seybold Report, VOLUME 15 ISSUE 9 2020 , page 1201-1209

  30. arXiv:2104.07378  [pdf, other

    cs.CL cs.AI cs.IR

    Tracking entities in technical procedures -- a new dataset and baselines

    Authors: Saransh Goyal, Pratyush Pandey, Garima Gaur, Subhalingam D, Srikanta Bedathur, Maya Ramanath

    Abstract: We introduce TechTrack, a new dataset for tracking entities in technical procedures. The dataset, prepared by annotating open domain articles from WikiHow, consists of 1351 procedures, e.g., "How to connect a printer", identifies more than 1200 unique entities with an average of 4.7 entities per procedure. We evaluate the performance of state-of-the-art models on the entity-tracking task and find… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  31. Evaluation of deep learning models for multi-step ahead time series prediction

    Authors: Rohitash Chandra, Shaurya Goyal, Rishabh Gupta

    Abstract: Time series prediction with neural networks has been the focus of much research in the past few decades. Given the recent deep learning revolution, there has been much attention in using deep learning models for time series prediction, and hence it is important to evaluate their strengths and weaknesses. In this paper, we present an evaluation study that compares the performance of deep learning m… ▽ More

    Submitted 7 June, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

    Journal ref: IEEE Access, 2021

  32. arXiv:2103.08298  [pdf, other

    cs.CV

    Knowledge driven Description Synthesis for Floor Plan Interpretation

    Authors: Shreya Goyal, Chiranjoy Chattopadhyay, Gaurav Bhatnagar

    Abstract: Image captioning is a widely known problem in the area of AI. Caption generation from floor plan images has applications in indoor path planning, real estate, and providing architectural solutions. Several methods have been explored in literature for generating captions or semi-structured descriptions from floor plan images. Since only the caption is insufficient to capture fine-grained details, r… ▽ More

    Submitted 15 March, 2021; originally announced March 2021.

    Comments: 19 pages, 18 Figure

  33. arXiv:2103.08297  [pdf, other

    cs.CV

    GRIHA: Synthesizing 2-Dimensional Building Layouts from Images Captured using a Smart Phone

    Authors: Shreya Goyal, Naimul Khan, Chiranjoy Chattopadhyay, Gaurav Bhatnagar

    Abstract: Reconstructing an indoor scene and generating a layout/floor plan in 3D or 2D is a widely known problem. Quite a few algorithms have been proposed in the literature recently. However, most existing methods either use RGB-D images, thus requiring a depth camera, or depending on panoramic photos, assuming that there is little to no occlusion in the rooms. In this work, we proposed GRIHA (Generating… ▽ More

    Submitted 15 March, 2021; originally announced March 2021.

    Comments: 19 pages, 22 Figures, 4 Tables

  34. arXiv:2103.03426  [pdf, other

    cs.IT

    Target Localization using Bistatic and Multistatic Radar with 5G NR Waveform

    Authors: O. Kanhere, S. Goyal, M. Beluri, T. S. Rappaport

    Abstract: Joint communication and sensing allows the utilization of common spectral resources for communication and localization, reducing the cost of deployment. By using fifth generation (5G) New Radio (NR) (i.e., the 3rd Generation Partnership Project Radio Access Network for 5G) reference signals, conventionally used for communication, this paper shows sub-meter precision localization is possible at mil… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: IEEE 93rd Vehicular Technology Conference (VTC-Spring)

  35. arXiv:2103.01436  [pdf, other

    cs.LG

    ForceNet: A Graph Neural Network for Large-Scale Quantum Calculations

    Authors: Weihua Hu, Muhammed Shuaibi, Abhishek Das, Siddharth Goyal, Anuroop Sriram, Jure Leskovec, Devi Parikh, C. Lawrence Zitnick

    Abstract: With massive amounts of atomic simulation data available, there is a huge opportunity to develop fast and accurate machine learning models to approximate expensive physics-based calculations. The key quantity to estimate is atomic forces, where the state-of-the-art Graph Neural Networks (GNNs) explicitly enforce basic physical constraints such as rotation-covariance. However, to strictly satisfy t… ▽ More

    Submitted 1 March, 2021; originally announced March 2021.

  36. arXiv:2102.13309  [pdf, other

    econ.TH cs.GT cs.SI

    Discord and Harmony in Networks

    Authors: Andrea Galeotti, Benjamin Golub, Sanjeev Goyal, Rithvik Rao

    Abstract: Consider a coordination game played on a network, where agents prefer taking actions closer to those of their neighbors and to their own ideal points in action space. We explore how the welfare outcomes of a coordination game depend on network structure and the distribution of ideal points throughout the network. To this end, we imagine a benevolent or adversarial planner who intervenes, at a cost… ▽ More

    Submitted 26 February, 2021; originally announced February 2021.

  37. arXiv:2102.11849  [pdf

    cs.CR

    Usability and Security of Different Authentication Methods for an Electronic Health Records System

    Authors: Saptarshi Purkayastha, Shreya Goyal, Bolu Oluwalade, Tyler Phillips, Huanmei Wu, Xukai Zou

    Abstract: We conducted a survey of 67 graduate students enrolled in the Privacy and Security in Healthcare course at Indiana University Purdue University Indianapolis. This was done to measure user preference and their understanding of usability and security of three different Electronic Health Records authentication methods: single authentication method (username and password), Single sign-on with Central… ▽ More

    Submitted 23 February, 2021; originally announced February 2021.

    Comments: HEALTHINF21 at the 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2021)

  38. arXiv:2101.01048  [pdf, other

    cs.NI

    Reducing the Paging Overhead in Highly Directional Systems

    Authors: Sanjay Goyal, Hussain Elkotby, Ravikumar Pragada, Tanbir Haque

    Abstract: New Radio (NR) supports operations at high-frequency bands (e.g., millimeter-wave frequencies) by using narrow beam based directional transmissions to compensate high propagation losses at such frequencies. Due to the limited spatial coverage with each beam, the broadcast transmission of paging in NR is performed using beam sweeping, which takes multiple time slots. Thus, the paging procedure used… ▽ More

    Submitted 5 January, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

    Comments: 7 pages, 5 figures, accepted at IEEE VTC Spring 2021, April 2021, Helsinki, Finland

  39. Enabling Secure and Effective Biomedical Data Sharing through Cyberinfrastructure Gateways

    Authors: Shreya Goyal, Saptarshi Purkayastha, Tyler Phillips, Rob Quick, Alexis Britt

    Abstract: Dynaswap project reports on developing a coherently integrated and trustworthy holistic secure workflow protection architecture for cyberinfrastructures which can be used on virtual machines deployed through cyberinfrastructure (CI) services such as JetStream. This service creates a user-friendly cloud environment designed to give researchers access to interactive computing and data analysis resou… ▽ More

    Submitted 23 December, 2020; originally announced December 2020.

    Comments: Presented at Gateways 2020, Online, USA, October 2020, see https://osf.io/meetings/gateways2020/

  40. arXiv:2010.15947  [pdf, other

    cs.CV cs.LG

    PAL : Pretext-based Active Learning

    Authors: Shubhang Bhatnagar, Sachin Goyal, Darshan Tank, Amit Sethi

    Abstract: The goal of pool-based active learning is to judiciously select a fixed-sized subset of unlabeled samples from a pool to query an oracle for their labels, in order to maximize the accuracy of a supervised learner. However, the unsaid requirement that the oracle should always assign correct labels is unreasonable for most situations. We propose an active learning technique for deep neural networks… ▽ More

    Submitted 28 March, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

  41. arXiv:2010.11125  [pdf, other

    cs.CL cs.LG

    Beyond English-Centric Multilingual Machine Translation

    Authors: Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin

    Abstract: Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages. However, much of this work is English-Centric by training only on data which was translated from or to English. While this is supported by large sources of training data, it does not reflect translation needs worldwide. In… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

  42. arXiv:2010.09990  [pdf, other

    cond-mat.mtrl-sci cs.LG

    The Open Catalyst 2020 (OC20) Dataset and Community Challenges

    Authors: Lowik Chanussot, Abhishek Das, Siddharth Goyal, Thibaut Lavril, Muhammed Shuaibi, Morgane Riviere, Kevin Tran, Javier Heras-Domingo, Caleb Ho, Weihua Hu, Aini Palizhati, Anuroop Sriram, Brandon Wood, Junwoong Yoon, Devi Parikh, C. Lawrence Zitnick, Zachary Ulissi

    Abstract: Catalyst discovery and optimization is key to solving many societal and energy challenges including solar fuels synthesis, long-term energy storage, and renewable fertilizer production. Despite considerable effort by the catalysis community to apply machine learning models to the computational catalyst discovery process, it remains an open challenge to build models that can generalize across both… ▽ More

    Submitted 24 September, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 37 pages, 11 figures, submitted to ACS Catalysis

  43. arXiv:2010.09435  [pdf, other

    cond-mat.mtrl-sci cs.CE cs.LG

    An Introduction to Electrocatalyst Design using Machine Learning for Renewable Energy Storage

    Authors: C. Lawrence Zitnick, Lowik Chanussot, Abhishek Das, Siddharth Goyal, Javier Heras-Domingo, Caleb Ho, Weihua Hu, Thibaut Lavril, Aini Palizhati, Morgane Riviere, Muhammed Shuaibi, Anuroop Sriram, Kevin Tran, Brandon Wood, Junwoong Yoon, Devi Parikh, Zachary Ulissi

    Abstract: Scalable and cost-effective solutions to renewable energy storage are essential to addressing the world's rising energy needs while reducing climate change. As we increase our reliance on renewable energy sources such as wind and solar, which produce intermittent power, storage is needed to transfer power from times of peak generation to peak demand. This may require the storage of power for hours… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 27 pages

    ACM Class: I.2.6; J.2

  44. Indoor Distance Estimation using LSTMs over WLAN Network

    Authors: Pranav Sankhe, Saqib Azim, Sachin Goyal, Tanya Choudhary, Kumar Appaiah, Sukumar Srikant

    Abstract: The Global Navigation Satellite Systems (GNSS) like GPS suffer from accuracy degradation and are almost unavailable in indoor environments. Indoor positioning systems (IPS) based on WiFi signals have been gaining popularity. However, owing to the strong spatial and temporal variations of wireless communication channels in the indoor environment, the achieved accuracy of existing IPS is around seve… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

    Comments: Published in IEEE 16th Workshop on Positioning, Navigation and Communications (WPNC 2019, Germany)

    Journal ref: 2019 16th IEEE Workshop on Positioning, Navigation and Communications

  45. arXiv:2003.11170  [pdf, other

    cs.MA cs.CR

    Norms and Sanctions as a Basis for Promoting Cybersecurity Practices

    Authors: Nirav Ajmeri, Shubham Goyal, Munindar P. Singh

    Abstract: Many cybersecurity breaches occur due to users not following good cybersecurity practices, chief among them being regulations for applying software patches to operating systems, updating applications, and maintaining strong passwords. We capture cybersecurity expectations on users as norms. We empirically investigate sanctioning mechanisms in promoting compliance with those norms as well as the… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

    Comments: 10 pages, 4 figures

  46. arXiv:2002.12718  [pdf, other

    cs.LG stat.ML

    DROCC: Deep Robust One-Class Classification

    Authors: Sachin Goyal, Aditi Raghunathan, Moksh Jain, Harsha Vardhan Simhadri, Prateek Jain

    Abstract: Classical approaches for one-class problems such as one-class SVM and isolation forest require careful feature engineering when applied to structured domains like images. State-of-the-art methods aim to leverage deep learning to learn appropriate features via two main approaches. The first approach based on predicting transformations (Golan & El-Yaniv, 2018; Hendrycks et al., 2019a) while successf… ▽ More

    Submitted 15 August, 2020; v1 submitted 28 February, 2020; originally announced February 2020.

    Comments: 16 pages, 9 figures, Published at International Conference on Machine Learning (ICML) 2020

  47. arXiv:2001.10309  [pdf, other

    cs.IT eess.SP

    New Radio Physical Layer Abstraction for System-Level Simulations of 5G Networks

    Authors: Sandra Lagen, Kevin Wanuga, Hussain Elkotby, Sanjay Goyal, Natale Patriciello, Lorenza Giupponi

    Abstract: A physical layer (PHY) abstraction model estimates the PHY performance in system-level simulators to speed up the simulations. This paper presents a PHY abstraction model for 5G New Radio (NR) and its integration into an open-source ns-3 based NR system-level simulator. The model capitalizes on the exponential effective signal-to-interference-plus-noise ratio (SINR) mapping (EESM) and considers th… ▽ More

    Submitted 19 April, 2021; v1 submitted 28 January, 2020; originally announced January 2020.

    Comments: published in IEEE ICC

  48. arXiv:2001.08950  [pdf, other

    cs.LG cs.CL stat.ML

    PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination

    Authors: Saurabh Goyal, Anamitra R. Choudhury, Saurabh M. Raje, Venkatesan T. Chakaravarthy, Yogish Sabharwal, Ashish Verma

    Abstract: We develop a novel method, called PoWER-BERT, for improving the inference time of the popular BERT model, while maintaining the accuracy. It works by: a) exploiting redundancy pertaining to word-vectors (intermediate encoder outputs) and eliminating the redundant vectors. b) determining which word-vectors to eliminate by developing a strategy for measuring their significance, based on the self-att… ▽ More

    Submitted 8 September, 2020; v1 submitted 24 January, 2020; originally announced January 2020.

    Comments: Accepted at ICML 2020

  49. arXiv:2001.04779  [pdf, other

    cs.IT cs.NI

    NR-U and WiGig Coexistence in 60 GHz Bands

    Authors: Natale Patriciello, Sanjay Goyal, Sandra Lagen, Lorenza Giupponi, Biljana Bojovic, Alpaslan Demir, Mihaela Beluri

    Abstract: In December 2019, the 3GPP defined the road-map for Release-17, which includes new features on the operation of New Radio (NR) in millimeter-wave bands with highly directional communications systems, i.e., up to 52.6 GHz. In this paper, a system-level simulation based study on the coexistence of NR-based access to unlicensed spectrum (NR-U) and an IEEE technology, i.e., 802.11ad Wireless Gigabit (… ▽ More

    Submitted 21 January, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

  50. arXiv:1907.09273  [pdf, other

    cs.AI cs.CL

    Why Build an Assistant in Minecraft?

    Authors: Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston

    Abstract: In this document we describe a rationale for a research program aimed at building an open "assistant" in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.

    Submitted 25 July, 2019; v1 submitted 22 July, 2019; originally announced July 2019.