Skip to main content

Showing 1–50 of 83 results for author: Hegde, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04180  [pdf, other

    cs.CV

    Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing

    Authors: Anushrut Jignasu, Kelly O. Marshall, Ankush Kumar Mishra, Lucas Nerone Rillo, Baskar Ganapathysubramanian, Aditya Balu, Chinmay Hegde, Adarsh Krishnamurthy

    Abstract: G-code (Geometric code) or RS-274 is the most widely used computer numerical control (CNC) and 3D printing programming language. G-code provides machine instructions for the movement of the 3D printer, especially for the nozzle, stage, and extrusion of material for extrusion-based additive manufacturing. Currently there does not exist a large repository of curated CAD models along with their corre… ▽ More

    Submitted 11 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Replaced "SLICE-100K" with "Slice-100K", added acknowledgements, and updated main figure to better capture shadows

  2. arXiv:2406.19314  [pdf, other

    cs.CL cs.AI cs.LG

    LiveBench: A Challenging, Contamination-Free LLM Benchmark

    Authors: Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann LeCun, Tom Goldstein, Willie Neiswanger, Micah Goldblum

    Abstract: Test set contamination, wherein test data from a benchmark ends up in a newer model's training set, is a well-documented obstacle for fair LLM evaluation and can quickly render benchmarks obsolete. To mitigate this, many recent benchmarks crowdsource new prompts and evaluations from human or LLM judges; however, these can introduce significant biases, and break down when scoring hard questions. In… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.17720  [pdf, other

    cs.CV

    Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity

    Authors: Chih-Hsuan Yang, Benjamin Feuer, Zaki Jubery, Zi K. Deng, Andre Nakkab, Md Zahid Hasan, Shivani Chiranjeevi, Kelly Marshall, Nirmal Baishnab, Asheesh K Singh, Arti Singh, Soumik Sarkar, Nirav Merchant, Chinmay Hegde, Baskar Ganapathysubramanian

    Abstract: We introduce Arboretum, the largest publicly accessible dataset designed to advance AI for biodiversity applications. This dataset, curated from the iNaturalist community science platform and vetted by domain experts to ensure accuracy, includes 134.6 million images, surpassing existing datasets in scale by an order of magnitude. The dataset encompasses image-language paired data for a diverse set… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Preprint under review

  4. arXiv:2405.09312  [pdf, ps, other

    cs.LG

    Agnostic Active Learning of Single Index Models with Linear Sample Complexity

    Authors: Aarshvi Gajjar, Wai Ming Tai, Xingyu Xu, Chinmay Hegde, Yi Li, Christopher Musco

    Abstract: We study active learning methods for single index models of the form $F({\mathbf x}) = f(\langle {\mathbf w}, {\mathbf x}\rangle)$, where $f:\mathbb{R} \to \mathbb{R}$ and ${\mathbf x,\mathbf w} \in \mathbb{R}^d$. In addition to their theoretical interest as simple examples of non-linear neural networks, single index models have received significant recent attention due to applications in scientif… ▽ More

    Submitted 9 July, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  5. arXiv:2404.08079  [pdf, other

    cs.LG cs.CV math.OC

    DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models

    Authors: Nastaran Saadati, Minh Pham, Nasla Saleem, Joshua R. Waite, Aditya Balu, Zhanhong Jiang, Chinmay Hegde, Soumik Sarkar

    Abstract: Recent advances in decentralized deep learning algorithms have demonstrated cutting-edge performance on various tasks with large pre-trained models. However, a pivotal prerequisite for achieving this level of competitiveness is the significant communication and computation overheads when updating these models, which prohibits the applications of them to real-world scenarios. To address this issue,… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 accepted paper, 22 pages, 12 figures

  6. arXiv:2404.03631  [pdf, other

    cs.CV

    Robust Concept Erasure Using Task Vectors

    Authors: Minh Pham, Kelly O. Marshall, Chinmay Hegde, Niv Cohen

    Abstract: With the rapid growth of text-to-image models, a variety of techniques have been suggested to prevent undesirable image generations. Yet, these methods often only protect against specific user prompts and have been shown to allow unsafe generations with other inputs. Here we focus on unconditionally erasing a concept from a text-to-image model rather than conditioning the erasure on the user's pro… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  7. arXiv:2403.08092  [pdf, other

    cs.CV

    Mitigating the Impact of Attribute Editing on Face Recognition

    Authors: Sudipta Banerjee, Sai Pranaswi Mullangi, Shruti Wagle, Chinmay Hegde, Nasir Memon

    Abstract: Through a large-scale study over diverse face images, we show that facial attribute editing using modern generative AI models can severely degrade automated face recognition systems. This degradation persists even with identity-preserving generative models. To mitigate this issue, we propose two novel techniques for local and global attribute editing. We empirically ablate twenty-six facial semant… ▽ More

    Submitted 9 April, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Under review

  8. arXiv:2402.18085  [pdf, other

    cs.SD cs.CR eess.AS

    AI-assisted Tagging of Deepfake Audio Calls using Challenge-Response

    Authors: Govind Mittal, Arthur Jakobsson, Kelly O. Marshall, Chinmay Hegde, Nasir Memon

    Abstract: Scammers are aggressively leveraging AI voice-cloning technology for social engineering attacks, a situation significantly worsened by the advent of audio Real-time Deepfakes (RTDFs). RTDFs can clone a target's voice in real-time over phone calls, making these interactions highly interactive and thus far more convincing. Our research confidently addresses the gap in the existing literature on deep… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: Dataset will be made public by end of March 2024

  9. arXiv:2402.11137  [pdf, other

    cs.LG

    TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks

    Authors: Benjamin Feuer, Robin Tibor Schirrmeister, Valeriia Cherepanova, Chinmay Hegde, Frank Hutter, Micah Goldblum, Niv Cohen, Colin White

    Abstract: While tabular classification has traditionally relied on from-scratch training, a recent breakthrough called prior-data fitted networks (PFNs) challenges this approach. Similar to large language models, PFNs make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass. However, current PFNs have limitations that prohibit their widespread adopt… ▽ More

    Submitted 18 March, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  10. arXiv:2311.10609  [pdf, other

    cs.LG cs.DB

    Scaling TabPFN: Sketching and Feature Selection for Tabular Prior-Data Fitted Networks

    Authors: Benjamin Feuer, Chinmay Hegde, Niv Cohen

    Abstract: Tabular classification has traditionally relied on supervised algorithms, which estimate the parameters of a prediction model using its training data. Recently, Prior-Data Fitted Networks (PFNs) such as TabPFN have successfully learned to classify tabular data in-context: the model parameters are designed to classify new samples based on labelled training samples given after the model training. Wh… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: 2nd Table Representation Learning Workshop: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  11. arXiv:2311.09024  [pdf, other

    cs.CV

    Fast Certification of Vision-Language Models Using Incremental Randomized Smoothing

    Authors: A K Nirala, A Joshi, C Hegde, S Sarkar

    Abstract: A key benefit of deep vision-language models such as CLIP is that they enable zero-shot open vocabulary classification; the user has the ability to define novel class labels via natural language prompts at inference time. However, while CLIP-based zero-shot classifiers have demonstrated competitive performance across a range of domain shifts, they remain highly vulnerable to adversarial attacks. T… ▽ More

    Submitted 4 January, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

  12. arXiv:2311.04016  [pdf, ps, other

    cs.CV cs.LG

    Exploring Dataset-Scale Indicators of Data Quality

    Authors: Benjamin Feuer, Chinmay Hegde

    Abstract: Modern computer vision foundation models are trained on massive amounts of data, incurring large economic and environmental costs. Recent research has suggested that improving data quality can significantly reduce the need for data quantity. But what constitutes data quality in computer vision? We posit that the quality of a given dataset can be decomposed into distinct sample-level and dataset-le… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: 1st Workshop on Attributing Model Behavior at Scale: 37th Conference on Neural Information Processing Systems (NeurIPS 2023). 7 pages, 1 figure

  13. arXiv:2310.18208  [pdf, other

    cs.CL cs.LG

    ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models

    Authors: Benjamin Feuer, Yurong Liu, Chinmay Hegde, Juliana Freire

    Abstract: Existing deep-learning approaches to semantic column type annotation (CTA) have important shortcomings: they rely on semantic types which are fixed at training time; require a large number of training samples per type and incur large run-time inference costs; and their performance can degrade when evaluated on novel datasets, even when types remain constant. Large language models have exhibited st… ▽ More

    Submitted 6 November, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: 17 pages, 8 figures

  14. arXiv:2310.04604  [pdf, other

    cs.CR cs.LG

    PriViT: Vision Transformers for Fast Private Inference

    Authors: Naren Dhyani, Jianqiao Mo, Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde

    Abstract: The Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications. However, ViTs are ill-suited for private inference using secure multi-party computation (MPC) protocols, due to the large number of non-polynomial operations (self-attention, feed-forward rectifiers, layer normalization). We propose PriViT, a gradient b… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: 18 pages, 14 figures

  15. arXiv:2309.05795  [pdf, ps, other

    stat.ML cs.CC cs.LG

    On the Fine-Grained Hardness of Inverting Generative Models

    Authors: Feyza Duman Keles, Chinmay Hegde

    Abstract: The objective of generative model inversion is to identify a size-$n$ latent vector that produces a generative model output that closely matches a given target. This operation is a core computational primitive in numerous modern applications involving computer vision and NLP. However, the problem is known to be computationally challenging and NP-hard in the worst case. This paper aims to provide a… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 19 pages

  16. arXiv:2309.02465  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    Towards Foundational AI Models for Additive Manufacturing: Language Models for G-Code Debugging, Manipulation, and Comprehension

    Authors: Anushrut Jignasu, Kelly Marshall, Baskar Ganapathysubramanian, Aditya Balu, Chinmay Hegde, Adarsh Krishnamurthy

    Abstract: 3D printing or additive manufacturing is a revolutionary technology that enables the creation of physical objects from digital models. However, the quality and accuracy of 3D printing depend on the correctness and efficiency of the G-code, a low-level numerical control programming language that instructs 3D printers how to move and extrude material. Debugging G-code is a challenging task that requ… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  17. arXiv:2308.03821  [pdf, other

    cs.CV cs.LG

    Distributionally Robust Classification on a Data Budget

    Authors: Benjamin Feuer, Ameya Joshi, Minh Pham, Chinmay Hegde

    Abstract: Real world uses of deep learning require predictable model behavior under distribution shifts. Models such as CLIP show emergent natural distributional robustness comparable to humans, but may require hundreds of millions of training samples. Can we train robust learners in a domain where data is limited? To rigorously address this question, we introduce JANuS (Joint Annotations and Names Set), a… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: TMLR 2023; openreview link: https://openreview.net/forum?id=D5Z2E8CNsD

  18. arXiv:2308.01508  [pdf, other

    cs.LG cs.CR cs.CV

    Circumventing Concept Erasure Methods For Text-to-Image Generative Models

    Authors: Minh Pham, Kelly O. Marshall, Niv Cohen, Govind Mittal, Chinmay Hegde

    Abstract: Text-to-image generative models can produce photo-realistic images for an extremely broad range of concepts, and their usage has proliferated widely among the general public. On the flip side, these models have numerous drawbacks, including their potential to generate images featuring sexually explicit content, mirror artistic styles without permission, or even hallucinate (or deepfake) the likene… ▽ More

    Submitted 8 October, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

  19. arXiv:2307.08585  [pdf, other

    cs.CV

    Identity-Preserving Aging of Face Images via Latent Diffusion Models

    Authors: Sudipta Banerjee, Govind Mittal, Ameya Joshi, Chinmay Hegde, Nasir Memon

    Abstract: The performance of automated face recognition systems is inevitably impacted by the facial aging process. However, high quality datasets of individuals collected over several years are typically small in scale. In this work, we propose, train, and validate the use of latent text-to-image diffusion models for synthetically aging and de-aging face images. Our models succeed with few-shot training, a… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted to appear in International Joint Conference in Biometrics (IJCB) 2023

  20. arXiv:2306.10159  [pdf, other

    cs.CV

    Vision-Language Models can Identify Distracted Driver Behavior from Naturalistic Videos

    Authors: Md Zahid Hasan, Jiajing Chen, Jiyang Wang, Mohammed Shaiqur Rahman, Ameya Joshi, Senem Velipasalar, Chinmay Hegde, Anuj Sharma, Soumik Sarkar

    Abstract: Recognizing the activities causing distraction in real-world driving scenarios is critical for ensuring the safety and reliability of both drivers and pedestrians on the roadways. Conventional computer vision techniques are typically data-intensive and require a large volume of annotated training data to detect and classify various distracted driving behaviors, thereby limiting their efficiency an… ▽ More

    Submitted 21 March, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 15 pages, 7 figures

  21. arXiv:2306.08183  [pdf, other

    cs.CV

    ZeroForge: Feedforward Text-to-Shape Without 3D Supervision

    Authors: Kelly O. Marshall, Minh Pham, Ameya Joshi, Anushrut Jignasu, Aditya Balu, Adarsh Krishnamurthy, Chinmay Hegde

    Abstract: Current state-of-the-art methods for text-to-shape generation either require supervised training using a labeled dataset of pre-defined 3D shapes, or perform expensive inference-time optimization of implicit neural representations. In this work, we present ZeroForge, an approach for zero-shot text-to-shape generation that avoids both pitfalls. To achieve open-vocabulary shape generation, we requir… ▽ More

    Submitted 15 June, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: 19 pages, High resolution figures needed to demonstrate 3D results

  22. A Feasibility Study on Indoor Localization and Multi-person Tracking Using Sparsely Distributed Camera Network with Edge Computing

    Authors: Hyeokhyen Kwon, Chaitra Hegde, Yashar Kiarashi, Venkata Siva Krishna Madala, Ratan Singh, ArjunSinh Nakum, Robert Tweedy, Leandro Miletto Tonetto, Craig M. Zimring, Matthew Doiron, Amy D. Rodriguez, Allan I. Levey, Gari D. Clifford

    Abstract: Camera-based activity monitoring systems are becoming an attractive solution for smart building applications with the advances in computer vision and edge computing technologies. In this paper, we present a feasibility study and systematic analysis of a camera-based indoor localization and multi-person tracking system implemented on edge computing devices within a large indoor space. To this end,… ▽ More

    Submitted 29 November, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

  23. arXiv:2305.02997  [pdf, other

    cs.LG cs.AI stat.ML

    When Do Neural Nets Outperform Boosted Trees on Tabular Data?

    Authors: Duncan McElfresh, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C, Benjamin Feuer, Chinmay Hegde, Ganesh Ramakrishnan, Micah Goldblum, Colin White

    Abstract: Tabular data is one of the most commonly used types of data in machine learning. Despite recent advances in neural nets (NNs) for tabular data, there is still an active discussion on whether or not NNs generally outperform gradient-boosted decision trees (GBDTs) on tabular data, with several recent works arguing either that GBDTs consistently outperform NNs on tabular data, or vice versa. In this… ▽ More

    Submitted 15 July, 2024; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: NeurIPS Datasets and Benchmarks Track 2023

  24. arXiv:2302.10281  [pdf, other

    cs.CV cs.AI cs.CL

    LiT Tuned Models for Efficient Species Detection

    Authors: Andre Nakkab, Benjamin Feuer, Chinmay Hegde

    Abstract: Recent advances in training vision-language models have demonstrated unprecedented robustness and transfer learning effectiveness; however, standard computer vision datasets are image-only, and therefore not well adapted to such training methods. Our paper introduces a simple methodology for adapting any fine-grained image classification dataset for distributed vision-language pretraining. We impl… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.

    Comments: 5 pages, 5 figures, 1 table, presented at AAAI 2023 conference for the AIAFS workshop

  25. arXiv:2301.12540  [pdf, other

    stat.ML cs.LG

    Implicit Regularization for Group Sparsity

    Authors: Jiangyuan Li, Thanh V. Nguyen, Chinmay Hegde, Raymond K. W. Wong

    Abstract: We study the implicit regularization of gradient descent towards structured sparsity via a novel neural reparameterization, which we call a diagonally grouped linear neural network. We show the following intriguing property of our reparameterization: gradient descent over the squared regression loss, without any explicit regularization, biases towards solutions with a group sparsity structure. In… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

    Comments: accepted by ICLR 2023

  26. arXiv:2301.06820  [pdf, other

    cs.LG cs.AI

    Pathfinding Neural Cellular Automata

    Authors: Sam Earle, Ozlem Yildiz, Julian Togelius, Chinmay Hegde

    Abstract: Pathfinding makes up an important sub-component of a broad range of complex tasks in AI, such as robot path planning, transport routing, and game playing. While classical algorithms can efficiently compute shortest paths, neural networks could be better suited to adapting these sub-routines to more complex and intractable tasks. As a step toward developing such networks, we hand-code and learn mod… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

  27. arXiv:2211.03241  [pdf, other

    cs.LG math.NA

    Neural PDE Solvers for Irregular Domains

    Authors: Biswajit Khara, Ethan Herron, Zhanhong Jiang, Aditya Balu, Chih-Hsuan Yang, Kumar Saurabh, Anushrut Jignasu, Soumik Sarkar, Chinmay Hegde, Adarsh Krishnamurthy, Baskar Ganapathysubramanian

    Abstract: Neural network-based approaches for solving partial differential equations (PDEs) have recently received special attention. However, the large majority of neural PDE solvers only apply to rectilinear domains, and do not systematically address the imposition of Dirichlet/Neumann boundary conditions over irregular domain boundaries. In this paper, we present a framework to neurally solve partial dif… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  28. arXiv:2210.13601  [pdf, other

    cs.LG

    Active Learning for Single Neuron Models with Lipschitz Non-Linearities

    Authors: Aarshvi Gajjar, Chinmay Hegde, Christopher Musco

    Abstract: We consider the problem of active learning for single neuron models, also sometimes called ``ridge functions'', in the agnostic setting (under adversarial label noise). Such models have been shown to be broadly effective in modeling physical phenomena, and for constructing surrogate data-driven models for partial differential equations. Surprisingly, we show that for a single neuron model with a… ▽ More

    Submitted 18 July, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: Inadvertently submitting an incorrect writeup that does not align with the intended content

  29. arXiv:2210.07396  [pdf, other

    cs.CV

    Caption supervision enables robust learners

    Authors: Benjamin Feuer, Ameya Joshi, Chinmay Hegde

    Abstract: Vision language (VL) models like CLIP are robust to natural distribution shifts, in part because CLIP learns on unstructured data using a technique called caption supervision; the model inteprets image-linked texts as ground-truth labels. In a carefully controlled comparison study, we show that caption-supervised CNNs trained on a standard cross-entropy loss (with image labels assigned by scanning… ▽ More

    Submitted 8 December, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    ACM Class: I.4.9

  30. arXiv:2210.06186  [pdf, other

    cs.CR cs.AI cs.CV

    GOTCHA: Real-Time Video Deepfake Detection via Challenge-Response

    Authors: Govind Mittal, Chinmay Hegde, Nasir Memon

    Abstract: With the rise of AI-enabled Real-Time Deepfakes (RTDFs), the integrity of online video interactions has become a growing concern. RTDFs have now made it feasible to replace an imposter's face with their victim in live video interactions. Such advancement in deepfakes also coaxes detection to rise to the same standard. However, existing deepfake detection techniques are asynchronous and hence ill-s… ▽ More

    Submitted 23 May, 2024; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE Euro S&P 2024

  31. arXiv:2209.10105  [pdf, ps, other

    cs.LG cs.DC stat.ML

    Distributed Online Non-convex Optimization with Composite Regret

    Authors: Zhanhong Jiang, Aditya Balu, Xian Yeow Lee, Young M. Lee, Chinmay Hegde, Soumik Sarkar

    Abstract: Regret has been widely adopted as the metric of choice for evaluating the performance of online optimization algorithms for distributed, multi-agent systems. However, data/model variations associated with agents can significantly impact decisions and requires consensus among agents. Moreover, most existing works have focused on developing approaches for (either strongly or non-strongly) convex los… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: 41 pages, presented in allerton conference 2022

  32. arXiv:2209.04881  [pdf, ps, other

    cs.LG cs.CC

    On The Computational Complexity of Self-Attention

    Authors: Feyza Duman Keles, Pruthuvi Mahesakya Wijewardena, Chinmay Hegde

    Abstract: Transformer architectures have led to remarkable progress in many state-of-art applications. However, despite their successes, modern transformers rely on the self-attention mechanism, whose time- and space-complexity is quadratic in the length of the input. Several approaches have been proposed to speed up self-attention mechanisms to achieve sub-quadratic running time; however, the large majorit… ▽ More

    Submitted 11 September, 2022; originally announced September 2022.

  33. arXiv:2206.08491  [pdf, other

    cs.LG

    Revisiting Self-Distillation

    Authors: Minh Pham, Minsu Cho, Ameya Joshi, Chinmay Hegde

    Abstract: Knowledge distillation is the procedure of transferring "knowledge" from a large model (the teacher) to a more compact one (the student), often being used in the context of model compression. When both models have the same architecture, this procedure is called self-distillation. Several works have anecdotally shown that a self-distilled student can outperform the teacher on held-out data. In this… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  34. arXiv:2206.07565  [pdf, other

    cs.CV cs.LG

    A Meta-Analysis of Distributionally-Robust Models

    Authors: Benjamin Feuer, Ameya Joshi, Chinmay Hegde

    Abstract: State-of-the-art image classifiers trained on massive datasets (such as ImageNet) have been shown to be vulnerable to a range of both intentional and incidental distribution shifts. On the other hand, several recent classifiers with favorable out-of-distribution (OOD) robustness properties have emerged, achieving high accuracy on their target tasks while maintaining their in-distribution accuracy… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: To be presented at ICML Workshop on Principles of Distribution Shift 2022. Copyright 2022 by the author(s)

  35. arXiv:2205.06154  [pdf, other

    cs.LG cs.CV

    Smooth-Reduce: Leveraging Patches for Improved Certified Robustness

    Authors: Ameya Joshi, Minh Pham, Minsu Cho, Leonid Boytsov, Filipe Condessa, J. Zico Kolter, Chinmay Hegde

    Abstract: Randomized smoothing (RS) has been shown to be a fast, scalable technique for certifying the robustness of deep neural network classifiers. However, methods based on RS require augmenting data with large amounts of noise, which leads to significant drops in accuracy. We propose a training-free, modified smoothing approach, Smooth-Reduce, that leverages patching and aggregation to provide improved… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

  36. arXiv:2202.02340  [pdf, other

    cs.CR cs.LG

    Selective Network Linearization for Efficient Private Inference

    Authors: Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde

    Abstract: Private inference (PI) enables inference directly on cryptographically secure data.While promising to address many privacy issues, it has seen limited use due to extreme runtimes. Unlike plaintext inference, where latency is dominated by FLOPs, in PI non-linear functions (namely ReLU) are the bottleneck. Thus, practical PI demands novel ReLU-aware optimizations. To reduce PI latency we propose a g… ▽ More

    Submitted 7 June, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

    Comments: Published in ICML 2022

  37. arXiv:2112.02813  [pdf, other

    cs.LG cs.AI

    MDPGT: Momentum-based Decentralized Policy Gradient Tracking

    Authors: Zhanhong Jiang, Xian Yeow Lee, Sin Yong Tan, Kai Liang Tan, Aditya Balu, Young M. Lee, Chinmay Hegde, Soumik Sarkar

    Abstract: We propose a novel policy gradient method for multi-agent reinforcement learning, which leverages two different variance-reduction techniques and does not require large batches over iterations. Specifically, we propose a momentum-based decentralized policy gradient tracking (MDPGT) where a new momentum-based variance reduction technique is used to approximate the local policy gradient surrogate wi… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

  38. arXiv:2110.04337  [pdf, other

    cs.CV cs.CR cs.LG

    Adversarial Token Attacks on Vision Transformers

    Authors: Ameya Joshi, Gauri Jagatap, Chinmay Hegde

    Abstract: Vision transformers rely on a patch token based self attention mechanism, in contrast to convolutional networks. We investigate fundamental differences between these two families of models, by designing a block sparsity based adversarial token attack. We probe and analyze transformer as well as convolutional models with token attacks of varying patch sizes. We infer that transformer models are mor… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  39. arXiv:2110.01601  [pdf, other

    cs.LG math.NA

    NeuFENet: Neural Finite Element Solutions with Theoretical Bounds for Parametric PDEs

    Authors: Biswajit Khara, Aditya Balu, Ameya Joshi, Soumik Sarkar, Chinmay Hegde, Adarsh Krishnamurthy, Baskar Ganapathysubramanian

    Abstract: We consider a mesh-based approach for training a neural network to produce field predictions of solutions to parametric partial differential equations (PDEs). This approach contrasts current approaches for "neural PDE solvers" that employ collocation-based methods to make point-wise predictions of solutions to PDEs. This approach has the advantage of naturally enforcing different boundary conditio… ▽ More

    Submitted 13 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

  40. arXiv:2110.01532  [pdf, other

    cs.LG stat.ML

    Differentiable Spline Approximations

    Authors: Minsu Cho, Aditya Balu, Ameya Joshi, Anjana Deva Prasad, Biswajit Khara, Soumik Sarkar, Baskar Ganapathysubramanian, Adarsh Krishnamurthy, Chinmay Hegde

    Abstract: The paradigm of differentiable programming has significantly enhanced the scope of machine learning via the judicious use of gradient-based optimization. However, standard differentiable programming methods (such as autodiff) typically require that the machine learning models be differentiable, limiting their applicability. Our goal in this paper is to use a new, principled approach to extend grad… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

    Comments: 9 pages, accepted in Neurips 2021

  41. arXiv:2108.05574  [pdf, other

    stat.ML cs.LG

    Implicit Sparse Regularization: The Impact of Depth and Early Stopping

    Authors: Jiangyuan Li, Thanh V. Nguyen, Chinmay Hegde, Raymond K. W. Wong

    Abstract: In this paper, we study the implicit bias of gradient descent for sparse regression. We extend results on regression with quadratic parametrization, which amounts to depth-2 diagonal linear networks, to more general depth-N networks, under more realistic settings of noise and correlated designs. We show that early stopping is crucial for gradient descent to converge to a sparse model, a phenomenon… ▽ More

    Submitted 26 October, 2021; v1 submitted 12 August, 2021; originally announced August 2021.

    Comments: 32 pages, accepted by NeurIPS 2021. arXiv admin note: text overlap with arXiv:1909.05122 by other authors

  42. Sphynx: ReLU-Efficient Network Design for Private Inference

    Authors: Minsu Cho, Zahra Ghodsi, Brandon Reagen, Siddharth Garg, Chinmay Hegde

    Abstract: The emergence of deep learning has been accompanied by privacy concerns surrounding users' data and service providers' models. We focus on private inference (PI), where the goal is to perform inference on a user's data sample using a service provider's model. Existing PI methods for deep networks enable cryptographically secure inference with little drop in functionality; however, they incur sever… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Journal ref: IEEE Security & Privacy, vol. 20, no. 05, pp. 22-34, 2022

  43. arXiv:2105.06371  [pdf, other

    cs.LG stat.ML

    Provably Convergent Algorithms for Solving Inverse Problems Using Generative Models

    Authors: Viraj Shah, Rakib Hyder, M. Salman Asif, Chinmay Hegde

    Abstract: The traditional approach of hand-crafting priors (such as sparsity) for solving inverse problems is slowly being replaced by the use of richer learned priors (such as those modeled by deep generative networks). In this work, we study the algorithmic aspects of such a learning-based approach from a theoretical perspective. For certain generative network architectures, we establish a simple non-conv… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

    Comments: arXiv admin note: text overlap with arXiv:1810.03587, arXiv:1802.08406

  44. arXiv:2104.14547  [pdf, other

    cs.LG cs.CV

    NURBS-Diff: A Differentiable Programming Module for NURBS

    Authors: Anjana Deva Prasad, Aditya Balu, Harshil Shah, Soumik Sarkar, Chinmay Hegde, Adarsh Krishnamurthy

    Abstract: Boundary representations (B-reps) using Non-Uniform Rational B-splines (NURBS) are the de facto standard used in CAD, but their utility in deep learning-based approaches is not well researched. We propose a differentiable NURBS module to integrate NURBS representations of CAD models with deep learning methods. We mathematically define the derivatives of the NURBS curves or surfaces with respect to… ▽ More

    Submitted 13 January, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

  45. arXiv:2104.14538  [pdf, other

    cs.LG stat.ML

    Distributed Multigrid Neural Solvers on Megavoxel Domains

    Authors: Aditya Balu, Sergio Botelho, Biswajit Khara, Vinay Rao, Chinmay Hegde, Soumik Sarkar, Santi Adavani, Adarsh Krishnamurthy, Baskar Ganapathysubramanian

    Abstract: We consider the distributed training of large-scale neural networks that serve as PDE solvers producing full field outputs. We specifically consider neural solvers for the generalized 3D Poisson equation over megavoxel domains. A scalable framework is presented that integrates two distinct advances. First, we accelerate training a large model via a method analogous to the multigrid technique used… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

  46. arXiv:2103.02051  [pdf, other

    cs.LG cs.DC

    Cross-Gradient Aggregation for Decentralized Learning from Non-IID data

    Authors: Yasaman Esfandiari, Sin Yong Tan, Zhanhong Jiang, Aditya Balu, Ethan Herron, Chinmay Hegde, Soumik Sarkar

    Abstract: Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. Recently, decentralized learning algorithms have demonstrated state-of-the-art results on benchmark data sets, comparable with centralized algorithms. However, the key assumption to achieve competitive performance is that the data is independen… ▽ More

    Submitted 28 June, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

    Comments: ICML 2021 accepted paper

  47. arXiv:2102.12643  [pdf, other

    stat.ML cs.LG

    Provable Compressed Sensing with Generative Priors via Langevin Dynamics

    Authors: Thanh V. Nguyen, Gauri Jagatap, Chinmay Hegde

    Abstract: Deep generative models have emerged as a powerful class of priors for signals in various inverse problems such as compressed sensing, phase retrieval and super-resolution. Here, we assume an unknown signal to lie in the range of some pre-trained generative model. A popular approach for signal recovery is via gradient descent in the low-dimensional latent space. While gradient descent has achieved… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

  48. arXiv:2008.12338  [pdf, other

    cs.LG cs.CV stat.ML

    Adversarially Robust Learning via Entropic Regularization

    Authors: Gauri Jagatap, Ameya Joshi, Animesh Basak Chowdhury, Siddharth Garg, Chinmay Hegde

    Abstract: In this paper we propose a new family of algorithms, ATENT, for training adversarially robust deep neural networks. We formulate a new loss function that is equipped with an additional entropic regularization. Our loss function considers the contribution of adversarial samples that are drawn from a specially designed distribution in the data space that assigns high probability to points with high… ▽ More

    Submitted 19 February, 2021; v1 submitted 27 August, 2020; originally announced August 2020.

  49. arXiv:2007.12792  [pdf, other

    cs.LG stat.ML

    Deep Generative Models that Solve PDEs: Distributed Computing for Training Large Data-Free Models

    Authors: Sergio Botelho, Ameya Joshi, Biswajit Khara, Soumik Sarkar, Chinmay Hegde, Santi Adavani, Baskar Ganapathysubramanian

    Abstract: Recent progress in scientific machine learning (SciML) has opened up the possibility of training novel neural network architectures that solve complex partial differential equations (PDEs). Several (nearly data free) approaches have been recently reported that successfully solve PDEs, with examples including deep feed forward networks, generative networks, and deep encoder-decoder networks. Howeve… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

    Comments: 10 pages, 18 figures

  50. arXiv:2007.04087  [pdf, other

    cs.LG stat.ML

    Hyperparameter Optimization in Neural Networks via Structured Sparse Recovery

    Authors: Minsu Cho, Mohammadreza Soltani, Chinmay Hegde

    Abstract: In this paper, we study two important problems in the automated design of neural networks -- Hyper-parameter Optimization (HPO), and Neural Architecture Search (NAS) -- through the lens of sparse recovery methods. In the first part of this paper, we establish a novel connection between HPO and structured sparse recovery. In particular, we show that a special encoding of the hyperparameter space en… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

    Comments: arXiv admin note: text overlap with arXiv:1906.02869