Zum Hauptinhalt springen

Showing 1–18 of 18 results for author: Maini, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09358  [pdf, other

    cs.LG

    Understanding Hallucinations in Diffusion Models through Mode Interpolation

    Authors: Sumukh K Aithal, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter

    Abstract: Colloquially speaking, image generation models based upon diffusion processes are frequently said to exhibit "hallucinations," samples that could never occur in the training data. But where do such hallucinations come from? In this paper, we study a particular failure mode in diffusion models, which we term mode interpolation. Specifically, we find that diffusion models smoothly "interpolate" betw… ▽ More

    Submitted 25 August, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Additional results on real datasets

  2. arXiv:2406.06443  [pdf, other

    cs.LG cs.CL cs.CR

    LLM Dataset Inference: Did you train on my dataset?

    Authors: Pratyush Maini, Hengrui Jia, Nicolas Papernot, Adam Dziedzic

    Abstract: The proliferation of large language models (LLMs) in the real world has come with a rise in copyright cases against companies for training their models on unlicensed data from the internet. Recent works have presented methods to identify if individual text sequences were members of the model's training data, known as membership inference attacks (MIAs). We demonstrate that the apparent success of… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Code is available at \href{https://github.com/pratyushmaini/llm_dataset_inference/

  3. arXiv:2404.15146  [pdf, other

    cs.LG cs.CL

    Rethinking LLM Memorization through the Lens of Adversarial Compression

    Authors: Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter

    Abstract: Large language models (LLMs) trained on web-scale datasets raise substantial concerns regarding permissible data usage. One major question is whether these models "memorize" all their training data or they integrate many data sources in some way more akin to how a human would learn and synthesize information. The answer hinges, to a large degree, on how we define memorization. In this work, we pro… ▽ More

    Submitted 1 July, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: https://locuslab.github.io/acr-memorization

  4. arXiv:2404.07177  [pdf, other

    cs.LG

    Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic

    Authors: Sachin Goyal, Pratyush Maini, Zachary C. Lipton, Aditi Raghunathan, J. Zico Kolter

    Abstract: Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets. In recent times, data curation has gained prominence with several works developing strategies to retain 'high-quality' subsets of 'raw' scraped data. For instance, the LAION public dataset retained only 10% of the total crawled data. However, these strategies are typically developed agnostic of… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Published at CVPR 2024

  5. arXiv:2401.16380  [pdf, other

    cs.CL

    Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

    Authors: Pratyush Maini, Skyler Seto, He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly

    Abstract: Large language models are trained on massive scrapes of the web, which are often unstructured, noisy, and poorly phrased. Current scaling laws show that learning from such data requires an abundance of both compute and data, which grows with the size of the model being trained. This is infeasible both because of the large compute costs and duration associated with pre-training, and the impending s… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  6. arXiv:2401.06121  [pdf, other

    cs.LG cs.CL

    TOFU: A Task of Fictitious Unlearning for LLMs

    Authors: Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J. Zico Kolter

    Abstract: Large language models trained on massive corpora of data from the web can memorize and reproduce sensitive or private data raising both legal and ethical concerns. Unlearning, or tuning models to forget information present in their training data, provides us with a way to protect private data after training. Although several methods exist for such unlearning, it is unclear to what extent they resu… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: https://locuslab.github.io/tofu/

  7. arXiv:2307.09542  [pdf, other

    cs.LG cs.CV

    Can Neural Network Memorization Be Localized?

    Authors: Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang

    Abstract: Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model. Memorization refers to the ability to correctly predict on $\textit{atypical}$ examples of the training set. In this work, we show that rather than being confined to individual lay… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: Accepted at ICML 2023

  8. arXiv:2307.03132  [pdf, other

    cs.CV cs.CL cs.LG

    T-MARS: Improving Visual Representations by Circumventing Text Feature Learning

    Authors: Pratyush Maini, Sachin Goyal, Zachary C. Lipton, J. Zico Kolter, Aditi Raghunathan

    Abstract: Large web-sourced multimodal datasets have powered a slew of new methods for learning general-purpose visual representations, advancing the state of the art in computer vision and revolutionizing zero- and few-shot recognition. One crucial decision facing practitioners is how, if at all, to curate these ever-larger datasets. For example, the creators of the LAION-5B dataset chose to retain only im… ▽ More

    Submitted 18 March, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Accepted to ICLR 2024. Oral at ICCV Datacomp 2023

  9. arXiv:2303.07320  [pdf, other

    cs.CL cs.LG

    Model-tuning Via Prompts Makes NLP Models Adversarially Robust

    Authors: Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi

    Abstract: In recent years, NLP practitioners have converged on the following practice: (i) import an off-the-shelf pretrained (masked) language model; (ii) append a multilayer perceptron atop the CLS token's hidden representation (with randomly initialized weights); and (iii) fine-tune the entire model on a downstream task (MLP-FT). This procedure has produced massive gains on standard NLP benchmarks, but t… ▽ More

    Submitted 5 December, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: Accepted to the EMNLP 2023 Conference

  10. arXiv:2210.15031  [pdf, other

    cs.LG

    Characterizing Datapoints via Second-Split Forgetting

    Authors: Pratyush Maini, Saurabh Garg, Zachary C. Lipton, J. Zico Kolter

    Abstract: Researchers investigating example hardness have increasingly focused on the dynamics by which neural networks learn and forget examples throughout training. Popular metrics derived from these dynamics include (i) the epoch at which examples are first correctly classified; (ii) the number of times their predictions flip during training; and (iii) whether their prediction flips if they are held out.… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted at NeurIPS 2022

  11. arXiv:2111.10462  [pdf, other

    cs.RO

    Online Coverage Planning for an Autonomous Weed Mowing Robot with Curvature Constraints

    Authors: Parikshit Maini, Burak M. Gonultas, Volkan Isler

    Abstract: The land used for grazing cattle takes up about one-third of the land in the United States. These areas can be highly rugged. Yet, they need to be maintained to prevent weeds from taking over the nutritious grassland. This can be a daunting task especially in the case of organic farming since herbicides cannot be used. In this paper, we present the design of Cowbot, an autonomous weed mowing robot… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

  12. arXiv:2104.10706  [pdf, other

    stat.ML cs.CR cs.LG

    Dataset Inference: Ownership Resolution in Machine Learning

    Authors: Pratyush Maini, Mohammad Yaghini, Nicolas Papernot

    Abstract: With increasingly more data and computation involved in their training, machine learning models constitute valuable intellectual property. This has spurred interest in model stealing, which is made more practical by advances in learning with partial, little, or no supervision. Existing defenses focus on inserting unique watermarks in a model's decision surface, but this is insufficient: the waterm… ▽ More

    Submitted 21 April, 2021; originally announced April 2021.

    Comments: Published as a conference paper at ICLR 2021 (Spotlight Presentation)

  13. arXiv:2011.14779  [pdf, other

    cs.LG

    Data-Free Model Extraction

    Authors: Jean-Baptiste Truong, Pratyush Maini, Robert J. Walls, Nicolas Papernot

    Abstract: Current model extraction attacks assume that the adversary has access to a surrogate dataset with characteristics similar to the proprietary data used to train the victim model. This requirement precludes the use of existing model extraction techniques on valuable models, such as those trained on rare or hard to acquire datasets. In contrast, we propose data-free model extraction methods that do n… ▽ More

    Submitted 31 March, 2021; v1 submitted 30 November, 2020; originally announced November 2020.

    Comments: Published in the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition

  14. arXiv:2005.00159  [pdf, other

    cs.CL cs.LG

    Why and when should you pool? Analyzing Pooling in Recurrent Architectures

    Authors: Pratyush Maini, Keshav Kolluru, Danish Pruthi, Mausam

    Abstract: Pooling-based recurrent neural architectures consistently outperform their counterparts without pooling. However, the reasons for their enhanced performance are largely unexamined. In this work, we examine three commonly used pooling techniques (mean-pooling, max-pooling, and attention), and propose max-attention, a novel variant that effectively captures interactions among predictive tokens in a… ▽ More

    Submitted 27 October, 2020; v1 submitted 30 April, 2020; originally announced May 2020.

    Comments: Accepted to Findings of EMNLP 2020, to be presented at BlackBoxNLP. Updated Version

  15. arXiv:1909.04068  [pdf, other

    cs.LG cs.AI stat.ML

    Adversarial Robustness Against the Union of Multiple Perturbation Models

    Authors: Pratyush Maini, Eric Wong, J. Zico Kolter

    Abstract: Owing to the susceptibility of deep learning systems to adversarial attacks, there has been a great deal of work in developing (both empirically and certifiably) robust classifiers. While most work has defended against a single type of attack, recent work has looked at defending against multiple perturbation models using simple aggregations of multiple attacks. However, these methods can be diffic… ▽ More

    Submitted 28 July, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: ICML 2020 Final Version

  16. arXiv:1903.07363  [pdf, other

    cs.RO

    Visual Monitoring for Multiple Points of Interest on a 2.5D Terrain using a UAV with Limited Field-of-View Constraint

    Authors: Parikshit Maini, Suijt PB, Pratap Tokekar

    Abstract: Varying terrain conditions and limited field-of-view restricts the visibility of aerial robots while performing visual monitoring operations. In this paper, we study the multi-point monitoring problem on a 2.5D terrain using an unmanned aerial vehicle (UAV) with limited camera field-of-view. This problem is NP-Hard and hence we develop a two phase strategy to compute an approximate tour for the UA… ▽ More

    Submitted 18 March, 2019; originally announced March 2019.

  17. arXiv:1805.11007  [pdf, other

    cs.CE cs.MS q-bio.QM

    Particle-based simulations of reaction-diffusion processes with Aboria

    Authors: Maria Bruna, Philip K. Maini, Martin Robinson

    Abstract: Mathematical models of transport and reactions in biological systems have been traditionally written in terms of partial differential equations (PDEs) that describe the time evolution of population-level variables. In recent years, the use of stochastic particle-based models, which keep track of the evolution of each organism in the system, has become widespread. These models provide a lot more de… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

  18. arXiv:1805.04417  [pdf, other

    cs.RO

    Cooperative Planning for Fuel-constrained Aerial Vehicles and Ground-based Refueling Vehicles for Large-Scale Coverage

    Authors: Parikshit Maini, Kaarthik Sundar, Sivakumar Rathinam, PB Sujit

    Abstract: Low cost Unmanned Aerial Vehicles (UAVs) need multiple refuels to accomplish large area coverage. The number of refueling stations and their placement plays a vital role in determining coverage efficiency. In this paper, we propose the use of a ground-based refueling vehicle (RV) to increase the operational range of a UAV in both spatial and temporal domains. Determining optimal routes for the UAV… ▽ More

    Submitted 11 May, 2018; originally announced May 2018.