Zum Hauptinhalt springen

Showing 1–17 of 17 results for author: Golatkar, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06324  [pdf, other

    cs.LG cs.CL cs.NE

    B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory

    Authors: Luca Zancato, Arjun Seshadri, Yonatan Dukler, Aditya Golatkar, Yantao Shen, Benjamin Bowman, Matthew Trager, Alessandro Achille, Stefano Soatto

    Abstract: We describe a family of architectures to support transductive inference by allowing memory to grow to a finite but a-priori unknown bound while making efficient use of finite resources for inference. Current architectures use such resources to represent data either eidetically over a finite span ("context" in Transformers), or fading over an infinite span (in State Space Models, or SSMs). Recent h… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2406.08431  [pdf, other

    cs.CV cs.AI cs.CR cs.LG

    Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

    Authors: Benjamin Biggs, Arjun Seshadri, Yang Zou, Achin Jain, Aditya Golatkar, Yusheng Xie, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto

    Abstract: We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  3. arXiv:2403.18920  [pdf, other

    cs.CR cs.AI cs.CV

    CPR: Retrieval Augmented Generation for Copyright Protection

    Authors: Aditya Golatkar, Alessandro Achille, Luca Zancato, Yu-Xiang Wang, Ashwin Swaminathan, Stefano Soatto

    Abstract: Retrieval Augmented Generation (RAG) is emerging as a flexible and robust technique to adapt models to private users data without training, to handle credit attribution, and to allow efficient machine unlearning at scale. However, RAG techniques for image generation may lead to parts of the retrieved samples being copied in the model's output. To reduce risks of leaking private information contain… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  4. arXiv:2308.01937  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Training Data Protection with Compositional Diffusion Models

    Authors: Aditya Golatkar, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto

    Abstract: We introduce Compartmentalized Diffusion Models (CDM), a method to train different diffusion models (or prompts) on distinct data sources and arbitrarily compose them at inference time. The individual models can be trained in isolation, at different times, and on different distributions and domains and can be later composed to achieve performance comparable to a paragon model trained on all data s… ▽ More

    Submitted 13 February, 2024; v1 submitted 2 August, 2023; originally announced August 2023.

  5. arXiv:2307.08122  [pdf, other

    cs.LG

    Tangent Transformers for Composition, Privacy and Removal

    Authors: Tian Yu Liu, Aditya Golatkar, Stefano Soatto

    Abstract: We introduce Tangent Attention Fine-Tuning (TAFT), a method for fine-tuning linearized transformers obtained by computing a First-order Taylor Expansion around a pre-trained initialization. We show that the Jacobian-Vector Product resulting from linearization can be computed efficiently in a single forward pass, reducing training and inference cost to the same order of magnitude as its original no… ▽ More

    Submitted 14 May, 2024; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: Published at the International Conference on Learning Representations (ICLR) 2024. Code available at: https://github.com/tianyu139/tangent-model-composition

  6. arXiv:2304.13169  [pdf, other

    cs.LG

    SAFE: Machine Unlearning With Shard Graphs

    Authors: Yonatan Dukler, Benjamin Bowman, Alessandro Achille, Aditya Golatkar, Ashwin Swaminathan, Stefano Soatto

    Abstract: We present Synergy Aware Forgetting Ensemble (SAFE), a method to adapt large models on a diverse collection of data while minimizing the expected cost to remove the influence of training samples from the trained model. This process, also known as selective forgetting or unlearning, is often conducted by partitioning a dataset into shards, training fully independent models on each, then ensembling… ▽ More

    Submitted 22 August, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted at ICCV 2023

  7. arXiv:2211.13108  [pdf, other

    cs.LG

    Integral Continual Learning Along the Tangent Vector Field of Tasks

    Authors: Tian Yu Liu, Aditya Golatkar, Stefano Soatto, Alessandro Achille

    Abstract: We propose a lightweight continual learning method which incorporates information from specialized datasets incrementally, by integrating it along the vector field of "generalist" models. The tangent plane to the specialist model acts as a generalist guide and avoids the kind of over-fitting that leads to catastrophic forgetting, while exploiting the convexity of the optimization landscape in the… ▽ More

    Submitted 11 December, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

  8. arXiv:2207.00581  [pdf, other

    cs.LG

    On Leave-One-Out Conditional Mutual Information For Generalization

    Authors: Mohamad Rida Rammal, Alessandro Achille, Aditya Golatkar, Suhas Diggavi, Stefano Soatto

    Abstract: We derive information theoretic generalization bounds for supervised learning algorithms based on a new measure of leave-one-out conditional mutual information (loo-CMI). Contrary to other CMI bounds, which are black-box bounds that do not exploit the structure of the problem and may be hard to evaluate in practice, our loo-CMI bounds can be computed easily and can be interpreted in connection to… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

  9. arXiv:2203.11481  [pdf, other

    cs.CV cs.CR

    Mixed Differential Privacy in Computer Vision

    Authors: Aditya Golatkar, Alessandro Achille, Yu-Xiang Wang, Aaron Roth, Michael Kearns, Stefano Soatto

    Abstract: We introduce AdaMix, an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data. While pre-training language models on large public datasets has enabled strong differential privacy (DP) guarantees with minor loss of accuracy, a similar practice yields punishing trade-offs in vision tasks. A few-shot or even zero-shot learning… ▽ More

    Submitted 28 March, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: Accepted at CVPR 2022

  10. arXiv:2106.13870  [pdf, other

    cs.CV cs.LG stat.ML

    Scene Uncertainty and the Wellington Posterior of Deterministic Image Classifiers

    Authors: Stephanie Tsuei, Aditya Golatkar, Stefano Soatto

    Abstract: We propose a method to estimate the uncertainty of the outcome of an image classifier on a given input datum. Deep neural networks commonly used for image classification are deterministic maps from an input image to an output class. As such, their outcome on a given datum involves no uncertainty, so we must specify what variability we are referring to when defining, measuring and interpreting unce… ▽ More

    Submitted 24 March, 2023; v1 submitted 25 June, 2021; originally announced June 2021.

    Report number: UCLA CS Report #210001

  11. arXiv:2012.13431  [pdf, other

    cs.LG cs.AI cs.CV

    Mixed-Privacy Forgetting in Deep Networks

    Authors: Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia Polito, Stefano Soatto

    Abstract: We show that the influence of a subset of the training samples can be removed -- or "forgotten" -- from the weights of a network trained on large-scale image classification tasks, and we provide strong computable bounds on the amount of remaining information after forgetting. Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy se… ▽ More

    Submitted 20 June, 2021; v1 submitted 24 December, 2020; originally announced December 2020.

    Comments: CVPR 2021

  12. arXiv:2012.11140  [pdf, other

    cs.LG cs.CV stat.ML

    LQF: Linear Quadratic Fine-Tuning

    Authors: Alessandro Achille, Aditya Golatkar, Avinash Ravichandran, Marzia Polito, Stefano Soatto

    Abstract: Classifiers that are linear in their parameters, and trained by optimizing a convex loss function, have predictable behavior with respect to changes in the training data, initial conditions, and optimization. Such desirable properties are absent in deep neural networks (DNNs), typically trained by non-linear fine-tuning of a pre-trained model. Previous attempts to linearize DNNs have led to intere… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

  13. arXiv:2003.02960  [pdf, other

    cs.LG cs.CV cs.IT stat.ML

    Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations

    Authors: Aditya Golatkar, Alessandro Achille, Stefano Soatto

    Abstract: We describe a procedure for removing dependency on a cohort of training data from a trained deep network that improves upon and generalizes previous methods to different readout functions and can be extended to ensure forgetting in the activations of the network. We introduce a new bound on how much information can be extracted per query about the forgotten cohort from a black-box network for whic… ▽ More

    Submitted 28 October, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

    Comments: ECCV 2020

  14. arXiv:1911.04933  [pdf, other

    cs.LG stat.ML

    Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks

    Authors: Aditya Golatkar, Alessandro Achille, Stefano Soatto

    Abstract: We explore the problem of selectively forgetting a particular subset of the data used for training a deep neural network. While the effects of the data to be forgotten can be hidden from the output of the network, insights may still be gleaned by probing deep into its weights. We propose a method for "scrubbing'" the weights clean of information about a particular set of training data. The method… ▽ More

    Submitted 31 March, 2020; v1 submitted 12 November, 2019; originally announced November 2019.

    Comments: Accepted at CVPR 2020

  15. arXiv:1905.13277  [pdf, other

    cs.LG cs.AI stat.ML

    Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence

    Authors: Aditya Golatkar, Alessandro Achille, Stefano Soatto

    Abstract: Regularization is typically understood as improving generalization by altering the landscape of local extrema to which the model eventually converges. Deep neural networks (DNNs), however, challenge this view: We show that removing regularization after an initial transient period has little effect on generalization, even if the final loss landscape is the same as if there had been no regularizatio… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

  16. arXiv:1809.02497  [pdf, other

    cs.LG stat.ML

    Sparse Kernel PCA for Outlier Detection

    Authors: Rudrajit Das, Aditya Golatkar, Suyash P. Awate

    Abstract: In this paper, we propose a new method to perform Sparse Kernel Principal Component Analysis (SKPCA) and also mathematically analyze the validity of SKPCA. We formulate SKPCA as a constrained optimization problem with elastic net regularization (Hastie et al.) in kernel feature space and solve it. We consider outlier detection (where KPCA is employed) as an application for SKPCA, using the RBF ker… ▽ More

    Submitted 13 September, 2018; v1 submitted 7 September, 2018; originally announced September 2018.

    Comments: Accepted at IEEE ICMLA 2018 for Oral Presentation

  17. arXiv:1802.08080  [pdf, other

    cs.CV

    Classification of Breast Cancer Histology using Deep Learning

    Authors: Aditya Golatkar, Deepak Anand, Amit Sethi

    Abstract: Breast Cancer is a major cause of death worldwide among women. Hematoxylin and Eosin (H&E) stained breast tissue samples from biopsies are observed under microscopes for the primary diagnosis of breast cancer. In this paper, we propose a deep learning-based method for classification of H&E stained breast tissue images released for BACH challenge 2018 by fine-tuning Inception-v3 convolutional neura… ▽ More

    Submitted 25 July, 2018; v1 submitted 22 February, 2018; originally announced February 2018.

    Comments: 8 pages. Published at ICIAR 2018, Portugal