Zum Hauptinhalt springen

Showing 1–39 of 39 results for author: Sanjabi, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.14421  [pdf, other

    cs.LG cs.CR cs.CV

    DP-RDM: Adapting Diffusion Models to Private Domains Without Fine-Tuning

    Authors: Jonathan Lebensold, Maziar Sanjabi, Pietro Astolfi, Adriana Romero-Soriano, Kamalika Chaudhuri, Mike Rabbat, Chuan Guo

    Abstract: Text-to-image diffusion models have been shown to suffer from sample-level memorization, possibly reproducing near-perfect replica of images that they are trained on, which may be undesirable. To remedy this issue, we develop the first differentially private (DP) retrieval-augmented generation algorithm that is capable of generating high-quality image samples while providing provable privacy guara… ▽ More

    Submitted 13 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  2. arXiv:2403.02506  [pdf, other

    cs.CV cs.LG

    Differentially Private Representation Learning via Image Captioning

    Authors: Tom Sander, Yaodong Yu, Maziar Sanjabi, Alain Durmus, Yi Ma, Kamalika Chaudhuri, Chuan Guo

    Abstract: Differentially private (DP) machine learning is considered the gold-standard solution for training a model from sensitive data while still preserving privacy. However, a major barrier to achieving this ideal is its sub-optimal privacy-accuracy trade-off, which is particularly visible in DP representation learning. Specifically, it has been shown that under modest privacy budgets, most models learn… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  3. arXiv:2311.10794  [pdf, other

    cs.CV

    Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression

    Authors: Animesh Sinha, Bo Sun, Anmol Kalia, Arantxa Casanova, Elliot Blanchard, David Yan, Winnie Zhang, Tony Nelli, Jiahui Chen, Hardik Shah, Licheng Yu, Mitesh Kumar Singh, Ankit Ramchandani, Maziar Sanjabi, Sonal Gupta, Amy Bearman, Dhruv Mahajan

    Abstract: We introduce Style Tailoring, a recipe to finetune Latent Diffusion Models (LDMs) in a distinct domain with high visual quality, prompt alignment and scene diversity. We choose sticker image generation as the target domain, as the images significantly differ from photorealistic samples typically generated by large-scale LDMs. We start with a competent text-to-image model, like Emu, and show that r… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 10 pages, 5 figures

  4. arXiv:2310.04743  [pdf, other

    cs.CL

    Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models

    Authors: Song Jiang, Zahra Shakeri, Aaron Chan, Maziar Sanjabi, Hamed Firooz, Yinglong Xia, Bugra Akyildiz, Yizhou Sun, Jinchao Li, Qifan Wang, Asli Celikyilmaz

    Abstract: Chain-of-thought (CoT) prompting, which offers step-by-step problem-solving rationales, has impressively unlocked the reasoning potential of large language models (LLMs). Yet, the standard CoT is less effective in problems demanding multiple reasoning steps. This limitation arises from the complex reasoning process in multi-step problems: later stages often depend on the results of several steps e… ▽ More

    Submitted 8 May, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: 29 pages

  5. arXiv:2310.02426  [pdf, other

    cs.CV

    EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods

    Authors: Samyadeep Basu, Mehrdad Saberi, Shweta Bhardwaj, Atoosa Malemir Chegini, Daniela Massiceti, Maziar Sanjabi, Shell Xu Hu, Soheil Feizi

    Abstract: A plethora of text-guided image editing methods have recently been developed by leveraging the impressive capabilities of large-scale diffusion-based generative models such as Imagen and Stable Diffusion. A standardized evaluation protocol, however, does not exist to compare methods across different types of fine-grained edits. To address this gap, we introduce EditVal, a standardized benchmark fo… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  6. arXiv:2307.10504  [pdf, other

    cs.CV cs.LG

    Identifying Interpretable Subspaces in Image Representations

    Authors: Neha Kalibhat, Shweta Bhardwaj, Bayan Bruss, Hamed Firooz, Maziar Sanjabi, Soheil Feizi

    Abstract: We propose Automatic Feature Explanation using Contrasting Concepts (FALCON), an interpretability framework to explain features of image representations. For a target feature, FALCON captions its highly activating cropped images using a large captioning dataset (like LAION-400m) and a pre-trained vision-language model like CLIP. Each word among the captions is scored and ranked leading to a small… ▽ More

    Submitted 7 September, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: Published at ICML 2023 Code: https://github.com/NehaKalibhat/falcon-explain

  7. arXiv:2307.09233  [pdf, other

    cs.CV

    Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP

    Authors: Samyadeep Basu, Shell Xu Hu, Maziar Sanjabi, Daniela Massiceti, Soheil Feizi

    Abstract: Image-text contrastive models like CLIP have wide applications in zero-shot classification, image-text retrieval, and transfer learning. However, they often struggle on compositional visio-linguistic tasks (e.g., attribute-binding or object-relationships) where their performance is no better than random chance. To address this, we introduce SDS-CLIP, a lightweight and sample-efficient distillation… ▽ More

    Submitted 1 July, 2024; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: Short paper

  8. arXiv:2306.08842  [pdf, other

    cs.CV cs.CR cs.LG

    ViP: A Differentially Private Foundation Model for Computer Vision

    Authors: Yaodong Yu, Maziar Sanjabi, Yi Ma, Kamalika Chaudhuri, Chuan Guo

    Abstract: Artificial intelligence (AI) has seen a tremendous surge in capabilities thanks to the use of foundation models trained on internet-scale data. On the flip side, the uncurated nature of internet-scale data also poses significant privacy and legal risks, as they often contain personal information or copyrighted material that should not be trained on without permission. In this work, we propose as a… ▽ More

    Submitted 28 June, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: Code: https://github.com/facebookresearch/ViP-MAE. V2 adds a GitHub link to the code

  9. arXiv:2305.06386  [pdf, other

    cs.CV cs.AI cs.HC cs.LG

    Text-To-Concept (and Back) via Cross-Model Alignment

    Authors: Mazda Moayeri, Keivan Rezaei, Maziar Sanjabi, Soheil Feizi

    Abstract: We observe that the mapping between an image's representation in one model to its representation in another can be learned surprisingly well with just a linear layer, even across diverse models. Building on this observation, we propose $\textit{text-to-concept}$, where features from a fixed pretrained model are aligned linearly to the CLIP space, so that text embeddings from CLIP's text encoder be… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: Accepted to ICML 2023 and CVPR4XAI workshop 2023

  10. arXiv:2304.01482  [pdf, other

    cs.CV

    Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning

    Authors: Ajinkya Tejankar, Maziar Sanjabi, Qifan Wang, Sinong Wang, Hamed Firooz, Hamed Pirsiavash, Liang Tan

    Abstract: Recently, self-supervised learning (SSL) was shown to be vulnerable to patch-based data poisoning backdoor attacks. It was shown that an adversary can poison a small part of the unlabeled data so that when a victim trains an SSL model on it, the final model will have a backdoor that the adversary can exploit. This work aims to defend self-supervised learning against such attacks. We use a three-st… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023

  11. arXiv:2210.15500  [pdf, other

    cs.CL cs.CY cs.IR cs.LG

    COFFEE: Counterfactual Fairness for Personalized Text Generation in Explainable Recommendation

    Authors: Nan Wang, Qifan Wang, Yi-Chia Wang, Maziar Sanjabi, Jingzhou Liu, Hamed Firooz, Hongning Wang, Shaoliang Nie

    Abstract: As language models become increasingly integrated into our digital lives, Personalized Text Generation (PTG) has emerged as a pivotal component with a wide range of applications. However, the bias inherent in user written text, often used for PTG model training, can inadvertently associate different levels of linguistic quality with users' protected attributes. The model can inherit the bias and p… ▽ More

    Submitted 22 October, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: This is a long paper accepted by the Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)

  12. arXiv:2210.13662  [pdf, other

    cs.LG cs.CR cs.IT

    Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano

    Authors: Chuan Guo, Alexandre Sablayrolles, Maziar Sanjabi

    Abstract: Differential privacy (DP) is by far the most widely accepted framework for mitigating privacy risks in machine learning. However, exactly how small the privacy parameter $ε$ needs to be to protect against certain privacy risks in practice is still not well-understood. In this work, we study data reconstruction attacks for discrete data and analyze it under the framework of multiple hypothesis test… ▽ More

    Submitted 9 August, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

  13. arXiv:2210.08090  [pdf, other

    cs.LG cs.AI

    Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

    Authors: John Nguyen, Jianyu Wang, Kshitiz Malik, Maziar Sanjabi, Michael Rabbat

    Abstract: An oft-cited challenge of federated learning is the presence of heterogeneity. \emph{Data heterogeneity} refers to the fact that data from different clients may follow very different distributions. \emph{System heterogeneity} refers to the fact that client devices have different system capabilities. A considerable number of federated optimization methods address this challenge. In the literature,… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: v2. arXiv admin note: substantial text overlap with arXiv:2206.15387

  14. arXiv:2207.00779  [pdf, other

    cs.CL cs.AI cs.LG

    FRAME: Evaluating Rationale-Label Consistency Metrics for Free-Text Rationales

    Authors: Aaron Chan, Shaoliang Nie, Liang Tan, Xiaochang Peng, Hamed Firooz, Maziar Sanjabi, Xiang Ren

    Abstract: Following how humans communicate, free-text rationales aim to use natural language to explain neural language model (LM) behavior. However, free-text rationales' unconstrained nature makes them prone to hallucination, so it is important to have metrics for free-text rationale quality. Existing free-text rationale metrics measure how consistent the rationale is with the LM's predicted label, but th… ▽ More

    Submitted 2 December, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

    Comments: BlackboxNLP Workshop at EMNLP 2022

  15. arXiv:2206.15387  [pdf, other

    cs.LG cs.AI

    Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

    Authors: John Nguyen, Jianyu Wang, Kshitiz Malik, Maziar Sanjabi, Michael Rabbat

    Abstract: An oft-cited challenge of federated learning is the presence of heterogeneity. \emph{Data heterogeneity} refers to the fact that data from different clients may follow very different distributions. \emph{System heterogeneity} refers to client devices having different system capabilities. A considerable number of federated optimization methods address this challenge. In the literature, empirical ev… ▽ More

    Submitted 24 March, 2023; v1 submitted 30 June, 2022; originally announced June 2022.

    Comments: Accepted at ICLR

    Journal ref: International Conference on Learning Representations 2023

  16. arXiv:2205.12542  [pdf, other

    cs.CL

    ER-Test: Evaluating Explanation Regularization Methods for Language Models

    Authors: Brihi Joshi, Aaron Chan, Ziyi Liu, Shaoliang Nie, Maziar Sanjabi, Hamed Firooz, Xiang Ren

    Abstract: By explaining how humans would solve a given task, human rationales can provide strong learning signal for neural language models (LMs). Explanation regularization (ER) aims to improve LM generalization by pushing the LM's machine rationales (Which input tokens did the LM focus on?) to align with human rationales (Which input tokens would humans focus on?). Though prior works primarily study ER vi… ▽ More

    Submitted 27 February, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Findings of EMNLP 2022

  17. arXiv:2204.13169  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    FedShuffle: Recipes for Better Use of Local Work in Federated Learning

    Authors: Samuel Horváth, Maziar Sanjabi, Lin Xiao, Peter Richtárik, Michael Rabbat

    Abstract: The practice of applying several local updates before aggregation across clients has been empirically shown to be a successful approach to overcoming the communication bottleneck in Federated Learning (FL). Such methods are usually implemented by having clients perform one or more epochs of local training per round while randomly reshuffling their finite dataset in each epoch. Data imbalance, wher… ▽ More

    Submitted 27 September, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: Published in Transactions on Machine Learning Research (09/2022)

  18. arXiv:2204.05990  [pdf, other

    cs.CL

    Detection, Disambiguation, Re-ranking: Autoregressive Entity Linking as a Multi-Task Problem

    Authors: Khalil Mrini, Shaoliang Nie, Jiatao Gu, Sinong Wang, Maziar Sanjabi, Hamed Firooz

    Abstract: We propose an autoregressive entity linking model, that is trained with two auxiliary tasks, and learns to re-rank generated samples at inference time. Our proposed novelties address two weaknesses in the literature. First, a recent method proposes to learn mention detection and then entity candidate selection, but relies on predefined sets of candidates. We use encoder-decoder autoregressive enti… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: Long paper accepted to ACL 2022 Findings

  19. arXiv:2204.03809  [pdf, other

    cs.LG cs.DC math.OC

    Federated Learning with Partial Model Personalization

    Authors: Krishna Pillutla, Kshitiz Malik, Abdelrahman Mohamed, Michael Rabbat, Maziar Sanjabi, Lin Xiao

    Abstract: We consider two federated learning algorithms for training partially personalized models, where the shared and personal parameters are updated either simultaneously or alternately on the devices. Both algorithms have been proposed in the literature, but their convergence properties are not fully understood, especially for the alternating variant. We provide convergence analyses of both algorithms… ▽ More

    Submitted 15 August, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Journal ref: ICML 2022: 17716-17758

  20. arXiv:2203.01881  [pdf, other

    cs.LG cs.AI cs.CV

    Measuring Self-Supervised Representation Quality for Downstream Classification using Discriminative Features

    Authors: Neha Kalibhat, Kanika Narang, Hamed Firooz, Maziar Sanjabi, Soheil Feizi

    Abstract: Self-supervised learning (SSL) has shown impressive results in downstream classification tasks. However, there is limited work in understanding their failure modes and interpreting their learned representations. In this paper, we study the representation space of state-of-the-art self-supervised models including SimCLR, SwaV, MoCo, BYOL, DINO, SimSiam, VICReg and Barlow Twins. Without the use of c… ▽ More

    Submitted 12 December, 2023; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: Published at AAAI 2024

  21. arXiv:2201.00072  [pdf, other

    cs.LG

    BARACK: Partially Supervised Group Robustness With Guarantees

    Authors: Nimit S. Sohoni, Maziar Sanjabi, Nicolas Ballas, Aditya Grover, Shaoliang Nie, Hamed Firooz, Christopher Ré

    Abstract: While neural networks have shown remarkable success on classification tasks in terms of average-case performance, they often fail to perform well on certain groups of the data. Such group information may be expensive to obtain; thus, recent works in robustness and fairness have proposed ways to improve worst-group performance even when group labels are unavailable for the training data. However, t… ▽ More

    Submitted 10 April, 2022; v1 submitted 31 December, 2021; originally announced January 2022.

    Comments: 26 pages

  22. arXiv:2112.13884  [pdf, other

    cs.CV

    A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision

    Authors: Ajinkya Tejankar, Maziar Sanjabi, Bichen Wu, Saining Xie, Madian Khabsa, Hamed Pirsiavash, Hamed Firooz

    Abstract: Using natural language as a supervision for training visual recognition models holds great promise. Recent works have shown that if such supervision is used in the form of alignment between images and captions in large training datasets, then the resulting aligned models perform well on zero-shot classification as downstream tasks2. In this paper, we focus on teasing out what parts of the language… ▽ More

    Submitted 5 January, 2022; v1 submitted 27 December, 2021; originally announced December 2021.

  23. arXiv:2112.08802  [pdf, other

    cs.CL cs.AI cs.LG

    UNIREX: A Unified Learning Framework for Language Model Rationale Extraction

    Authors: Aaron Chan, Maziar Sanjabi, Lambert Mathias, Liang Tan, Shaoliang Nie, Xiaochang Peng, Xiang Ren, Hamed Firooz

    Abstract: An extractive rationale explains a language model's (LM's) prediction on a given task instance by highlighting the text inputs that most influenced the prediction. Ideally, rationale extraction should be faithful (reflective of LM's actual behavior) and plausible (convincing to humans), without compromising the LM's (i.e., task model's) task performance. Although attribution algorithms and select-… ▽ More

    Submitted 26 February, 2023; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: ICML 2022

  24. arXiv:2109.06141  [pdf, other

    cs.LG cs.IT math.OC stat.ML

    On Tilted Losses in Machine Learning: Theory and Applications

    Authors: Tian Li, Ahmad Beirami, Maziar Sanjabi, Virginia Smith

    Abstract: Exponential tilting is a technique commonly used in fields such as statistics, probability, information theory, and optimization to create parametric distribution shifts. Despite its prevalence in related fields, tilting has not seen widespread use in machine learning. In this work, we aim to bridge this gap by exploring the use of tilting in risk minimization. We study a simple extension to ERM -… ▽ More

    Submitted 1 June, 2023; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2007.01162

  25. arXiv:2101.01881  [pdf, other

    cs.CV cs.LG

    MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding

    Authors: Woojeong Jin, Maziar Sanjabi, Shaoliang Nie, Liang Tan, Xiang Ren, Hamed Firooz

    Abstract: To reduce a model size but retain performance, we often rely on knowledge distillation (KD) which transfers knowledge from a large "teacher" model to a smaller "student" model. However, KD on multimodal datasets such as vision-language tasks is relatively unexplored, and digesting multimodal information is challenging since different modalities present different types of information. In this paper… ▽ More

    Submitted 21 October, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

    Comments: Accepted to EMNLP 2021 Findings

  26. arXiv:2009.03482  [pdf, ps, other

    math.OC cs.LG

    Alternating Direction Method of Multipliers for Quantization

    Authors: Tianjian Huang, Prajwal Singhania, Maziar Sanjabi, Pabitra Mitra, Meisam Razaviyayn

    Abstract: Quantization of the parameters of machine learning models, such as deep neural networks, requires solving constrained optimization problems, where the constraint set is formed by the Cartesian product of many simple discrete sets. For such optimization problems, we study the performance of the Alternating Direction Method of Multipliers for Quantization ($\texttt{ADMM-Q}$) algorithm, which is a va… ▽ More

    Submitted 1 March, 2021; v1 submitted 7 September, 2020; originally announced September 2020.

  27. arXiv:2007.01162  [pdf, other

    cs.LG cs.IT stat.ML

    Tilted Empirical Risk Minimization

    Authors: Tian Li, Ahmad Beirami, Maziar Sanjabi, Virginia Smith

    Abstract: Empirical risk minimization (ERM) is typically designed to perform well on the average loss, which can result in estimators that are sensitive to outliers, generalize poorly, or treat subgroups unfairly. While many methods aim to address these problems individually, in this work, we explore them through a unified framework -- tilted empirical risk minimization (TERM). In particular, we show that i… ▽ More

    Submitted 17 March, 2021; v1 submitted 2 July, 2020; originally announced July 2020.

    Comments: Accepted by ICLR 2021

  28. arXiv:2006.08141  [pdf, other

    math.OC cs.LG stat.ML

    Non-convex Min-Max Optimization: Applications, Challenges, and Recent Theoretical Advances

    Authors: Meisam Razaviyayn, Tianjian Huang, Songtao Lu, Maher Nouiehed, Maziar Sanjabi, Mingyi Hong

    Abstract: The min-max optimization problem, also known as the saddle point problem, is a classical optimization problem which is also studied in the context of zero-sum games. Given a class of objective functions, the goal is to find a value for the argument which leads to a small objective value even for the worst case function in the given class. Min-max optimization problems have recently become very pop… ▽ More

    Submitted 18 August, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Journal ref: IEEE Signal Processing Magazine (Volume: 37, Issue: 5, Sept. 2020)

  29. arXiv:2001.01920  [pdf, other

    cs.LG stat.ML

    FedDANE: A Federated Newton-Type Method

    Authors: Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith

    Abstract: Federated learning aims to jointly learn statistical models over massively distributed remote devices. In this work, we propose FedDANE, an optimization method that we adapt from DANE, a method for classical distributed optimization, to handle the practical constraints of federated learning. We provide convergence guarantees for this method when learning over both convex and non-convex functions.… ▽ More

    Submitted 7 January, 2020; originally announced January 2020.

    Comments: Asilomar Conference on Signals, Systems, and Computers 2019

  30. arXiv:1911.09815  [pdf, ps, other

    cs.LG

    When Does Non-Orthogonal Tensor Decomposition Have No Spurious Local Minima?

    Authors: Maziar Sanjabi, Sina Baharlouei, Meisam Razaviyayn, Jason D. Lee

    Abstract: We study the optimization problem for decomposing $d$ dimensional fourth-order Tensors with $k$ non-orthogonal components. We derive \textit{deterministic} conditions under which such a problem does not have spurious local minima. In particular, we show that if $κ= \frac{λ_{max}}{λ_{min}} < \frac{5}{4}$, and incoherence coefficient is of the order $O(\frac{1}{\sqrt{d}})$, then all the local minima… ▽ More

    Submitted 21 November, 2019; originally announced November 2019.

  31. arXiv:1905.10497  [pdf, other

    cs.LG stat.ML

    Fair Resource Allocation in Federated Learning

    Authors: Tian Li, Maziar Sanjabi, Ahmad Beirami, Virginia Smith

    Abstract: Federated learning involves training statistical models in massive, heterogeneous networks. Naively minimizing an aggregate loss function in such a network may disproportionately advantage or disadvantage some of the devices. In this work, we propose q-Fair Federated Learning (q-FFL), a novel optimization objective inspired by fair resource allocation in wireless networks that encourages a more fa… ▽ More

    Submitted 14 February, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

    Comments: ICLR 2020

  32. arXiv:1904.09775  [pdf, other

    cs.LG cs.AI stat.ML

    Training generative networks using random discriminators

    Authors: Babak Barazandeh, Meisam Razaviyayn, Maziar Sanjabi

    Abstract: In recent years, Generative Adversarial Networks (GANs) have drawn a lot of attentions for learning the underlying distribution of data in various applications. Despite their wide applicability, training GANs is notoriously difficult. This difficulty is due to the min-max nature of the resulting optimization problem and the lack of proper tools of solving general (non-convex, non-concave) min-max… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

  33. arXiv:1902.08297  [pdf, other

    math.OC cs.LG stat.ML

    Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods

    Authors: Maher Nouiehed, Maziar Sanjabi, Tianjian Huang, Jason D. Lee, Meisam Razaviyayn

    Abstract: Recent applications that arise in machine learning have surged significant interest in solving min-max saddle point games. This problem has been extensively studied in the convex-concave regime for which a global equilibrium solution can be computed efficiently. In this paper, we study the problem in the non-convex regime and show that an \varepsilon--first order stationary point of the game can b… ▽ More

    Submitted 30 October, 2019; v1 submitted 21 February, 2019; originally announced February 2019.

  34. arXiv:1812.06127  [pdf, other

    cs.LG stat.ML

    Federated Optimization in Heterogeneous Networks

    Authors: Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith

    Abstract: Federated Learning is a distributed learning paradigm with two key challenges that differentiate it from traditional distributed optimization: (1) significant variability in terms of the systems characteristics on each device in the network (systems heterogeneity), and (2) non-identically distributed data across the network (statistical heterogeneity). In this work, we introduce a framework, FedPr… ▽ More

    Submitted 21 April, 2020; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: MLSys 2020

  35. arXiv:1812.02878  [pdf, ps, other

    math.OC cs.GT cs.LG

    Solving Non-Convex Non-Concave Min-Max Games Under Polyak-Łojasiewicz Condition

    Authors: Maziar Sanjabi, Meisam Razaviyayn, Jason D. Lee

    Abstract: In this short note, we consider the problem of solving a min-max zero-sum game. This problem has been extensively studied in the convex-concave regime where the global solution can be computed efficiently. Recently, there have also been developments for finding the first order stationary points of the game when one of the player's objective is concave or (weakly) concave. This work focuses on the… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

  36. arXiv:1802.08249  [pdf, other

    cs.LG math.OC stat.ML

    On the Convergence and Robustness of Training GANs with Regularized Optimal Transport

    Authors: Maziar Sanjabi, Jimmy Ba, Meisam Razaviyayn, Jason D. Lee

    Abstract: Generative Adversarial Networks (GANs) are one of the most practical methods for learning data distributions. A popular GAN formulation is based on the use of Wasserstein distance as a metric between probability distributions. Unfortunately, minimizing the Wasserstein distance between the data distribution and the generative model distribution is a computationally challenging problem as its object… ▽ More

    Submitted 22 May, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

  37. arXiv:1705.10467  [pdf, other

    cs.LG stat.ML

    Federated Multi-Task Learning

    Authors: Virginia Smith, Chao-Kai Chiang, Maziar Sanjabi, Ameet Talwalkar

    Abstract: Federated learning poses new statistical and systems challenges in training machine learning models over distributed networks of devices. In this work, we show that multi-task learning is naturally suited to handle the statistical challenges of this setting, and propose a novel systems-aware optimization method, MOCHA, that is robust to practical systems issues. Our method and theory for the first… ▽ More

    Submitted 27 February, 2018; v1 submitted 30 May, 2017; originally announced May 2017.

  38. arXiv:1407.1424  [pdf, ps, other

    cs.IT

    Cross Layer Provision of Future Cellular Networks

    Authors: H. Baligh, M. Hong, W. -C. Liao, Z. -Q. Luo, M. Razaviyayn, M. Sanjabi, R. Sun

    Abstract: To cope with the growing demand for wireless data and to extend service coverage, future 5G networks will increasingly rely on the use of low powered nodes to support massive connectivity in diverse set of applications and services [1]. To this end, virtualized and mass-scale cloud architectures are proposed as promising technologies for 5G in which all the nodes are connected via a backhaul netwo… ▽ More

    Submitted 5 July, 2014; originally announced July 2014.

  39. arXiv:1009.3481  [pdf, ps, other

    cs.IT

    Linear Transceiver Design for Interference Alignment: Complexity and Computation

    Authors: Meisam Razaviyayn, Maziar Sanjabi, Zhi-Quan Luo

    Abstract: Consider a MIMO interference channel whereby each transmitter and receiver are equipped with multiple antennas. The basic problem is to design optimal linear transceivers (or beamformers) that can maximize system throughput. The recent work [1] suggests that optimal beamformers should maximize the total degrees of freedom and achieve interference alignment in high SNR. In this paper we first consi… ▽ More

    Submitted 17 September, 2010; originally announced September 2010.