Zum Hauptinhalt springen

Showing 1–50 of 90 results for author: Takáč, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09250  [pdf, other

    cs.CV cs.AI cs.LG

    MirrorCheck: Efficient Adversarial Defense for Vision-Language Models

    Authors: Samar Fares, Klea Ziu, Toluwani Aremu, Nikita Durasov, Martin Takáč, Pascal Fua, Karthik Nandakumar, Ivan Laptev

    Abstract: Vision-Language Models (VLMs) are becoming increasingly vulnerable to adversarial attacks as various novel attack strategies are being proposed against these models. While existing defenses excel in unimodal contexts, they currently fall short in safeguarding VLMs against adversarial threats. To mitigate this vulnerability, we propose a novel, yet elegantly simple approach for detecting adversaria… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2406.04443  [pdf, other

    cs.LG math.OC

    Gradient Clipping Improves AdaGrad when the Noise Is Heavy-Tailed

    Authors: Savelii Chezhegov, Yaroslav Klyukin, Andrei Semenov, Aleksandr Beznosikov, Alexander Gasnikov, Samuel Horváth, Martin Takáč, Eduard Gorbunov

    Abstract: Methods with adaptive stepsizes, such as AdaGrad and Adam, are essential for training modern Deep Learning models, especially Large Language Models. Typically, the noise in the stochastic gradients is heavy-tailed for the later ones. Gradient clipping provably helps to achieve good high-probability convergence for such noises. However, despite the similarity between AdaGrad/Adam and Clip-SGD, the… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 37 pages, 8 figures

  3. arXiv:2406.00846  [pdf, other

    cs.LG cs.DC math.OC

    Local Methods with Adaptivity via Scaling

    Authors: Savelii Chezhegov, Sergey Skorik, Nikolas Khachaturov, Danil Shalagin, Aram Avetisyan, Aleksandr Beznosikov, Martin Takáč, Yaroslav Kholodov, Alexander Gasnikov

    Abstract: The rapid development of machine learning and deep learning has introduced increasingly complex optimization challenges that must be addressed. Indeed, training modern, advanced models has become difficult to implement without leveraging multiple computing nodes in a distributed environment. Distributed optimization is also fundamental to emerging fields such as federated learning. Specifically, t… ▽ More

    Submitted 12 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: 41 pages, 2 algorithms, 6 figures, 1 table

  4. arXiv:2405.17950  [pdf, other

    cs.AI

    Self-Guiding Exploration for Combinatorial Problems

    Authors: Zangir Iklassov, Yali Du, Farkhad Akimov, Martin Takac

    Abstract: Large Language Models (LLMs) have become pivotal in addressing reasoning tasks across diverse domains, including arithmetic, commonsense, and symbolic reasoning. They utilize prompting techniques such as Exploration-of-Thought, Decomposition, and Refinement to effectively navigate and solve intricate tasks. Despite these advancements, the application of LLMs to Combinatorial Problems (CPs), known… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 22 pages

  5. arXiv:2404.07525  [pdf, other

    cs.LG

    Enhancing Policy Gradient with the Polyak Step-Size Adaption

    Authors: Yunxiang Li, Rui Yuan, Chen Fan, Mark Schmidt, Samuel Horváth, Robert M. Gower, Martin Takáč

    Abstract: Policy gradient is a widely utilized and foundational algorithm in the field of reinforcement learning (RL). Renowned for its convergence guarantees and stability compared to other RL algorithms, its practical application is often hindered by sensitivity to hyper-parameters, particularly the step-size. In this paper, we introduce the integration of the Polyak step-size in RL, which automatically a… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  6. arXiv:2403.18444  [pdf, other

    cs.LG

    FRESCO: Federated Reinforcement Energy System for Cooperative Optimization

    Authors: Nicolas Mauricio Cuadrado, Roberto Alejandro Gutierrez, Martin Takáč

    Abstract: The rise in renewable energy is creating new dynamics in the energy grid that promise to create a cleaner and more participative energy grid, where technology plays a crucial part in making the required flexibility to achieve the vision of the next-generation grid. This work presents FRESCO, a framework that aims to ease the implementation of energy markets using a hierarchical control architectur… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Tiny Paper at ICLR 2023

  7. arXiv:2403.18439  [pdf, other

    cs.LG

    Generalized Policy Learning for Smart Grids: FL TRPO Approach

    Authors: Yunxiang Li, Nicolas Mauricio Cuadrado, Samuel Horváth, Martin Takáč

    Abstract: The smart grid domain requires bolstering the capabilities of existing energy management systems; Federated Learning (FL) aligns with this goal as it demonstrates a remarkable ability to train models on heterogeneous datasets while maintaining data privacy, making it suitable for smart grid applications, which often involve disparate data distributions and interdependencies among features that hin… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: ICLR 2024 Workshop: Tackling Climate Change with Machine Learning

  8. arXiv:2403.02648  [pdf, other

    cs.LG cs.AI math.OC

    Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad

    Authors: Sayantan Choudhury, Nazarii Tupitsa, Nicolas Loizou, Samuel Horvath, Martin Takac, Eduard Gorbunov

    Abstract: Adaptive methods are extremely popular in machine learning as they make learning rate tuning less expensive. This paper introduces a novel optimization algorithm named KATE, which presents a scale-invariant adaptation of the well-known AdaGrad algorithm. We prove the scale-invariance of KATE for the case of Generalized Linear Models. Moreover, for general smooth non-convex problems, we establish a… ▽ More

    Submitted 5 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 27 pages, 12 figures

  9. arXiv:2402.09765  [pdf, other

    cs.AI

    Reinforcement Learning for Solving Stochastic Vehicle Routing Problem with Time Windows

    Authors: Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac

    Abstract: This paper introduces a reinforcement learning approach to optimize the Stochastic Vehicle Routing Problem with Time Windows (SVRP), focusing on reducing travel costs in goods delivery. We develop a novel SVRP formulation that accounts for uncertain travel costs and demands, alongside specific customer time windows. An attention-based neural network trained through reinforcement learning is employ… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  10. arXiv:2402.05264  [pdf, other

    cs.LG math.OC

    AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size

    Authors: Petr Ostroukhov, Aigerim Zhumabayeva, Chulu Xiang, Alexander Gasnikov, Martin Takáč, Dmitry Kamzolov

    Abstract: This paper presents a novel adaptation of the Stochastic Gradient Descent (SGD), termed AdaBatchGrad. This modification seamlessly integrates an adaptive step size with an adjustable batch size. An increase in batch size and a decrease in step size are well-known techniques to tighten the area of convergence of SGD and decrease its variance. A range of studies by R. Byrd and J. Nocedal introduced… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  11. arXiv:2402.05050  [pdf, other

    cs.LG math.OC

    Federated Learning Can Find Friends That Are Advantageous

    Authors: Nazarii Tupitsa, Samuel Horváth, Martin Takáč, Eduard Gorbunov

    Abstract: In Federated Learning (FL), the distributed nature and heterogeneity of client data present both opportunities and challenges. While collaboration among clients can significantly enhance the learning process, not all collaborations are beneficial; some may even be detrimental. In this study, we introduce a novel algorithm that assigns adaptive aggregation weights to clients participating in FL tra… ▽ More

    Submitted 17 July, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  12. arXiv:2312.17369  [pdf, other

    cs.LG math.OC

    SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms

    Authors: Farshed Abdukhakimov, Chulu Xiang, Dmitry Kamzolov, Robert Gower, Martin Takáč

    Abstract: Adaptive optimization methods are widely recognized as among the most popular approaches for training Deep Neural Networks (DNNs). Techniques such as Adam, AdaGrad, and AdaHessian utilize a preconditioner that modifies the search direction by incorporating information about the curvature of the objective function. However, despite their adaptive characteristics, these methods still require manual… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  13. arXiv:2312.15799  [pdf, other

    stat.ML cs.LG

    Efficient Conformal Prediction under Data Heterogeneity

    Authors: Vincent Plassier, Nikita Kotelevskii, Aleksandr Rubashevskii, Fedor Noskov, Maksim Velikanov, Alexander Fishkov, Samuel Horvath, Martin Takac, Eric Moulines, Maxim Panov

    Abstract: Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification, which is crucial for ensuring the reliability of predictions. However, common CP methods heavily rely on data exchangeability, a condition often violated in practice. Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples. This work introduce… ▽ More

    Submitted 13 July, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

    Comments: 29 pages

  14. arXiv:2312.11230  [pdf, other

    stat.ML cs.LG

    Dirichlet-based Uncertainty Quantification for Personalized Federated Learning with Improved Posterior Networks

    Authors: Nikita Kotelevskii, Samuel Horváth, Karthik Nandakumar, Martin Takáč, Maxim Panov

    Abstract: In modern federated learning, one of the main challenges is to account for inherent heterogeneity and the diverse nature of data distributions for different clients. This problem is often addressed by introducing personalization of the models towards the data distribution of the particular client. However, a personalized model might be unreliable when applied to the data that is not typical for th… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  15. arXiv:2311.07708  [pdf, other

    cs.AI cs.CE cs.LG

    Reinforcement Learning for Solving Stochastic Vehicle Routing Problem

    Authors: Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac

    Abstract: This study addresses a gap in the utilization of Reinforcement Learning (RL) and Machine Learning (ML) techniques in solving the Stochastic Vehicle Routing Problem (SVRP) that involves the challenging task of optimizing vehicle routes under uncertain conditions. We propose a novel end-to-end framework that comprehensively addresses the key sources of stochasticity in SVRP and utilizes an RL agent… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 14 pages, accepted to ACML24

  16. arXiv:2311.04611  [pdf, other

    cs.LG math.OC

    Byzantine-Tolerant Methods for Distributed Variational Inequalities

    Authors: Nazarii Tupitsa, Abdulla Jasem Almansoori, Yanlin Wu, Martin Takáč, Karthik Nandakumar, Samuel Horváth, Eduard Gorbunov

    Abstract: Robustness to Byzantine attacks is a necessity for various distributed training scenarios. When the training reduces to the process of solving a minimization problem, Byzantine robustness is relatively well-understood. However, other problem formulations, such as min-max problems or, more generally, variational inequalities, arise in many modern machine learning and, in particular, distributed lea… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023; 69 pages, 12 figures

  17. arXiv:2310.02093  [pdf, other

    cs.LG math.OC

    Stochastic Gradient Descent with Preconditioned Polyak Step-size

    Authors: Farshed Abdukhakimov, Chulu Xiang, Dmitry Kamzolov, Martin Takáč

    Abstract: Stochastic Gradient Descent (SGD) is one of the many iterative optimization methods that are widely used in solving machine learning problems. These methods display valuable properties and attract researchers and industrial machine learning engineers with their simplicity. However, one of the weaknesses of this type of methods is the necessity to tune learning rate (step-size) for every loss funct… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  18. arXiv:2304.04026  [pdf, other

    cs.CL cs.AI

    WikiGoldSK: Annotated Dataset, Baselines and Few-Shot Learning Experiments for Slovak Named Entity Recognition

    Authors: Dávid Šuba, Marek Šuppa, Jozef Kubík, Endre Hamerlik, Martin Takáč

    Abstract: Named Entity Recognition (NER) is a fundamental NLP tasks with a wide range of practical applications. The performance of state-of-the-art NER methods depends on high quality manually anotated datasets which still do not exist for some languages. In this work we aim to remedy this situation in Slovak by introducing WikiGoldSK, the first sizable human labelled Slovak NER dataset. We benchmark it by… ▽ More

    Submitted 8 April, 2023; originally announced April 2023.

    Comments: BSNLP 2023 Workshop at EACL 2023

  19. arXiv:2304.01547  [pdf, other

    cs.AI

    Regularization of the policy updates for stabilizing Mean Field Games

    Authors: Talal Algumaei, Ruben Solozabal, Reda Alami, Hakim Hacid, Merouane Debbah, Martin Takac

    Abstract: This work studies non-cooperative Multi-Agent Reinforcement Learning (MARL) where multiple agents interact in the same environment and whose goal is to maximize the individual returns. Challenges arise when scaling up the number of agents due to the resultant non-stationarity that the many agents introduce. In order to address this issue, Mean Field Games (MFG) rely on the symmetry and homogeneity… ▽ More

    Submitted 13 April, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

  20. arXiv:2303.08447  [pdf, other

    cs.LG cs.MA eess.SY

    MAHTM: A Multi-Agent Framework for Hierarchical Transactive Microgrids

    Authors: Nicolas Cuadrado, Roberto Gutierrez, Yongli Zhu, Martin Takac

    Abstract: Integrating variable renewable energy into the grid has posed challenges to system operators in achieving optimal trade-offs among energy availability, cost affordability, and pollution controllability. This paper proposes a multi-agent reinforcement learning framework for managing energy transactions in microgrids. The framework addresses the challenges above: it seeks to optimize the usage of av… ▽ More

    Submitted 14 September, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: ICLR 2023 Workshop: Tackling Climate Change with Machine Learning

    ACM Class: I.2.8

  21. arXiv:2302.07615  [pdf, other

    math.OC cs.DC cs.GT cs.LG stat.ML

    Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities

    Authors: Aleksandr Beznosikov, Martin Takáč, Alexander Gasnikov

    Abstract: Variational inequalities are a broad and flexible class of problems that includes minimization, saddle point, and fixed point problems as special cases. Therefore, variational inequalities are used in various applications ranging from equilibrium search to adversarial learning. With the increasing size of data and models, today's instances demand parallel and distributed computing for real-world m… ▽ More

    Submitted 30 March, 2024; v1 submitted 15 February, 2023; originally announced February 2023.

    Comments: Appears in: Advances in Neural Information Processing Systems 36 (NeurIPS 2023) (https://proceedings.neurips.cc/paper_files/paper/2023/hash/5b4a459db23e6db9be2a128380953d96-Abstract-Conference.html). 36 pages, 3 algorithms, 1 figure, 1 table

  22. arXiv:2301.00524  [pdf, other

    cs.CV cs.HC cs.LG

    Learning Confident Classifiers in the Presence of Label Noise

    Authors: Asma Ahmed Hashmi, Aigerim Zhumabayeva, Nikita Kotelevskii, Artem Agafonov, Mohammad Yaqub, Maxim Panov, Martin Takáč

    Abstract: The success of Deep Neural Network (DNN) models significantly depends on the quality of provided annotations. In medical image segmentation, for example, having multiple expert annotations for each data point is common to minimize subjective annotation bias. Then, the goal of estimation is to filter out the label noise and recover the ground-truth masks, which are not explicitly given. This paper… ▽ More

    Submitted 9 December, 2023; v1 submitted 1 January, 2023; originally announced January 2023.

  23. arXiv:2212.03836  [pdf, other

    cs.CV cs.LG

    PaDPaF: Partial Disentanglement with Partially-Federated GANs

    Authors: Abdulla Jasem Almansoori, Samuel Horváth, Martin Takáč

    Abstract: Federated learning has become a popular machine learning paradigm with many potential real-life applications, including recommendation systems, the Internet of Things (IoT), healthcare, and self-driving cars. Though most current applications focus on classification-based tasks, learning personalized generative models remains largely unexplored, and their benefits in the heterogeneous setting still… ▽ More

    Submitted 28 May, 2024; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: 29 pages, 21 figures. Published at TMLR 04/2024

  24. arXiv:2211.00866  [pdf, other

    math.OC cs.LG

    Gradient Descent and the Power Method: Exploiting their connection to find the leftmost eigen-pair and escape saddle points

    Authors: Rachael Tappenden, Martin Takáč

    Abstract: This work shows that applying Gradient Descent (GD) with a fixed step size to minimize a (possibly nonconvex) quadratic function is equivalent to running the Power Method (PM) on the gradients. The connection between GD with a fixed step size and the PM, both with and without fixed momentum, is thus established. Consequently, valuable eigen-information is available via GD. Recent examples show t… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

  25. arXiv:2210.09626  [pdf, other

    cs.LG math.OC

    FLECS-CGD: A Federated Learning Second-Order Framework via Compression and Sketching with Compressed Gradient Differences

    Authors: Artem Agafonov, Brahim Erraji, Martin Takáč

    Abstract: In the recent paper FLECS (Agafonov et al, FLECS: A Federated Learning Second-Order Framework via Compression and Sketching), the second-order framework FLECS was proposed for the Federated Learning problem. This method utilize compression of sketched Hessians to make communication costs low. However, the main bottleneck of FLECS is gradient communication without compression. In this paper, we pro… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

  26. arXiv:2207.10804  [pdf, other

    cs.CR cs.LG math.OC

    Suppressing Poisoning Attacks on Federated Learning for Medical Imaging

    Authors: Naif Alkhunaizi, Dmitry Kamzolov, Martin Takáč, Karthik Nandakumar

    Abstract: Collaboration among multiple data-owning entities (e.g., hospitals) can accelerate the training process and yield better machine learning models due to the availability and diversity of data. However, privacy concerns make it challenging to exchange data while preserving confidentiality. Federated Learning (FL) is a promising solution that enables collaborative training through exchange of model p… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  27. arXiv:2207.08171  [pdf, other

    cs.LG math.OC

    SP2: A Second Order Stochastic Polyak Method

    Authors: Shuang Li, William J. Swartworth, Martin Takáč, Deanna Needell, Robert M. Gower

    Abstract: Recently the "SP" (Stochastic Polyak step size) method has emerged as a competitive adaptive method for setting the step sizes of SGD. SP can be interpreted as a method specialized to interpolated models, since it solves the interpolation equations. SP solves these equation by using local linearizations of the model. We take a step further and develop a method for solving the interpolation equatio… ▽ More

    Submitted 17 July, 2022; originally announced July 2022.

  28. arXiv:2206.08303  [pdf, other

    cs.LG math.OC

    On Scaled Methods for Saddle Point Problems

    Authors: Aleksandr Beznosikov, Aibek Alanov, Dmitry Kovalev, Martin Takáč, Alexander Gasnikov

    Abstract: Methods with adaptive scaling of different features play a key role in solving saddle point problems, primarily due to Adam's popularity for solving adversarial machine learning problems, including GANS training. This paper carries out a theoretical analysis of the following scaling techniques for solving SPPs: the well-known Adam and RmsProp scaling and the newer AdaHessian and OASIS based on Hut… ▽ More

    Submitted 21 June, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: 54 pages, 2 algorithms with 4 options for each, 12 figures, 5 tables, 2 theorems

  29. arXiv:2206.04423  [pdf, other

    cs.LG cs.AI

    Learning to generalize Dispatching rules on the Job Shop Scheduling

    Authors: Zangir Iklassov, Dmitrii Medvedev, Ruben Solozabal, Martin Takac

    Abstract: This paper introduces a Reinforcement Learning approach to better generalize heuristic dispatching rules on the Job-shop Scheduling Problem (JSP). Current models on the JSP do not focus on generalization, although, as we show in this work, this is key to learning better heuristics on the problem. A well-known technique to improve generalization is to learn on increasingly complex instances using C… ▽ More

    Submitted 15 November, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

  30. arXiv:2206.02507  [pdf, other

    cs.LG eess.SY

    Learning to Control under Time-Varying Environment

    Authors: Yuzhen Han, Ruben Solozabal, Jing Dong, Xingyu Zhou, Martin Takac, Bin Gu

    Abstract: This paper investigates the problem of regret minimization in linear time-varying (LTV) dynamical systems. Due to the simultaneous presence of uncertainty and non-stationarity, designing online control algorithms for unknown LTV systems remains a challenging task. At a cost of NP-hard offline planning, prior works have introduced online convex optimization algorithms, although they suffer from non… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

  31. arXiv:2206.01666  [pdf, other

    math.OC cs.LG

    Algorithm for Constrained Markov Decision Process with Linear Convergence

    Authors: Egor Gladin, Maksim Lavrik-Karmazin, Karina Zainullina, Varvara Rudenko, Alexander Gasnikov, Martin Takáč

    Abstract: The problem of constrained Markov decision process is considered. An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs (the number of constraints is relatively small). A new dual approach is proposed with the integration of two ingredients: entropy regularized policy optimizer and Vaidya's dual optimizer, both of which are critical to ac… ▽ More

    Submitted 19 October, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: 27 pages, 2 figures, 3 tables. Improved presentation of the material, added a table with results, stated contributions more clearly, changed article template

  32. Stochastic Gradient Methods with Preconditioned Updates

    Authors: Abdurakhmon Sadiev, Aleksandr Beznosikov, Abdulla Jasem Almansoori, Dmitry Kamzolov, Rachael Tappenden, Martin Takáč

    Abstract: This work considers the non-convex finite sum minimization problem. There are several algorithms for such problems, but existing methods often work poorly when the problem is badly scaled and/or ill-conditioned, and a primary goal of this work is to introduce methods that alleviate this issue. Thus, here we include a preconditioner based on Hutchinson's approach to approximating the diagonal of th… ▽ More

    Submitted 14 January, 2024; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: 40 pages, 2 new algorithms, 20 figures, 4 tables

  33. arXiv:2203.05403  [pdf, other

    cs.LG

    Robustness Analysis of Classification Using Recurrent Neural Networks with Perturbed Sequential Input

    Authors: Guangyi Liu, Arash Amini, Martin Takac, Nader Motee

    Abstract: For a given stable recurrent neural network (RNN) that is trained to perform a classification task using sequential inputs, we quantify explicit robustness bounds as a function of trainable weight matrices. The sequential inputs can be perturbed in various ways, e.g., streaming images can be deformed due to robot motion or imperfect camera lens. Using the notion of the Voronoi diagram and Lipschit… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

  34. arXiv:2202.02491  [pdf, ps, other

    cs.LG cs.AI

    Distributed Learning With Sparsified Gradient Differences

    Authors: Yicheng Chen, Rick S. Blum, Martin Takac, Brian M. Sadler

    Abstract: A very large number of communications are typically required to solve distributed learning tasks, and this critically limits scalability and convergence speed in wireless communications applications. In this paper, we devise a Gradient Descent method with Sparsification and Error Correction (GD-SEC) to improve the communications efficiency in a general worker-server architecture. Motivated by a va… ▽ More

    Submitted 4 February, 2022; originally announced February 2022.

  35. Random-reshuffled SARAH does not need a full gradient computations

    Authors: Aleksandr Beznosikov, Martin Takáč

    Abstract: The StochAstic Recursive grAdient algoritHm (SARAH) algorithm is a variance reduced variant of the Stochastic Gradient Descent (SGD) algorithm that needs a gradient of the objective function from time to time. In this paper, we remove the necessity of a full gradient computation. This is achieved by using a randomized reshuffling strategy and aggregating stochastic gradients obtained in each epoch… ▽ More

    Submitted 14 January, 2024; v1 submitted 26 November, 2021; originally announced November 2021.

    Comments: 20 pages, 2 algorithms, 5 figures, 3 tables

  36. arXiv:2109.05198  [pdf, other

    cs.LG math.OC

    Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information

    Authors: Majid Jahani, Sergey Rusakov, Zheng Shi, Peter Richtárik, Michael W. Mahoney, Martin Takáč

    Abstract: We present a novel adaptive optimization algorithm for large-scale machine learning problems. Equipped with a low-cost estimate of local curvature and Lipschitz smoothness, our method dynamically adapts the search direction and step-size. The search direction contains gradient information preconditioned by a well-scaled diagonal preconditioning matrix that captures the local curvature information.… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

  37. arXiv:2107.02423  [pdf, other

    cs.LG

    Improving Text-to-Image Synthesis Using Contrastive Learning

    Authors: Hui Ye, Xiulong Yang, Martin Takac, Rajshekhar Sunderraman, Shihao Ji

    Abstract: The goal of text-to-image synthesis is to generate a visually realistic image that matches a given text description. In practice, the captions annotated by humans for the same image have large variance in terms of contents and the choice of words. The linguistic discrepancy between the captions of the identical image leads to the synthetic images deviating from the ground truth. To address this is… ▽ More

    Submitted 27 November, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

    Comments: Accepted to BMVC 2021

  38. arXiv:2106.07289  [pdf, other

    cs.LG cs.DC math.OC

    Decentralized Personalized Federated Learning for Min-Max Problems

    Authors: Ekaterina Borodich, Aleksandr Beznosikov, Abdurakhmon Sadiev, Vadim Sushko, Nikolay Savelyev, Martin Takáč, Alexander Gasnikov

    Abstract: Personalized Federated Learning (PFL) has witnessed remarkable advancements, enabling the development of innovative machine learning applications that preserve the privacy of training data. However, existing theoretical research in this field has primarily focused on distributed optimization for minimization problems. This paper is the first to study PFL for saddle point problems encompassing a br… ▽ More

    Submitted 17 April, 2024; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: 33 pages, 3 algorithms, 5 figures, 2 tables

  39. arXiv:2102.09700  [pdf, other

    cs.LG math.OC

    AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods

    Authors: Zheng Shi, Abdurakhmon Sadiev, Nicolas Loizou, Peter Richtárik, Martin Takáč

    Abstract: We present AI-SARAH, a practical variant of SARAH. As a variant of SARAH, this algorithm employs the stochastic recursive gradient yet adjusts step-size based on local geometry. AI-SARAH implicitly computes step-size and efficiently estimates local Lipschitz smoothness of stochastic functions. It is fully adaptive, tune-free, straightforward to implement, and computationally efficient. We provide… ▽ More

    Submitted 1 February, 2022; v1 submitted 18 February, 2021; originally announced February 2021.

  40. arXiv:2012.10480  [pdf, other

    cs.RO cs.LG cs.MA eess.SY

    Distributed Map Classification using Local Observations

    Authors: Guangyi Liu, Arash Amini, Martin Takáč, Héctor Muñoz-Avila, Nader Motee

    Abstract: We consider the problem of classifying a map using a team of communicating robots. It is assumed that all robots have localized visual sensing capabilities and can exchange their information with neighboring robots. Using a graph decomposition technique, we proposed an offline learning structure that makes every robot capable of communicating with and fusing information from its neighbors to plan… ▽ More

    Submitted 10 March, 2021; v1 submitted 18 December, 2020; originally announced December 2020.

  41. DynNet: Physics-based neural architecture design for linear and nonlinear structural response modeling and prediction

    Authors: Soheil Sadeghi Eshkevari, Martin Takáč, Shamim N. Pakzad, Majid Jahani

    Abstract: Data-driven models for predicting dynamic responses of linear and nonlinear systems are of great importance due to their wide application from probabilistic analysis to inverse problems such as system identification and damage diagnosis. In this study, a physics-based recurrent neural network model is designed that is able to learn the dynamics of linear and nonlinear multiple degrees of freedom s… ▽ More

    Submitted 3 July, 2020; originally announced July 2020.

    Comments: Submitted to Elsevier

    Report number: 111582

    Journal ref: Journal of Engineering Structures, Volume 229, 15 February 2021

  42. arXiv:2006.11984  [pdf, other

    cs.LG stat.ML

    Constrained Combinatorial Optimization with Reinforcement Learning

    Authors: Ruben Solozabal, Josu Ceberio, Martin Takáč

    Abstract: This paper presents a framework to tackle constrained combinatorial optimization problems using deep Reinforcement Learning (RL). To this end, we extend the Neural Combinatorial Optimization (NCO) theory in order to deal with constraints in its formulation. Notably, we propose defining constrained combinatorial problems as fully observable Constrained Markov Decision Processes (CMDP). In that co… ▽ More

    Submitted 21 June, 2020; originally announced June 2020.

  43. arXiv:2006.01892  [pdf, other

    stat.ML cs.LG math.DS

    Finite Difference Neural Networks: Fast Prediction of Partial Differential Equations

    Authors: Zheng Shi, Nur Sila Gulgec, Albert S. Berahas, Shamim N. Pakzad, Martin Takáč

    Abstract: Discovering the underlying behavior of complex systems is an important topic in many science and engineering disciplines. In this paper, we propose a novel neural network framework, finite difference neural networks (FDNet), to learn partial differential equations from data. Specifically, our proposed finite difference inspired network is designed to learn the underlying governing partial differen… ▽ More

    Submitted 2 June, 2020; originally announced June 2020.

    Comments: 38 pages, 48 figures

  44. arXiv:1912.09925  [pdf, other

    cs.LG cs.DC math.NA math.OC

    Distributed Fixed Point Methods with Compressed Iterates

    Authors: Sélim Chraibi, Ahmed Khaled, Dmitry Kovalev, Peter Richtárik, Adil Salim, Martin Takáč

    Abstract: We propose basic and natural assumptions under which iterative optimization methods with compressed iterates can be analyzed. This problem is motivated by the practice of federated learning, where a large model stored in the cloud is compressed before it is sent to a mobile device, which then proceeds with training based on local data. We develop standard and variance reduced methods, and establis… ▽ More

    Submitted 20 December, 2019; originally announced December 2019.

    Comments: 15 pages, 4 algorithms, 4 Theorems

  45. arXiv:1910.12680  [pdf, other

    cs.LG stat.ML

    FD-Net with Auxiliary Time Steps: Fast Prediction of PDEs using Hessian-Free Trust-Region Methods

    Authors: Nur Sila Gulgec, Zheng Shi, Neil Deshmukh, Shamim Pakzad, Martin Takáč

    Abstract: Discovering the underlying physical behavior of complex systems is a crucial, but less well-understood topic in many engineering disciplines. This study proposes a finite-difference inspired convolutional neural network framework to learn hidden partial differential equations from given data and iteratively estimate future dynamical behavior. The methodology designs the filter sizes such that they… ▽ More

    Submitted 28 October, 2019; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: Paper accepted to NeurIPS workshop

  46. arXiv:1909.09705  [pdf, other

    cs.LG cs.AI cs.RO eess.SY stat.ML

    A Layered Architecture for Active Perception: Image Classification using Deep Reinforcement Learning

    Authors: Hossein K. Mousavi, Guangyi Liu, Weihang Yuan, Martin Takáč, Héctor Muñoz-Avila, Nader Motee

    Abstract: We propose a planning and perception mechanism for a robot (agent), that can only observe the underlying environment partially, in order to solve an image classification problem. A three-layer architecture is suggested that consists of a meta-layer that decides the intermediate goals, an action-layer that selects local actions as the agent navigates towards a goal, and a classification-layer that… ▽ More

    Submitted 20 September, 2019; originally announced September 2019.

    Comments: Submitted to ICRA-2020

  47. arXiv:1905.13562  [pdf, other

    cs.LG cs.AI stat.ML

    Don't Forget Your Teacher: A Corrective Reinforcement Learning Framework

    Authors: Mohammadreza Nazari, Majid Jahani, Lawrence V. Snyder, Martin Takáč

    Abstract: Although reinforcement learning (RL) can provide reliable solutions in many settings, practitioners are often wary of the discrepancies between the RL solution and their status quo procedures. Therefore, they may be reluctant to adapt to the novel way of executing tasks proposed by RL. On the other hand, many real-world problems require relatively small adjustments from the status quo policies to… ▽ More

    Submitted 29 May, 2019; originally announced May 2019.

  48. arXiv:1905.04835  [pdf, other

    cs.LG cs.CV cs.MA cs.RO eess.SY stat.ML

    Multi-Agent Image Classification via Reinforcement Learning

    Authors: Hossein K. Mousavi, Mohammadreza Nazari, Martin Takáč, Nader Motee

    Abstract: We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment. The objective is to classify an image over a finite time horizon. We propose a network architecture on how agents should form a local belief, take local actions, and extract relevant features from their raw partial observations. Agents are allo… ▽ More

    Submitted 6 August, 2019; v1 submitted 12 May, 2019; originally announced May 2019.

    Comments: Preprint of the paper to be published in IROS'19 proceedings

  49. arXiv:1901.09997  [pdf, other

    math.OC cs.LG stat.ML

    Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample

    Authors: Albert S. Berahas, Majid Jahani, Peter Richtárik, Martin Takáč

    Abstract: We present two sampled quasi-Newton methods (sampled LBFGS and sampled LSR1) for solving empirical risk minimization problems that arise in machine learning. Contrary to the classical variants of these methods that sequentially build Hessian or inverse Hessian approximations as the optimization progresses, our proposed methods sample points randomly around the current iterate at every iteration to… ▽ More

    Submitted 27 July, 2021; v1 submitted 28 January, 2019; originally announced January 2019.

    Comments: 50 pages, 33 figures

  50. arXiv:1901.09269  [pdf, other

    cs.LG math.OC stat.ML

    Distributed Learning with Compressed Gradient Differences

    Authors: Konstantin Mishchenko, Eduard Gorbunov, Martin Takáč, Peter Richtárik

    Abstract: Training large machine learning models requires a distributed computing approach, with communication of the model updates being the bottleneck. For this reason, several methods based on the compression (e.g., sparsification and/or quantization) of updates were recently proposed, including QSGD (Alistarh et al., 2017), TernGrad (Wen et al., 2017), SignSGD (Bernstein et al., 2018), and DQGD (Khirira… ▽ More

    Submitted 28 December, 2023; v1 submitted 26 January, 2019; originally announced January 2019.

    Comments: 59 pages; Changes in V3: writing, presentation, and numerical experiments