Zum Hauptinhalt springen

Showing 1–13 of 13 results for author: Konstantinov, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.17299  [pdf, other

    stat.ML cs.LG math.OC

    Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data

    Authors: Nikita Tsoy, Nikola Konstantinov

    Abstract: Simplicity bias, the propensity of deep models to over-rely on simple features, has been identified as a potential reason for limited out-of-distribution generalization of neural networks (Shah et al., 2020). Despite the important implications, this phenomenon has been theoretically confirmed and characterized only under strong dataset assumptions, such as linear separability (Lyu et al., 2021). I… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: ICML 2024, camera-ready version

  2. arXiv:2403.06672  [pdf, ps, other

    stat.ML cs.CR cs.GT cs.LG

    Provable Mutual Benefits from Federated Learning in Privacy-Sensitive Domains

    Authors: Nikita Tsoy, Anna Mihalkova, Teodora Todorova, Nikola Konstantinov

    Abstract: Cross-silo federated learning (FL) allows data owners to train accurate machine learning models by benefiting from each others private datasets. Unfortunately, the model accuracy benefits of collaboration are often undermined by privacy defenses. Therefore, to incentivize client participation in privacy-sensitive domains, a FL protocol should strike a delicate balance between privacy guarantees an… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: AISTATS 2024; Camera-ready version

  3. arXiv:2305.16272  [pdf, other

    cs.LG cs.GT stat.ML

    Incentivizing Honesty among Competitors in Collaborative Learning and Optimization

    Authors: Florian E. Dorner, Nikola Konstantinov, Georgi Pashaliev, Martin Vechev

    Abstract: Collaborative learning techniques have the potential to enable training machine learning models that are superior to models trained on a single entity's data. However, in many cases, potential participants in such collaborative schemes are competitors on a downstream task, such as firms that each aim to attract customers by providing the best recommendations. This can incentivize dishonest updates… ▽ More

    Submitted 30 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 Camera Ready; 37 pages, 5 figures

  4. arXiv:2305.16052  [pdf, ps, other

    cs.LG cs.GT

    Strategic Data Sharing between Competitors

    Authors: Nikita Tsoy, Nikola Konstantinov

    Abstract: Collaborative learning techniques have significantly advanced in recent years, enabling private model training across multiple organizations. Despite this opportunity, firms face a dilemma when considering data sharing with competitors -- while collaboration can improve a company's machine learning model, it may also benefit competitors and hence reduce profits. In this work, we introduce a genera… ▽ More

    Submitted 30 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS 2023

  5. arXiv:2212.10154  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Human-Guided Fair Classification for Natural Language Processing

    Authors: Florian E. Dorner, Momchil Peychev, Nikola Konstantinov, Naman Goel, Elliott Ash, Martin Vechev

    Abstract: Text classifiers have promising applications in high-stake tasks such as resume screening and content moderation. These classifiers must be fair and avoid discriminatory decisions by being invariant to perturbations of sensitive attributes such as gender or ethnicity. However, there is a gap between human intuition about these perturbations and the formal similarity specifications capturing them.… ▽ More

    Submitted 16 March, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Published at ICLR 2023 (notable top 25%). 30 pages, 1 figure

  6. arXiv:2206.12395  [pdf, other

    cs.LG cs.CR cs.DC

    Data Leakage in Federated Averaging

    Authors: Dimitar I. Dimitrov, Mislav Balunović, Nikola Konstantinov, Martin Vechev

    Abstract: Recent attacks have shown that user data can be recovered from FedSGD updates, thus breaking privacy. However, these attacks are of limited practical relevance as federated learning typically uses the FedAvg algorithm. Compared to FedSGD, recovering data from FedAvg updates is much harder as: (i) the updates are computed at unobserved intermediate network weights, (ii) a large number of batches ar… ▽ More

    Submitted 1 November, 2022; v1 submitted 24 June, 2022; originally announced June 2022.

    ACM Class: I.2.11

  7. arXiv:2106.11732  [pdf, other

    cs.LG stat.ML

    FLEA: Provably Robust Fair Multisource Learning from Unreliable Training Data

    Authors: Eugenia Iofinova, Nikola Konstantinov, Christoph H. Lampert

    Abstract: Fairness-aware learning aims at constructing classifiers that not only make accurate predictions, but also do not discriminate against specific groups. It is a fast-growing area of machine learning with far-reaching societal impact. However, existing fair learning methods are vulnerable to accidental or malicious artifacts in the training data, which can cause them to unknowingly produce unfair cl… ▽ More

    Submitted 11 January, 2023; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: 10 pages in main text; 42 pages including bibliography and appendix. Published in Transactions of Machine Learning Research (TMLR), 2022, https://openreview.net/forum?id=XsPopigZX; project website at https://github.com/ISTAustria-CVML/FLEA

  8. arXiv:2102.06004  [pdf, ps, other

    cs.LG stat.ML

    Fairness-Aware PAC Learning from Corrupted Data

    Authors: Nikola Konstantinov, Christoph H. Lampert

    Abstract: Addressing fairness concerns about machine learning models is a crucial step towards their long-term adoption in real-world automated systems. While many approaches have been developed for training fair models from data, little is known about the robustness of these methods to data corruption. In this work we consider fairness-aware learning under worst-case data manipulations. We show that an adv… ▽ More

    Submitted 7 June, 2022; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: In Journal of Machine Learning Research (JMLR): http://jmlr.org/papers/v23/21-1189.html

  9. arXiv:2102.05996  [pdf, other

    cs.LG cs.IR stat.ML

    Fairness Through Regularization for Learning to Rank

    Authors: Nikola Konstantinov, Christoph H. Lampert

    Abstract: Given the abundance of applications of ranking in recent years, addressing fairness concerns around automated ranking systems becomes necessary for increasing the trust among end-users. Previous work on fair ranking has mostly focused on application-specific fairness notions, often tailored to online advertising, and it rarely considers learning as part of the process. In this work, we show how to… ▽ More

    Submitted 7 June, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: 34 pages

  10. arXiv:2002.10384  [pdf, other

    cs.LG stat.ML

    On the Sample Complexity of Adversarial Multi-Source PAC Learning

    Authors: Nikola Konstantinov, Elias Frantar, Dan Alistarh, Christoph H. Lampert

    Abstract: We study the problem of learning from multiple untrusted data sources, a scenario of increasing practical relevance given the recent emergence of crowdsourcing and collaborative learning paradigms. Specifically, we analyze the situation in which a learning system obtains datasets from multiple sources, some of which might be biased or even adversarially perturbed. It is known that in the single-so… ▽ More

    Submitted 30 June, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: International Conference on Machine Learning (ICML) 2020: Camera-ready. Strengthened the definition of adversarial PAC-learnability, added explicit bounds on sample complexity

  11. arXiv:1901.10310  [pdf, other

    cs.LG stat.ML

    Robust Learning from Untrusted Sources

    Authors: Nikola Konstantinov, Christoph Lampert

    Abstract: Modern machine learning methods often require more data for training than a single expert can provide. Therefore, it has become a standard procedure to collect data from external sources, e.g. via crowdsourcing. Unfortunately, the quality of these sources is not always guaranteed. As additional complications, the data might be stored in a distributed way, or might even have to remain private. In t… ▽ More

    Submitted 17 May, 2019; v1 submitted 29 January, 2019; originally announced January 2019.

    Comments: Accepted to International Conference on Machine Learning (ICML), 2019; Camera-ready version

  12. arXiv:1809.10505  [pdf, other

    cs.LG cs.DC stat.ML

    The Convergence of Sparsified Gradient Methods

    Authors: Dan Alistarh, Torsten Hoefler, Mikael Johansson, Sarit Khirirat, Nikola Konstantinov, Cédric Renggli

    Abstract: Distributed training of massive machine learning models, in particular deep neural networks, via Stochastic Gradient Descent (SGD) is becoming commonplace. Several families of communication-reduction methods, such as quantization, large-batch methods, and gradient sparsification, have been proposed. To date, gradient sparsification methods - where each node sorts gradients by magnitude, and only c… ▽ More

    Submitted 27 September, 2018; originally announced September 2018.

    Comments: NIPS 2018 - Advances in Neural Information Processing Systems; Authors in alphabetic order

  13. arXiv:1803.08841  [pdf, other

    cs.DC cs.LG stat.ML

    The Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory

    Authors: Dan Alistarh, Christopher De Sa, Nikola Konstantinov

    Abstract: Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the optimization backbone for training several classic models, from regression to neural networks. Given the recent practical focus on distributed machine learning, significant work has been dedicated to the convergence properties of this algorithm under the inconsistent and noisy updates arising from ex… ▽ More

    Submitted 22 June, 2018; v1 submitted 23 March, 2018; originally announced March 2018.

    Comments: To be published in PoDC 2018; 18 pages, 1 figure; Changes: added pseudocode for Algorithm 2, some references and corrected typos