Zum Hauptinhalt springen

Showing 1–28 of 28 results for author: Kolesnikov, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08973  [pdf, other

    cs.LG cs.AI

    XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

    Authors: Alexander Nikulin, Ilya Zisman, Alexey Zemtsov, Viacheslav Sinii, Vladislav Kurenkov, Sergey Kolesnikov

    Abstract: Following the success of the in-context learning paradigm in large-scale language and computer vision models, the recently emerging field of in-context reinforcement learning is experiencing a rapid growth. However, its development has been held back by the lack of challenging benchmarks, as all the experiments have been carried out in simple environments and on small-scale datasets. We present \t… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2312.13327  [pdf, other

    cs.LG cs.AI

    In-Context Reinforcement Learning for Variable Action Spaces

    Authors: Viacheslav Sinii, Alexander Nikulin, Vladislav Kurenkov, Ilya Zisman, Sergey Kolesnikov

    Abstract: Recently, it has been shown that transformers pre-trained on diverse datasets with multi-episode contexts can generalize to new reinforcement learning tasks in-context. A key limitation of previously proposed models is their reliance on a predefined action space size and structure. The introduction of a new action space often requires data re-collection and model re-training, which can be costly f… ▽ More

    Submitted 1 July, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: ICML 2024

  3. arXiv:2312.12275  [pdf, other

    cs.LG

    Emergence of In-Context Reinforcement Learning from Noise Distillation

    Authors: Ilya Zisman, Vladislav Kurenkov, Alexander Nikulin, Viacheslav Sinii, Sergey Kolesnikov

    Abstract: Recently, extensive studies in Reinforcement Learning have been carried out on the ability of transformers to adapt in-context to various environments and tasks. Current in-context RL methods are limited by their strict requirements for data, which needs to be generated by RL agents or labeled with actions from an optimal policy. In order to address this prevalent problem, we propose AD… ▽ More

    Submitted 12 June, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Proceedings of the 41-st International Conference on Machine Learning (ICML 2024); code: https://github.com/corl-team/ad-eps

  4. arXiv:2312.12044  [pdf, other

    cs.LG

    XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX

    Authors: Alexander Nikulin, Vladislav Kurenkov, Ilya Zisman, Artem Agarkov, Viacheslav Sinii, Sergey Kolesnikov

    Abstract: Inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid, we present XLand-MiniGrid, a suite of tools and grid-world environments for meta-reinforcement learning research. Written in JAX, XLand-MiniGrid is designed to be highly scalable and can potentially run on GPU or TPU accelerators, democratizing large-scale experimentation with limited resources. Along with… ▽ More

    Submitted 10 June, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023, Workshop, Source code: https://github.com/corl-team/xland-minigrid

  5. arXiv:2312.10464  [pdf, other

    cs.LG cs.AI cs.CV

    Unveiling Empirical Pathologies of Laplace Approximation for Uncertainty Estimation

    Authors: Maksim Zhdanov, Stanislav Dereka, Sergey Kolesnikov

    Abstract: In this paper, we critically evaluate Bayesian methods for uncertainty estimation in deep learning, focusing on the widely applied Laplace approximation and its variants. Our findings reveal that the conventional method of fitting the Hessian matrix negatively impacts out-of-distribution (OOD) detection efficiency. We propose a different point of view, asserting that focusing solely on optimizing… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

  6. arXiv:2312.01792  [pdf, other

    cs.LG

    Wild-Tab: A Benchmark For Out-Of-Distribution Generalization In Tabular Regression

    Authors: Sergey Kolesnikov

    Abstract: Out-of-Distribution (OOD) generalization, a cornerstone for building robust machine learning models capable of handling data diverging from the training set's distribution, is an ongoing challenge in deep learning. While significant progress has been observed in computer vision and natural language processing, its exploration in tabular data, ubiquitous in many industrial applications, remains nas… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  7. Time-Aware Item Weighting for the Next Basket Recommendations

    Authors: Aleksey Romanov, Oleg Lashinin, Marina Ananyeva, Sergey Kolesnikov

    Abstract: In this paper we study the next basket recommendation problem. Recent methods use different approaches to achieve better performance. However, many of them do not use information about the time of prediction and time intervals between baskets. To fill this gap, we propose a novel method, Time-Aware Item-based Weighting (TAIW), which takes timestamps and intervals into account. We provide experimen… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

  8. arXiv:2306.14292  [pdf, other

    cs.IR

    RecBaselines2023: a new dataset for choosing baselines for recommender models

    Authors: Veronika Ivanova, Oleg Lashinin, Marina Ananyeva, Sergey Kolesnikov

    Abstract: The number of proposed recommender algorithms continues to grow. The authors propose new approaches and compare them with existing models, called baselines. Due to the large number of recommender models, it is difficult to estimate which algorithms to choose in the article. To solve this problem, we have collected and published a dataset containing information about the recommender models used in… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

  9. arXiv:2306.08772  [pdf, other

    cs.LG cs.AI cs.NE

    Katakomba: Tools and Benchmarks for Data-Driven NetHack

    Authors: Vladislav Kurenkov, Alexander Nikulin, Denis Tarasov, Sergey Kolesnikov

    Abstract: NetHack is known as the frontier of reinforcement learning research where learning-based methods still need to catch up to rule-based solutions. One of the promising directions for a breakthrough is using pre-collected datasets similar to recent developments in robotics, recommender systems, and more under the umbrella of offline reinforcement learning (ORL). Recently, a large-scale NetHack datase… ▽ More

    Submitted 26 October, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. Source code at https://github.com/corl-team/katakomba

  10. arXiv:2305.11616  [pdf, other

    cs.CV cs.AI cs.LG

    Diversifying Deep Ensembles: A Saliency Map Approach for Enhanced OOD Detection, Calibration, and Accuracy

    Authors: Stanislav Dereka, Ivan Karpukhin, Maksim Zhdanov, Sergey Kolesnikov

    Abstract: Deep ensembles are capable of achieving state-of-the-art results in classification and out-of-distribution (OOD) detection. However, their effectiveness is limited due to the homogeneity of learned patterns within ensembles. To overcome this issue, our study introduces Saliency Diversified Deep Ensemble (SDDE), a novel approach that promotes diversity among ensemble members by leveraging saliency… ▽ More

    Submitted 14 June, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

  11. arXiv:2305.09836  [pdf, other

    cs.LG cs.AI

    Revisiting the Minimalist Approach to Offline Reinforcement Learning

    Authors: Denis Tarasov, Vladislav Kurenkov, Alexander Nikulin, Sergey Kolesnikov

    Abstract: Recent years have witnessed significant advancements in offline reinforcement learning (RL), resulting in the development of numerous algorithms with varying degrees of complexity. While these algorithms have led to noteworthy improvements, many incorporate seemingly minor design choices that impact their effectiveness beyond core algorithmic advances. However, the effect of these design choices o… ▽ More

    Submitted 24 October, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: Source code: https://github.com/DT6A/ReBRAC

  12. arXiv:2301.13616  [pdf, other

    cs.LG cs.AI cs.NE

    Anti-Exploration by Random Network Distillation

    Authors: Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Sergey Kolesnikov

    Abstract: Despite the success of Random Network Distillation (RND) in various domains, it was shown as not discriminative enough to be used as an uncertainty estimator for penalizing out-of-distribution actions in offline reinforcement learning. In this paper, we revisit these results and show that, with a naive choice of conditioning for the RND prior, it becomes infeasible for the actor to effectively min… ▽ More

    Submitted 17 May, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: ICML 2023, Poster, Source code: https://github.com/tinkoff-ai/sac-rnd

  13. arXiv:2211.11096  [pdf, other

    cs.LG cs.AI cs.NE

    Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows

    Authors: Dmitriy Akimov, Vladislav Kurenkov, Alexander Nikulin, Denis Tarasov, Sergey Kolesnikov

    Abstract: Offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions. There are two major challenges in this setting: (1) extrapolation error caused by approximating the value of state-action pairs not well-covered by the training data and (2) distributional shift between behavior and inference policies. One way to tackle these p… ▽ More

    Submitted 30 January, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted at 3rd Offline Reinforcement Learning Workshop at Neural Information Processing Systems, 2022. Source code: https://github.com/tinkoff-ai/cnf

  14. arXiv:2211.11092  [pdf, other

    cs.LG cs.AI cs.NE

    Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size

    Authors: Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Dmitry Akimov, Sergey Kolesnikov

    Abstract: Training large neural networks is known to be time-consuming, with the learning duration taking days or even weeks. To address this problem, large-batch optimization was introduced. This approach demonstrated that scaling mini-batch sizes with appropriate learning rate adjustments can speed up the training process by orders of magnitude. While long training time was not typically a major issue for… ▽ More

    Submitted 30 January, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted at 3rd Offline Reinforcement Learning Workshop at Neural Information Processing Systems, 2022. Source code: https://github.com/tinkoff-ai/lb-sac

  15. arXiv:2210.07105  [pdf, other

    cs.LG cs.AI

    CORL: Research-oriented Deep Offline Reinforcement Learning Library

    Authors: Denis Tarasov, Alexander Nikulin, Dmitry Akimov, Vladislav Kurenkov, Sergey Kolesnikov

    Abstract: CORL is an open-source library that provides thoroughly benchmarked single-file implementations of both deep offline and offline-to-online reinforcement learning algorithms. It emphasizes a simple developing experience with a straightforward codebase and a modern analysis tracking tool. In CORL, we isolate methods implementation into separate single files, making performance-relevant details easie… ▽ More

    Submitted 26 October, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Conference on Neural Information Processing Systems (NeurIPS 2023) Track on Datasets and Benchmarks. Source code at https://github.com/corl-team/CORL

  16. arXiv:2205.11195  [pdf, other

    cs.CV cs.IR cs.LG

    Deep Image Retrieval is not Robust to Label Noise

    Authors: Stanislav Dereka, Ivan Karpukhin, Sergey Kolesnikov

    Abstract: Large-scale datasets are essential for the success of deep learning in image retrieval. However, manual assessment errors and semi-supervised annotation techniques can lead to label noise even in popular datasets. As previous works primarily studied annotation quality in image classification tasks, it is still unclear how label noise affects deep learning approaches to image retrieval. In this wor… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  17. EXACT: How to Train Your Accuracy

    Authors: Ivan Karpukhin, Stanislav Dereka, Sergey Kolesnikov

    Abstract: Classification tasks are usually evaluated in terms of accuracy. However, accuracy is discontinuous and cannot be directly optimized using gradient ascent. Popular methods minimize cross-entropy, hinge loss, or other surrogate losses, which can lead to suboptimal results. In this paper, we propose a new optimization framework by introducing stochasticity to a model's output and optimizing expected… ▽ More

    Submitted 24 July, 2024; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: Pattern Recognition Letters (2024)

  18. arXiv:2205.05393  [pdf, other

    cs.LG

    CVTT: Cross-Validation Through Time

    Authors: Mikhail Andronov, Sergey Kolesnikov

    Abstract: The evaluation of recommender systems from a practical perspective is a topic of ongoing discourse within the research community. While many current evaluation methods reduce performance to a single value metric as an easy way to compare models, it relies on the assumption that the methods' performance remains constant over time. In this study, we examine this assumption and propose the Cross-Vali… ▽ More

    Submitted 10 February, 2023; v1 submitted 11 May, 2022; originally announced May 2022.

  19. arXiv:2202.06768  [pdf, other

    cs.CV

    Probabilistic Embeddings Revisited

    Authors: Ivan Karpukhin, Stanislav Dereka, Sergey Kolesnikov

    Abstract: In recent years, deep metric learning and its probabilistic extensions claimed state-of-the-art results in the face verification task. Despite improvements in face verification, probabilistic methods received little attention in the research community and practical applications. In this paper, we, for the first time, perform an in-depth analysis of known probabilistic methods in verification and r… ▽ More

    Submitted 10 November, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

  20. arXiv:2110.05589  [pdf, other

    cs.LG

    Next Period Recommendation Reality Check

    Authors: Sergey Kolesnikov, Oleg Lashinin, Michail Pechatov, Alexander Kosov

    Abstract: Over the past decade, tremendous progress has been made in Recommender Systems (RecSys) for well-known tasks such as next-item and next-basket prediction. On the other hand, the recently proposed next-period recommendation (NPR) task is not covered as much. Current works about NPR are mostly based around distinct problem formulations, methods, and proprietary datasets, making solutions difficult t… ▽ More

    Submitted 20 December, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

  21. arXiv:2110.04156  [pdf, other

    cs.LG cs.AI

    Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters

    Authors: Vladislav Kurenkov, Sergey Kolesnikov

    Abstract: In this work, we argue for the importance of an online evaluation budget for a reliable comparison of deep offline RL algorithms. First, we delineate that the online evaluation budget is problem-dependent, where some problems allow for less but others for more. And second, we demonstrate that the preference between algorithms is budget-dependent across a diverse range of decision-making domains su… ▽ More

    Submitted 5 June, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: ICML 2022, Spotlight; https://tinkoff-ai.github.io/eop/

  22. arXiv:2109.06692  [pdf, other

    cs.CV cs.LG

    LRWR: Large-Scale Benchmark for Lip Reading in Russian language

    Authors: Evgeniy Egorov, Vasily Kostyumov, Mikhail Konyk, Sergey Kolesnikov

    Abstract: Lipreading, also known as visual speech recognition, aims to identify the speech content from videos by analyzing the visual deformations of lips and nearby areas. One of the significant obstacles for research in this field is the lack of proper datasets for a wide variety of languages: so far, these methods have been focused only on English or Chinese. In this paper, we introduce a naturally dist… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

  23. arXiv:2003.14210  [pdf, other

    cs.LG cs.AI stat.ML

    Sample Efficient Ensemble Learning with Catalyst.RL

    Authors: Sergey Kolesnikov, Valentin Khrulkov

    Abstract: We present Catalyst.RL, an open-source PyTorch framework for reproducible and sample efficient reinforcement learning (RL) research. Main features of Catalyst.RL include large-scale asynchronous distributed training, efficient implementations of various RL algorithms and auxiliary tricks, such as n-step returns, value distributions, hyperbolic reinforcement learning, etc. To demonstrate the effect… ▽ More

    Submitted 7 April, 2020; v1 submitted 29 March, 2020; originally announced March 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1903.00027

  24. arXiv:1903.00027  [pdf, other

    cs.LG stat.ML

    Catalyst.RL: A Distributed Framework for Reproducible RL Research

    Authors: Sergey Kolesnikov, Oleksii Hrinchuk

    Abstract: Despite the recent progress in deep reinforcement learning field (RL), and, arguably because of it, a large body of work remains to be done in reproducing and carefully comparing different RL algorithms. We present catalyst.RL, an open source framework for RL research with a focus on reproducibility and flexibility. Main features of our library include large-scale asynchronous distributed training… ▽ More

    Submitted 28 February, 2019; originally announced March 2019.

  25. arXiv:1902.02441  [pdf, other

    cs.LG cs.RO stat.ML

    Artificial Intelligence for Prosthetics - challenge solutions

    Authors: Łukasz Kidziński, Carmichael Ong, Sharada Prasanna Mohanty, Jennifer Hicks, Sean F. Carroll, Bo Zhou, Hongsheng Zeng, Fan Wang, Rongzhong Lian, Hao Tian, Wojciech Jaśkowski, Garrett Andersen, Odd Rune Lykkebø, Nihat Engin Toklu, Pranav Shyam, Rupesh Kumar Srivastava, Sergey Kolesnikov, Oleksii Hrinchuk, Anton Pechenko, Mattias Ljungström, Zhen Wang, Xu Hu, Zehong Hu, Minghui Qiu, Jun Huang , et al. (25 additional authors not shown)

    Abstract: In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to describe their algorithms. In this work, we describe the challenge and present thirteen solutions that used deep reinforcement learning approaches. Many s… ▽ More

    Submitted 6 February, 2019; originally announced February 2019.

  26. arXiv:1804.00361  [pdf, other

    cs.LG cs.AI stat.ML

    Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments

    Authors: Łukasz Kidziński, Sharada Prasanna Mohanty, Carmichael Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Miłoś, Błażej Osiński, Andrew Melnik, Malte Schilling, Helge Ritter, Sean Carroll , et al. (4 additional authors not shown)

    Abstract: In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course. Top participants were invited to describe their algorithms. In this work, we present eight solutions that used deep reinforcement learning approaches, based on algorithms such as Deep Deterministic Policy Gradient… ▽ More

    Submitted 1 April, 2018; originally announced April 2018.

    Comments: 27 pages, 17 figures

  27. arXiv:1712.07440  [pdf, other

    cs.SE

    On the Relation of External and Internal Feature Interactions: A Case Study

    Authors: Sergiy Kolesnikov, Norbert Siegmund, Christian Kästner, Sven Apel

    Abstract: Detecting feature interactions is imperative for accurately predicting performance of highly-configurable systems. State-of-the-art performance prediction techniques rely on supervised machine learning for detecting feature interactions, which, in turn, relies on time consuming performance measurements to obtain training data. By providing information about potentially interacting features, we can… ▽ More

    Submitted 22 January, 2018; v1 submitted 20 December, 2017; originally announced December 2017.

  28. arXiv:1711.06922  [pdf, other

    cs.AI cs.LG stat.ML

    Run, skeleton, run: skeletal model in a physics-based simulation

    Authors: Mikhail Pavlov, Sergey Kolesnikov, Sergey M. Plis

    Abstract: In this paper, we present our approach to solve a physics-based reinforcement learning challenge "Learning to Run" with objective to train physiologically-based human model to navigate a complex obstacle course as quickly as possible. The environment is computationally expensive, has a high-dimensional continuous action space and is stochastic. We benchmark state of the art policy-gradient methods… ▽ More

    Submitted 28 January, 2018; v1 submitted 18 November, 2017; originally announced November 2017.

    Comments: Corrected typos and spelling