Zum Hauptinhalt springen

Showing 1–12 of 12 results for author: Boney, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02477  [pdf, other

    cs.CV cs.CL

    Understanding Alignment in Multimodal LLMs: A Comprehensive Study

    Authors: Elmira Amirloo, Jean-Philippe Fauconnier, Christoph Roesmann, Christian Kerl, Rinu Boney, Yusu Qian, Zirui Wang, Afshin Dehghan, Yinfei Yang, Zhe Gan, Peter Grasch

    Abstract: Preference alignment has become a crucial component in enhancing the performance of Large Language Models (LLMs), yet its impact in Multimodal Large Language Models (MLLMs) remains comparatively underexplored. Similar to language models, MLLMs for image understanding tasks encounter challenges like hallucination. In MLLMs, hallucination can occur not only by stating incorrect facts but also by pro… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2306.09466  [pdf, other

    cs.LG cs.RO

    Simplified Temporal Consistency Reinforcement Learning

    Authors: Yi Zhao, Wenshuai Zhao, Rinu Boney, Juho Kannala, Joni Pajarinen

    Abstract: Reinforcement learning is able to solve complex sequential decision-making tasks but is currently limited by sample efficiency and required computation. To improve sample efficiency, recent work focuses on model-based RL which interleaves model learning with planning. Recent methods further utilize policy learning, value estimation, and, self-supervised learning as auxiliary objectives. In this pa… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  3. arXiv:2303.11801  [pdf, other

    cs.RO

    SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations

    Authors: Khaled Nakhleh, Minahil Raza, Mack Tang, Matthew Andrews, Rinu Boney, Ilija Hadzic, Jeongran Lee, Atefeh Mohajeri, Karina Palyutina

    Abstract: We study the training performance of ROS local planners based on Reinforcement Learning (RL), and the trajectories they produce on real-world robots. We show that recent enhancements to the Soft Actor Critic (SAC) algorithm such as RAD and DrQ achieve almost perfect training after only 10000 episodes. We also observe that on real-world robots the resulting SACPlanner is more reactive to obstacles… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: Accepted at 2023 IEEE International Conference on Robotics and Automation (ICRA)

  4. arXiv:2210.13846  [pdf, other

    cs.LG cs.RO

    Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning

    Authors: Yi Zhao, Rinu Boney, Alexander Ilin, Juho Kannala, Joni Pajarinen

    Abstract: Offline reinforcement learning, by learning from a fixed dataset, makes it possible to learn agent behaviors without interacting with the environment. However, depending on the quality of the offline dataset, such pre-trained agents may have limited performance and would further need to be fine-tuned online by interacting with the environment. During online fine-tuning, the performance of the pre-… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

  5. arXiv:2106.07995  [pdf, other

    cs.LG cs.CV cs.RO

    Learning of feature points without additional supervision improves reinforcement learning from images

    Authors: Rinu Boney, Alexander Ilin, Juho Kannala

    Abstract: In many control problems that include vision, optimal controls can be inferred from the location of the objects in the scene. This information can be represented using feature points, which is a list of spatial locations in learned feature maps of an input image. Previous works show that feature points learned using unsupervised pre-training or human supervision can provide good features for contr… ▽ More

    Submitted 4 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

  6. arXiv:2012.12186  [pdf, other

    cs.AI

    Learning to Play Imperfect-Information Games by Imitating an Oracle Planner

    Authors: Rinu Boney, Alexander Ilin, Juho Kannala, Jarno Seppänen

    Abstract: We consider learning to play multiplayer imperfect-information games with simultaneous moves and large state-action spaces. Previous attempts to tackle such challenging games have largely focused on model-free learning methods, often requiring hundreds of years of experience to produce competitive agents. Our approach is based on model-based planning. We tackle the problem of partial observability… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

  7. arXiv:2011.03085  [pdf, other

    cs.RO cs.AI

    RealAnt: An Open-Source Low-Cost Quadruped for Education and Research in Real-World Reinforcement Learning

    Authors: Rinu Boney, Jussi Sainio, Mikko Kaivola, Arno Solin, Juho Kannala

    Abstract: Current robot platforms available for research are either very expensive or unable to handle the abuse of exploratory controls in reinforcement learning. We develop RealAnt, a minimal low-cost physical version of the popular `Ant' benchmark used in reinforcement learning. RealAnt costs only $\sim$350 EUR (\$410) in materials and can be assembled in less than an hour. We validate the platform with… ▽ More

    Submitted 4 June, 2022; v1 submitted 5 November, 2020; originally announced November 2020.

  8. arXiv:2008.00715  [pdf, other

    cs.RO

    Learning to Drive (L2D) as a Low-Cost Benchmark for Real-World Reinforcement Learning

    Authors: Ari Viitala, Rinu Boney, Yi Zhao, Alexander Ilin, Juho Kannala

    Abstract: We present Learning to Drive (L2D), a low-cost benchmark for real-world reinforcement learning (RL). L2D involves a simple and reproducible experimental setup where an RL agent has to learn to drive a Donkey car around three miniature tracks, given only monocular image observations and speed of the car. The agent has to learn to drive from disengagements, which occurs when it drives off the track.… ▽ More

    Submitted 6 November, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

  9. arXiv:1910.05527  [pdf, other

    cs.LG cs.RO stat.ML

    Regularizing Model-Based Planning with Energy-Based Models

    Authors: Rinu Boney, Juho Kannala, Alexander Ilin

    Abstract: Model-based reinforcement learning could enable sample-efficient learning by quickly acquiring rich knowledge about the world and using it to improve behaviour without additional data. Learned dynamics models can be directly used for planning actions but this has been challenging because of inaccuracies in the learned models. In this paper, we focus on planning with learned dynamics models and pro… ▽ More

    Submitted 12 October, 2019; originally announced October 2019.

    Comments: Conference on Robot Learning 2019

  10. arXiv:1903.11981  [pdf, other

    cs.LG cs.RO stat.ML

    Regularizing Trajectory Optimization with Denoising Autoencoders

    Authors: Rinu Boney, Norman Di Palo, Mathias Berglund, Alexander Ilin, Juho Kannala, Antti Rasmus, Harri Valpola

    Abstract: Trajectory optimization using a learned model of the environment is one of the core elements of model-based reinforcement learning. This procedure often suffers from exploiting inaccuracies of the learned model. We propose to regularize trajectory optimization by means of a denoising autoencoder that is trained on the same trajectories as the model of the environment. We show that the proposed reg… ▽ More

    Submitted 25 December, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

    Comments: NeurIPS 2019

  11. arXiv:1711.10856  [pdf, other

    cs.LG stat.ML

    Semi-Supervised and Active Few-Shot Learning with Prototypical Networks

    Authors: Rinu Boney, Alexander Ilin

    Abstract: We consider the problem of semi-supervised few-shot classification where a classifier needs to adapt to new tasks using a few labeled examples and (potentially many) unlabeled examples. We propose a clustering approach to the problem. The features extracted with Prototypical Networks are clustered using $K$-means with the few labeled examples guiding the clustering process. We note that in many re… ▽ More

    Submitted 25 April, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

  12. arXiv:1707.09219  [pdf, other

    cs.NE cs.AI cs.LG stat.ML

    Recurrent Ladder Networks

    Authors: Isabeau Prémont-Schwarz, Alexander Ilin, Tele Hotloo Hao, Antti Rasmus, Rinu Boney, Harri Valpola

    Abstract: We propose a recurrent extension of the Ladder networks whose structure is motivated by the inference required in hierarchical latent variable models. We demonstrate that the recurrent Ladder is able to handle a wide variety of complex learning tasks that benefit from iterative inference and temporal modeling. The architecture shows close-to-optimal results on temporal modeling of video data, comp… ▽ More

    Submitted 18 December, 2017; v1 submitted 28 July, 2017; originally announced July 2017.

    Comments: 9 pages, 9 figures, 7-page appendix, fixed fig 9 (c)