Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Kastner, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17718  [pdf, other

    cs.LG

    When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning

    Authors: Claas Voelcker, Tyler Kastner, Igor Gilitschenski, Amir-massoud Farahmand

    Abstract: We investigate the impact of auxiliary learning tasks such as observation reconstruction and latent self-prediction on the representation learning problem in reinforcement learning. We also study how they interact with distractions and observation functions in the MDP. We provide a theoretical analysis of the learning dynamics of observation reconstruction, latent self-prediction, and TD learning… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2310.19804  [pdf, other

    cs.LG cs.AI

    A Kernel Perspective on Behavioural Metrics for Markov Decision Processes

    Authors: Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

    Abstract: Behavioural metrics have been shown to be an effective mechanism for constructing representations in reinforcement learning. We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels. We leverage this new perspective to define a new metric that is provably equivalent to the recently introduced MICo distance (Castro et al., 2021). T… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: Published in TMLR

  3. arXiv:2307.01708  [pdf, other

    cs.LG cs.AI

    Distributional Model Equivalence for Risk-Sensitive Reinforcement Learning

    Authors: Tyler Kastner, Murat A. Erdogdu, Amir-massoud Farahmand

    Abstract: We consider the problem of learning models for risk-sensitive reinforcement learning. We theoretically demonstrate that proper value equivalence, a method of learning models which can be used to plan optimally in the risk-neutral setting, is not sufficient to plan optimally in the risk-sensitive setting. We leverage distributional reinforcement learning to introduce two new notions of model equiva… ▽ More

    Submitted 3 December, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

  4. Objective Measures of Perceptual Audio Quality Reviewed: An Evaluation of Their Application Domain Dependence

    Authors: Matteo Torcoli, Thorsten Kastner, Jürgen Herre

    Abstract: Over the past few decades, computational methods have been developed to estimate perceptual audio quality. These methods, also referred to as objective quality measures, are usually developed and intended for a specific application domain. Because of their convenience, they are often used outside their original intended domain, even if it is unclear whether they provide reliable quality estimates… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Journal ref: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 29, 2021

  5. Controlling the Remixing of Separated Dialogue with a Non-Intrusive Quality Estimate

    Authors: Matteo Torcoli, Jouni Paulus, Thorsten Kastner, Christian Uhle

    Abstract: Remixing separated audio sources trades off interferer attenuation against the amount of audible deteriorations. This paper proposes a non-intrusive audio quality estimation method for controlling this trade-off in a signal-adaptive manner. The recently proposed 2f-model is adopted as the underlying quality measure, since it has been shown to correlate strongly with basic audio quality in source s… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.

    Comments: Manuscript accepted for the 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

  6. arXiv:2106.08229  [pdf, other

    cs.LG cs.AI

    MICo: Improved representations via sampling-based state similarity for Markov decision processes

    Authors: Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

    Abstract: We present a new behavioural distance over the state space of a Markov decision process, and demonstrate the use of this distance as an effective means of shaping the learnt representations of deep reinforcement learning agents. While existing notions of state similarity are typically difficult to learn at scale due to high computational cost and lack of sample-based algorithms, our newly-proposed… ▽ More

    Submitted 21 January, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: Published at NeurIPS 2021

  7. arXiv:1708.09706  [pdf

    cs.HC

    Seminar Innovation Management - Winter Term 2017

    Authors: Gerd Häusler, Aleksandra Milczarek, Markus Schreiter, Thomas Kästner, Florian Willomitzer, Andreas Maier, Florian Schiffers, Stefan Steidl, Temitope Paul Onanuga, Mathias Unberath, Florian Dötzer, Maike Stöve, Jonas Hajek, Christian Heidorn, Felix Häußler, Tobias Geimer, Johannes Wendel

    Abstract: This document contains the results obtained by the Innovation Management Seminar in winter term 2017. In total 11 ideas have been developed by the team. In the document all 11 ideas show improvements for future applications in ophthalmology. The 11 ideas are AR/VR Glasses with Medical Applications, Augmented Reality Eye Surgery, Game Diagnosis, Intelligent Adapting Glasses, MD Facebook, Medical Cr… ▽ More

    Submitted 22 August, 2017; originally announced August 2017.

    ACM Class: K.6.0