Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Joppen, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2101.10670  [pdf, other

    cs.AI

    Ordinal Monte Carlo Tree Search

    Authors: Tobias Joppen, Johannes Fürnkranz

    Abstract: In many problem settings, most notably in game playing, an agent receives a possibly delayed reward for its actions. Often, those rewards are handcrafted and not naturally given. Even simple terminal-only rewards, like winning equals one and losing equals minus one, can not be seen as an unbiased statement, since these values are chosen arbitrarily, and the behavior of the learner may change with… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: preprint. arXiv admin note: substantial text overlap with arXiv:1901.04274

  2. Ordinal Bucketing for Game Trees using Dynamic Quantile Approximation

    Authors: Tobias Joppen, Tilman Strübig, Johannes Fürnkranz

    Abstract: In this paper, we present a simple and cheap ordinal bucketing algorithm that approximately generates $q$-quantiles from an incremental data stream. The bucketing is done dynamically in the sense that the amount of buckets $q$ increases with the number of seen samples. We show how this can be used in Ordinal Monte Carlo Tree Search (OMCTS) to yield better bounds on time and space complexity, espec… ▽ More

    Submitted 31 May, 2019; originally announced May 2019.

    Comments: preprint

    Journal ref: Proc. IEEE CoG 2019

  3. arXiv:1905.02005  [pdf, other

    cs.LG cs.AI stat.ML

    Deep Ordinal Reinforcement Learning

    Authors: Alexander Zap, Tobias Joppen, Johannes Fürnkranz

    Abstract: Reinforcement learning usually makes use of numerical rewards, which have nice properties but also come with drawbacks and difficulties. Using rewards on an ordinal scale (ordinal rewards) is an alternative to numerical rewards that has received more attention in recent years. In this paper, a general approach to adapting reinforcement learning problems to the use of ordinal rewards is presented a… ▽ More

    Submitted 11 July, 2019; v1 submitted 6 May, 2019; originally announced May 2019.

    Comments: replaced figures for better visibility, added github repository, more details about source of experimental results, updated target value calculation for standard and ordinal Deep Q-Network

    Journal ref: Proc. ECML/PKDD (3) 2019: 3-18

  4. arXiv:1901.04274  [pdf, other

    cs.AI

    Ordinal Monte Carlo Tree Search

    Authors: Tobias Joppen, Johannes Fürnkranz

    Abstract: In many problem settings, most notably in game playing, an agent receives a possibly delayed reward for its actions. Often, those rewards are handcrafted and not naturally given. Even simple terminal-only rewards, like winning equals 1 and losing equals -1, can not be seen as an unbiased statement, since these values are chosen arbitrarily, and the behavior of the learner may change with different… ▽ More

    Submitted 14 January, 2019; originally announced January 2019.

    Comments: preview

    Journal ref: IJCAI Workshop on Monte Carlo Tree Search, 2020

  5. Preference-Based Monte Carlo Tree Search

    Authors: Tobias Joppen, Christian Wirth, Johannes Fürnkranz

    Abstract: Monte Carlo tree search (MCTS) is a popular choice for solving sequential anytime problems. However, it depends on a numeric feedback signal, which can be difficult to define. Real-time MCTS is a variant which may only rarely encounter states with an explicit, extrinsic reward. To deal with such cases, the experimenter has to supply an additional numeric feedback signal in the form of a heuristic,… ▽ More

    Submitted 17 July, 2018; originally announced July 2018.

    Comments: To be published

    Journal ref: Proceedings of the 41st German Conference on Artificial Intelligence (KI-18), 2018