Zum Hauptinhalt springen

Showing 1–3 of 3 results for author: Tanner, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.03121  [pdf, other

    cs.AI cs.GT cs.MA

    Evaluating Agents using Social Choice Theory

    Authors: Marc Lanctot, Kate Larson, Yoram Bachrach, Luke Marris, Zun Li, Avishkar Bhoopchand, Thomas Anthony, Brian Tanner, Anna Koop

    Abstract: We argue that many general evaluation problems can be viewed through the lens of voting theory. Each task is interpreted as a separate voter, which requires only ordinal rankings or pairwise comparisons of agents to produce an overall evaluation. By viewing the aggregator as a social welfare function, we are able to leverage centuries of research in social choice theory to derive principled evalua… ▽ More

    Submitted 6 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

  2. Reward-Respecting Subtasks for Model-Based Reinforcement Learning

    Authors: Richard S. Sutton, Marlos C. Machado, G. Zacharias Holland, David Szepesvari, Finbarr Timbers, Brian Tanner, Adam White

    Abstract: To achieve the ambitious goals of artificial intelligence, reinforcement learning must include planning with a model of the world that is abstract in state and time. Deep learning has made progress with state abstraction, but temporal abstraction has rarely been used, despite extensively developed theory based on the options framework. One reason for this is that the space of possible options is i… ▽ More

    Submitted 16 September, 2023; v1 submitted 7 February, 2022; originally announced February 2022.

    Journal ref: Artificial Intelligence, first published online September 6, 2023

  3. arXiv:1504.05539  [pdf, other

    cs.LG

    Temporal-Difference Networks

    Authors: Richard S. Sutton, Brian Tanner

    Abstract: We introduce a generalization of temporal-difference (TD) learning to networks of interrelated predictions. Rather than relating a single prediction to itself at a later time, as in conventional TD methods, a TD network relates each prediction in a set of predictions to other predictions in the set at a later time. TD networks can represent and apply TD learning to a much wider class of prediction… ▽ More

    Submitted 21 April, 2015; originally announced April 2015.

    Comments: 8 pages, 3 figures, presented at the 2004 conference on Neural Information Processing Systems. in Advances in Neural Information Processing Systems 17 (proceedings of the 2004 conference), Saul, L. K., Weiss, Y., and Bottou, L. (Eds)