Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Franzmeyer, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03428  [pdf, other

    cs.LG

    HelloFresh: LLM Evaluations on Streams of Real-World Human Editorial Actions across X Community Notes and Wikipedia edits

    Authors: Tim Franzmeyer, Aleksandar Shtedritski, Samuel Albanie, Philip Torr, João F. Henriques, Jakob N. Foerster

    Abstract: Benchmarks have been essential for driving progress in machine learning. A better understanding of LLM capabilities on real world tasks is vital for safe development. Designing adequate LLM benchmarks is challenging: Data from real-world tasks is hard to collect, public availability of static evaluation data results in test data contamination and benchmark overfitting, and periodically generating… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Findings

  2. arXiv:2405.03735  [pdf, other

    cs.LG cs.AI cs.MA

    Select to Perfect: Imitating desired behavior from large multi-agent data

    Authors: Tim Franzmeyer, Edith Elkind, Philip Torr, Jakob Foerster, Joao Henriques

    Abstract: AI agents are commonly trained with large datasets of demonstrations of human behavior. However, not all behaviors are equally safe or desirable. Desired characteristics for an AI agent can be expressed by assigning desirability scores, which we assume are not assigned to individual behaviors but to collective trajectories. For example, in a dataset of vehicle interactions, these scores might rela… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: ICLR 2024

  3. arXiv:2404.07099  [pdf, other

    cs.LG cs.AI

    Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection

    Authors: Linas Nasvytis, Kai Sandbrink, Jakob Foerster, Tim Franzmeyer, Christian Schroeder de Witt

    Abstract: While reinforcement learning (RL) algorithms have been successfully applied across numerous sequential decision-making problems, their generalization to unforeseen testing environments remains a significant concern. In this paper, we study the problem of out-of-distribution (OOD) detection in RL, which focuses on identifying situations at test time that RL agents have not encountered in their trai… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted as a full paper to the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024)

  4. arXiv:2306.01804  [pdf, other

    cs.LG cs.AI

    Extracting Reward Functions from Diffusion Models

    Authors: Felipe Nuti, Tim Franzmeyer, João F. Henriques

    Abstract: Diffusion models have achieved remarkable results in image generation, and have similarly been used to learn high-performing policies in sequential decision-making tasks. Decision-making diffusion models can be trained on lower-quality data, and then be steered with a reward function to generate near-optimal trajectories. We consider the problem of extracting a reward function by comparing a decis… ▽ More

    Submitted 9 December, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023

  5. arXiv:2209.12093  [pdf, other

    cs.AI

    Learn what matters: cross-domain imitation learning with task-relevant embeddings

    Authors: Tim Franzmeyer, Philip H. S. Torr, João F. Henriques

    Abstract: We study how an autonomous agent learns to perform a task from demonstrations in a different domain, such as a different environment or different agent. Such cross-domain imitation learning is required to, for example, train an artificial agent from demonstrations of a human expert. We propose a scalable framework that enables cross-domain imitation learning without access to additional demonstrat… ▽ More

    Submitted 24 September, 2022; originally announced September 2022.

    Comments: NeurIPS 2022

  6. arXiv:2207.10170  [pdf, other

    cs.AI

    Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks

    Authors: Tim Franzmeyer, Stephen McAleer, João F. Henriques, Jakob N. Foerster, Philip H. S. Torr, Adel Bibi, Christian Schroeder de Witt

    Abstract: Autonomous agents deployed in the real world need to be robust against adversarial attacks on sensory inputs. Robustifying agent policies requires anticipating the strongest attacks possible. We demonstrate that existing observation-space attacks on reinforcement learning agents have a common weakness: while effective, their lack of information-theoretic detectability constraints makes them detect… ▽ More

    Submitted 6 May, 2024; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: ICLR 2024 Spotlight (top 5%)

  7. arXiv:2107.09598  [pdf, other

    cs.AI cs.LG cs.MA

    Learning Altruistic Behaviours in Reinforcement Learning without External Rewards

    Authors: Tim Franzmeyer, Mateusz Malinowski, João F. Henriques

    Abstract: Can artificial agents learn to assist others in achieving their goals without knowing what those goals are? Generic reinforcement learning agents could be trained to behave altruistically towards others by rewarding them for altruistic behaviour, i.e., rewarding them for benefiting other agents in a given situation. Such an approach assumes that other agents' goals are known so that the altruistic… ▽ More

    Submitted 21 March, 2022; v1 submitted 20 July, 2021; originally announced July 2021.

    Comments: ICLR 2022 Spotlight Presentation