Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Thoma, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.10207  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Learning to Steer Markovian Agents under Model Uncertainty

    Authors: Jiawei Huang, Vinzenz Thoma, Zebang Shen, Heinrich H. Nax, Niao He

    Abstract: Designing incentives for an adapting population is a ubiquitous problem in a wide array of economic applications and beyond. In this work, we study how to design additional rewards to steer multi-agent systems towards desired policies \emph{without} prior knowledge of the agents' underlying learning dynamics. We introduce a model-based non-episodic Reinforcement Learning (RL) formulation for our s… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 33 Pages

  2. arXiv:2406.01575  [pdf, other

    math.OC cs.AI cs.LG stat.ML

    Stochastic Bilevel Optimization with Lower-Level Contextual Markov Decision Processes

    Authors: Vinzenz Thoma, Barna Pasztor, Andreas Krause, Giorgia Ramponi, Yifan Hu

    Abstract: In various applications, the optimal policy in a strategic decision-making problem depends both on the environmental configuration and exogenous events. For these settings, we introduce Bilevel Optimization with Contextual Markov Decision Processes (BO-CMDP), a stochastic bilevel decision-making model, where the lower level consists of solving a contextual Markov Decision Process (CMDP). BO-CMDP c… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 54 pages, 18 Figures

  3. arXiv:2402.08129  [pdf, ps, other

    cs.GT

    Automated Design of Affine Maximizer Mechanisms in Dynamic Settings

    Authors: Michael Curry, Vinzenz Thoma, Darshan Chakrabarti, Stephen McAleer, Christian Kroer, Tuomas Sandholm, Niao He, Sven Seuken

    Abstract: Dynamic mechanism design is a challenging extension to ordinary mechanism design in which the mechanism designer must make a sequence of decisions over time in the face of possibly untruthful reports of participating agents. Optimizing dynamic mechanisms for welfare is relatively well understood. However, there has been less work on optimizing for other goals (e.g. revenue), and without restrictiv… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: To be published in the Thirty-Eighth Proceedings of the AAAI Conference on Artificial Intelligence 2024

  4. arXiv:2312.13232  [pdf, other

    cs.GT

    Learning Best Response Policies in Dynamic Auctions via Deep Reinforcement Learning

    Authors: Vinzenz Thoma, Michael Curry, Niao He, Sven Seuken

    Abstract: Many real-world auctions are dynamic processes, in which bidders interact and report information over multiple rounds with the auctioneer. The sequential decision making aspect paired with imperfect information renders analyzing the incentive properties of such auctions much more challenging than in the static case. It is clear that bidders often have incentives for manipulation, but the full scop… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 14 pages, 4 figures

  5. arXiv:2312.04516  [pdf, other

    cs.GT

    Computing Perfect Bayesian Equilibria in Sequential Auctions

    Authors: Vinzenz Thoma, Vitor Bosshard, Sven Seuken

    Abstract: We present a best-response based algorithm for computing verifiable $\varepsilon$-perfect Bayesian equilibria for sequential auctions with combinatorial bidding spaces and incomplete information. Previous work has focused only on computing Bayes-Nash equilibria for static single-round auctions, which our work captures as a special case. Additionally, we prove an upper bound $\varepsilon$ on the ut… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: 12 pages, 8 figures