-
Mean field teams and games with correlated types
Authors:
Deepanshu Vasal
Abstract:
Mean field games have traditionally been defined~[1,2] as a model of large scale interaction of players where each player has a private type that is independent across the players. In this paper, we introduce a new model of mean field teams and games with \emph{correlated types} where there are a large population of homogeneous players sequentially making strategic decisions and each player is aff…
▽ More
Mean field games have traditionally been defined~[1,2] as a model of large scale interaction of players where each player has a private type that is independent across the players. In this paper, we introduce a new model of mean field teams and games with \emph{correlated types} where there are a large population of homogeneous players sequentially making strategic decisions and each player is affected by other players through an aggregate population state. Each player has a private type that only she observes and types of any $N$ players are correlated through a kernel $Q$. All players commonly observe a correlated mean-field population state which represents the empirical distribution of any $N$ players' correlated joint types. We define the Mean-Field Team optimal Strategies (MFTO) as strategies of the players that maximize total expected joint reward of the players. We also define Mean-Field Equilibrium (MFE) in such games as solution of coupled Bellman dynamic programming backward equation and Fokker Planck forward equation of the correlated mean field state, where a player's strategy in an MFE depends on both, her private type and current correlated mean field population state. We present sufficient conditions for the existence of such an equilibria. We also present a backward recursive methodology equivalent of master's equation to compute all MFTO and MFEs of the team and game respectively. Each step in this methodology consists of solving an optimization problem for the team problem and a fixed-point equation for the game. We provide sufficient conditions that guarantee existence of this fixed-point equation for the game for each time $t$.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Master equation of discrete-time Stackelberg mean field games with multiple leaders
Authors:
Deepanshu Vasal
Abstract:
In this paper, we consider a discrete-time Stackelberg mean field game with a finite number of leaders, a finite number of major followers and an infinite number of minor followers. The leaders and the followers each observe types privately that evolve as conditionally independent controlled Markov processes. The leaders are of "Stackelberg" kind which means they commit to a dynamic policy. We con…
▽ More
In this paper, we consider a discrete-time Stackelberg mean field game with a finite number of leaders, a finite number of major followers and an infinite number of minor followers. The leaders and the followers each observe types privately that evolve as conditionally independent controlled Markov processes. The leaders are of "Stackelberg" kind which means they commit to a dynamic policy. We consider two types of followers: major and minor, each with a private type. All the followers best respond to the policies of the Stackelberg leaders and each other. Knowing that the followers would play a mean field game (with major players) based on their policy, each (Stackelberg) leader chooses a policy that maximizes her reward. We refer to the resulting outcome as a Stackelberg mean field equilibrium with multiple leaders (SMFE-ML). In this paper, we provide a master equation of this game that allows one to compute all SMFE-ML. We further extend this notion to the case when there are infinite number of leaders.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
A dynamic program to achieve capacity of multiple access channel with noiseless feedback
Authors:
Deepanshu Vasal
Abstract:
In this paper, we consider the problem of evaluating capacity expression of a multiple access channel (MAC) with noiseless feedback. So far, the capacity expression for this channel is known through a multi letter directed information by Kramer [1]. Recently, it was shown in [2] that one can pose it as a dynamic optimization problem, however, no dynamic program was provided as the authors claimed…
▽ More
In this paper, we consider the problem of evaluating capacity expression of a multiple access channel (MAC) with noiseless feedback. So far, the capacity expression for this channel is known through a multi letter directed information by Kramer [1]. Recently, it was shown in [2] that one can pose it as a dynamic optimization problem, however, no dynamic program was provided as the authors claimed there is no notion of state that is observed by both the senders. In this paper, we build upon [2] to show that there indeed exists a state and therefore a dynamic program (DP) that decomposes this dynamic optimization problem, and equivalently a Bellman fixed-point equation to evaluate capacity of this channel. We do so by defining a common belief on private messages and private beliefs of the two senders, and using this common belief as state of the system. We further show that this DP can be further reduced to a DP with state as the common belief on just the messages. This provides a single letter characterization of the capacity of this channel.
△ Less
Submitted 5 February, 2022;
originally announced February 2022.
-
Master Equation for Discrete-Time Stackelberg Mean Field Games with single leader
Authors:
Deepanshu Vasal,
Randall Berry
Abstract:
In this paper, we consider a discrete-time Stackelberg mean field game with a leader and an infinite number of followers. The leader and the followers each observe types privately that evolve as conditionally independent controlled Markov processes. The leader commits to a dynamic policy and the followers best respond to that policy and each other. Knowing that the followers would play a mean fiel…
▽ More
In this paper, we consider a discrete-time Stackelberg mean field game with a leader and an infinite number of followers. The leader and the followers each observe types privately that evolve as conditionally independent controlled Markov processes. The leader commits to a dynamic policy and the followers best respond to that policy and each other. Knowing that the followers would play a mean field game based on her policy, the leader chooses a policy that maximizes her reward. We refer to the resulting outcome as a Stackelberg mean field equilibrium (SMFE). In this paper, we provide a master equation of this game that allows one to compute all SMFE. Based on our framework, we consider two numerical examples. First, we consider an epidemic model where the followers get infected based on the mean field population. The leader chooses subsidies for a vaccine to maximize social welfare and minimize vaccination costs. In the second example, we consider a technology adoption game where the followers decide to adopt a technology or a product and the leader decides the cost of one product that maximizes his returns, which are proportional to the people adopting that technology
△ Less
Submitted 15 January, 2022;
originally announced January 2022.
-
A dynamic program for linear sequential coding for Gaussian MAC with noisy feedback
Authors:
Deepanshu Vasal
Abstract:
In this paper consider a two user multiple access channel with noisy feedback. There are two senders with independent messages who transmit symbols across an additive white Gaussian channel to a receiver, who in turn sends back a symbol which is received by the two senders through two independent noisy Gaussian channels. We consider the case when the feedback is active i.e. the receiver actively e…
▽ More
In this paper consider a two user multiple access channel with noisy feedback. There are two senders with independent messages who transmit symbols across an additive white Gaussian channel to a receiver, who in turn sends back a symbol which is received by the two senders through two independent noisy Gaussian channels. We consider the case when the feedback is active i.e. the receiver actively encodes the feedback using a linear state process. We pose this as a problem of linear sequential coding at the senders and the receiver to minimize the terminal mean square probability of error at the receiver. This is an instance of decentralized control with no common information at the senders and the receiver. In this paper, we construct two linear controllers at the sender and the receiver. Due to linearity of the policies and the controllers, all the random variables involved are jointly Gaussian. Moreover, the corresponding covariance matrix at the receiver of the estimation process of the senders' messages is a deterministic process, which is a function of the parameters of the controllers and the strategies of the players, and is thus perfectly observed by the senders. Based on this observation, we use deterministic dynamic programming to find the optimal policies and the optimal linear controllers at both the senders and the receiver. The problem with passive feedback can be considered as a special case.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Convergence of Generalized Belief Propagation Algorithm on Graphs with Motifs
Authors:
Yitao Chen,
Deepanshu Vasal
Abstract:
Belief propagation is a fundamental message-passing algorithm for numerous applications in machine learning. It is known that belief propagation algorithm is exact on tree graphs. However, belief propagation is run on loopy graphs in most applications. So, understanding the behavior of belief propagation on loopy graphs has been a major topic for researchers in different areas. In this paper, we s…
▽ More
Belief propagation is a fundamental message-passing algorithm for numerous applications in machine learning. It is known that belief propagation algorithm is exact on tree graphs. However, belief propagation is run on loopy graphs in most applications. So, understanding the behavior of belief propagation on loopy graphs has been a major topic for researchers in different areas. In this paper, we study the convergence behavior of generalized belief propagation algorithm on graphs with motifs (triangles, loops, etc.) We show under a certain initialization, generalized belief propagation converges to the global optimum of the Bethe free energy for ferromagnetic Ising models on graphs with motifs.
△ Less
Submitted 11 December, 2021;
originally announced December 2021.
-
A Phase Transition in Large Network Games
Authors:
Abhishek Shende,
Deepanshu Vasal,
Sriram Vishwanath
Abstract:
In this paper, we use a model of large random network game where the agents plays selfishly and are affected by their neighbors, to explore the conditions under which the Nash equilibrium (NE) of the game is affected by a perturbation in the network. We use a phase transition phenomenon observed in finite rank deformations of large random matrices, to study how the NE changes on crossing critical…
▽ More
In this paper, we use a model of large random network game where the agents plays selfishly and are affected by their neighbors, to explore the conditions under which the Nash equilibrium (NE) of the game is affected by a perturbation in the network. We use a phase transition phenomenon observed in finite rank deformations of large random matrices, to study how the NE changes on crossing critical threshold points. Our main contribution is as follows: when the perturbation strength is greater than a critical point, it impacts the NE of the game, whereas when this perturbation is below this critical point, the NE remains independent of the perturbation parameter. This demonstrates a phase transition in NE which alludes that perturbations can affect the behavior of the society only if their strength is above a critical threshold. We provide numerical examples for this result and present scenarios under which this phenomenon could potentially occur in real world applications.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Linear Coding for AWGN channels with Noisy Output Feedback via Dynamic Programming
Authors:
Rajesh Mishra,
Deepanshu Vasal,
Hyeji Kim
Abstract:
The optimal coding scheme for communicating a Gaussian message over an Additive White Gaussian noise (AWGN) channel with AWGN output feedback, with a limited number of transmissions is unknown. Even if we restrict the scope of the coding scheme to linear schemes, still, deriving the optimal coding scheme is a challenging task. The state-of-the-art linear scheme for channels with noisy feedback is…
▽ More
The optimal coding scheme for communicating a Gaussian message over an Additive White Gaussian noise (AWGN) channel with AWGN output feedback, with a limited number of transmissions is unknown. Even if we restrict the scope of the coding scheme to linear schemes, still, deriving the optimal coding scheme is a challenging task. The state-of-the-art linear scheme for channels with noisy feedback is by Chance and Love, where the coefficients of the linear scheme are numerically optimized based on unique observations [1]. In this paper, we introduce a new class of sequential linear schemes for this channel by introducing a novel linear state process at the transmitter and derive the optimal sequential scheme within this class of schemes in a closed-form by formulating a novel Dynamic Programming (DP). We empirically show that our scheme outperforms the state-of-the-art linear scheme in [1] for noisy feedback and coincides with the SK scheme for noiseless feedback. We also show that in communicating message bits as opposed to a Gaussian message, a learning-based approach further improves the reliability of sequential linear schemes. This problem is an instance of decentralized control without any common information and to the best of our knowledge the first such scenario where we can derive analytical solutions using a DP.
△ Less
Submitted 23 May, 2022; v1 submitted 17 March, 2021;
originally announced March 2021.
-
Network Design for Social Welfare
Authors:
Abhishek Shende,
Deepanshu Vasal,
Sriram Vishwanath
Abstract:
In this paper, we consider the problem of network design on network games. We study the conditions on the adjacency matrix of the underlying network to design a game such that the Nash equilibrium coincides with the social optimum. We provide the examples for linear quadratic games that satisfy this condition. Furthermore, we identify conditions on properties of adjacency matrix that provide a uni…
▽ More
In this paper, we consider the problem of network design on network games. We study the conditions on the adjacency matrix of the underlying network to design a game such that the Nash equilibrium coincides with the social optimum. We provide the examples for linear quadratic games that satisfy this condition. Furthermore, we identify conditions on properties of adjacency matrix that provide a unique solution using variational inequality formulation, and verify the robustness and continuity of the social cost under perturbations of the network. Finally we comment on individual rationality and extension of our results to large random networked games.
△ Less
Submitted 10 September, 2022; v1 submitted 23 December, 2020;
originally announced December 2020.
-
Multi-Agent Decentralized Belief Propagation on Graphs
Authors:
Yitao Chen,
Deepanshu Vasal
Abstract:
We consider the problem of interactive partially observable Markov decision processes (I-POMDPs), where the agents are located at the nodes of a communication network. Specifically, we assume a certain message type for all messages. Moreover, each agent makes individual decisions based on the interactive belief states, the information observed locally and the messages received from its neighbors o…
▽ More
We consider the problem of interactive partially observable Markov decision processes (I-POMDPs), where the agents are located at the nodes of a communication network. Specifically, we assume a certain message type for all messages. Moreover, each agent makes individual decisions based on the interactive belief states, the information observed locally and the messages received from its neighbors over the network. Within this setting, the collective goal of the agents is to maximize the globally averaged return over the network through exchanging information with their neighbors. We propose a decentralized belief propagation algorithm for the problem, and prove the convergence of our algorithm. Finally we show multiple applications of our framework. Our work appears to be the first study of decentralized belief propagation algorithm for networked multi-agent I-POMDPs.
△ Less
Submitted 9 November, 2020; v1 submitted 6 November, 2020;
originally announced November 2020.
-
Model-free Reinforcement Learning for Stochastic Stackelberg Security Games
Authors:
Rajesh K Mishra,
Deepanshu Vasal,
Sriram Vishwanath
Abstract:
In this paper, we consider a sequential stochastic Stackelberg game with two players, a leader and a follower. The follower has access to the state of the system while the leader does not. Assuming that the players act in their respective best interests, the follower's strategy is to play the best response to the leader's strategy. In such a scenario, the leader has the advantage of committing to…
▽ More
In this paper, we consider a sequential stochastic Stackelberg game with two players, a leader and a follower. The follower has access to the state of the system while the leader does not. Assuming that the players act in their respective best interests, the follower's strategy is to play the best response to the leader's strategy. In such a scenario, the leader has the advantage of committing to a policy which maximizes its own returns given the knowledge that the follower is going to play the best response to its policy. Thus, both players converge to a pair of policies that form the Stackelberg equilibrium of the game. Recently,~[1] provided a sequential decomposition algorithm to compute the Stackelberg equilibrium for such games which allow for the computation of Markovian equilibrium policies in linear time as opposed to double exponential, as before. In this paper, we extend the idea to an MDP whose dynamics are not known to the players, to propose an RL algorithm based on Expected Sarsa that learns the Stackelberg equilibrium policy by simulating a model of the MDP. We use particle filters to estimate the belief update for a common agent which computes the optimal policy based on the information which is common to both the players. We present a security game example to illustrate the policy learned by our algorithm. by simulating a model of the MDP. We use particle filters to estimate the belief update for a common agent which computes the optimal policy based on the information which is common to both the players. We present a security game example to illustrate the policy learned by our algorithm.
△ Less
Submitted 24 May, 2020;
originally announced May 2020.
-
Dynamic information design
Authors:
Deepanshu Vasal
Abstract:
We consider the problem of dynamic information design with one sender and one receiver where the sender observers a private state of the system and takes an action to send a signal based on its observation to a receiver. Based on this signal, the receiver takes an action that determines rewards for both the sender and the receiver and controls the state of the system. In this technical note, we sh…
▽ More
We consider the problem of dynamic information design with one sender and one receiver where the sender observers a private state of the system and takes an action to send a signal based on its observation to a receiver. Based on this signal, the receiver takes an action that determines rewards for both the sender and the receiver and controls the state of the system. In this technical note, we show that this problem can be considered as a problem of dynamic game of asymmetric information and its perfect Bayesian equilibrium (PBE) and Stackelberg equilibrium (SE) can be analyzed using the algorithms presented in [1], [2] by the same author (among others). We then extend this model when there is one sender and multiple receivers and provide algorithms to compute a class of equilibria of this game.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Fault Tolerant Equilibria in Anonymous Games: best response correspondences and fixed points
Authors:
Deepanshu Vasal,
Randall Berry
Abstract:
The notion of fault tolerant Nash equilibria has been introduced as a way of studying the robustness of Nash equilibria. Under this notion, a fixed number of players are allowed to exhibit faulty behavior in which they may deviate arbitrarily from an equilibrium strategy. A Nash equilibrium in a game with $N$ players is said to be $α$-tolerant if no non-faulty user wants to deviate from an equilib…
▽ More
The notion of fault tolerant Nash equilibria has been introduced as a way of studying the robustness of Nash equilibria. Under this notion, a fixed number of players are allowed to exhibit faulty behavior in which they may deviate arbitrarily from an equilibrium strategy. A Nash equilibrium in a game with $N$ players is said to be $α$-tolerant if no non-faulty user wants to deviate from an equilibrium strategy as long as $N-α-1$ other players are playing the equilibrium strategies, i.e., it is robust to deviations from rationality by $α$ faulty players. In prior work, $α$-tolerance has been largely viewed as a property of a given Nash equilibria. Here, instead we consider following Nash's approach for showing the existence of equilibria, namely, through the use of best response correspondences and fixed-point arguments. In this manner, we provide sufficient conditions for the existence an $α$-tolerant equilibrium. This involves first defining an $α$-tolerant best response correspondence. Given a strategy profile of non-faulty agents, this correspondence contains strategies for a non-faulty player that are a best response given any strategy profile of the faulty players. We prove that if this correspondence is non-empty, then it is upper-hemi-continuous. This enables us to apply Kakutani's fixed-point theorem and argue that if this correspondence is non-empty for every strategy profile of the non-faulty players then there exists an $α$-tolerant equilibrium. However, we also illustrate by examples, that in many games this best response correspondence will be empty for some strategy profiles even though $α$-tolerant equilibira still exist.
△ Less
Submitted 12 May, 2022; v1 submitted 14 May, 2020;
originally announced May 2020.
-
Existence of structured perfect Bayesian equilibrium in dynamic games of asymmetric information
Authors:
Deepanshu Vasal
Abstract:
In~[1],authors considered a general finite horizon model of dynamic game of asymmetric information, where N players have types evolving as independent Markovian process, where each player observes its own type perfectly and actions of all players. The authors present a sequential decomposition algorithm to find all structured perfect Bayesian equilibria of the game. The algorithm consists of solvi…
▽ More
In~[1],authors considered a general finite horizon model of dynamic game of asymmetric information, where N players have types evolving as independent Markovian process, where each player observes its own type perfectly and actions of all players. The authors present a sequential decomposition algorithm to find all structured perfect Bayesian equilibria of the game. The algorithm consists of solving a class of fixed-point of equations for each time $t,π_t$, whose existence was left as an open question. In this paper, we prove existence of these fixed-point equations for compact metric spaces.
△ Less
Submitted 29 May, 2020; v1 submitted 12 May, 2020;
originally announced May 2020.
-
Sequential decomposition of stochastic Stackelberg games
Authors:
Deepanshu Vasal
Abstract:
In this paper, we consider a discrete-time stochastic Stackelberg game with a single leader and multiple followers. Both the followers and the leader together have conditionally independent private types, conditioned on action and previous state, that evolve as controlled Markov processes. The objective is to compute the stochastic Stackelberg equilibrium of the game where the leader commits to a…
▽ More
In this paper, we consider a discrete-time stochastic Stackelberg game with a single leader and multiple followers. Both the followers and the leader together have conditionally independent private types, conditioned on action and previous state, that evolve as controlled Markov processes. The objective is to compute the stochastic Stackelberg equilibrium of the game where the leader commits to a dynamic strategy. Each follower's strategy is the best response to the leader's strategies and other followers' strategies while the each leader's strategy is optimum given the followers play the best response. In general, computing such equilibrium involves solving a fixed-point equation for the whole game. In this paper, we present a backward recursive algorithm that computes such strategies by solving smaller fixed-point equations for each time $t$. Based on this algorithm, we compute stochastic Stackelberg equilibrium of a security example and a dynamics information design example used in~\cite{El17} (beeps).
△ Less
Submitted 19 September, 2022; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Mechanism Design for Large Scale Network Utility Maximization
Authors:
Meng Zhang,
Deepanshu Vasal
Abstract:
Network utility maximization (NUM) is a general framework for designing distributed optimization algorithms for large-scale networks. An economic challenge arises in the presence of strategic agents' private information. Existing studies proposed (economic) mechanisms but largely neglected the issue of large-scale implementation. Specifically, they require certain modifications to the deployed alg…
▽ More
Network utility maximization (NUM) is a general framework for designing distributed optimization algorithms for large-scale networks. An economic challenge arises in the presence of strategic agents' private information. Existing studies proposed (economic) mechanisms but largely neglected the issue of large-scale implementation. Specifically, they require certain modifications to the deployed algorithms, which may bring the significant cost. To tackle this challenge, we present the large-scale Vickery-Clark-Grove (VCG) Mechanism for NUM, with a simpler payment rule characterized by the shadow prices. The Large-Scale VCG Mechanism maximizes the network utility and achieves individual rationality and budget balance. With infinitely many agents, agents' truthful reports of their types are their dominant strategies; for the finite case, each agent's incentive to misreport converges quadratically to zero. For practical implementation, we introduce a modified mechanism that possesses an additional important technical property, superimposability, which makes it able to be built upon any (potentially distributed) algorithm that optimally solves the NUM Problem and ensures all agents to obey the algorithm. We then extend this idea to the dynamic case, when agents' types are dynamically evolving as a controlled Markov process. In this case, the mechanism leads to incentive compatible actions of agent for each time slot.
△ Less
Submitted 12 January, 2021; v1 submitted 9 March, 2020;
originally announced March 2020.
-
Sequential decomposition of discrete memoryless channel with noisy feedback
Authors:
Deepanshu Vasal
Abstract:
In this paper, we consider a discrete memoryless point to point channel with noisy feedback, where there is a sender with a private message that she wants to communicate to a receiver by sequentially transmitting symbols over a noisy channel. After each transmission, she receives a noisy feedback of the symbol received by the receiver. The goal is to design transmission control strategy of the sen…
▽ More
In this paper, we consider a discrete memoryless point to point channel with noisy feedback, where there is a sender with a private message that she wants to communicate to a receiver by sequentially transmitting symbols over a noisy channel. After each transmission, she receives a noisy feedback of the symbol received by the receiver. The goal is to design transmission control strategy of the sender that minimize the average probability of error. This is an instance of decentralized control of information where the two controllers, the sender and the receiver have no common information. There exist no methodology in the literature that provides a notion of "state" and a dynamic program to find optimal policies for this problem In this paper, we show introduce a notion of state, based on which we provide a sequential decomposition methodology that finds optimum policies within the class of Markov strategies with respect to this state (which need not be globally optimum). This allows to decompose the problem across time and reduce the complexity dependence on time from double exponential to linear in time.
△ Less
Submitted 21 February, 2020;
originally announced February 2020.
-
Master equation of discrete time graphon mean field games and teams
Authors:
Deepanshu Vasal,
Rajesh K Mishra,
Sriram Vishwanath
Abstract:
In this paper, we present a sequential decomposition algorithm equivalent of Master equation to compute GMFE of GMFG and graphon optimal Markovian policies (GOMPs) of graphon mean field teams (GMFTs). We consider a large population of players sequentially making strategic decisions where the actions of each player affect their neighbors which is captured in a graph, generated by a known graphon. E…
▽ More
In this paper, we present a sequential decomposition algorithm equivalent of Master equation to compute GMFE of GMFG and graphon optimal Markovian policies (GOMPs) of graphon mean field teams (GMFTs). We consider a large population of players sequentially making strategic decisions where the actions of each player affect their neighbors which is captured in a graph, generated by a known graphon. Each player observes a private state and also a common information as a graphon mean-field population state which represents the empirical networked distribution of other players' types. We consider non-stationary population state dynamics and present a novel backward recursive algorithm to compute both GMFE and GOMP that depend on both, a player's private type, and the current (dynamic) population state determined through the graphon. Each step in computing GMFE consists of solving a fixed-point equation, while computing GOMP involves solving for an optimization problem. We provide conditions on model parameters for which there exists such a GMFE. Using this algorithm, we obtain the GMFE and GOMP for a specific security setup in cyber physical systems for different graphons that capture the interactions between the nodes in the system.
△ Less
Submitted 7 June, 2022; v1 submitted 15 January, 2020;
originally announced January 2020.
-
Markov perfect equilibria in non-stationary mean-field games
Authors:
Deepanshu Vasal
Abstract:
In this paper, we consider both finite and infinite horizon discounted dynamic mean-field games where there is a large population of homogeneous players sequentially making strategic decisions and each player is affected by other players through an aggregate population state. Each player has a private type that only she observes. Such games have been studied in the literature under simplifying ass…
▽ More
In this paper, we consider both finite and infinite horizon discounted dynamic mean-field games where there is a large population of homogeneous players sequentially making strategic decisions and each player is affected by other players through an aggregate population state. Each player has a private type that only she observes. Such games have been studied in the literature under simplifying assumption that population state dynamics are stationary. In this paper, we consider non-stationary population state dynamics and present a novel backward recursive algorithm to compute Markov perfect equilibrium (MPE) that depend on both, a player's private type, and current (dynamic) population state. Using this algorithm, we study a security problem in cyberphysical system where infected nodes put negative externality on the system, and each node makes a decision to get vaccinated. We numerically compute MPE of the game.
△ Less
Submitted 21 October, 2019; v1 submitted 10 May, 2019;
originally announced May 2019.
-
Incentive design for learning in user-recommendation systems with time-varying states
Authors:
Deepanshu Vasal,
Vijay Subramanian,
Achilleas Anastasopoulos
Abstract:
We consider the problem of how strategic users with asymmetric information can learn an underlying time varying state in a user-recommendation system. Users who observe private signals about the state, sequentially make a decision about buying a product whose value varies with time in an ergodic manner. We formulate the team problem as an instance of decentralized stochastic control problem and ch…
▽ More
We consider the problem of how strategic users with asymmetric information can learn an underlying time varying state in a user-recommendation system. Users who observe private signals about the state, sequentially make a decision about buying a product whose value varies with time in an ergodic manner. We formulate the team problem as an instance of decentralized stochastic control problem and characterize its optimal policies. With strategic users, we design incentives such that users reveal their true private signals, so that the gap between the strategic and team objective is small and the overall expected incentive payments are also small.
△ Less
Submitted 13 April, 2018;
originally announced April 2018.
-
Sequential decomposition of repeated games with asymmetric information and dependent states
Authors:
Deepanshu Vasal
Abstract:
We consider a finite horizon repeated game with $N$ selfish players who observe their types privately and take actions, which are publicly observed. Their actions and types jointly determine their instantaneous rewards. In each period, players jointly observe actions of each other with delay 1, and private observations of the state of the system, and get an instantaneous reward which is a function…
▽ More
We consider a finite horizon repeated game with $N$ selfish players who observe their types privately and take actions, which are publicly observed. Their actions and types jointly determine their instantaneous rewards. In each period, players jointly observe actions of each other with delay 1, and private observations of the state of the system, and get an instantaneous reward which is a function of the state and everyone's actions. The players' types are static and are potentially correlated among players.
An appropriate notion of equilibrium for such games is Perfect Bayesian Equilibrium (PBE) which consists of a strategy and a belief profile of the players which is coupled across time and as a result, the complexity of finding such equilibria grows double-exponentially in time. We present a sequential decomposition methodology to compute \emph{structured perfect Bayesian equilibria} (SPBE) of this game, introduced in~\cite{VaAn15arxiv}, where equilibrium policy of a player is a function of a common belief and a private state. This methodology computes SPBE in linear time. In general, the SPBE of the game problem exhibit \textit{signaling} behavior, i.e. players' actions reveal part of their private information that is payoff relevant to other players.
△ Less
Submitted 16 May, 2019; v1 submitted 10 January, 2018;
originally announced January 2018.
-
Decentralized Bayesian learning in dynamic games: A framework for studying informational cascades
Authors:
Deepanshu Vasal,
Achilleas Anastasopoulos
Abstract:
We study the problem of Bayesian learning in a dynamical system involving strategic agents with asymmetric information. In a series of seminal papers in the literature, this problem has been investigated under a simplifying model where myopically selfish players appear sequentially and act once in the game, based on private noisy observations of the system state and public observation of past play…
▽ More
We study the problem of Bayesian learning in a dynamical system involving strategic agents with asymmetric information. In a series of seminal papers in the literature, this problem has been investigated under a simplifying model where myopically selfish players appear sequentially and act once in the game, based on private noisy observations of the system state and public observation of past players' actions. It has been shown that there exist information cascades where users discard their private information and mimic the action of their predecessor. In this paper, we provide a framework for studying Bayesian learning dynamics in a more general setting than the one described above. In particular, our model incorporates cases where players are non-myopic and strategically participate for the whole duration of the game, and cases where an endogenous process selects which subset of players will act at each time instance. The proposed framework hinges on a sequential decomposition methodology for finding structured perfect Bayesian equilibria (PBE) of a general class of dynamic games with asymmetric information, where user-specific states evolve as conditionally independent Markov processes and users make independent noisy observations of their states. Using this methodology, we study a specific dynamic learning model where players make decisions about public investment based on their estimates of everyone's types. We characterize a set of informational cascades for this problem where learning stops for the team as a whole. We show that in such cascades, all players' estimates of other players' types freeze even though each individual player asymptotically learns its own true type.
△ Less
Submitted 8 April, 2018; v1 submitted 22 July, 2016;
originally announced July 2016.
-
Signaling equilibria for dynamic LQG games with asymmetric information
Authors:
Deepanshu Vasal,
Achilleas Anastasopoulos
Abstract:
We consider a finite horizon dynamic game with two players who observe their types privately and take actions, which are publicly observed. Players' types evolve as independent, controlled linear Gaussian processes and players incur quadratic instantaneous costs. This forms a dynamic linear quadratic Gaussian (LQG) game with asymmetric information. We show that under certain conditions, players' s…
▽ More
We consider a finite horizon dynamic game with two players who observe their types privately and take actions, which are publicly observed. Players' types evolve as independent, controlled linear Gaussian processes and players incur quadratic instantaneous costs. This forms a dynamic linear quadratic Gaussian (LQG) game with asymmetric information. We show that under certain conditions, players' strategies that are linear in their private types, together with Gaussian beliefs form a perfect Bayesian equilibrium (PBE) of the game. Furthermore, it is shown that this is a signaling equilibrium due to the fact that future beliefs on players' types are affected by the equilibrium strategies. We provide a backward-forward algorithm to find the PBE. Each step of the backward algorithm reduces to solving an algebraic matrix equation for every possible realization of the state estimate covariance matrix. The forward algorithm consists of Kalman filter recursions, where state estimate covariance matrices depend on equilibrium strategies.
△ Less
Submitted 15 June, 2016;
originally announced June 2016.
-
A systematic process for evaluating structured perfect Bayesian equilibria in dynamic games with asymmetric information
Authors:
Deepanshu Vasal,
Abhinav Sinha,
Achilleas Anastasopoulos
Abstract:
We consider finite-horizon and infinite-horizon versions of a dynamic game with $N$ selfish players who observe their types privately and take actions that are publicly observed. Players' types evolve as conditionally independent Markov processes, conditioned on their current actions. Their actions and types jointly determine their instantaneous rewards. In dynamic games with asymmetric informatio…
▽ More
We consider finite-horizon and infinite-horizon versions of a dynamic game with $N$ selfish players who observe their types privately and take actions that are publicly observed. Players' types evolve as conditionally independent Markov processes, conditioned on their current actions. Their actions and types jointly determine their instantaneous rewards. In dynamic games with asymmetric information, a widely used concept of equilibrium is perfect Bayesian equilibrium (PBE), which consists of a strategy and belief pair that simultaneously satisfy sequential rationality and belief consistency. In general, there does not exist a universal algorithm that decouples the interdependence of strategies and beliefs over time in calculating PBE. In this paper, for the finite-horizon game with independent types we develop a two-step backward-forward recursive algorithm that sequentially decomposes the problem (w.r.t. time) to obtain a subset of PBEs, which we refer to as structured Bayesian perfect equilibria (SPBE). In such equilibria, a player's strategy depends on its history only through a common public belief and its current private type. The backward recursive part of this algorithm defines an equilibrium generating function. Each period in the backward recursion involves solving a fixed-point equation on the space of probability simplexes for every possible belief on types. Using this function, equilibrium strategies and beliefs are generated through a forward recursion. We then extend this methodology to the infinite-horizon model, where we propose a time-invariant single-shot fixed-point equation, which in conjunction with a forward recursive step, generates the SPBE. Sufficient conditions for the existence of SPBE are provided. With our proposed method, we find equilibria that exhibit signaling behavior. This is illustrated with the help of a concrete public goods example.
△ Less
Submitted 18 March, 2018; v1 submitted 25 August, 2015;
originally announced August 2015.