\addbibresource

[location=local]reference.bib

Initial Error Affection and Error Correction in Linear Quadratic Mean Field Games under Erroneous Initial Information

Yuxin Jin
Beihang University Lu Ren
Beihang University Wang Yao
Beihang University Xiao Zhang
Beihang University

Abstract

In this paper, the initial error affection and error correction in linear quadratic mean field games (MPLQMFGs) under erroneous initial distribution information are investigated. First, a LQMFG model is developed where agents are coupled by dynamics and cost functions. Next, by studying the evolutionary of LQMFGs under erroneous initial distributions information, the affection of initial error on the game and agents’ strategies are given. Furthermore, under deterministic situation, we provide a sufficient condition for agents to correct initial error and give their optimal strategies when agents are allowed to change their strategies at a intermediate time. Besides, the situation where agents are allowed to predict MF and adjust their strategies in real-time is considered. Finally, simulations are performed to verify above conclusions.

keywords:

Mean Field Game, Optimal Control, Error Correction

1 Introduction

Mean Field Games (MFGs), first introduced in 2007 by Huang et al. in a series of works [1]-[3] and Lasry $\&$ Lions in [4]-[6], represent a framework for analyzing strategic interactions among a large number of homogeneous players. Several studies, including [8]-[11], have delved into Linear Quadratic Mean Field Games (LQMFGs) under various settings. In these investigations, the cost functional assumes a quadratic form with respect to all state variables, control variables, and the mean field terms. Concurrently, the controlled dynamics are linear and incorporate mean field terms, reflecting the influence of the aggregate behavior on individual player dynamics. This setup provides a tractable yet rich framework for analyzing complex strategic interactions in large-scale systems.

MFGs have garnered significant attention and found widespread applications across diverse fields, including swarm robotics, industrial engineering, and crowd dynamics [13]-[20]. In some scenarios, agents are required to give their decentralized strategies. Nevertheless, the presence of a massive number of agents poses challenges to real-time communication, hindering their ability to accurately observe the group’s state in real-time. In [1], the decentralized strategies which yield Nash equilibria were provided. For LQMFGs, the equilibrium usually can be mathematically described as a set of equations with a forward-backward structure. When a unique mean field equilibrium exists, given the initial mean field states and parameters of the system, an agent can predict the mean field state(MF-S) and mean field control(MF-C) by solving the system of equations describing the equilibrium. So at the initial moment, players can predict MF-S and MF-C and give their optimal feedback control laws related to their real-time states.

However, in application, there may be some errors in the initial information about the mean field terms obtained by agents. A series of questions that arise from this problem are: In LQMFGs, when agents have different erroneous observations on the initial mean field states, how does erroneous initial information affect the game? Is it possible to detect and correct errors according to the information of its own state for an agent? How will the game evolve if agents are allowed to modify their strategies?

Based on the above questions, this paper studies the initial error affection, error correction and strategy modification in LQMFGs under erroneous initial distribution information. We assume each agent has a clean observation on its own state and control, and the parameters of the system are accessible. At the initial moment, agents obtain different erroneous initial mean field states, predict the mean field terms(MF) and give their feedback control law for the whole time period based on the above information.

In section 2, we present the LQMFG model where agents are coupled through cost functions and dynamic equations, mathematically describe the mean field equilibrium and agents’ strategies under correct information. In section 3, we give our assumptions, study LQMFGs under erroneous initial distribution information and discuss the initial error affection. In section 4, we discuss the error correction and strategy modification in deterministic situation when agents are allowed to modify their strategies at an intermediate moment without observing the mean field term. In section 5, we consider the situation where agents are allowed to predict MF and adjust their strategies in real-time. At last, we conduct simulations and verify our conclusions. The main contributions of this paper can be summarized as follows:

•

We build a mathematic model of LQMFGs under erroneous initial distribution information, study and give a mathematically description of the corresponding strategies of agents.
•

We study the initial error affection on the game. Four all-agents-known linear relationships are given between the deviations and the initial error.
•

Under deterministic situation, we provide a sufficient condition for agents to correct initial error and give their optimal strategies when agents are allowed to change their strategies at a intermediate moment. A calculation method for new strategies and the error analysis for new equilibrium is given.
•

We consider the situation that agents are allowed to predict MF and adjust their strategies in real-time. Then, we analysis the estimation error affection on the results.

2 LQMFG under Correct Information

In this section, we introduce our LQMFG model where a large number of agents are coupled by mean filed term both in dynamics and cost functions. We consider a stochastic game with $N$ agents, $\mathcal{A}_{i},1\leq i\leq N$ .

2.1 Dynamics and Cost Functions

Let $(\Omega,\mathcal{F},\mathbb{P})$ be a complete probability space and $T>0$ . Suppose that $W_{i},1\leq i\leq N$ are N independent n-dimensional standard Wiener processes defined on $(\Omega,\mathcal{F},\mathbb{P})$ , and $x_{0}^{i},1\leq i\leq N$ are N independent, identically distributed n-dimensional random vectors. We assume $x_{0}^{i}$ is independent to $(W_{1},...,W_{N}),1\leq i\leq N$ . Let $(x_{i}(t))_{0\leq t\leq T}$ be the state of $\mathcal{A}_{i}$ , and $x_{0}^{i}$ represents the initial state of $\mathcal{A}_{i}$ .

The dynamics of $\mathcal{A}_{i}$ are given by

\begin{split}\left\{\begin{array}[]{ll}&dx_{i}(t)=[Ax_{i}(t)+Bu_{i}(t)+Cx^{(N)% }(t)+Fu^{(N)}(t)]dt+DdW_{i}(t),\\ &x_{i}(0)=x_{0}^{i}.\end{array}\right.\end{split}

(1)

where $A,B,C,F,D$ are matrices of suitable sizes, $x^{(N)}(t):=1/N\Sigma_{j=1}^{N}x_{j}(t)$ and $u^{(N)}(t):=1/N\Sigma_{j=1}^{N}u_{j}(t)$ are the mean field state(MF-S) and mean field control(MF-C). Control $u_{i}(t)$ is in $L^{2}_{\mathcal{G}}(0,T;\mathbb{R}^{m})$ , which is the $L^{2}$ -space of stochastic processes adapted to the filtration

\mathcal{G}_{t}:=\sigma((x_{1}(0),...,x_{N}(0),W_{1}(s),...,W_{N}(s)),s\leq t),

with values in $\mathbb{R}^{m}$ .

The cost functional of $\mathcal{A}_{i}$ is given by

\begin{split}&J(u_{i})=\frac{1}{2}\mathbb{E}[\int_{0}^{T}[\|x_{i}(t)-s\|_{Q_{I% }}^{2}+\|u_{i}(t)\|_{R}^{2}+\|x_{i}(t)-(\Gamma x^{(N)}(t)+\eta)\|_{Q}^{2}]dt+% \|x_{i}(T)-\bar{s}\|_{\bar{Q}_{I}}^{2}\\ &+\|x_{i}(T)-(\bar{\Gamma}x^{(N)}(T)+\bar{\eta})\|_{\bar{Q}}^{2}].\end{split}

(2)

where we define $\|X\|_{Q}^{2}=X^{T}QX$ . $Q_{I},Q,R,\bar{Q}_{I},\bar{Q}$ are positive definite matrices. Let $\Theta$ be the set of parameters of the above systems.

2.2 Optimal Controls

As $N\rightarrow\infty$ , suppose that agents are homogeneous and indistinguishable, we can take $\mathcal{A}_{i}$ as a generic agent, and $z(t):=\lim_{N\rightarrow\infty}x^{(N)}(t),\bar{u}(t):=\lim_{N\rightarrow\infty% }u^{(N)}(t)$ .

Problem 2.1 Given continuous deterministic processes $(z(t))_{0\leq t\leq T}$ and $(\bar{u})_{0\leq t\leq T}$ with values in $\mathbb{R}^{n}$ . Find an optimal control $u_{i}$ in $L^{2}_{\mathcal{F}_{i}}(0,T;\mathbb{R}^{m})$ which minimizes

\begin{split}&J(v)=\frac{1}{2}\mathbb{E}[\int_{0}^{T}[\|x_{i}(t)-s\|_{Q_{I}}^{% 2}+\|v(t)\|_{R}^{2}+\|x_{i}(t)-(\Gamma z(t)+\eta)\|_{Q}^{2}]dt+\|x_{i}(T)-\bar% {s}\|_{\bar{Q}_{I}}^{2}\\ &+\|x_{i}(T)-(\bar{\Gamma}z(T)+\bar{\eta})\|_{\bar{Q}}^{2}].\end{split}

where $\mathcal{F}^{i}_{t}:=\sigma(x_{i}(0),W_{i}(s),s\leq t)$ , $v$ is an admissible control in $L^{2}_{\mathcal{F}_{i}}(0,T;\mathbb{R}^{m})$ , and the dynamics is given by

\begin{split}&dx_{i}(t)=[Ax_{i}(t)+Bv(t)+Cz(t)+F\bar{u}(t)]dt+DdW_{i}(t),\\ &x_{i}(0)=x_{0}^{i}.\end{split}

Theorem 2.1 For given continuous deterministic process $(z(t))_{0\leq t\leq T}$ and $(\bar{u})_{0\leq t\leq T}$ , $\mathcal{A}_{i}$ has its unique optimal control $(u_{i}(t))_{0\leq t\leq T}$ , where $u_{i}(t)=-R^{-1}B^{T}p_{i}(t)$ , $(x_{i},p_{i})$ satisfy the stochastic maximum principle relation

\begin{split}&dx_{i}(t)=(Ax_{i}(t)-BR^{-1}B^{T}p_{i}(t)+Cz(t)+F\bar{u}(t))dt+% DdW_{i}(t)\\ &x_{i}(0)=x_{0}^{i}\\ &dy_{i}(t)=(-A^{T}y_{i}(t)+Q\Gamma z(t)-(Q_{I}+Q)x_{i}(t)+Q_{I}s+Q\eta)dt\\ &y_{i}(T)=(\bar{Q}_{I}+\bar{Q})x_{i}(T)-\bar{Q}_{I}\bar{s}-\bar{Q}(\bar{\Gamma% }z(T)+\bar{\eta}).\end{split}

(3)

such that $p_{i}(t)=\mathbb{E}[y_{i}(t)|\mathcal{F}_{t}^{i}]$ .

Proof: We give a proof of this theorem based on [c1].

Remark 2.1 The optimal control $u_{i}$ has a feedback representation $-R^{-1}B^{T}(P_{1}(t)x_{i}(t)+g(t))$ , where $P_{1}$ satisfies a non-symmetric riccati equation

\begin{split}-dP_{1}(t)=&[P_{1}A+A^{T}P_{1}+(Q_{I}+Q)-P_{1}BR^{-1}B^{T}P_{1}]% dt,\\ P_{1}(T)=&\bar{Q}_{I}+\bar{Q}.\end{split}

(4)

and $g(t)$ satisfies

\begin{split}dg(t)=&-[(A^{T}-P_{1}BR^{-1}B^{T})g(t)+(P_{1}C-Q\Gamma)z(t)+P_{1}% F\bar{u}(t)-Q_{I}s-Q\eta]dt,\\ g(T)=&-\bar{Q}_{I}\bar{s}-\bar{Q}(\bar{\Gamma}z(T)+\bar{\eta}).\end{split}

(5)

2.3 Mean Field Equilibrium

On the one hand, a Nash equilibrium is reached if and only if each agent’s control is the optimal response to the current mean field term, on the other hand, $z(t)$ is defined as $z(t)=\lim_{N\rightarrow\infty}\Sigma_{i=1}^{N}x_{i}(t)/N$ and $\bar{u}(t)$ is defined as $\bar{u}(t)=\lim_{N\rightarrow\infty}\Sigma_{i=1}^{N}u_{i}(t)/N$ . When $N\rightarrow\infty$ , we have

Theorem 2.2 The equilibrium mean field trajectory $z(t)_{0\leq t\leq T}$ and mean field control $\bar{u}(t)=-R^{-1}B^{T}p(t)$ satisfy the following equations

\begin{split}&d\begin{pmatrix}z(t)\\ p(t)\\ \end{pmatrix}=\left\{\begin{pmatrix}A+C&-(B+F)R^{-1}B^{T}\\ Q\Gamma-Q_{I}-Q&-A^{T}\end{pmatrix}\begin{pmatrix}z(t)\\ p(t)\\ \end{pmatrix}-\begin{pmatrix}0\\ Q_{I}s+Q\eta\\ \end{pmatrix}\right\}dt,\\ &z(0)=z^{0},\\ &p(T)=(\bar{Q}_{I}+\bar{Q}-\bar{Q}\bar{\Gamma})z(T)-\bar{Q_{I}}\bar{s}-\bar{Q}% \bar{\eta}.\\ \end{split}

(6)

Remark 2.2 We notice that $p(t)=P_{0}(t)z(t)+\mathcal{G}(t)$ , where $P_{0}(t)$ satisfies a non-symmetric riccati equation

\begin{split}-dP_{0}(t)=&\{P_{0}(t)(A+C)+A^{T}P_{0}(t)+(Q_{I}+Q-Q\Gamma)-P_{0}% (t)(B+F)R^{-1}B^{T}P_{0}(t)\}dt,\\ P_{0}(T)=&\bar{Q}_{I}+\bar{Q}-\bar{Q}\bar{\Gamma}.\\ \end{split}

(7)

and $\mathcal{G}(t)$ satisfies the backward ordinary differential equations (BODEs)

\begin{split}d\mathcal{G}(t)=&\{-(A^{T}-P_{0}(t)(B+F)R^{-1}B^{T})\mathcal{G}+Q% _{I}s+Q\eta\}dt,\\ \mathcal{G}(T)=&-\bar{Q_{I}}\bar{s}-\bar{Q}\bar{\eta}.\\ \end{split}

(8)

According to Remark 2.1, we also have $p(t)=P_{1}(t)z(t)+g(t)$ .

2.4 Closed-Loop Feedback Control

When (7) and (4) have unique solutions, $\mathcal{A}_{i}$ can get its closed-loop equilibrium control law.

At the initial time, $P_{0}$ , $P_{1}$ and $\mathcal{G}$ can be computed by $\mathcal{A}_{i}$ , $\forall i$ . For given $z^{0}$ , agents can compute the mean field state(MF-S) $z(t)$ and mean field control(MF-C) $\bar{u}(t)$ for $0\leq t\leq T$ .

Predict MF-S and MF-C Substitute $P_{0}(t)$ and $\mathcal{G}(t)$ into (6), then $(z(t))_{0\leq t\leq T}$ can be uniquely solved by

\begin{split}&dz(t)=[(A+C-(B+F)R^{-1}B^{T}P_{0})z(t)-(B+F)R^{-1}B^{T}\mathcal{% G}]dt\\ &z(0)=z^{0}.\\ \end{split}

(9)

and $\bar{u}(t)$ can be given by $\bar{u}(t)=-R^{-1}B^{T}p(t)=-R^{-1}B^{T}(P_{0}(t)z(t)+\mathcal{G}(t))$ .

Feedback Control For computed $(z(t))_{0\leq t\leq T}$ , $\mathcal{A}_{i}$ can solve (5) for $(g(t))_{0\leq t\leq T}$ . According to Remark 2.1, $\mathcal{A}_{i}$ ’s feedback optimal control is

\begin{split}&u_{i}(t)=\phi(x_{i}(t),t),0\leq t\leq T\\ &\phi(x_{i}(t),t)=-R^{-1}B^{T}(P_{1}(t)x_{i}(t)+g(t)).\\ \end{split}

(10)

We notice that for given $z$ and $\bar{u}$ , $\mathbb{E}[P(t)|\mathcal{F}_{t}^{i}]=\mathbb{E}[P(t)|\mathcal{F}_{0}^{i}],% \mathbb{E}[g(t)|\mathcal{F}_{t}^{i}]=\mathbb{E}[g(t)|\mathcal{F}_{0}^{i}]$ , so the above optimal feedback control law can be given at the initial time.

2.5 Existence and Uniqueness of the Equilibrium

In this subsection, we provide a sufficient condition for the existence and uniqueness of the mean field equilibrium, which is equivalent to (6) has a unique solution. In accordance with Theorem 4.1 and Theorem 4.2 in[12], we have the following proposition.

We set $\mathcal{Q}_{p}+\mathcal{S}=Q_{I}+Q-Q\Gamma,\mathcal{\bar{Q}}_{p}+\mathcal{% \bar{S}}=\bar{Q}_{I}+\bar{Q}-\bar{Q}\bar{\Gamma},\mathcal{C}=(B+F)R^{-1}B^{T}$ , where $\mathcal{Q}_{p},\mathcal{\bar{Q}}_{p}$ are positive matrices.

Proposition 2.1 Let $\phi(s,t)$ be the fundamental solution to $A$ . Suppose that

\begin{split}(1+\sqrt{T}\|\phi\|_{T}\cdot\|\mathcal{C}\mathcal{Q}_{p}^{-\frac{% 1}{2}}\|)(1+N(S))<2.\end{split}

(11)

Where $\|\cdot\|$ stands for usual Euclidean norm. Then there exists a unique mean field equilibrium. Here,

\begin{split}&\|\phi\|_{T}:=\sup_{0\leq t\leq T}\sqrt{\|\phi^{*}(T,t)\mathcal{% \bar{Q}}^{-\frac{1}{2}}\|^{2}+\int_{t}^{T}\|\phi^{*}(s,t)\mathcal{\bar{Q}}^{% \frac{1}{2}}\|^{2}ds}\\ &N(S)=max\{\|\mathcal{\bar{Q}}^{-\frac{1}{2}}\mathcal{\bar{S}}\mathcal{\bar{Q}% }^{-\frac{1}{2}}\|,\|\mathcal{Q}^{-\frac{1}{2}}\mathcal{S}\mathcal{Q}^{-\frac{% 1}{2}}\|\}\end{split}

2.6 $\epsilon$ -Nash Equilibrium

In this subsection, we shall show that the equilibrium strategy $u_{i}$ of $\mathcal{A}_{i}$ is an $\epsilon$ -Nash equilibrium of the N-player stochastic game. Because of the permutation symmetry, it suffices to consider Player 1.

Theorem 2.3 Feedback control (10) is an $\epsilon$ -Nash equilibrium: for any $\epsilon>0$ , there is a positive integer $N_{0}$ such that when $N\geq N_{0}$ , we have:

\begin{split}&\inf_{v_{1}\in L_{\mathcal{G}}^{2}(0,T;\mathbb{R}^{m})}J(v_{1},u% _{2},...,u_{N})\geq J(u_{1},u_{2},....,u_{N})-\epsilon\\ &J(v,u_{2},...,u_{N}):=\frac{1}{2}\mathbb{E}[\int_{0}^{T}[\|x_{1}(t)-s\|_{Q_{I% }}^{2}+\|v(t)\|_{R}^{2}+\|x_{1}(t)-(\Gamma\frac{1}{N}\Sigma_{j=1}^{N}x_{j}(t)+% \eta)\|_{Q}^{2}]dt\\ &+\|x_{1}(T)-\bar{s}\|_{\bar{Q}_{I}}^{2}+\|x_{1}(T)-(\bar{\Gamma}\frac{1}{N}% \Sigma_{j=1}^{N}x_{j}(T)+\bar{\eta})\|_{\bar{Q}}^{2}].\\ \end{split}

(12)

Proof: The proof of this theorem can be seen in the proof of Theorem 3.9 in [7].

3 Initial Error Propagation in LQMFGs

3.1 Assumptions

•

The correct $x_{i}(0)$ , $\Theta$ are accessible for $\mathcal{A}_{i},1\leq i\leq N$ .
•

$z_{i}(0)=z^{0}+E_{i}$ is the initial mean field state as observed by $\mathcal{A}_{i},1\leq i\leq N$ with error $E_{i}\in\mathbb{R}^{n}$ , and we define $\bar{E}=\frac{1}{N}\Sigma_{i=1}^{N}E_{i}$ .
•

At $t=0$ , $\mathcal{A}_{i}$ gives its feedback control law $u_{i}(t)=\phi_{i}(x_{i}(t),t)$ and evolves according to this strategy during $0\leq t\leq T$ .
•

$(x_{i}(t))_{0\leq t\leq t_{0}},(u_{i}(t))_{0\leq t\leq t_{0}}$ are accessible for $\mathcal{A}_{i}$ at $t=t_{0}$ .

We define $(z_{i}(t))_{0\leq t\leq T}$ as the MF-S and $(\bar{u}_{i}(t))_{0\leq t\leq T}$ as the MF-C predicted by $\mathcal{A}_{i}$ . Then we have $\mathcal{F}_{t}^{i}=\{x_{i}(0),W_{i}(s),s\leq t\}$ .

3.2 Feedback Control under Initial Error

The impact of the erroneous initial MF-S information is first manifested in (9).

MF-S and MF-C $\mathcal{A}_{i}$ predicts $(z_{i}(t))_{0\leq t\leq T}$ by uniquely solved

\begin{split}dz_{i}(t)=&[(A+C-(B+F)R^{-1}B^{T}P_{0})z_{i}(t)-(B+F)R^{-1}B^{T}% \mathcal{G}]dt\\ z_{i}(0)=&z^{0}+E_{i}.\\ \end{split}

(13)

and $\bar{u}_{i}(t)$ can be given by $\bar{u}_{i}(t)=-R^{-1}B^{T}(P_{0}(t)z_{i}(t)+\mathcal{G}(t))$ .

For new $(z_{i}(t))_{0\leq t\leq T}$ and $(\bar{u}_{i}(t))_{0\leq t\leq T}$ , (5) changes to

\begin{split}dg_{i}(t)=&-[(A^{T}-P_{1}BR^{-1}B^{T})g_{i}(t)+(P_{1}C-Q\Gamma)z_% {i}(t)+P_{1}F\bar{u}(t)-Q_{I}s-Q\eta]dt,\\ g_{i}(T)=&-\bar{Q}_{I}\bar{s}-\bar{Q}(\bar{\Gamma}z_{i}(T)+\bar{\eta}).\end{split}

(14)

Feedback Control For computed $(z_{i}(t))_{0\leq t\leq T}$ , $\mathcal{A}_{i}$ can solve (14) for $(g_{i}(t))_{0\leq t\leq T}$ . $\mathcal{A}_{i}$ ’s actual feedback control is

\begin{split}u_{i}(t)=&\phi_{i}(x_{i}(t),t),0\leq t\leq T\\ \phi_{i}(x_{i}(t),t)=&-R^{-1}B^{T}(P_{1}(t)x_{i}(t)+g_{i}(t)).\\ \end{split}

(15)

3.3 Initial Error Affection on Predicted Mean Field

In this subsection, we analysis the initial error affection on the mean field equilibrium predicted by $\mathcal{A}_{i}$ . We represent the MF-S and MF-C under correct information as $(z^{c}(t))_{0\leq t\leq T}$ and $(\bar{u}^{c}(t))_{0\leq t\leq T}$ . We set $\Delta z_{i}(t)=z_{i}(t)-z^{c}(t)$ , $\Delta\bar{u}_{i}(t)=\bar{u}_{i}(t)-\bar{u}^{c}(t)$ , then according to (9) and (13), we have

\begin{split}d\Delta z_{i}(t)=&[(A+C-(B+F)R^{-1}B^{T}P_{0})\Delta z_{i}(t)]dt% \\ \Delta z_{i}(0)=&E_{i}.\\ \end{split}

(16)

Define $\Phi_{1}(t)$ as a basis solution of (16), then $\Phi_{1}(t)$ can be solved according to $H_{0}(t)=A+C-(B+F)R^{-1}B^{T}P_{0}(t)$ . The solution of (16) is given by

\begin{split}\Delta z_{i}(t)=\Phi_{1}(t)\Phi_{1}(0)^{-1}E_{i}.\\ \end{split}

(17)

Since $(H_{0}(t))_{0\leq t\leq T}$ can be calculated without knowing the information of initial states, $(\Delta z_{i}(t))_{0\leq t\leq T}$ has an all-agents-known linear relationship with $E_{i}$ . We can get the following theorem

Theorem 3.1 $(\Delta z_{i}(t))_{0\leq t\leq T}$ has a linear relationship with $E_{i}$ , and this linear relationship can be computed by all agents without knowing $E_{i}$ , which is

\begin{split}\Delta z_{i}(t)=&\Phi_{1}(t)\Phi_{1}(0)^{-1}E_{i}.\\ \Delta\bar{u}_{i}(t)=&-R^{-1}B^{T}P_{0}(t)\Phi_{1}(t)\Phi_{1}(0)^{-1}E_{i}\\ \end{split}

(18)

This theorem gives the deviation of the equilibrium in the prediction of $\mathcal{A}_{i}$ and that under correct information.

3.4 Initial Error Affection on Feedback Control

In this subsection, we analysis the initial error affection on the feedback control law used by $\mathcal{A}_{i}$ . We represent the feedback control under correct information as $u_{i}^{c}(t)=\phi_{c}(x_{i}^{c}(t),t)=-R^{-1}B^{T}(P_{1}(t)x_{i}^{c}(t)+g^{c}(% t))$ , and the actual trajectory of $\mathcal{A}_{i}$ as $x_{i}^{A}(t)$ .

We set $\Delta g_{i}(t)=g_{i}(t)-g^{c}(t),\Delta\phi_{i}(\Delta x_{i}(t),t)=\phi_{i}(x% _{i}^{A}(t),t)-\phi_{c}(x_{i}^{c}(t),t),\Delta x_{i}(t)=x_{i}^{A}(t)-x_{i}^{c}% (t)$ , then according to (5) and (14), we have

\begin{split}d\Delta g_{i}(t)=&-[(A^{T}-P_{1}(B+F)R^{-1}B^{T})\Delta g_{i}(t)+% (P_{1}C-P_{1}FR^{-1}B^{T}P_{1}-Q\Gamma)\Delta z_{i}(t)]dt,\\ \Delta g_{i}(T)=&-\bar{Q}\bar{\Gamma}\Delta z_{i}(T).\end{split}

(19)

(19) corresponds a homogeneous linear equation

\begin{split}&dg(t)=-[(A^{T}-P_{1}(t)(B+F)R^{-1}B^{T})g(t)]dt.\\ \end{split}

(20)

Define $\Phi_{g}(t)$ as a basis solution of (20), then $\Phi_{g}(t)$ can be solved according to $H_{g}(t)=A^{T}-P_{1}(t)(B+F)R^{-1}B^{T}$ . Using the method of variation of parameters, the solution of (19) is given by

\begin{split}&\Delta g_{i}(t)=-\Phi_{g}(t)\Phi_{g}^{-1}(t)\bar{Q}\bar{\Gamma}% \Delta z_{i}(T)+\Phi_{g}(t)\int_{T}^{t}\Phi_{g}^{-1}(s)f_{g}(s)ds.\\ \end{split}

(21)

where $f_{g}(s)=-(P_{1}(s)C-P_{1}(s)FR^{-1}B^{T}P_{1}(s)-Q\Gamma)\Delta z_{i}(s)$ . Applying the conclusion of Theorem 3.1, we can get the following theorem

Theorem 3.2 $(\Delta g_{i}(t))_{0\leq t\leq T}$ has a linear relationship with $E_{i}$ , and this linear relationship can be computed by all agents without knowing $E_{i}$ , which is

\begin{split}\Delta g_{i}(t)=&\mathcal{M}_{g}(t)E_{i}.\\ \mathcal{M}_{g}(t)=&-\Phi_{g}(t)\Phi_{g}^{-1}(T)\bar{Q}\bar{\Gamma}\Phi_{1}(T)% \Phi_{1}^{-1}(0)-\\ &\Phi_{g}(t)\int_{T}^{t}\Phi_{g}^{-1}(s)(P_{1}(s)C-P_{1}(s)FR^{-1}B^{T}P_{1}(s% )-Q\Gamma)\Phi_{1}(s)\Phi_{1}^{-1}(0)ds\\ \end{split}

(22)

According to (15) and Theorem 3.2, we have $\Delta\phi_{i}(\Delta x_{i}(t),t)=-R^{-1}B^{T}(P_{1}(t)\Delta x_{i}(t)+\Delta g% _{i}(t))$ .

Remark 3.1 The relationship between $E_{i}$ and $\Delta\phi_{i}(\Delta x_{i}(t),t)$ can be represented as

\begin{split}&\Delta\phi_{i}(\Delta x_{i}(t),t)=-R^{-1}B^{T}(P_{1}(t)\Delta x_% {i}(t)+\mathcal{M}_{g}(t)E_{i}).\\ \end{split}

(23)

3.5 Initial Error Affection on Actual MF

In this subsection, we analysis the initial error affection on the actual MF. We represent the trajectory of $\mathcal{A}_{i}$ under correct information as $x_{i}^{c}(t)$ , the actual trajectory of $\mathcal{A}_{i}$ as $x_{i}^{A}(t)$ , the actual MF-S as $z^{A}(t)$ , and the actual MF-C as $z^{c}(t)$ . $\Delta z^{A}(t)=z^{A}(t)-z^{c}(t)$ , $W_{i}^{c}$ and $W_{i}^{A}$ are independent n-dimensional standard Wiener processes defined on $(\Omega,\mathcal{F},\mathbb{P})$ .

Substitute the feedback control into dynamics, we have

\begin{split}&dx_{i}^{c}(t)=[Ax_{i}^{c}(t)-BR^{-1}B^{T}(P_{1}(t)x_{i}^{c}(t)+g% ^{c}(t))+Cz^{c}(t)+F\bar{u}^{c}(t)]dt+DdW_{i}^{c}(t),\\ &x_{i}^{c}(0)=x_{0}^{i}.\\ \end{split}

(24)

and

\begin{split}&dx_{i}^{A}(t)=[Ax_{i}^{A}(t)-BR^{-1}B^{T}(P_{1}(t)x_{i}^{A}(t)+g% _{i}(t))+Cz^{A}(t)+F\bar{u}^{A}(t)]dt+DdW_{i}^{A}(t),\\ &x_{i}^{A}(0)=x_{0}^{i}.\\ \end{split}

(25)

Then we can get

\begin{split}d\Delta x_{i}(t)=&[A\Delta x_{i}(t)-BR^{-1}B^{T}(P_{1}(t)\Delta x% _{i}(t)+\Delta g_{i}(t))+C\Delta z^{A}(t)+F\Delta\bar{u}^{A}(t)]dt+\\ &Dd(W_{i}^{A}(t)-W_{i}^{c}(t)),\\ \Delta x_{i}(0)=&0.\\ \end{split}

(26)

Since $\Delta z^{A}(t)=\frac{1}{N}\Sigma_{i=1}^{N}\Delta x_{i}(t)$ , $\Delta\bar{u}^{A}(t)=\frac{1}{N}\Sigma_{i=1}^{N}-R^{-1}B^{T}(P_{1}(t)\Delta x_% {i}(t)+\Delta g_{i}(t))$ , let $\Delta\bar{g}(t)=\frac{1}{N}\Sigma_{i=1}^{N}\Delta g_{i}(t)$ , when $N\rightarrow\infty$ , we have

\begin{split}d\Delta z^{A}(t)=&[(A+C-(B+F)R^{-1}B^{T}P_{1}(t))\Delta z^{A}(t)-% (B+F)R^{-1}B^{T}\Delta\bar{g}(t)]dt,\\ \Delta z^{A}(0)=&0.\\ \end{split}

(27)

(27) corresponds a homogeneous linear equation

\begin{split}&d\Delta z^{A}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{1}(t))\Delta z^{A}(t)]% dt.\\ \end{split}

(28)

Define $\Phi_{z}(t)$ as a basis solution of (28), then $\Phi_{z}(t)$ can be solved according to $H_{z}(t)=A+C-(B+F)R^{-1}B^{T}P_{1}(t)$ . Using the method of variation of parameters, the solution of (27) is given by

\begin{split}&\Delta z^{A}(t)=\Phi_{z}(t)\int_{0}^{t}\Phi_{z}^{-1}(s)f_{z}(s)% ds.\\ \end{split}

(29)

where $f_{z}(s)=-(B+F)R^{-1}B^{T}\Delta\bar{g}(t)$ . Applying the conclusion of Theorem 3.2, we have $\Delta\bar{g}(t)=\mathcal{M}_{g}(t)\bar{E}$ .

Theorem 3.3 $(\Delta z^{A}(t))_{0\leq t\leq T}$ has a linear relationship with $\bar{E}$ , and this linear relationship can be computed by all agents without knowing $\bar{E}$ , which is

\begin{split}\Delta z^{A}(t)=&\mathcal{M}_{z}(t)\bar{E}.\\ \mathcal{M}_{z}(t)=&-\Phi_{z}(t)\int_{0}^{t}\Phi_{z}^{-1}(s)(B+F)R^{-1}B^{T}% \mathcal{M}_{g}(s)ds\\ \end{split}

(30)

Remark 3.2 The relationship between $\bar{E}$ and $\Delta\bar{u}^{A}$ can be represented as

\begin{split}\Delta\bar{u}^{A}=&-R^{-1}B^{T}(P_{1}(t)\Delta z^{A}(t)+\mathcal{% M}_{g}(t)\bar{E})\\ &=-[R^{-1}B^{T}(P_{1}(t)\mathcal{M}_{z}(t)+\mathcal{M}_{g}(t))]\bar{E}\end{split}

(31)

3.6 Initial Error Affection on Actual Trajectory of $\mathcal{A}_{i}$

In this subsection, we analysis the initial error affection on $\mathcal{A}_{i}$ ’s trajectory. We return to (26), let $x_{i}^{E}(t)=\mathbb{E}[\Delta x_{i}(t)|\mathcal{G}_{0}]$ . Taking expectations on both sides of the equation, we have

\begin{split}&dx_{i}^{E}(t)=[Ax_{i}^{E}(t)-BR^{-1}B^{T}(P_{1}(t)x_{i}^{E}(t)+% \Delta g_{i}(t))+C\Delta z^{A}(t)+F\Delta\bar{u}^{A}(t)]dt,\\ &x_{i}^{E}(0)=0.\\ \end{split}

(32)

(32) corresponds a homogeneous linear equation

\begin{split}&dx_{i}^{E}(t)=[(A-BR^{-1}B^{T}P_{1}(t))x_{i}^{E}(t)]dt.\\ \end{split}

(33)

Define $\Phi_{x}(t)$ as a basis solution of (32), then $\Phi_{x}(t)$ can be solved according to $H_{x}(t)=A-BR^{-1}B^{T}P_{1}(t)$ . Using the method of variation of parameters, the solution of (31) is given by

\begin{split}&x_{i}^{E}(t)=\Phi_{x}(t)\int_{0}^{t}\Phi_{x}^{-1}(s)f_{x}(s)ds.% \\ \end{split}

(34)

where $f_{x}(s)=-BR^{-1}B^{T}\Delta g_{i}(t)+C\Delta z^{A}(t)+F\Delta\bar{u}^{A}(t)$ . Applying the conclusion of Theorem 3.3, we can get the following theorem

Theorem 3.4 $(x_{i}^{E}(t))_{0\leq t\leq T}$ has a linear relationship with $(E_{i}^{T},\bar{E}^{T})^{T}$ , and this linear relationship can be computed by all agents without knowing $(E_{i}^{T},\bar{E}^{T})^{T}$ , which is

\begin{split}&x_{i}^{E}(t)=\mathcal{M}_{x}^{1}(t)E_{i}+\mathcal{M}_{x}^{2}(t)% \bar{E}\\ &\mathcal{M}_{x}^{1}(t)=-\Phi_{x}(t)\int_{0}^{t}\Phi_{x}^{-1}(s)\mathcal{L}_{1% }(s)ds\\ &\mathcal{L}_{1}(s)=-BR^{-1}B^{T}\mathcal{M}_{g}(s)\\ &\mathcal{M}_{x}^{2}(t)=-\Phi_{x}(t)\int_{0}^{t}\Phi_{x}^{-1}(s)\mathcal{L}_{2% }(s)ds\\ &\mathcal{L}_{2}(s)=(C-FR^{-1}B^{T}P_{1}(s))\mathcal{M}_{z}(s)-FR^{-1}B^{T}% \mathcal{M}_{g}(s)\\ \end{split}

(35)

Remark 3.4 (26) can be rewritten as

\begin{split}d\Delta x_{i}(t)=&[(A-BR^{-1}B^{T}P_{1}(t))\Delta x_{i}(t)-BR^{-1% }B^{T}\mathcal{M}_{g}(t)E_{i}+\\ &((C-FR^{-1}B^{T}P_{1}(t))\mathcal{M}_{z}(t)-FR^{-1}B^{T}\mathcal{M}_{g}(t))% \bar{E}]dt+Dd(W_{i}^{A}(t)-W_{i}^{c}(t)),\\ \Delta x_{i}(0)=&0.\\ \end{split}

(36)

4 LQMFG with One-Time Error Correction for MF

During the game, when $\mathcal{A}_{i}$ is allowed to change its strategy, it can use its newly acquired information to correct errors and modify its strategy. In deciding on its strategy, $\mathcal{A}_{i}$ not only needs to consider the actual MF, but also needs to make predictions about the strategies of $\mathcal{A}_{j}$ , $1\leq j\leq N$ , and the strategy of $\mathcal{A}_{j}$ is related to $\mathcal{A}_{j}$ ’s prediction of MF and the strategies of other agents… This complicates the discussion of error correction and strategy modification in LQMFGs.

In this section, we consider the situation that $D=0$ , agents are allowed to adjust their strategies at time $t_{0}$ and this principle is known for all agents. When the re-game time $t_{0}$ satisfies some conditions, agents have enough information to calculate the actual MF, and can ensure that other agents also get the actual MF information. We give a sufficient condition about $t_{0}$ for $\mathcal{A}_{i}$ to compute $E_{i},\bar{E}$ and $z^{A}(t_{0})$ at time $t_{0}$ only based on $(x_{i}(t))_{0\leq t\leq t_{0}}$ and $z_{i}^{0}=z^{0}+E_{i}$ . Then, we give the modified strategy for $\mathcal{A}_{i}$ . Besides, we analysis the initial error affection on the new game.

4.1 Information obtained by $\mathcal{A}_{i}$ at $t_{0}$

In this subsection, we discuss what information can be obtained by $\mathcal{A}_{i}$ from the analysis of $(x_{i}(t))_{0\leq t\leq t_{0}}$ . According to (25), we have

Cz^{A}(t)+F\bar{u}^{A}(t)=\dot{x}_{i}(t)-Ax_{i}(t)-Bu_{i}(t)

(37)

As $\bar{u}^{A}(t)=-R^{-1}B^{t}(P_{1}(t)z^{A}(t)+\bar{g}(t))$ , we have

\dot{x}_{i}(t)-Ax_{i}(t)-Bu_{i}(t)=(C-FR^{-1}B^{T}P_{1}(t))z^{A}(t)-FR^{-1}B^{% T}\bar{g}(t).

(38)

where $\dot{(}x)_{i}(t),x_{i}(t),u_{i}(t)$ are known by $\mathcal{A}_{i}$ , $t\in[0,t_{0}]$ . So $\mathcal{A}_{i}$ can get $(Ob(t))_{0\leq t\leq t_{0}}$ , where

Ob(t)=(C-FR^{-1}B^{T}P_{1}(t))z^{A}(t)-FR^{-1}B^{T}\bar{g}(t).

(39)

4.2 Error Correction

In this subsection, we give a sufficient condition for $\mathcal{A}_{i}$ to compute $E_{i}$ and $\bar{E}$ .

Because $(g_{i}(t))_{0\leq t\leq t_{0}}$ and $(z_{i}(t))_{0\leq t\leq t_{0}}$ can be computed by $\mathcal{A}_{i}$ , $\mathcal{A}_{i}$ can compute $(Ob^{1}(t))_{0\leq t\leq t_{0}}$ , where

\begin{split}Ob^{1}(t)&=Ob(t)-(C-FR^{-1}B^{T}P_{1}(t))z_{i}(t)-FR^{-1}B^{T}g_{% i}(t)\\ &=(C-FR^{-1}B^{T}P_{1}(t))(\Delta z^{A}(t)-\Delta z_{i}(t))-FR^{-1}B^{T}(% \Delta\bar{g}(t)-\Delta g_{i}(t))\\ &=\mathcal{K}_{1}(t)\bar{E}+\mathcal{K}_{2}(t)E_{i}\\ &=[\mathcal{K}_{1}(t),\mathcal{K}_{2}(t)][\bar{E}^{T},E_{i}^{T}]^{T}.\\ \end{split}

(40)

where

\begin{split}&\mathcal{K}_{1}(t)=(C-FR^{-1}B^{T}P_{1}(t))\mathcal{M}_{z}(t)-FR% ^{-1}B^{T}\mathcal{M}_{g}(t)\\ &\mathcal{K}_{2}(t)=FR^{-1}B^{T}\mathcal{M}_{g}(t)-(C-FR^{-1}B^{T}P_{1}(t))% \Phi_{1}(t)\Phi_{1}^{-1}(0).\\ \end{split}

(41)

Since $\mathcal{K}_{1}$ and $\mathcal{K}_{2}$ can be computed by $\mathcal{A}_{i}$ , we have the following sufficient condition for error correction.

Theorem 4.1 If there exists $0\leq t_{1}\leq t_{2}...\leq t_{m}\leq t_{0},m\in\mathbb{N}$ , $s.t.$

\begin{split}rank\begin{pmatrix}\mathcal{K}_{1}(t_{1})&\mathcal{K}_{2}(t_{1})% \\ ...&...\\ \mathcal{K}_{1}(t_{m})&\mathcal{K}_{2}(t_{m})\end{pmatrix}=2n\end{split}

(42)

then $\mathcal{A}_{i}$ can compute $\bar{E}$ and $E_{i}$ according to $\mathcal{F}_{t}^{i}$ .
Proof:

Let

\begin{split}\mathcal{K}=\begin{pmatrix}\mathcal{K}_{1}(t_{1})&\mathcal{K}_{2}% (t_{1})\\ ...&...\\ \mathcal{K}_{1}(t_{m})&\mathcal{K}_{2}(t_{m})\end{pmatrix}\end{split}

(43)

According to (40), we have

\begin{split}\begin{pmatrix}Ob^{1}(t_{1})\\ ...\\ Ob^{1}(t_{m})\end{pmatrix}=\mathcal{K}[\bar{E}^{T},E_{i}^{T}]^{T}\end{split}

(44)

Since $Ob^{1}(t_{i}),1\leq i\leq m,\mathcal{K}$ are accessible for $\mathcal{A}_{i}$ , $rank\mathcal{K}=2n=dim[\bar{E}^{T},E_{i}^{T}]^{T}$ , (44) has a unique solution $[\bar{E}^{T},E_{i}^{T}]^{T}$ , and $\mathcal{A}_{i}$ can compute $\bar{E}$ and $E_{i}$ . $\Box$

4.3 New Mean Field Equilibrium and Modified Feedback Control

At time $t_{0}$ , if $\forall i,\mathcal{A}_{i}$ gets the correct $E_{i},\bar{E}$ , $\mathcal{A}_{i}$ can compute $z^{A}(t_{0})$ by

\begin{split}&z^{A}(t_{0})=z_{i}(t_{0})+\mathcal{M}_{z}(t_{0})\bar{E}-\Phi_{1}% (t_{0})\Phi_{1}^{-1}(0)E_{i}.\\ \end{split}

(45)

Then $\mathcal{A}_{i}$ gets correct $z(t_{0})$ , so if all agents are admit to change their feedback controls at the same time $t_{0}$ , the game after $t_{0}$ changes to LQMFG under correct information.
MF-S and MF-C

$\mathcal{A}_{i}$ predicts $(z^{new}(t))_{t_{0}\leq t\leq T}$ by uniquely solved

\begin{split}&dz^{new}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{0})z^{new}(t)-(B+F)R^{-1}B^% {T}\mathcal{G}]dt\\ &z^{new}(t_{0})=z^{A}(t_{0}).\\ \end{split}

(46)

and $\bar{u}_{i}^{new}(t)$ can be given by $\bar{u}^{new}(t)=-R^{-1}B^{T}(P_{0}(t)z^{new}(t)+\mathcal{G}(t))$ .

For new $(z^{new}(t))_{0\leq t\leq T}$ and $(\bar{u}^{new}(t))_{0\leq t\leq T}$ , (5) changes to

\begin{split}&dg^{new}(t)=-[(A^{T}-P_{1}(B+F)R^{-1}B^{T})g^{new}(t)+(P_{1}C-P_% {1}FR^{-1}B^{T}P_{1}-Q\Gamma)z^{new}(t)-Q_{I}s-Q\eta]dt,\\ &g^{new}(T)=-\bar{Q}_{I}\bar{s}-\bar{Q}(\bar{\Gamma}z^{new}(T)+\bar{\eta}).% \end{split}

(47)

Feedback Control

For computed $(z^{new}(t))_{0\leq t\leq T}$ , $\mathcal{A}_{i}$ can solve (47) for $(g^{new}(t))_{0\leq t\leq T}$ . $\mathcal{A}_{i}$ ’s feedback optimal control is

\begin{split}&u_{i}^{new}(t)=\phi^{new}(x_{i}^{new}(t),t),t_{0}\leq t\leq T\\ &\phi^{new}(x_{i}^{new}(t),t)=-R^{-1}B^{T}(P_{1}(t)x_{i}^{new}(t)+g^{new}(t)).% \\ \end{split}

(48)

4.4 Initial Error Affection on New MF

In this subsection, we analysis the initial error affection on the new MF. Let $\Delta z^{new}(t)=z^{new}(t)-z^{c}(t),\Delta\bar{u}^{new}(t)=\bar{u}^{new}(t)-% \bar{u}^{c}(t)$ , we have

\begin{split}&d\Delta z^{new}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{0})\Delta z^{new}(t)% ]dt\\ &\Delta z^{new}(t_{0})=\Delta z^{A}(t_{0}).\\ \end{split}

(49)

The solution of (49) is given by

\begin{split}\Delta z^{new}(t)&=\Phi_{1}(t)\Phi_{1}(t_{0})^{-1}\Delta z^{A}(t_% {0})\\ &=\Phi_{1}(t)\Phi_{1}(t_{0})^{-1}\mathcal{M}_{z}(t_{0})\bar{E}.\\ \end{split}

(50)

Since $(\Phi_{1}(t))_{0\leq t\leq T}$ and $\mathcal{M}_{z}(t_{0})$ can be calculated without knowing the information of initial states, $(\Delta z^{new}(t))_{0\leq t\leq T}$ has an all-agents-known linear relationship with $\bar{E}$ . We can get the following theorem

Theorem 4.2 $(\Delta z^{new}(t))_{t_{0}\leq t\leq T}$ has a linear relationship with $\bar{E}$ , and this linear relationship can be computed by all agents without knowing $\bar{E}$ , which is

\begin{split}\Delta z^{new}(t)&=\Phi_{1}(t)\Phi_{1}(0)^{-1}\mathcal{M}_{z}(t_{% 0})\bar{E}.\\ \Delta\bar{u}_{i}(t)&=-R^{-1}B^{T}P_{1}(t)\Phi_{1}(t)\Phi_{1}(0)^{-1}\mathcal{% M}_{z}(t_{0})\bar{E}.\\ \end{split}

(51)

4.5 Initial Error Affection on modified Control

In this subsection, we analysis the initial error affection on the modified feedback control law used by $\mathcal{A}_{i}$ in last subsection. We set $\Delta g^{new}(t)=g^{new}(t)-g^{c}(t),\Delta\phi^{new}(\Delta x_{i}^{new}(t),t% )=\phi^{new}(x_{i}^{new}(t),t)-\phi(x_{i}^{c}(t),t),\Delta x_{i}^{new}(t)=x_{i% }^{new}(t)-x_{i}^{c}(t)$ , then according to (5) and (47), we have

\begin{split}&d\Delta g^{new}(t)=-[(A^{T}-P_{1}(B+F)R^{-1}B^{T})\Delta g^{new}% t)+(P_{1}C-P_{1}FR^{-1}B^{T}P_{1}-Q\Gamma)\Delta z^{new}(t)]dt,\\ &\Delta g^{new}(T)=-\bar{Q}\bar{\Gamma}\Delta z^{new}(T).\end{split}

(52)

(52) corresponds a homogeneous linear equation (20). Using the method of variation of parameters, the solution of (52) is given by

\begin{split}&\Delta g^{new}(t)=-\Phi_{g}(t)\Phi_{g}^{-1}(t)\bar{Q}\bar{\Gamma% }\Delta z^{new}(T)+\Phi_{g}(t)\int_{T}^{t}\Phi_{g}^{-1}(s)f^{new}_{g}(s)ds.\\ \end{split}

(53)

where $f^{new}_{g}(s)=-(P_{1}(s)C-P_{1}(s)FR^{-1}B^{T}P_{1}(s)-Q\Gamma)\Delta z^{new}% (s)$ . Applying the conclusion of Theorem 4.2, we can get the following theorem

Theorem 4.3 $(\Delta g^{new}(t))_{t_{0}\leq t\leq T}$ has a linear relationship with $\bar{E}$ , and this linear relationship can be computed by all agents without knowing $\bar{E}$ , which is

\begin{split}&\Delta g^{new}(t)=\mathcal{M}_{g}^{new}(t)\bar{E}.\\ &\mathcal{M}^{new}_{g}(t)=-\Phi_{g}(t)\Phi_{g}^{-1}(T)\bar{Q}\bar{\Gamma}\Phi_% {1}(T)\Phi_{1}^{-1}(t_{0})\mathcal{M}_{z}(t_{0})-\\ &\Phi_{g}(t)\int_{T}^{t}\Phi_{g}^{-1}(s)(P_{1}(s)C-P_{1}(s)FR^{-1}B^{T}P_{1}(s% )-Q\Gamma)\Phi_{1}(s)\Phi_{1}^{-1}(t_{0})\mathcal{M}_{z}(t_{0})ds\\ \end{split}

(54)

According to Theorem 4.3, we have $\Delta\phi^{new}(\Delta x_{i}^{new}(t),t)=-R^{-1}B^{T}(P_{1}(t)\Delta x_{i}^{% new}(t)+\Delta g^{new}(t))$ .

Remark 4.1 The relationship between $\bar{E}$ and $\Delta\phi^{new}(\Delta x_{i}^{new}(t),t)$ can be represented as

\begin{split}&\Delta\phi^{new}(\Delta x_{i}^{new}(t),t)=-R^{-1}B^{T}(P_{1}(t)% \Delta x_{i}^{new}(t)+\mathcal{M}^{new}_{g}(t)\bar{E}).\\ \end{split}

(55)

5 LQMFG with Real-Time Estimation for MF

In this section, we consider the situation that agents are allowed to predict MF and adjust their strategies in real-time. Then, we analysis the estimation error affection on the results. Due to the existence of random terms, $\mathcal{A}_{i}$ ’s estimation of current MF based on its actual trajectory may be incorrect, so we consider the situation where $\mathcal{A}_{i}$ ’s estimation of MF and strategy change over time.

Consider $\mathcal{A}_{i}$ ’s behavior at any given moment $t_{0}$ . At time $t_{0}\in[0,T]$ , $\mathcal{A}_{i}$ estimates current MF-S $z^{A}(t_{0})$ at time $t_{0}$ as $\hat{z}_{i,t_{0}}(t_{0})$ , predicts MF-S and MF-C after time $t_{0}$ as $(\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ , $(\bar{u}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ , and gives its feedback optimal control $u_{i,t_{0}}(t)=\phi_{i,t_{0}}(x_{i}(t),t),t_{0}\leq t\leq T$ corresponding to $(\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ . $\mathcal{A}_{i}$ ’s control input at time $t$ can be represented as $u_{i}(t)=u_{i,t}(t)=\phi_{i,t}(x_{i}(t),t)$ .

Because of the problems posed by predicting other agents’ strategies mentioned at the beginning of the previous section, we give some assumptions to reduce the complexity of the problem. An important assumption A4 is that $\mathcal{A}_{i}$ believes the average estimations of MF are correct and consistent across all agents, which avoids $\mathcal{A}_{i}$ ’s continued estimation of other agents’ estimations of all agents’ estimations.

5.1 Assumptions

A1: $\mathcal{A}_{i}$ estimates $\mathcal{A}_{j},1\leq j\leq N$ ’s average estimation of MF-S at time $t_{0}$ as $\bar{z}_{t_{0}}^{i}(t_{0})$ , and takes it as the actual agents’ average estimation to give its strategy.

A2: $\mathcal{A}_{i}$ takes $(\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ as the actual MF-S to give its strategy at time $t_{0}$ , and this criterion is known to all agents.

A3: $\mathcal{A}_{i}$ takes $\phi_{j,t}(x_{j}(t),t)=\phi_{j,t_{0}}(x_{j}(t),t),t_{0}\leq t\leq T,1\leq j\leq N$ to give its strategy at time $t_{0}$ , and this criterion is known to all agents.

A4: $\mathcal{A}_{i}$ believes that $\bar{z}_{t_{0}}^{i}(t_{0})=\bar{z}_{t_{0}}^{j}(t_{0}),1\leq i,j\leq N$ .

5.2 Optimal Control

According to Theorem 2.1, for given $(\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T},(\bar{u}_{i,t_{0}}(t))_{t_{0}\leq t% \leq T}$ , $\mathcal{A}_{i}$ can give its corresponding optimal control as

\phi_{i,t_{0}}(x_{i}(t),t)=-R^{-1}B^{T}(P_{1}(t)x_{i}(t)+g_{i,t_{0}}(t))

(56)

where $P_{1}(t)$ satisfies (4), and $(g_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ satisfies

\begin{split}-dg_{i,t_{0}}(t)=&-[(A^{T}-P_{1}BR^{-1}B^{T})g_{i,t_{0}}(t)+(P_{1% }C-Q\Gamma)\hat{z}_{i,t_{0}}(t)+P_{1}F\bar{u}_{i,t_{0}}(t)-Q_{I}s-Q\eta]dt,\\ g_{i,t_{0}}(T)=&-\bar{Q}_{I}\bar{s}-\bar{Q}(\bar{\Gamma}\hat{z}_{i,t_{0}}(T)+% \bar{\eta}).\end{split}

(57)

5.3 Predicted MF under Augmented Information

In this subsection, we consider the situation where $\mathcal{A}_{j}$ ’s predicted MF and strategy $(\hat{z}_{j,t_{0}}(t))_{t_{0}\leq t\leq T}$ , $(g_{j,t_{0}}(t))_{t_{0}\leq t\leq T},1\leq j\leq N$ are available to $\mathcal{A}_{i}$ at time $t_{0}$ , which means agents share their predictions on MF and strategies with each other. Then $\mathcal{A}_{i}$ gives its prediction on MF based on the augmented information set $\mathcal{F}^{i}_{t_{0}}\bigcup\{(\hat{z}_{j,t_{0}}(t))_{t_{0}\leq t\leq T},(g_% {j,t_{0}}(t))_{t_{0}\leq t\leq T},1\leq j\leq N\}$ .

Substitute the optimal control into dynamics, for $\mathcal{A}_{i}$ , we have

\begin{split}&dx_{i}(t)=[Ax_{i}(t)-BR^{-1}B^{T}(P_{1}(t)x_{i}(t)+g_{i,t}(t))+% Cz^{A}(t)+F\bar{u}^{A}(t)]dt+DdW_{i}(t),\\ &x_{i}(t_{0})=x_{i}(t_{0}).\\ \end{split}

(58)

Since $\bar{u}^{A}=-R^{-1}B^{T}\Sigma_{i=1}^{N}(P_{1}(t)x_{i}(t)+g_{i,t}(t))/N=-R^{-1% }B^{T}(P_{1}(t)z^{A}(t)+\bar{g}_{t}(t))$ , where $\bar{g}_{t}(t)=\Sigma_{i=1}^{N}g_{i,t}(t)/N$ , we have

\begin{split}dx_{i}(t)=&[(A-BR^{-1}B^{T}P_{1}(t))x_{i}(t)-BR^{-1}B^{T}g_{i,t}(% t)+(C-FR^{-1}B^{T}P_{1}(t))z^{A}(t)\\ &-FR^{-1}B^{T}\bar{g}_{t}(t)]dt+DdW_{i}(t),\\ x_{i}(t_{0})=&x_{i}(t_{0}).\\ \end{split}

(59)

When $N\rightarrow\infty$ , the actual MF-S satisfies

\begin{split}&dz^{A}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{1}(t))z^{A}(t)-(B+F)R^{-1}B^{% T}\bar{g}_{t}(t)]dt,\\ &z^{A}(t_{0})=z^{A}(t_{0}).\\ \end{split}

(60)

where according to (57), the actual $\bar{g}_{t_{0}}(t)$ satisfies

\begin{split}d\bar{g}_{t_{0}}(t)=&-[(A^{T}-P_{1}BR^{-1}B^{T})\bar{g}_{t_{0}}(t% )+(P_{1}C-Q\Gamma)\bar{z}_{t_{0}}(t)+P_{1}F\bar{u}_{t_{0}}(t)-Q_{I}s-Q\eta]dt,% \\ \bar{g}_{t_{0}}(T)=&-\bar{Q}_{I}\bar{s}-\bar{Q}(\bar{\Gamma}\bar{z}_{t_{0}}(T)% +\bar{\eta}).\end{split}

(61)

where $\bar{z}_{t_{0}}(t)=\Sigma_{i=1}^{N}\hat{z}_{i,t_{0}}(t)/N,\bar{u}_{t_{0}}(t)=% \Sigma_{i=1}^{N}\bar{u}_{i,t_{0}}(t)/N$ .

According to A2 and (60), for given $\bar{g}_{t}(t)$ at time $t_{0}$ , $\bar{u}_{i,t_{0}}(t)=-R^{-1}B^{T}(P_{1}(t)\hat{z}_{i,t_{0}}(t)+\bar{g}_{t}(t))$ , $\mathcal{A}_{i}$ predicts $(\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ as

\begin{split}&d\hat{z}_{i,t_{0}}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{1}(t))\hat{z}_{i,% t_{0}}(t)-(B+F)R^{-1}B^{T}\bar{g}_{t}(t)]dt,\\ &\hat{z}_{i,t_{0}}(t_{0})=\hat{z}_{i,t_{0}}(t_{0}).\\ \end{split}

(62)

According to A3, at time $t_{0}$ , $\mathcal{A}_{i}$ takes $g_{j,t}(t)$ as $g_{j,t_{0}}(t)$ , $\bar{g}_{t}(t)$ as $\bar{g}_{t_{0}}(t)$ , so $\bar{u}_{i,t_{0}}(t)=-R^{-1}B^{T}(P_{1}(t)\hat{z}_{i,t_{0}}(t)+\bar{g}_{t_{0}}% (t))$ , (62) changes to

\begin{split}&d\hat{z}_{i,t_{0}}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{1}(t))\hat{z}_{i,% t_{0}}(t)-(B+F)R^{-1}B^{T}\bar{g}_{t_{0}}(t)]dt,\\ &\hat{z}_{i,t_{0}}(t_{0})=\hat{z}_{i,t_{0}}(t_{0}).\\ \end{split}

(63)

So when agents share their predictions and strategies with each other, under A2, A3, agents’ average prediction on MF-S and average strategy satisfy

\begin{split}&d\begin{pmatrix}\bar{z}_{t_{0}}(t)\\ \bar{g}_{t_{0}}(t)\\ \end{pmatrix}=\left\{\begin{pmatrix}A+C-(B+F)R^{-1}B^{T}P_{1}&-(B+F)R^{-1}B^{T% }\\ -(P_{1}C-P_{1}FR^{-1}B^{T}P_{1}-Q\Gamma)&-(A^{T}-P_{1}(B+F)R^{-1}B^{T})\end{% pmatrix}\begin{pmatrix}\bar{z}_{t_{0}}(t)\\ \bar{g}_{t_{0}}(t)\\ \end{pmatrix}-\begin{pmatrix}0\\ Q_{I}s+Q\eta\\ \end{pmatrix}\right\}dt,\\ &\bar{z}_{t_{0}}(t_{0})=\bar{z}_{t_{0}}(t_{0}),\\ &\bar{g}_{t_{0}}(T)=-\bar{Q}_{I}\bar{s}-\bar{Q}(\bar{\Gamma}\bar{z}_{t_{0}}(T)% +\bar{\eta}).\\ \end{split}

(64)

We notice that $\bar{g}_{t_{0}}(t)=P_{2}(t)\bar{z}_{t_{0}}(t)+\mathcal{G}_{1}(t)$ , where $P_{2}(t)$ satisfies a matrix riccati differential equation

\begin{split}-dP_{2}=&\{P_{2}(A+C-(B+F)R^{-1}B^{T}P_{1})+(A^{T}-P_{1}(B+F)R^{-% 1}B^{T})P_{2}\\ &+(P_{1}C-P_{1}FR^{-1}B^{T}P_{1}-Q\Gamma)-P_{2}(B+F)R^{-1}B^{T}P_{2}\}dt,\\ P_{2}(T)=&-\bar{Q}\bar{\Gamma}.\\ \end{split}

(65)

and $\mathcal{G}_{1}(t)$ satisfies the backward ordinary differential equations (BODEs)

\begin{split}d\mathcal{G}_{1}(t)=&\{-(A^{T}-(P_{1}(t)+P_{2}(t))(B+F)R^{-1}B^{T% })\mathcal{G}_{1}+Q_{I}s+Q\eta\}dt,\\ \mathcal{G}_{1}(T)=&-\bar{Q_{I}}\bar{s}-\bar{Q}\bar{\eta}.\\ \end{split}

(66)

Remark 5.1 Let $p_{t_{0}}=P_{1}(t)\bar{z}_{t_{0}}(t)+\bar{g}_{t_{0}}(t)$ , we have

\begin{split}&d\begin{pmatrix}\bar{z}_{t_{0}}(t)\\ p_{t_{0}}(t)\\ \end{pmatrix}=\left\{\begin{pmatrix}A+C&-(B+F)R^{-1}B^{T}\\ Q\Gamma-Q_{I}-Q&-A^{T}\end{pmatrix}\begin{pmatrix}\bar{z}_{t_{0}}(t)\\ p_{t_{0}}(t)\\ \end{pmatrix}-\begin{pmatrix}0\\ Q_{I}s+Q\eta\\ \end{pmatrix}\right\}dt,\\ &\bar{z}_{t_{0}}(t_{0})=z^{0},\\ &p_{t_{0}}(T)=(\bar{Q}_{I}+\bar{Q}-\bar{Q}\bar{\Gamma})\bar{z}_{t_{0}}(T)-\bar% {Q_{I}}\bar{s}-\bar{Q}\bar{\eta}.\\ \end{split}

(67)

and $p_{t_{0}}(t)=P_{0}(t)\bar{z}_{t_{0}}(t)+\mathcal{G}(t)$ .

5.4 Predicted MF under Restricted Information

In this subsection, we consider $\mathcal{A}_{i}$ ’s strategy at $t_{0}$ based on the restricted information set $\mathcal{F}^{i}_{t_{0}}$ and above assumptions. We show that under A1, A2, A3, A4, $\mathcal{A}_{i}$ only needs to estimate $\bar{z}^{i}_{t_{0}}(t_{0}),\hat{z}_{i,t_{0}}(t_{0})$ to give its prediction on MF-S $(\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ .

We notice that (64) can be solved only based on $\bar{z}_{t_{0}}(t_{0})$ and $\Theta$ , so $\mathcal{A}_{i}$ can compute $(\bar{g}_{t_{0}}(t))_{t_{0}\leq t\leq T}$ only using $\bar{z}_{t_{0}}(t_{0})$ . By substituting $(\bar{g}_{t_{0}}(t))_{t_{0}\leq t\leq T}$ into (62), $\mathcal{A}_{i}$ can compute $(\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ , and further can solve (57) for its control. So $\mathcal{A}_{i}$ can calculate its strategy under augmented information, but only using its estimation of MF-S and agents’ average estimation of MF-S.

Under A1 and A4, $\mathcal{A}_{i}$ believes all agents have the same correct $\bar{z}_{t_{0}}(t_{0})$ , so all agents can compute the same correct $(\bar{z}_{t_{0}}(t))_{t_{0}\leq t\leq T}$ and $(\bar{g}_{t_{0}}(t))_{t_{0}\leq t\leq T}$ under augmented information through (64). Then $\mathcal{A}_{i}$ believes all agents can give their strategies under augmented information by solving (63) and (57), and the game under restricted information is consistent with that under augmented information. $\mathcal{A}_{i}$ predicts $(\bar{z}_{t_{0}}^{i}(t))_{t_{0}\leq t\leq T}$ and $(\bar{g}_{t_{0}}^{i}(t))_{t_{0}\leq t\leq T}$ from

\begin{split}&d\begin{pmatrix}\bar{z}_{t_{0}}^{i}(t)\\ \bar{g}_{t_{0}}^{i}(t)\\ \end{pmatrix}=\left\{\begin{pmatrix}A+C-(B+F)R^{-1}B^{T}P_{1}&-(B+F)R^{-1}B^{T% }\\ -(P_{1}C-P_{1}FR^{-1}B^{T}P_{1}-Q\Gamma)&-(A^{T}-P_{1}(B+F)R^{-1}B^{T})\end{% pmatrix}\begin{pmatrix}\bar{z}_{t_{0}}^{i}(t)\\ \bar{g}_{t_{0}}^{i}(t)\\ \end{pmatrix}-\begin{pmatrix}0\\ Q_{I}s+Q\eta\\ \end{pmatrix}\right\}dt,\\ &\bar{z}_{t_{0}}^{i}(t_{0})=\bar{z}_{t_{0}}^{i}(t_{0}),\\ &\bar{g}_{t_{0}}^{i}(T)=-\bar{Q}_{I}\bar{s}-\bar{Q}(\bar{\Gamma}\bar{z}_{t_{0}% }^{i}(T)+\bar{\eta}).\\ \end{split}

(68)

Then we can give the following theorem

Theorem 5.1 Suppose A1, A2, A3, A4. At time $t_{0}$ , $(\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ predicted by $\mathcal{A}_{i}$ and $(\phi_{i,t_{0}}(\cdot,t))_{t_{0}\leq t\leq T}$ can be computed only based on $\bar{z}^{i}_{t_{0}}(t_{0}),\hat{z}_{i,t_{0}}(t_{0})$ and parameters $\Theta$ .

5.5 Strategies under Restricted Information

Consider $\mathcal{A}_{i}$ ’s strategy under restricted information. The following system gives $\mathcal{A}_{i}$ ’s feedback control at time $t_{0}$ .
MF-S and MF-C

Predict $(\bar{z}^{i}_{t_{0}}(t))_{0\leq t\leq T}$

$\mathcal{A}_{i}$ predicts $(\bar{z}^{i}_{t_{0}}(t))_{0\leq t\leq T}$ by uniquely solved

\begin{split}&d\bar{z}_{t_{0}}^{i}(t)=[(A+C-(B+F)R^{-1}B^{T}(P_{1}+P_{2}))\bar% {z}_{t_{0}}^{i}(t)-(B+F)R^{-1}B^{T}\mathcal{G}_{1}]dt\\ &\bar{z}_{t_{0}}^{i}(t_{0})=\bar{z}_{t_{0}}^{i}(t_{0}).\\ \end{split}

(69)

and $\bar{g}_{t_{0}}^{i}(t)$ can be given by $\bar{g}^{i}_{t_{0}}(t)=P_{2}(t)\bar{z}^{i}_{t_{0}}(t)+\mathcal{G}_{1}(t)$ .

$\mathcal{A}_{i}$ can also solve (67) for $(\bar{z}^{i}_{t_{0}}(t))_{0\leq t\leq T}$

Predict $(\bar{z}^{i}_{t_{0}}(t))_{0\leq t\leq T}$ -1

$\mathcal{A}_{i}$ predicts $(\bar{z}^{i}_{t_{0}}(t))_{0\leq t\leq T}$ by uniquely solved

\begin{split}&d\bar{z}_{t_{0}}^{i}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{0})\bar{z}_{t_{% 0}}^{i}(t)-(B+F)R^{-1}B^{T}\mathcal{G}]dt\\ &\bar{z}_{t_{0}}^{i}(t_{0})=\bar{z}_{t_{0}}^{i}(t_{0}).\\ \end{split}

(70)

and $\bar{g}_{t_{0}}^{i}(t)$ can be given by $\bar{g}^{i}_{t_{0}}(t)=(P_{0}(t)-P_{1}(t))\bar{z}^{i}_{t_{0}}(t)+\mathcal{G}(t)$ .

Predict $(\hat{z}_{i,t_{0}}(t))_{0\leq t\leq T}$

$\mathcal{A}_{i}$ predicts $(\hat{z}_{i,t_{0}}(t))_{0\leq t\leq T}$ by uniquely solved

\begin{split}&d\hat{z}_{i,t_{0}}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{1}(t))\hat{z}_{i,% t_{0}}(t)-(B+F)R^{-1}B^{T}\bar{g}^{i}_{t_{0}}(t)]dt,\\ &\hat{z}_{i,t_{0}}(t_{0})=\hat{z}_{i,t_{0}}(t_{0}).\\ \end{split}

(71)

and $\bar{u}_{t_{0}}^{i}(t)$ can be given by $\bar{u}^{i}_{t_{0}}(t)=P_{1}(t)\hat{z}_{i,t_{0}}(t)+\bar{g}_{t_{0}}^{i}(t)$ .
Feedback Control

For computed $(\hat{z}_{i,t_{0}}(t))_{0\leq t\leq T}$ and $(\bar{u}_{t_{0}}^{i}(t))_{0\leq t\leq T}$ , $\mathcal{A}_{i}$ can solve (57) for $(g_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ . It’s feedback optimal control is

\phi_{i,t_{0}}(x_{i}(t),t)=-R^{-1}B^{T}(P_{1}(t)x_{i}(t)+g_{i,t_{0}}(t)),t_{0}% \leq t\leq T

(72)

5.6 Estimation Error Affection on Predicted Mean Field

In this subsection, we analysis the estimation error affection on the mean field equilibrium predicted by $\mathcal{A}_{i}$ . We represent the MF-S and MF-C under correct information as $(z^{c}(t))_{0\leq t\leq T}$ and $(\bar{u}^{c}(t))_{0\leq t\leq T}$ . We set $\Delta\bar{z}_{t_{0}}^{i}(t)=\bar{z}_{t_{0}}^{i}(t)-z^{c}(t),\Delta\hat{z}_{i,% t_{0}}(t)=\hat{z}_{i,t_{0}}(t)-z^{c}(t),\Delta\bar{u}^{i}_{t_{0}}(t)=\bar{u}^{% i}_{t_{0}}(t)-\bar{u}^{c}(t),\Delta\bar{g}^{i}_{t_{0}}(t)=\bar{g}^{i}_{t_{0}}(% t)-g^{c}(t)$ , $\bar{E}^{i}(t):=\Delta\bar{z}_{t}^{i}(t),E_{i}(t):=\Delta\hat{z}_{i,t}(t)$ . Then according to (6) and (70), we have

\begin{split}&d\Delta\bar{z}_{t_{0}}^{i}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{0})\Delta% \bar{z}_{t_{0}}^{i}(t)]dt\\ &\Delta\bar{z}_{t_{0}}^{i}(t_{0})=\bar{E}^{i}(t_{0}).\\ \end{split}

(73)

We have defined $\Phi_{1}(t)$ as a basis solution of (16), then $\Phi_{1}(t)$ can be solved according to $H_{0}(t)=A+C-(B+F)R^{-1}B^{T}P_{0}(t)$ . The solution of (73) is given by

\begin{split}\Delta\bar{z}_{t_{0}}^{i}(t)=\Phi_{1}(t)\Phi_{1}^{-1}(t_{0})\bar{% E}^{i}(t_{0}).\\ \end{split}

(74)

So we have $\Delta\bar{g}_{t_{0}}^{i}(t)=P_{2}(t)\Phi_{1}(t)\Phi_{1}^{-1}(t_{0})\bar{E}^{i% }(t_{0})$ . According to (71), we have

\begin{split}&d\Delta\hat{z}_{i,t_{0}}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{1}(t))% \Delta\hat{z}_{i,t_{0}}(t)-(B+F)R^{-1}B^{T}\Delta\bar{g}^{i}_{t_{0}}(t)]dt,\\ &\Delta\hat{z}_{i,t_{0}}(t_{0})=E_{i}(t_{0}).\\ \end{split}

(75)

The solution of the above equation is given by

\begin{split}\Delta\hat{z}_{i,t_{0}}(t)=\Phi_{z}(t)\Phi_{z}(t_{0})^{-1}E_{i}(t% _{0})+\Phi_{z}(t)\int_{t_{0}}^{t}\Phi_{z}^{-1}(s)f_{z}^{i}(s)ds.\\ \end{split}

(76)

where $f_{z}^{i}(s)=-(B+F)R^{-1}B^{T}\Delta\bar{g}^{i}_{t_{0}}(s)$ . Since $(H_{0}(t))_{0\leq t\leq T},(H_{z}(t))_{0\leq t\leq T}$ can be calculated without knowing the information of initial states, $(\Delta\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ has an all-agents-known linear relationship with $[\bar{E}^{i}(t_{0})^{T},E_{i}(t_{0})^{T}]^{T}$ . We can get the following theorem

Theorem 5.2 $(\Delta\hat{z}_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ has a linear relationship with $[\bar{E}^{i}(t_{0})^{T},E_{i}(t_{0})^{T}]^{T}$ , and this linear relationship can be computed by all agents without knowing $[\bar{E}^{i}(t_{0})^{T},E_{i}(t_{0})^{T}]^{T}$ , which is

\begin{split}&\Delta\hat{z}_{i,t_{0}}(t)=\mathcal{M}_{i,z}(t)E_{i}(t_{0})+% \mathcal{M}_{0,z}(t)\bar{E}^{i}(t_{0}),\\ &\mathcal{M}_{i,z}(t)=\Phi_{z}(t)\Phi_{z}(t_{0})^{-1},\\ &\mathcal{M}_{0,z}(t)=-\Phi_{z}(t)\int_{t_{0}}^{t}(B+F)R^{-1}B^{T}P_{2}(s)\Phi% _{1}(s)\Phi_{1}^{-1}(t_{0})ds.\\ \end{split}

(77)

This theorem gives the deviation of the MF in the prediction of $\mathcal{A}_{i}$ and that under correct information.

5.7 Estimation Error Affection on Feedback Control

In this subsection, we analysis the estimation error affection on the feedback control law used by $\mathcal{A}_{i}$ . We represent the feedback control under correct information as $u_{i}^{c}(t)=\phi_{c}(x_{i}^{c}(t),t)=-R^{-1}B^{T}(P_{1}(t)x_{i}^{c}(t)+g^{c}(% t))$ , and the actual trajectory of $\mathcal{A}_{i}$ as $x_{i}^{A}(t)$ .

We set $\Delta g_{i,t_{0}}(t)=g_{i,t_{0}}(t)-g^{c}(t),\Delta\phi_{i,t_{0}}(\Delta x_{i% }(t),t)=\phi_{i,t_{0}}(x_{i}^{A}(t),t)-\phi_{c}(x_{i}^{c}(t),t),\Delta x_{i}(t% )=x_{i}^{A}(t)-x_{i}^{c}(t)$ , then according to (5) and (57), we have

\begin{split}&d\Delta g_{i,t_{0}}(t)=-[(A^{T}-P_{1}BR^{-1}B^{T})\Delta g_{i,t_% {0}}(t)+(P_{1}C-P_{1}FR^{-1}B^{T}-Q\Gamma)\Delta\hat{z}_{i,t_{0}}(t)-P_{1}FR^{% -1}B^{T}\Delta\bar{g}_{t_{0}}^{i}(t)]dt,\\ &\Delta g_{i,t_{0}}(T)=-\bar{Q}\bar{\Gamma}\Delta\hat{z}_{i,t_{0}}(T).\end{split}

(78)

\begin{split}&\Delta g_{i,t_{0}}(t)=-\Phi_{g}(t)\Phi_{g}^{-1}(T)\bar{Q}\bar{% \Gamma}\Delta\hat{z}_{i,t_{0}}(T)+\Phi_{g}(t)\int_{T}^{t}\Phi_{g}^{-1}(s)f_{g}% ^{1}(s)ds.\\ \end{split}

(79)

where $f_{g}^{1}(s)=-(P_{1}(s)C-P_{1}(s)FR^{-1}B^{T}P_{1}(s)-Q\Gamma)\Delta\hat{z}_{i% ,t_{0}}(s)+P_{1}FR^{-1}B^{T}\Delta\bar{g}_{t_{0}}^{i}(s)$ . Applying the conclusion of Theorem 4.1, we can get the following theorem

Theorem 5.3 $(\Delta g_{i,t_{0}}(t))_{t_{0}\leq t\leq T}$ has a linear relationship with $[\bar{E}^{i}(t_{0})^{T},E_{i}(t_{0})^{T}]^{T}$ , and this linear relationship can be computed by all agents without knowing $[\bar{E}^{i}(t_{0})^{T},E_{i}(t_{0})^{T}]^{T}$ , which is

\begin{split}&\Delta g_{i,t_{0}}(t)=\mathcal{M}_{i,g}(t)E_{i}(t_{0})+\mathcal{% M}_{0,g}(t)\bar{E}^{i}(t_{0}).\\ &\mathcal{M}_{i,g}(t)=-\Phi_{g}(t)\Phi_{g}^{-1}(T)\bar{Q}\bar{\Gamma}\mathcal{% M}_{i,z}(T)-\Phi_{g}(t)\int_{T}^{t}\Phi_{g}^{-1}(P_{1}C-P_{1}FR^{-1}B^{T}P_{1}% -Q\Gamma)\mathcal{M}_{i,z}(s)ds,\\ &\mathcal{M}_{0,g}(t)=-\Phi_{g}(t)\Phi_{g}^{-1}(T)\bar{Q}\bar{\Gamma}\mathcal{% M}_{0,z}(T)-\Phi_{g}(t)\int_{T}^{t}\Phi_{g}^{-1}[(P_{1}C-P_{1}FR^{-1}B^{T}P_{1% }-Q\Gamma)\mathcal{M}_{0,z}(s)\\ &-P_{1}FR^{-1}B^{T}P_{2}(s)\Phi_{1}(s)\Phi_{1}^{-1}(t_{0})]ds.\\ \end{split}

(80)

According to Theorem 5.3, we have $\Delta\phi_{i,t_{0}}(\Delta x_{i}(t),t)=-R^{-1}B^{T}(P_{1}(t)\Delta x_{i}(t)+% \Delta g_{i,t_{0}}(t))$ .

Remark 5.2 The relationship between $[\bar{E}^{i}(t_{0})^{T},E_{i}(t_{0})^{T}]^{T}$ and $\Delta\phi_{i,t_{0}}(\Delta x_{i}(t),t)$ can be represented as

\begin{split}&\Delta\phi_{i,t_{0}}(\Delta x_{i}(t),t)=-R^{-1}B^{T}(P_{1}(t)% \Delta x_{i}(t)+\mathcal{M}_{i,g}(t)E_{i}(t_{0})+\mathcal{M}_{0,g}(t)\bar{E}^{% i}(t_{0})).\\ \end{split}

(81)

5.8 Estimation Error Affection on Actual Mean Field

Let $\Delta\bar{g}_{t_{0}}(t)=\bar{g}_{t_{0}}(t)-g^{c}(t),\Delta z^{A}(t)=z^{A}(t)-% z^{c}(t)$ , $\bar{E}^{1}(t_{0})=\Sigma_{1}^{N}\bar{E}^{i}(t_{0})/N,\bar{E}(t_{0})=\Sigma_{1% }^{N}E_{i}(t_{0})/N$ , then when $N\rightarrow\infty$ , we have

\begin{split}\Delta\bar{g}_{t_{0}}(t)=\mathcal{M}_{i,g}(t)\bar{E}(t_{0})+% \mathcal{M}_{0,g}(t)\bar{E}^{1}(t_{0}).\\ \end{split}

(82)

Then according to (60), we have

\begin{split}&d\Delta z^{A}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{1}(t))\Delta z^{A}(t)-% (B+F)R^{-1}B^{T}\Delta\bar{g}_{t}(t)]dt,\\ &\Delta z^{A}(0)=0.\\ \end{split}

(83)

Substitute $\Delta\bar{g}_{t_{0}}(t)=\mathcal{M}_{i,g}(t)\bar{E}(t_{0})+\mathcal{M}_{0,g}(% t)\bar{E}^{1}(t_{0})$ in to the above equation, we have

\begin{split}&d\Delta z^{A}(t)=[(A+C-(B+F)R^{-1}B^{T}P_{1}(t))\Delta z^{A}(t)-% (B+F)R^{-1}B^{T}(\mathcal{M}_{i,g}(t)\bar{E}(t)+\mathcal{M}_{0,g}(t)\bar{E}^{1% }(t))]dt,\\ &\Delta z^{A}(0)=0.\\ \end{split}

(84)

Using the method of variation of parameters, the solution of (84) is given by

\begin{split}&\Delta z^{A}(t)=\Phi_{z}(t)\int_{0}^{t}\Phi_{z}^{-1}(s)f_{z}^{1}% (s)ds.\\ \end{split}

(85)

where $f_{z}^{1}(s)=-(B+F)R^{-1}B^{T}(\mathcal{M}_{i,g}(s)\bar{E}(s)+\mathcal{M}_{0,g% }(s)\bar{E}^{1}(s))$ . Then we can get the following theorem

Theorem 5.4 The relationship between $(\Delta z^{A}(t))_{0\leq t\leq T}$ and $\bar{E}^{1}(t),\bar{E}(t)$ can be represented as

\begin{split}&\Delta z^{A}(t)=-\Phi_{z}(t)\int_{0}^{t}\Phi_{z}^{-1}(s)(B+F)R^{% -1}B^{T}(\mathcal{M}_{i,g}(s)\bar{E}(s)+\mathcal{M}_{0,z}(s)\bar{E}^{1}(s))ds.% \\ \end{split}

(86)

and $\mathcal{M}_{i,g}(t),\mathcal{M}_{0,z}(t),\Phi_{z}(t),0\leq t\leq T$ can be computed by all agents.

Remark 5.3 Notice that when $\bar{E}^{1}(t)=\bar{E}(t)=0,0\leq t\leq T$ , we have $z^{A}(t)=z^{c}(t)$ .

6 Simulations

6.1 Model Formulations

We set $N=100$ , agent $\mathcal{A}_{i}$ gets $x_{i}(0)$ and observes an erroneous initial mean field state $z_{i}(0)=z^{0}+E_{i}$ at $t=0$ . The initial distribution is a normal distribution with $z^{0}=[0.3,0.5]^{T}$ as the expectation and $0.001I_{2\times 2}$ as the covariance matrix. $\{E_{i},1\leq i\leq N\}$ conform to a normal distribution with $\bar{E}$ as the expectation and $0.1I_{2\times 2}$ as the covariance matrix. The dynamics and cost functions of $\mathcal{A}_{i}$ is given by (1), where

\begin{split}&C=0.5I_{2\times 2},A=-I_{2\times 2},B=F=0.5I_{2\times 2},\\ &D=0*I_{2\times 2},R=I_{2\times 2},T=2,\\ &Q_{I}=I_{2\times 2}=\bar{Q}_{I},Q=I_{2\times 2}=\bar{Q},\\ &\Gamma=I_{2\times 2}=\bar{\Gamma},\eta=0=\bar{\eta},s=(0.5,0.3)^{T}.\\ \end{split}

At $t=0$ , agents give their feedback control $(\phi_{i}(x_{i}(t),t))_{0\leq t\leq T},1\leq i\leq N$ .

6.2 MF-S in Predictions

The evolutionary of agents’ trajectories ,MF-S in predictions and modified trajectories under erroneous initial information are shown in Fig.1, Fig.2. The color ranges from blue to yellow, corresponding to the time from $0$ to $T$ .

It can be seen that the wrong initial mean field states information leads to a significant difference among these situations. We compare $z_{A},z_{c},z^{new}$ when $\bar{E}=[-0.4;0.4]$ in Fig.3.

Refer to caption — Figure 1: MF-S in predictions under correct information and that under erroneous information, $\bar{E}=[0.4;-0.4]$ .

6.3 Initial Error Affection

We compare the deviations $\Delta z^{A}(t)$ and the deviations $\Delta z^{new}(t)$ with different initial error in Fig.4. It can be shown that the deviations have a linear relationship with $\bar{E}$ , which verifies the linear relationship mentioned in Section 3.

6.4 Re-game Time Affection

We simulate the situations when agents re-game at different time $t_{0}$ . Fig.5 compares the deviations $z^{new}(t)$ under different $t_{0}$ .

References

[1] M. Huang, P. E. Caines, and R. P. Malhame, Large-population cost coupled lqg problems with nonuniform agents: Individual-mass behavior and decentralized $\epsilon$ -nash equilibria. IEEE Transactions on Automatic Control, vol. 52, no. 9, pp. 1560–1571, 2007
[2] P. E. Caines, M. Huang, and R. P. Malham´e, Large population stochastic dynamic games: closed-loop mckean-vlasov systems and the nash certainty equivalence principle. Commun. Inf. Syst., vol. 6, pp. 221252, 2006
[3] T. Li and J.-F. Zhang, Asymptotically optimal decentralized control for large population stochastic multiagent systems, IEEE Transactions on Automatic Control, vol. 53, no. 7, pp. 1643–1660, 2008
[4] J.-M. Lasry and P.-L. Lions, Jeux ‘a champ moyen. i– le cas stationnaire, Comptes Rendus Mathematique, vol.343, no. 9, pp. 619–625,2006
[5] ——, Jeux ‘a champ moyen. ii– horizon fini et contrˆ ole optimal, Comptes Rendus Mathematique, vol. 343, no. 10, pp. 679–684, 2006
[6] ——, Mean field games. Japanese Journal of Mathematics, vol. 2, no. 1, pp. 229–260, Mar 2007
[7] L. Ren, Y. Jin, Z. Niu, W. Yao and X. Zhang, Hierarchical Cooperation in LQ Multi-Population Mean Field Game With Its Application to Opinion Evolution, IEEE Transactions on Network Science and Engineering, vol. 11, no. 5, pp. 5008-5022, Sept.-Oct. 2024
[8] Bensoussan, A., Sung, K.C.J., Yam, S.C.P. et al., Linear-Quadratic Mean Field Games. J Optim Theory Appl, vol. 169, pp. 496–529, 2016
[9] Bardi, M., Priuli, F.S., Linear-quadratic N-person and mean-field games with ergodic cost. SIAM J. Control Optim, vol. 52, no. 5, pp. 3022–3052, 2014
[10] Bensoussan, A., Sung, K.C.J., Yam, S.C.P., Linear-quadratic time-inconsistent mean field games. Dyn. Games Appl., vol. 3, no.4, pp. 537–552, 2013
[11] Priuli, F.S., Linear-quadratic N-person and mean-field games: infinite horizon games with discounted cost and singular limits. Dyn. Games Appl., vol. 5, no.3, pp. 397–419, 2014
[12] Bensoussan, A., Chau, M.H.M. and Yam, S.C.P., Mean Field Games with a Dominating Player. 2016, Appl Math Optim, vol. 74, pp. 91–128, 2016).
[13] Z. Liu, B. Wu, and H. Lin, A mean field game approach to swarming robots control. 2018 Annual American Control Conference (ACC), pp. 4293–4298, 2018
[14] K. Elamvazhuthi and S. Berman, Mean-field models in swarm robotics: a survey. Bioinspiration $\&$ Biomimetics, vol. 15, no. 1, pp. 015001, nov 2019
[15] R. Lin, Z. Xu, X. Huang, J. Gao, H. Chen, and T. Shen, Optimal scheduling management of the parking lot and decentralized charging of electric vehicles based on mean field game. Applied Energy, vol. 328, pp. 120198, 2022
[16] A. C. Kizilkale, R. Salhab, and R. P. Malham´ e, An integral control formulation of mean field game based large scale coordination of loads in smart grids. Automatica, vol. 100, pp. 312–322, 2019
[17] A. De Paola, V. Trovato, D. Angeli, and G. Strbac, A mean field game approach for distributed control of thermostatic loads acting in simultaneous energy-frequency response markets. IEEE Transactions on Smart Grid, vol. 10, no. 6, pp. 5987–5999, 2019
[18] A. Lachapelle and M.-T. Wolfram, On a mean field game approach modeling congestion and aversion in pedestrian crowds. Transportation Research Part B: Methodological, vol. 45, no. 10, pp. 1572–1589, 2011
[19] A. Aurell and B. Djehiche, Mean-field type modeling of nonlocal crowd aversion in pedestrian crowd dynamics. SIAM Journal on Control and Optimization, vol. 56, no. 1, pp. 434–455, 2018
[20] Y. Achdou and J.-M. Lasry, Mean Field Games for Modeling Crowd Motion. Cham: Springer International Publishing, pp. 17–42, 2019

\addappheadtotoc