Search | arXiv e-print repository

Synchronization behind Learning in Periodic Zero-Sum Games Triggers Divergence from Nash equilibrium

Authors: Yuma Fujimoto, Kaito Ariu, Kenshi Abe

Abstract: Learning in zero-sum games studies a situation where multiple agents competitively learn their strategy. In such multi-agent learning, we often see that the strategies cycle around their optimum, i.e., Nash equilibrium. When a game periodically varies (called a ``periodic'' game), however, the Nash equilibrium moves generically. How learning dynamics behave in such periodic games is of interest bu… ▽ More Learning in zero-sum games studies a situation where multiple agents competitively learn their strategy. In such multi-agent learning, we often see that the strategies cycle around their optimum, i.e., Nash equilibrium. When a game periodically varies (called a ``periodic'' game), however, the Nash equilibrium moves generically. How learning dynamics behave in such periodic games is of interest but still unclear. Interestingly, we discover that the behavior is highly dependent on the relationship between the two speeds at which the game changes and at which players learn. We observe that when these two speeds synchronize, the learning dynamics diverge, and their time-average does not converge. Otherwise, the learning dynamics draw complicated cycles, but their time-average converges. Under some assumptions introduced for the dynamical systems analysis, we prove that this behavior occurs. Furthermore, our experiments observe this behavior even if removing these assumptions. This study discovers a novel phenomenon, i.e., synchronization, and gains insight widely applicable to learning in periodic games. △ Less

Submitted 20 August, 2024; originally announced August 2024.

Comments: 8 pages, 5 figures (main); 7 pages, 1 figure (appendix)

arXiv:2405.14546 [pdf, other]

Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetry

Authors: Yuma Fujimoto, Kaito Ariu, Kenshi Abe

Abstract: This study examines the global behavior of dynamics in learning in games between two players, X and Y. We consider the simplest situation for memory asymmetry between two players: X memorizes the other Y's previous action and uses reactive strategies, while Y has no memory. Although this memory complicates the learning dynamics, we discover two novel quantities that characterize the global behavio… ▽ More This study examines the global behavior of dynamics in learning in games between two players, X and Y. We consider the simplest situation for memory asymmetry between two players: X memorizes the other Y's previous action and uses reactive strategies, while Y has no memory. Although this memory complicates the learning dynamics, we discover two novel quantities that characterize the global behavior of such complex dynamics. One is an extended Kullback-Leibler divergence from the Nash equilibrium, a well-known conserved quantity from previous studies. The other is a family of Lyapunov functions of X's reactive strategy. These two quantities capture the global behavior in which X's strategy becomes more exploitative, and the exploited Y's strategy converges to the Nash equilibrium. Indeed, we theoretically prove that Y's strategy globally converges to the Nash equilibrium in the simplest game equipped with an equilibrium in the interior of strategy spaces. Furthermore, our experiments also suggest that this global convergence is universal for more advanced zero-sum games than the simplest game. This study provides a novel characterization of the global behavior of learning in games through a couple of indicators. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 11 pages, 4 figures (main); 4 pages (appendix)

arXiv:2402.10825 [pdf, other]

Nash Equilibrium and Learning Dynamics in Three-Player Matching $m$-Action Games

Authors: Yuma Fujimoto, Kaito Ariu, Kenshi Abe

Abstract: Learning in games discusses the processes where multiple players learn their optimal strategies through the repetition of game plays. The dynamics of learning between two players in zero-sum games, such as matching pennies, where their benefits are competitive, have already been well analyzed. However, it is still unexplored and challenging to analyze the dynamics of learning among three players.… ▽ More Learning in games discusses the processes where multiple players learn their optimal strategies through the repetition of game plays. The dynamics of learning between two players in zero-sum games, such as matching pennies, where their benefits are competitive, have already been well analyzed. However, it is still unexplored and challenging to analyze the dynamics of learning among three players. In this study, we formulate a minimalistic game where three players compete to match their actions with one another. Although interaction among three players diversifies and complicates the Nash equilibria, we fully analyze the equilibria. We also discuss the dynamics of learning based on some famous algorithms categorized into Follow the Regularized Leader. From both theoretical and experimental aspects, we characterize the dynamics by categorizing three-player interactions into three forces to synchronize their actions, switch their actions rotationally, and seek competition. △ Less

Submitted 20 August, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: 9 pages, 4 figures (main), 9 pages, 1 figure (appendix)

arXiv:2402.01734 [pdf, other]

CFTM: Continuous time fractional topic model

Authors: Kei Nakagawa, Kohei Hayashi, Yugo Fujimoto

Abstract: In this paper, we propose the Continuous Time Fractional Topic Model (cFTM), a new method for dynamic topic modeling. This approach incorporates fractional Brownian motion~(fBm) to effectively identify positive or negative correlations in topic and word distribution over time, revealing long-term dependency or roughness. Our theoretical analysis shows that the cFTM can capture these long-term depe… ▽ More In this paper, we propose the Continuous Time Fractional Topic Model (cFTM), a new method for dynamic topic modeling. This approach incorporates fractional Brownian motion~(fBm) to effectively identify positive or negative correlations in topic and word distribution over time, revealing long-term dependency or roughness. Our theoretical analysis shows that the cFTM can capture these long-term dependency or roughness in both topic and word distributions, mirroring the main characteristics of fBm. Moreover, we prove that the parameter estimation process for the cFTM is on par with that of LDA, traditional topic models. To demonstrate the cFTM's property, we conduct empirical study using economic news articles. The results from these tests support the model's ability to identify and track long-term dependency or roughness in topics over time. △ Less

Submitted 6 February, 2024; v1 submitted 29 January, 2024; originally announced February 2024.

arXiv:2401.17714 [pdf, other]

3D-Plotting Algorithm for Insects using YOLOv5

Authors: Daisuke Mori, Hiroki Hayami, Yasufumi Fujimoto, Isao Goto

Abstract: In ecological research, accurately collecting spatiotemporal position data is a fundamental task for understanding the behavior and ecology of insects and other organisms. In recent years, advancements in computer vision techniques have reached a stage of maturity where they can support, and in some cases, replace manual observation. In this study, a simple and inexpensive method for monitoring in… ▽ More In ecological research, accurately collecting spatiotemporal position data is a fundamental task for understanding the behavior and ecology of insects and other organisms. In recent years, advancements in computer vision techniques have reached a stage of maturity where they can support, and in some cases, replace manual observation. In this study, a simple and inexpensive method for monitoring insects in three dimensions (3D) was developed so that their behavior could be observed automatically in experimental environments. The main achievements of this study have been to create a 3D monitoring algorithm using inexpensive cameras and other equipment to design an adjusting algorithm for depth error, and to validate how our plotting algorithm is quantitatively precise, all of which had not been realized in conventional studies. By offering detailed 3D visualizations of insects, the plotting algorithm aids researchers in more effectively comprehending how insects interact within their environments. △ Less

Submitted 31 January, 2024; originally announced January 2024.

arXiv:2310.17877 [pdf, other]

ASPIRO: Any-shot Structured Parsing-error-Induced ReprOmpting for Consistent Data-to-Text Generation

Authors: Martin Vejvar, Yasutaka Fujimoto

Abstract: We present ASPIRO, an approach for structured data verbalisation into short template sentences in zero to few-shot settings. Unlike previous methods, our approach prompts large language models (LLMs) to directly produce entity-agnostic templates, rather than relying on LLMs to faithfully copy the given example entities, or validating/crafting the templates manually. We incorporate LLM re-prompting… ▽ More We present ASPIRO, an approach for structured data verbalisation into short template sentences in zero to few-shot settings. Unlike previous methods, our approach prompts large language models (LLMs) to directly produce entity-agnostic templates, rather than relying on LLMs to faithfully copy the given example entities, or validating/crafting the templates manually. We incorporate LLM re-prompting, triggered by algorithmic parsing checks, as well as the PARENT metric induced consistency validation to identify and rectify template generation problems in real-time. ASPIRO, compared to direct LLM output, averages 66\% parsing error rate reduction in generated verbalisations of RDF triples on the DART dataset. Our best 5-shot text-davinci-003 setup, scoring BLEU of 50.62, METEOR of 45.16, BLEURT of 0.82, NUBIA of 0.87, and PARENT of 0.8962 on the Rel2Text dataset, competes effectively with recent fine-tuned pre-trained language models. △ Less

Submitted 26 October, 2023; originally announced October 2023.

Comments: Accepted to Findings of EMNLP2023, code available at https://github.com/vejvarm/ASPIRO

arXiv:2310.12581 [pdf, ps, other]

Evolutionary stability of cooperation by the leading eight norms in indirect reciprocity under noisy and private assessment

Authors: Yuma Fujimoto, Hisashi Ohtsuki

Abstract: Indirect reciprocity is a mechanism that explains large-scale cooperation in human societies. In indirect reciprocity, an individual chooses whether or not to cooperate with another based on reputation information, and others evaluate the action as good or bad. Under what evaluation rule (called ``social norm'') cooperation evolves has long been of central interest in the literature. It has been r… ▽ More Indirect reciprocity is a mechanism that explains large-scale cooperation in human societies. In indirect reciprocity, an individual chooses whether or not to cooperate with another based on reputation information, and others evaluate the action as good or bad. Under what evaluation rule (called ``social norm'') cooperation evolves has long been of central interest in the literature. It has been reported that if individuals can share their evaluations (i.e., public reputation), social norms called ``leading eight'' can be evolutionarily stable. On the other hand, when they cannot share their evaluations (i.e., private assessment), the evolutionary stability of cooperation is still in question. To tackle this problem, we create a novel method to analyze the reputation structure in the population under private assessment. Specifically, we characterize each individual by two variables, ``goodness'' (what proportion of the population considers the individual as good) and ``self-reputation'' (whether an individual thinks of him/herself as good or bad), and analyze the stochastic process of how these two variables change over time. We discuss evolutionary stability of each of the leading eight social norms by studying the robustness against invasions of unconditional cooperators and defectors. We identify key pivots in those social norms for establishing a high level of cooperation or stable cooperation against mutants. Our finding gives an insight into how human cooperation is established in a real-world society. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Comments: 12 pages & 5 figures (main), 7 pages & 2 figures (supplement)

arXiv:2305.13619 [pdf, other]

Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Games

Authors: Yuma Fujimoto, Kaito Ariu, Kenshi Abe

Abstract: Learning in games considers how multiple agents maximize their own rewards through repeated games. Memory, an ability that an agent changes his/her action depending on the history of actions in previous games, is often introduced into learning to explore more clever strategies and discuss the decision-making of real agents like humans. However, such games with memory are hard to analyze because th… ▽ More Learning in games considers how multiple agents maximize their own rewards through repeated games. Memory, an ability that an agent changes his/her action depending on the history of actions in previous games, is often introduced into learning to explore more clever strategies and discuss the decision-making of real agents like humans. However, such games with memory are hard to analyze because they exhibit complex phenomena like chaotic dynamics or divergence from Nash equilibrium. In particular, how asymmetry in memory capacities between agents affects learning in games is still unclear. In response, this study formulates a gradient ascent algorithm in games with asymmetry memory capacities. To obtain theoretical insights into learning dynamics, we first consider a simple case of zero-sum games. We observe complex behavior, where learning dynamics draw a heteroclinic connection from unstable fixed points to stable ones. Despite this complexity, we analyze learning dynamics and prove local convergence to these stable fixed points, i.e., the Nash equilibria. We identify the mechanism driving this convergence: an agent with a longer memory learns to exploit the other, which in turn endows the other's utility function with strict concavity. We further numerically observe such convergence in various initial strategies, action numbers, and memory lengths. This study reveals a novel phenomenon due to memory asymmetry, providing fundamental strides in learning in games and new insights into computing equilibria. △ Less

Submitted 16 February, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: 9 pages & 5 figures (main), 5 pages & 2 figures (appendix)

arXiv:2302.03265 [pdf, ps, other]

doi 10.1073/pnas.2300544120

Evolutionary stability of cooperation in indirect reciprocity under noisy and private assessment

Authors: Yuma Fujimoto, Hisashi Ohtsuki

Abstract: Indirect reciprocity is a mechanism that explains large-scale cooperation in humans. In indirect reciprocity, individuals use reputations to choose whether or not to cooperate with a partner and update others' reputations. A major question is how the rules to choose their actions and the rules to update reputations evolve. In the public reputation case, where all individuals share the evaluation o… ▽ More Indirect reciprocity is a mechanism that explains large-scale cooperation in humans. In indirect reciprocity, individuals use reputations to choose whether or not to cooperate with a partner and update others' reputations. A major question is how the rules to choose their actions and the rules to update reputations evolve. In the public reputation case, where all individuals share the evaluation of others, social norms called Simple Standing (SS) and Stern Judging (SJ) have been known to maintain cooperation. However, in the case of private assessment where individuals independently evaluate others, the mechanism of maintenance of cooperation is still largely unknown. This study theoretically shows for the first time that cooperation by indirect reciprocity can be evolutionarily stable under private assessment. Specifically, we find that SS can be stable, but SJ can never be. This is intuitive because SS can correct interpersonal discrepancies in reputations through its simplicity. On the other hand, SJ is too complicated to avoid an accumulation of errors, which leads to the collapse of cooperation. We conclude that moderate simplicity is a key to success in maintaining cooperation under the private assessment. Our result provides a theoretical basis for evolution of human cooperation. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: 12 pages, 3 figures, 1 table (main); 15 pages, 4 figures, 2 tables (supplement)

arXiv:2302.01073 [pdf, other]

Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium

Authors: Yuma Fujimoto, Kaito Ariu, Kenshi Abe

Abstract: Repeated games consider a situation where multiple agents are motivated by their independent rewards throughout learning. In general, the dynamics of their learning become complex. Especially when their rewards compete with each other like zero-sum games, the dynamics often do not converge to their optimum, i.e., the Nash equilibrium. To tackle such complexity, many studies have understood various… ▽ More Repeated games consider a situation where multiple agents are motivated by their independent rewards throughout learning. In general, the dynamics of their learning become complex. Especially when their rewards compete with each other like zero-sum games, the dynamics often do not converge to their optimum, i.e., the Nash equilibrium. To tackle such complexity, many studies have understood various learning algorithms as dynamical systems and discovered qualitative insights among the algorithms. However, such studies have yet to handle multi-memory games (where agents can memorize actions they played in the past and choose their actions based on their memories), even though memorization plays a pivotal role in artificial intelligence and interpersonal relationship. This study extends two major learning algorithms in games, i.e., replicator dynamics and gradient ascent, into multi-memory games. Then, we prove their dynamics are identical. Furthermore, theoretically and experimentally, we clarify that the learning dynamics diverge from the Nash equilibrium in multi-memory zero-sum games and reach heteroclinic cycles (sojourn longer around the boundary of the strategy space), providing a fundamental advance in learning in games. △ Less

Submitted 22 May, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

Comments: 8 pages & 4 figures (main), 6 pages & 1figure (appendix)

arXiv:2210.17030 [pdf, other]

Uncertainty Aware Trader-Company Method: Interpretable Stock Price Prediction Capturing Uncertainty

Authors: Yugo Fujimoto, Kei Nakagawa, Kentaro Imajo, Kentaro Minami

Abstract: Machine learning is an increasingly popular tool with some success in predicting stock prices. One promising method is the Trader-Company~(TC) method, which takes into account the dynamism of the stock market and has both high predictive power and interpretability. Machine learning-based stock prediction methods including the TC method have been concentrating on point prediction. However, point pr… ▽ More Machine learning is an increasingly popular tool with some success in predicting stock prices. One promising method is the Trader-Company~(TC) method, which takes into account the dynamism of the stock market and has both high predictive power and interpretability. Machine learning-based stock prediction methods including the TC method have been concentrating on point prediction. However, point prediction in the absence of uncertainty estimates lacks credibility quantification and raises concerns about safety. The challenge in this paper is to make an investment strategy that combines high predictive power and the ability to quantify uncertainty. We propose a novel approach called Uncertainty Aware Trader-Company Method~(UTC) method. The core idea of this approach is to combine the strengths of both frameworks by merging the TC method with the probabilistic modeling, which provides probabilistic predictions and uncertainty estimations. We expect this to retain the predictive power and interpretability of the TC method while capturing the uncertainty. We theoretically prove that the proposed method estimates the posterior variance and does not introduce additional biases from the original TC method. We conduct a comprehensive evaluation of our approach based on the synthetic and real market datasets. We confirm with synthetic data that the UTC method can detect situations where the uncertainty increases and the prediction is difficult. We also confirmed that the UTC method can detect abrupt changes in data generating distributions. We demonstrate with real market data that the UTC method can achieve higher returns and lower risks than baselines. △ Less

Submitted 2 November, 2022; v1 submitted 30 October, 2022; originally announced October 2022.

Comments: IEEE BIGDATA 2022 Accepted

arXiv:2203.12898 [pdf, ps, other]

Reputation structure in indirect reciprocity under noisy and private assessment

Authors: Yuma Fujimoto, Hisashi Ohtsuki

Abstract: Evaluation relationships are pivotal for maintaining a cooperative society. A formation of the evaluation relationships has been discussed in terms of indirect reciprocity, by modeling dynamics of good or bad reputations among individuals. Recently, a situation that individuals independently evaluate others with errors (i.e., noisy and private reputation) is considered, where the reputation struct… ▽ More Evaluation relationships are pivotal for maintaining a cooperative society. A formation of the evaluation relationships has been discussed in terms of indirect reciprocity, by modeling dynamics of good or bad reputations among individuals. Recently, a situation that individuals independently evaluate others with errors (i.e., noisy and private reputation) is considered, where the reputation structure (from what proportion of individuals in the population each receives good reputations, defined as goodness here) becomes complex, and thus has been studied mainly with numerical simulations. The present study gives a theoretical analysis of such complex reputation structure. We formulate the time change of goodness of individuals caused by updates of reputations among individuals. By considering a large population, we derive dynamics of the frequency distribution of goodnesses. An equilibrium state of the dynamics is approximated by a summation of Gaussian functions. We demonstrate that the theoretical solution well fits the numerical calculation. From the theoretical solution, we obtain a new interpretation of the complex reputation structure. This study provides a novel mathematical basis for cutting-edge studies on indirect reciprocity. △ Less

Submitted 9 May, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

Comments: 14 pages, 5 figure (mainmanuscript); 3 pages (supplement)

arXiv:2101.08156 [pdf, other]

doi 10.1007/JHEP03(2021)273

Extensive Studies of the Neutron Star Equation of State from the Deep Learning Inference with the Observational Data Augmentation

Authors: Yuki Fujimoto, Kenji Fukushima, Koichi Murase

Abstract: We discuss deep learning inference for the neutron star equation of state (EoS) using the real observational data of the mass and the radius. We make a quantitative comparison between the conventional polynomial regression and the neural network approach for the EoS parametrization. For our deep learning method to incorporate uncertainties in observation, we augment the training data with noise fl… ▽ More We discuss deep learning inference for the neutron star equation of state (EoS) using the real observational data of the mass and the radius. We make a quantitative comparison between the conventional polynomial regression and the neural network approach for the EoS parametrization. For our deep learning method to incorporate uncertainties in observation, we augment the training data with noise fluctuations corresponding to observational uncertainties. Deduced EoSs can accommodate a weak first-order phase transition, and we make a histogram for likely first-order regions. We also find that our observational data augmentation has a byproduct to tame the overfitting behavior. To check the performance improved by the data augmentation, we set up a toy model as the simplest inference problem to recover a double-peaked function and monitor the validation loss. We conclude that the data augmentation could be a useful technique to evade the overfitting without tuning the neural network architecture such as inserting the dropout. △ Less

Submitted 20 January, 2021; originally announced January 2021.

Comments: 45 pages, 25 figures

Journal ref: JHEP 03 (2021) 273

arXiv:1810.01740 [pdf, ps, other]

doi 10.1088/1367-2630/ab0459

Functional Dynamics by Intention Recognition in Iterated Games

Authors: Yuma Fujimoto, Kunihiko Kaneko

Abstract: Intention recognition is an important characteristic of intelligent agents. In their interactions with others, they try to read others' intentions and make an image of others to choose their actions accordingly. While the way in which players choose their actions depending on such intentions has been investigated in game theory, how dynamic changes in intentions by mutually reading others' intenti… ▽ More Intention recognition is an important characteristic of intelligent agents. In their interactions with others, they try to read others' intentions and make an image of others to choose their actions accordingly. While the way in which players choose their actions depending on such intentions has been investigated in game theory, how dynamic changes in intentions by mutually reading others' intentions are incorporated into game theory has not been explored. We present a novel formulation of game theory in which players read others' intentions and change their own through an iterated game. Here, intention is given as a function of the other's action and the own action to be taken accordingly as the dependent variable, while the mutual recognition of intention is represented as the functional dynamics. It is shown that a player suffers no disadvantage when he/she recognizes the other's intention, whereas the functional dynamics reach equilibria in which both players' intentions are optimized. These cover a classical Nash and Stackelberg equilibria but we extend them in this study: Novel equilibria exist depending on the degree of mutual recognition. Moreover, the degree to which each player recognizes the other can also differ. This formulation is applied to resource competition, duopoly, and prisoner's dilemma games. For example, in the resource competition game with player-dependent capacity on gaining the resource, the superior player's recognition leads to the exploitation of the other, while the inferior player's recognition leads to cooperation through which both players' payoffs increase. △ Less

Submitted 26 September, 2018; originally announced October 2018.

Comments: 20 pages, 6 figures, and supplementary material

arXiv:1704.07147 [pdf, ps, other]

doi 10.1007/978-3-319-91253-0_5

A Neural Network model with Bidirectional Whitening

Authors: Yuki Fujimoto, Toru Ohira

Abstract: We present here a new model and algorithm which performs an efficient Natural gradient descent for Multilayer Perceptrons. Natural gradient descent was originally proposed from a point of view of information geometry, and it performs the steepest descent updates on manifolds in a Riemannian space. In particular, we extend an approach taken by the "Whitened neural networks" model. We make the white… ▽ More We present here a new model and algorithm which performs an efficient Natural gradient descent for Multilayer Perceptrons. Natural gradient descent was originally proposed from a point of view of information geometry, and it performs the steepest descent updates on manifolds in a Riemannian space. In particular, we extend an approach taken by the "Whitened neural networks" model. We make the whitening process not only in feed-forward direction as in the original model, but also in the back-propagation phase. Its efficacy is shown by an application of this "Bidirectional whitened neural networks" model to a handwritten character recognition data (MNIST data). △ Less

Submitted 24 April, 2017; originally announced April 2017.

Comments: 16pages

Journal ref: In: Rutkowski L., Scherer R., Korytkowski M., Pedrycz W., Tadeusiewicz R., Zurada J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2018. Lecture Notes in Computer Science, vol 10841. Springer, Cham

Showing 1–15 of 15 results for author: Fujimoto, Y