-
Vaporetto: Efficient Japanese Tokenization Based on Improved Pointwise Linear Classification
Authors:
Koichi Akabe,
Shunsuke Kanda,
Yusuke Oda,
Shinsuke Mori
Abstract:
This paper proposes an approach to improve the runtime efficiency of Japanese tokenization based on the pointwise linear classification (PLC) framework, which formulates the whole tokenization process as a sequence of linear classification problems. Our approach optimizes tokenization by leveraging the characteristics of the PLC framework and the task definition. Our approach involves (1) composin…
▽ More
This paper proposes an approach to improve the runtime efficiency of Japanese tokenization based on the pointwise linear classification (PLC) framework, which formulates the whole tokenization process as a sequence of linear classification problems. Our approach optimizes tokenization by leveraging the characteristics of the PLC framework and the task definition. Our approach involves (1) composing multiple classifications into array-based operations, (2) efficient feature lookup with memory-optimized automata, and (3) three orthogonal pre-processing methods for reducing actual score calculation. Thus, our approach makes the tokenization speed 5.7 times faster than the current approach based on the same model without decreasing tokenization accuracy. Our implementation is available at https://github.com/daac-tools/vaporetto under the MIT or Apache-2.0 license.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
BioVL-QR: Egocentric Biochemical Video-and-Language Dataset Using Micro QR Codes
Authors:
Taichi Nishimura,
Koki Yamamoto,
Yuto Haneji,
Keiya Kajimura,
Chihiro Nishiwaki,
Eriko Daikoku,
Natsuko Okuda,
Fumihito Ono,
Hirotaka Kameko,
Shinsuke Mori
Abstract:
This paper introduces a biochemical vision-and-language dataset, which consists of 24 egocentric experiment videos, corresponding protocols, and video-and-language alignments. The key challenge in the wet-lab domain is detecting equipment, reagents, and containers is difficult because the lab environment is scattered by filling objects on the table and some objects are indistinguishable. Therefore…
▽ More
This paper introduces a biochemical vision-and-language dataset, which consists of 24 egocentric experiment videos, corresponding protocols, and video-and-language alignments. The key challenge in the wet-lab domain is detecting equipment, reagents, and containers is difficult because the lab environment is scattered by filling objects on the table and some objects are indistinguishable. Therefore, previous studies assume that objects are manually annotated and given for downstream tasks, but this is costly and time-consuming. To address this issue, this study focuses on Micro QR Codes to detect objects automatically. From our preliminary study, we found that detecting objects only using Micro QR Codes is still difficult because the researchers manipulate objects, causing blur and occlusion frequently. To address this, we also propose a novel object labeling method by combining a Micro QR Code detector and an off-the-shelf hand object detector. As one of the applications of our dataset, we conduct the task of generating protocols from experiment videos and find that our approach can generate accurate protocols.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Text-driven Affordance Learning from Egocentric Vision
Authors:
Tomoya Yoshida,
Shuhei Kurita,
Taichi Nishimura,
Shinsuke Mori
Abstract:
Visual affordance learning is a key component for robots to understand how to interact with objects. Conventional approaches in this field rely on pre-defined objects and actions, falling short of capturing diverse interactions in realworld scenarios. The key idea of our approach is employing textual instruction, targeting various affordances for a wide range of objects. This approach covers both…
▽ More
Visual affordance learning is a key component for robots to understand how to interact with objects. Conventional approaches in this field rely on pre-defined objects and actions, falling short of capturing diverse interactions in realworld scenarios. The key idea of our approach is employing textual instruction, targeting various affordances for a wide range of objects. This approach covers both hand-object and tool-object interactions. We introduce text-driven affordance learning, aiming to learn contact points and manipulation trajectories from an egocentric view following textual instruction. In our task, contact points are represented as heatmaps, and the manipulation trajectory as sequences of coordinates that incorporate both linear and rotational movements for various manipulations. However, when we gather data for this task, manual annotations of these diverse interactions are costly. To this end, we propose a pseudo dataset creation pipeline and build a large pseudo-training dataset: TextAFF80K, consisting of over 80K instances of the contact points, trajectories, images, and text tuples. We extend existing referring expression comprehension models for our task, and experimental results show that our approach robustly handles multiple affordances, serving as a new standard for affordance learning in real-world scenarios.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Towards Algorithmic Fidelity: Mental Health Representation across Demographics in Synthetic vs. Human-generated Data
Authors:
Shinka Mori,
Oana Ignat,
Andrew Lee,
Rada Mihalcea
Abstract:
Synthetic data generation has the potential to impact applications and domains with scarce data. However, before such data is used for sensitive tasks such as mental health, we need an understanding of how different demographics are represented in it. In our paper, we analyze the potential of producing synthetic data using GPT-3 by exploring the various stressors it attributes to different race an…
▽ More
Synthetic data generation has the potential to impact applications and domains with scarce data. However, before such data is used for sensitive tasks such as mental health, we need an understanding of how different demographics are represented in it. In our paper, we analyze the potential of producing synthetic data using GPT-3 by exploring the various stressors it attributes to different race and gender combinations, to provide insight for future researchers looking into using LLMs for data generation. Using GPT-3, we develop HEADROOM, a synthetic dataset of 3,120 posts about depression-triggering stressors, by controlling for race, gender, and time frame (before and after COVID-19). Using this dataset, we conduct semantic and lexical analyses to (1) identify the predominant stressors for each demographic group; and (2) compare our synthetic data to a human-generated dataset. We present the procedures to generate queries to develop depression data using GPT-3, and conduct analyzes to uncover the types of stressors it assigns to demographic groups, which could be used to test the limitations of LLMs for synthetic data generation for depression data. Our findings show that synthetic data mimics some of the human-generated data distribution for the predominant depression stressors across diverse demographics.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Automatic Construction of a Large-Scale Corpus for Geoparsing Using Wikipedia Hyperlinks
Authors:
Keyaki Ohno,
Hirotaka Kameko,
Keisuke Shirai,
Taichi Nishimura,
Shinsuke Mori
Abstract:
Geoparsing is the task of estimating the latitude and longitude (coordinates) of location expressions in texts. Geoparsing must deal with the ambiguity of the expressions that indicate multiple locations with the same notation. For evaluating geoparsing systems, several corpora have been proposed in previous work. However, these corpora are small-scale and suffer from the coverage of location expr…
▽ More
Geoparsing is the task of estimating the latitude and longitude (coordinates) of location expressions in texts. Geoparsing must deal with the ambiguity of the expressions that indicate multiple locations with the same notation. For evaluating geoparsing systems, several corpora have been proposed in previous work. However, these corpora are small-scale and suffer from the coverage of location expressions on general domains. In this paper, we propose Wikipedia Hyperlink-based Location Linking (WHLL), a novel method to construct a large-scale corpus for geoparsing from Wikipedia articles. WHLL leverages hyperlinks in Wikipedia to annotate multiple location expressions with coordinates. With this method, we constructed the WHLL corpus, a new large-scale corpus for geoparsing. The WHLL corpus consists of 1.3M articles, each containing about 7.8 unique location expressions. 45.6% of location expressions are ambiguous and refer to more than one location with the same notation. In each article, location expressions of the article title and those hyperlinks to other articles are assigned with coordinates. By utilizing hyperlinks, we can accurately assign location expressions with coordinates even with ambiguous location expressions in the texts. Experimental results show that there remains room for improvement by disambiguating location expressions.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
DeepDR: Deep Structure-Aware RGB-D Inpainting for Diminished Reality
Authors:
Christina Gsaxner,
Shohei Mori,
Dieter Schmalstieg,
Jan Egger,
Gerhard Paar,
Werner Bailer,
Denis Kalkofen
Abstract:
Diminished reality (DR) refers to the removal of real objects from the environment by virtually replacing them with their background. Modern DR frameworks use inpainting to hallucinate unobserved regions. While recent deep learning-based inpainting is promising, the DR use case is complicated by the need to generate coherent structure and 3D geometry (i.e., depth), in particular for advanced appli…
▽ More
Diminished reality (DR) refers to the removal of real objects from the environment by virtually replacing them with their background. Modern DR frameworks use inpainting to hallucinate unobserved regions. While recent deep learning-based inpainting is promising, the DR use case is complicated by the need to generate coherent structure and 3D geometry (i.e., depth), in particular for advanced applications, such as 3D scene editing. In this paper, we propose DeepDR, a first RGB-D inpainting framework fulfilling all requirements of DR: Plausible image and geometry inpainting with coherent structure, running at real-time frame rates, with minimal temporal artifacts. Our structure-aware generative network allows us to explicitly condition color and depth outputs on the scene semantics, overcoming the difficulty of reconstructing sharp and consistent boundaries in regions with complex backgrounds. Experimental results show that the proposed framework can outperform related work qualitatively and quantitatively.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Vision-Language Interpreter for Robot Task Planning
Authors:
Keisuke Shirai,
Cristian C. Beltran-Hernandez,
Masashi Hamaya,
Atsushi Hashimoto,
Shohei Tanaka,
Kento Kawaharazuka,
Kazutoshi Tanaka,
Yoshitaka Ushiku,
Shinsuke Mori
Abstract:
Large language models (LLMs) are accelerating the development of language-guided robot planners. Meanwhile, symbolic planners offer the advantage of interpretability. This paper proposes a new task that bridges these two trends, namely, multimodal planning problem specification. The aim is to generate a problem description (PD), a machine-readable file used by the planners to find a plan. By gener…
▽ More
Large language models (LLMs) are accelerating the development of language-guided robot planners. Meanwhile, symbolic planners offer the advantage of interpretability. This paper proposes a new task that bridges these two trends, namely, multimodal planning problem specification. The aim is to generate a problem description (PD), a machine-readable file used by the planners to find a plan. By generating PDs from language instruction and scene observation, we can drive symbolic planners in a language-guided framework. We propose a Vision-Language Interpreter (ViLaIn), a new framework that generates PDs using state-of-the-art LLM and vision-language models. ViLaIn can refine generated PDs via error message feedback from the symbolic planner. Our aim is to answer the question: How accurately can ViLaIn and the symbolic planner generate valid robot plans? To evaluate ViLaIn, we introduce a novel dataset called the problem description generation (ProDG) dataset. The framework is evaluated with four new evaluation metrics. Experimental results show that ViLaIn can generate syntactically correct problems with more than 99\% accuracy and valid plans with more than 58\% accuracy. Our code and dataset are available at https://github.com/omron-sinicx/ViLaIn.
△ Less
Submitted 19 February, 2024; v1 submitted 1 November, 2023;
originally announced November 2023.
-
Towards Flow Graph Prediction of Open-Domain Procedural Texts
Authors:
Keisuke Shirai,
Hirotaka Kameko,
Shinsuke Mori
Abstract:
Machine comprehension of procedural texts is essential for reasoning about the steps and automating the procedures. However, this requires identifying entities within a text and resolving the relationships between the entities. Previous work focused on the cooking domain and proposed a framework to convert a recipe text into a flow graph (FG) representation. In this work, we propose a framework ba…
▽ More
Machine comprehension of procedural texts is essential for reasoning about the steps and automating the procedures. However, this requires identifying entities within a text and resolving the relationships between the entities. Previous work focused on the cooking domain and proposed a framework to convert a recipe text into a flow graph (FG) representation. In this work, we propose a framework based on the recipe FG for flow graph prediction of open-domain procedural texts. To investigate flow graph prediction performance in non-cooking domains, we introduce the wikiHow-FG corpus from articles on wikiHow, a website of how-to instruction articles. In experiments, we consider using the existing recipe corpus and performing domain adaptation from the cooking to the target domain. Experimental results show that the domain adaptation models achieve higher performance than those trained only on the cooking or target domain data.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
Authors:
Oana Ignat,
Zhijing Jin,
Artem Abzaliev,
Laura Biester,
Santiago Castro,
Naihao Deng,
Xinyi Gao,
Aylin Gunal,
Jacky He,
Ashkan Kazemi,
Muhammad Khalifa,
Namho Koh,
Andrew Lee,
Siyang Liu,
Do June Min,
Shinka Mori,
Joan Nwatu,
Veronica Perez-Rosas,
Siqi Shen,
Zekun Wang,
Winston Wu,
Rada Mihalcea
Abstract:
Recent progress in large language models (LLMs) has enabled the deployment of many generative NLP applications. At the same time, it has also led to a misleading public discourse that ``it's all been solved.'' Not surprisingly, this has, in turn, made many NLP researchers -- especially those at the beginning of their careers -- worry about what NLP research area they should focus on. Has it all be…
▽ More
Recent progress in large language models (LLMs) has enabled the deployment of many generative NLP applications. At the same time, it has also led to a misleading public discourse that ``it's all been solved.'' Not surprisingly, this has, in turn, made many NLP researchers -- especially those at the beginning of their careers -- worry about what NLP research area they should focus on. Has it all been solved, or what remaining questions can we work on regardless of LLMs? To address this question, this paper compiles NLP research directions rich for exploration. We identify fourteen different research areas encompassing 45 research directions that require new research and are not directly solvable by LLMs. While we identify many research areas, many others exist; we do not cover areas currently addressed by LLMs, but where LLMs lag behind in performance or those focused on LLM development. We welcome suggestions for other research directions to include: https://bit.ly/nlp-era-llm
△ Less
Submitted 15 March, 2024; v1 submitted 21 May, 2023;
originally announced May 2023.
-
Recipe Generation from Unsegmented Cooking Videos
Authors:
Taichi Nishimura,
Atsushi Hashimoto,
Yoshitaka Ushiku,
Hirotaka Kameko,
Shinsuke Mori
Abstract:
This paper tackles recipe generation from unsegmented cooking videos, a task that requires agents to (1) extract key events in completing the dish and (2) generate sentences for the extracted events. Our task is similar to dense video captioning (DVC), which aims at detecting events thoroughly and generating sentences for them. However, unlike DVC, in recipe generation, recipe story awareness is c…
▽ More
This paper tackles recipe generation from unsegmented cooking videos, a task that requires agents to (1) extract key events in completing the dish and (2) generate sentences for the extracted events. Our task is similar to dense video captioning (DVC), which aims at detecting events thoroughly and generating sentences for them. However, unlike DVC, in recipe generation, recipe story awareness is crucial, and a model should extract an appropriate number of events in the correct order and generate accurate sentences based on them. We analyze the output of the DVC model and confirm that although (1) several events are adoptable as a recipe story, (2) the generated sentences for such events are not grounded in the visual content. Based on this, we set our goal to obtain correct recipes by selecting oracle events from the output events and re-generating sentences for them. To achieve this, we propose a transformer-based multimodal recurrent approach of training an event selector and sentence generator for selecting oracle events from the DVC's events and generating sentences for them. In addition, we extend the model by including ingredients to generate more accurate recipes. The experimental results show that the proposed method outperforms state-of-the-art DVC models. We also confirm that, by modeling the recipe in a story-aware manner, the proposed model outputs the appropriate number of events in the correct order.
△ Less
Submitted 18 February, 2024; v1 submitted 21 September, 2022;
originally announced September 2022.
-
Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows
Authors:
Keisuke Shirai,
Atsushi Hashimoto,
Taichi Nishimura,
Hirotaka Kameko,
Shuhei Kurita,
Yoshitaka Ushiku,
Shinsuke Mori
Abstract:
We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph (r-FG). The image pairs are grounded in the r-FG, which provides the cross-mo…
▽ More
We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph (r-FG). The image pairs are grounded in the r-FG, which provides the cross-modal relation. With our dataset, one can try a range of applications, from multimodal commonsense reasoning and procedural text generation.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Neural Text Generation with Artificial Negative Examples
Authors:
Keisuke Shirai,
Kazuma Hashimoto,
Akiko Eriguchi,
Takashi Ninomiya,
Shinsuke Mori
Abstract:
Neural text generation models conditioning on given input (e.g. machine translation and image captioning) are usually trained by maximum likelihood estimation of target text. However, the trained models suffer from various types of errors at inference time. In this paper, we propose to suppress an arbitrary type of errors by training the text generation model in a reinforcement learning framework,…
▽ More
Neural text generation models conditioning on given input (e.g. machine translation and image captioning) are usually trained by maximum likelihood estimation of target text. However, the trained models suffer from various types of errors at inference time. In this paper, we propose to suppress an arbitrary type of errors by training the text generation model in a reinforcement learning framework, where we use a trainable reward function that is capable of discriminating between references and sentences containing the targeted type of errors. We create such negative examples by artificially injecting the targeted errors to the references. In experiments, we focus on two error types, repeated and dropped tokens in model-generated text. The experimental results show that our method can suppress the generation errors and achieve significant improvements on two machine translation and two image captioning tasks.
△ Less
Submitted 28 December, 2020;
originally announced December 2020.
-
Content Word-based Sentence Decoding and Evaluating for Open-domain Neural Response Generation
Authors:
Tianyu Zhao,
Shinsuke Mori,
Tatsuya Kawahara
Abstract:
Various encoder-decoder models have been applied to response generation in open-domain dialogs, but a majority of conventional models directly learn a mapping from lexical input to lexical output without explicitly modeling intermediate representations. Utilizing language hierarchy and modeling intermediate information have been shown to benefit many language understanding and generation tasks. Mo…
▽ More
Various encoder-decoder models have been applied to response generation in open-domain dialogs, but a majority of conventional models directly learn a mapping from lexical input to lexical output without explicitly modeling intermediate representations. Utilizing language hierarchy and modeling intermediate information have been shown to benefit many language understanding and generation tasks. Motivated by Broca's aphasia, we propose to use a content word sequence as an intermediate representation for open-domain response generation. Experimental results show that the proposed method improves content relatedness of produced responses, and our models can often choose correct grammar for generated content words. Meanwhile, instead of evaluating complete sentences, we propose to compute conventional metrics on content word sequences, which is a better indicator of content relevance.
△ Less
Submitted 26 June, 2019; v1 submitted 31 May, 2019;
originally announced May 2019.
-
The Pitman-Yor process and an empirical study of choice behavior
Authors:
Masato Hisakado,
Fumiaki Sano,
Shintaro Mori
Abstract:
This study discusses choice behavior using a voting model in which voters can obtain information from a finite number of previous $r$ voters. Voters vote for a candidate with a probability proportional to the previous vote ratio, which is visible to the voters. We obtain the Pitman sampling formula as the equilibrium distribution of $r$ votes. We present the model as a process of posting on a bull…
▽ More
This study discusses choice behavior using a voting model in which voters can obtain information from a finite number of previous $r$ voters. Voters vote for a candidate with a probability proportional to the previous vote ratio, which is visible to the voters. We obtain the Pitman sampling formula as the equilibrium distribution of $r$ votes. We present the model as a process of posting on a bulletin board system, 2ch.net, where users can choose one of many threads to create a post. We explore how this choice depends on the last $r$ posts and the distribution of these last $r$ posts across threads. We conclude that the posting process is described by our voting model with analog herders for a small $r$, which might correspond to the time horizon of users' responses.
△ Less
Submitted 11 December, 2017; v1 submitted 24 July, 2017;
originally announced July 2017.
-
Analysis of the Effect of Dependency Information on Predicate-Argument Structure Analysis and Zero Anaphora Resolution
Authors:
Koichiro Yoshino,
Shinsuke Mori,
Satoshi Nakamura
Abstract:
This paper investigates and analyzes the effect of dependency information on predicate-argument structure analysis (PASA) and zero anaphora resolution (ZAR) for Japanese, and shows that a straightforward approach of PASA and ZAR works effectively even if dependency information was not available. We constructed an analyzer that directly predicts relationships of predicates and arguments with their…
▽ More
This paper investigates and analyzes the effect of dependency information on predicate-argument structure analysis (PASA) and zero anaphora resolution (ZAR) for Japanese, and shows that a straightforward approach of PASA and ZAR works effectively even if dependency information was not available. We constructed an analyzer that directly predicts relationships of predicates and arguments with their semantic roles from a POS-tagged corpus. The features of the system are designed to compensate for the absence of syntactic information by using features used in dependency parsing as a reference. We also constructed analyzers that use the oracle dependency and the real dependency parsing results, and compared with the system that does not use any syntactic information to verify that the improvement provided by dependencies is not crucial.
△ Less
Submitted 31 May, 2017;
originally announced May 2017.
-
Universal Dependencies for Learner English
Authors:
Yevgeni Berzak,
Jessica Kenney,
Carolyn Spadine,
Jing Xian Wang,
Lucia Lam,
Keiko Sophie Mori,
Sebastian Garza,
Boris Katz
Abstract:
We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic…
▽ More
We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic analyses are provided for both the original and error corrected versions of each sentence. Further on, we delineate ESL annotation guidelines that allow for consistent syntactic treatment of ungrammatical English. Finally, we benchmark POS tagging and dependency parsing performance on the TLE dataset and measure the effect of grammatical errors on parsing accuracy. We envision the treebank to support a wide range of linguistic and computational research on second language acquisition as well as automatic processing of ungrammatical language. The treebank is available at universaldependencies.org. The annotation manual used in this project and a graphical query engine are available at esltreebank.org.
△ Less
Submitted 7 June, 2016; v1 submitted 13 May, 2016;
originally announced May 2016.
-
Information cascade on networks
Authors:
Masato Hisakado,
Shintaro Mori
Abstract:
In this paper, we discuss a voting model by considering three different kinds of networks: a random graph, the Barabási-Albert(BA) model, and a fitness model. A voting model represents the way in which public perceptions are conveyed to voters. Our voting model is constructed by using two types of voters--herders and independents--and two candidates. Independents conduct voting based on their fund…
▽ More
In this paper, we discuss a voting model by considering three different kinds of networks: a random graph, the Barabási-Albert(BA) model, and a fitness model. A voting model represents the way in which public perceptions are conveyed to voters. Our voting model is constructed by using two types of voters--herders and independents--and two candidates. Independents conduct voting based on their fundamental values; on the other hand, herders base their voting on the number of previous votes. Hence, herders vote for the majority candidates and obtain information relating to previous votes from their networks. We discuss the difference between the phases on which the networks depend. Two kinds of phase transitions, an information cascade transition and a super-normal transition, were identified. The first of these is a transition between a state in which most voters make the correct choices and a state in which most of them are wrong. The second is a transition of convergence speed. The information cascade transition prevails when herder effects are stronger than the super-normal transition. In the BA and fitness models, the critical point of the information cascade transition is the same as that of the random network model. However, the critical point of the super-normal transition disappears when these two models are used. In conclusion, the influence of networks is shown to only affect the convergence speed and not the information cascade transition. We are therefore able to conclude that the influence of hubs on voters' perceptions is limited.
△ Less
Submitted 15 December, 2015; v1 submitted 2 April, 2015;
originally announced April 2015.
-
Interactive Restless Multi-armed Bandit Game and Swarm Intelligence Effect
Authors:
Shunsuke Yoshida,
Masato Hisakado,
Shintaro Mori
Abstract:
We obtain the conditions for the emergence of the swarm intelligence effect in an interactive game of restless multi-armed bandit (rMAB). A player competes with multiple agents. Each bandit has a payoff that changes with a probability $p_{c}$ per round. The agents and player choose one of three options: (1) Exploit (a good bandit), (2) Innovate (asocial learning for a good bandit among $n_{I}$ ran…
▽ More
We obtain the conditions for the emergence of the swarm intelligence effect in an interactive game of restless multi-armed bandit (rMAB). A player competes with multiple agents. Each bandit has a payoff that changes with a probability $p_{c}$ per round. The agents and player choose one of three options: (1) Exploit (a good bandit), (2) Innovate (asocial learning for a good bandit among $n_{I}$ randomly chosen bandits), and (3) Observe (social learning for a good bandit). Each agent has two parameters $(c,p_{obs})$ to specify the decision: (i) $c$, the threshold value for Exploit, and (ii) $p_{obs}$, the probability for Observe in learning. The parameters $(c,p_{obs})$ are uniformly distributed. We determine the optimal strategies for the player using complete knowledge about the rMAB. We show whether or not social or asocial learning is more optimal in the $(p_{c},n_{I})$ space and define the swarm intelligence effect. We conduct a laboratory experiment (67 subjects) and observe the swarm intelligence effect only if $(p_{c},n_{I})$ are chosen so that social learning is far more optimal than asocial learning.
△ Less
Submitted 13 March, 2015;
originally announced March 2015.
-
Model-Based Policy Gradients with Parameter-Based Exploration by Least-Squares Conditional Density Estimation
Authors:
Syogo Mori,
Voot Tangkaratt,
Tingting Zhao,
Jun Morimoto,
Masashi Sugiyama
Abstract:
The goal of reinforcement learning (RL) is to let an agent learn an optimal control policy in an unknown environment so that future expected rewards are maximized. The model-free RL approach directly learns the policy based on data samples. Although using many samples tends to improve the accuracy of policy learning, collecting a large number of samples is often expensive in practice. On the other…
▽ More
The goal of reinforcement learning (RL) is to let an agent learn an optimal control policy in an unknown environment so that future expected rewards are maximized. The model-free RL approach directly learns the policy based on data samples. Although using many samples tends to improve the accuracy of policy learning, collecting a large number of samples is often expensive in practice. On the other hand, the model-based RL approach first estimates the transition model of the environment and then learns the policy based on the estimated transition model. Thus, if the transition model is accurately learned from a small amount of data, the model-based approach can perform better than the model-free approach. In this paper, we propose a novel model-based RL method by combining a recently proposed model-free policy search method called policy gradients with parameter-based exploration and the state-of-the-art transition model estimator called least-squares conditional density estimation. Through experiments, we demonstrate the practical usefulness of the proposed method.
△ Less
Submitted 18 July, 2013;
originally announced July 2013.
-
Collective Adoption of Max-Min Strategy in an Information Cascade Voting Experiment
Authors:
Shintaro Mori,
Masato Hisakado,
Taiki Takahashi
Abstract:
We consider a situation where one has to choose an option with multiplier m. The multiplier is inversely proportional to the number of people who have chosen the option and is proportional to the return if it is correct. If one does not know the correct option, we call him a herder, and then there is a zero-sum game between the herder and other people who have set the multiplier. The max-min strat…
▽ More
We consider a situation where one has to choose an option with multiplier m. The multiplier is inversely proportional to the number of people who have chosen the option and is proportional to the return if it is correct. If one does not know the correct option, we call him a herder, and then there is a zero-sum game between the herder and other people who have set the multiplier. The max-min strategy where one divides one's choice inversely proportional to m is optimal from the viewpoint of the maximization of expected return. We call the optimal herder an analog herder. The system of analog herders takes the probability of correct choice to one for any value of the ratio of herders, p<1, in the thermodynamic limit if the accuracy of the choice of informed person q is one. We study how herders choose by a voting experiment in which 50 to 60 subjects sequentially answer a two-choice quiz. We show that the probability of selecting a choice by the herders is inversely proportional to m for 4/3 < m < 4 and they collectively adopt the max-min strategy in that range.
△ Less
Submitted 26 June, 2013; v1 submitted 13 November, 2012;
originally announced November 2012.
-
Two kinds of Phase transitions in a Voting model
Authors:
Masato Hisakado,
Shintaro Mori
Abstract:
In this paper, we discuss a voting model with two candidates, C_0 and C_1. We consider two types of voters--herders and independents. The voting of independents is based on their fundamental values; on the other hand, the voting of herders is based on the number of previous votes. We can identify two kinds of phase transitions. One is an information cascade transition similar to a phase transition…
▽ More
In this paper, we discuss a voting model with two candidates, C_0 and C_1. We consider two types of voters--herders and independents. The voting of independents is based on their fundamental values; on the other hand, the voting of herders is based on the number of previous votes. We can identify two kinds of phase transitions. One is an information cascade transition similar to a phase transition seen in Ising model. The other is a transition of super and normal diffusions. These phase transitions coexist. We compared our results to the conclusions of experiments and identified the phase transitions in the upper limit of the time t by using analysis of human behavior obtained from experiments.
△ Less
Submitted 26 July, 2012; v1 submitted 15 March, 2012;
originally announced March 2012.
-
Phase transition to two-peaks phase in an information cascade voting experiment
Authors:
Shintaro Mori,
Masato Hisakado,
Taiki Takahashi
Abstract:
Observational learning is an important information aggregation mechanism. However, it occasionally leads to a state in which an entire population chooses a sub-optimal option. When it occurs and whether it is a phase transition remain unanswered. To address these questions, we performed a voting experiment in which subjects answered a two-choice quiz sequentially with and without information about…
▽ More
Observational learning is an important information aggregation mechanism. However, it occasionally leads to a state in which an entire population chooses a sub-optimal option. When it occurs and whether it is a phase transition remain unanswered. To address these questions, we performed a voting experiment in which subjects answered a two-choice quiz sequentially with and without information about the prior subjects' choices. The subjects who could copy others are called herders. We obtained a microscopic rule regarding how herders copy others. Varying the ratio of herders led to qualitative changes in the macroscopic behavior in the experiment of about 50 subjects. If the ratio is small, the sequence of choices rapidly converges to the true one. As the ratio approaches 100%, convergence becomes extremely slow and information aggregation almost terminates. A simulation study of a stochastic model for 10^{6} subjects based on the herder's microscopic rule showed a phase transition to the two-peaks phase, where the convergence completely terminates, as the ratio exceeds some critical value.
△ Less
Submitted 11 November, 2012; v1 submitted 13 December, 2011;
originally announced December 2011.
-
Digital herders and phase transition in a voting model
Authors:
Masato Hisakado,
Shintaro Mori
Abstract:
In this paper, we discuss a voting model with two candidates, C_1 and C_2. We set two types of voters--herders and independents. The voting of independent voters is based on their fundamental values; on the other hand, the voting of herders is based on the number of votes. Herders always select the majority of the previous $r$ votes, which is visible to them. We call them digital herders. We can a…
▽ More
In this paper, we discuss a voting model with two candidates, C_1 and C_2. We set two types of voters--herders and independents. The voting of independent voters is based on their fundamental values; on the other hand, the voting of herders is based on the number of votes. Herders always select the majority of the previous $r$ votes, which is visible to them. We call them digital herders. We can accurately calculate the distribution of votes for special cases. When r>=3, we find that a phase transition occurs at the upper limit of t, where t is the discrete time (or number of votes). As the fraction of herders increases, the model features a phase transition beyond which a state where most voters make the correct choice coexists with one where most of them are wrong. On the other hand, when r<3, there is no phase transition. In this case, the herders' performance is the same as that of the independent voters. Finally, we recognize the behavior of human beings by conducting simple experiments.
△ Less
Submitted 19 May, 2011; v1 submitted 16 January, 2011;
originally announced January 2011.