Skip to main content

Showing 1–50 of 94 results for author: Inui, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16078  [pdf, other

    cs.CL

    First Heuristic Then Rational: Dynamic Use of Heuristics in Language Model Reasoning

    Authors: Yoichi Aoki, Keito Kudo, Tatsuki Kuribayashi, Shusaku Sone, Masaya Taniguchi, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Multi-step reasoning is widely adopted in the community to explore the better performance of language models (LMs). We report on the systematic strategy that LMs use in this process. Our controlled experiments reveal that LMs rely more heavily on heuristics, such as lexical overlap, in the earlier stages of reasoning when more steps are required to reach an answer. Conversely, as LMs progress clos… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.12402  [pdf, other

    cs.CL

    Flee the Flaw: Annotating the Underlying Logic of Fallacious Arguments Through Templates and Slot-filling

    Authors: Irfan Robbani, Paul Reisert, Naoya Inoue, Surawat Pothong, Camélia Guerraoui, Wenzhi Wang, Shoichi Naito, Jungmin Choi, Kentaro Inui

    Abstract: Prior research in computational argumentation has mainly focused on scoring the quality of arguments, with less attention on explicating logical errors. In this work, we introduce four sets of explainable templates for common informal logical fallacies designed to explicate a fallacy's implicit logic. Using our templates, we conduct an annotation study on top of 400 fallacious arguments taken from… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  3. arXiv:2406.06032  [pdf, other

    cs.CL

    The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models

    Authors: Ryosuke Takahashi, Go Kamoda, Benjamin Heinzerling, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Language models (LMs) encode world knowledge in their internal parameters through training. However, LMs may learn personal and confidential information from the training data, leading to privacy concerns such as data leakage. Therefore, research on knowledge deletion from LMs is essential. This study focuses on the knowledge stored in LMs and analyzes the relationship between the side effects of… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  4. arXiv:2405.04818  [pdf, other

    cs.CL

    ACORN: Aspect-wise Commonsense Reasoning Explanation Evaluation

    Authors: Ana Brassard, Benjamin Heinzerling, Keito Kudo, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Evaluating free-text explanations is a multifaceted, subjective, and labor-intensive task. Large language models (LLMs) present an appealing alternative due to their potential for consistency, scalability, and cost-efficiency. In this work, we present ACORN, a new dataset of 3,500 free-text explanations and aspect-wise quality ratings, and use it to gain insights into how LLMs evaluate explanation… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 18 pages, 7 figures, under review. Data available here: https://github.com/a-brassard/ACORN

  5. arXiv:2404.11315  [pdf, other

    cs.CL

    To Drop or Not to Drop? Predicting Argument Ellipsis Judgments: A Case Study in Japanese

    Authors: Yukiko Ishizuki, Tatsuki Kuribayashi, Yuichiroh Matsubayashi, Ryohei Sasano, Kentaro Inui

    Abstract: Speakers sometimes omit certain arguments of a predicate in a sentence; such omission is especially frequent in pro-drop languages. This study addresses a question about ellipsis -- what can explain the native speakers' ellipsis decisions? -- motivated by the interest in human discourse processing and writing assistance for this choice. To this end, we first collect large-scale human annotations o… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 13 pages; accepted by LREC-COLING 2024

  6. arXiv:2403.12500  [pdf, other

    cs.CL

    A Large Collection of Model-generated Contradictory Responses for Consistency-aware Dialogue Systems

    Authors: Shiki Sato, Reina Akama, Jun Suzuki, Kentaro Inui

    Abstract: Mitigating the generation of contradictory responses poses a substantial challenge in dialogue response generation. The quality and quantity of available contradictory response data play a vital role in suppressing these contradictions, offering two significant benefits. First, having access to large contradiction data enables a comprehensive examination of their characteristics. Second, data-driv… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 16 pages

  7. arXiv:2403.10381  [pdf, other

    cs.CL

    Monotonic Representation of Numeric Properties in Language Models

    Authors: Benjamin Heinzerling, Kentaro Inui

    Abstract: Language models (LMs) can express factual knowledge involving numeric properties such as Karl Popper was born in 1902. However, how this information is encoded in the model's internal representations is not understood well. Here, we introduce a simple method for finding and editing representations of numeric properties such as an entity's birth year. Empirically, we find low-dimensional subspaces… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  8. arXiv:2403.03396  [pdf, other

    cs.CL

    Japanese-English Sentence Translation Exercises Dataset for Automatic Grading

    Authors: Naoki Miura, Hiroaki Funayama, Seiya Kikuchi, Yuichiroh Matsubayashi, Yuya Iwase, Kentaro Inui

    Abstract: This paper proposes the task of automatic assessment of Sentence Translation Exercises (STEs), that have been used in the early stage of L2 language learning. We formalize the task as grading student responses for each rubric criterion pre-specified by the educators. We then create a dataset for STE between Japanese and English including 21 questions, along with a total of 3, 498 student responses… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 9 pages

  9. arXiv:2402.14411  [pdf, other

    cs.CL

    J-UniMorph: Japanese Morphological Annotation through the Universal Feature Schema

    Authors: Kosuke Matsuzaki, Masaya Taniguchi, Kentaro Inui, Keisuke Sakaguchi

    Abstract: We introduce a Japanese Morphology dataset, J-UniMorph, developed based on the UniMorph feature schema. This dataset addresses the unique and rich verb forms characteristic of the language's agglutinative nature. J-UniMorph distinguishes itself from the existing Japanese subset of UniMorph, which is automatically extracted from Wiktionary. On average, the Wiktionary Edition features around 12 infl… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 14 pages, 4 figures

  10. arXiv:2310.17121  [pdf, other

    cs.CL

    Test-time Augmentation for Factual Probing

    Authors: Go Kamoda, Benjamin Heinzerling, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Factual probing is a method that uses prompts to test if a language model "knows" certain world knowledge facts. A problem in factual probing is that small changes to the prompt can lead to large changes in model output. Previous work aimed to alleviate this problem by optimizing prompts via text mining or fine-tuning. However, such approaches are relation-specific and do not generalize to unseen… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 12 pages, 4 figures, accepted to EMNLP 2023 Findings (short paper)

  11. arXiv:2310.15921  [pdf, other

    cs.CL

    Contrastive Learning-based Sentence Encoders Implicitly Weight Informative Words

    Authors: Hiroto Kurita, Goro Kobayashi, Sho Yokoi, Kentaro Inui

    Abstract: The performance of sentence encoders can be significantly improved through the simple practice of fine-tuning using contrastive loss. A natural question arises: what characteristics do models acquire during contrastive learning? This paper theoretically and experimentally shows that contrastive-based sentence encoders implicitly weight words based on information-theoretic quantities; that is, more… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 16 pages, 6 figures, accepted to EMNLP 2023 Findings (short paper)

  12. Chat Translation Error Detection for Assisting Cross-lingual Communications

    Authors: Yunmeng Li, Jun Suzuki, Makoto Morishita, Kaori Abe, Ryoko Tokuhisa, Ana Brassard, Kentaro Inui

    Abstract: In this paper, we describe the development of a communication support system that detects erroneous translations to facilitate crosslingual communications due to the limitations of current machine chat translation methods. We trained an error detector as the baseline of the system and constructed a new Japanese-English bilingual chat corpus, BPersona-chat, which comprises multiturn colloquial chat… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Journal ref: Proceedings of the 3rd Workshop on Evaluation and Comparison of NLP Systems, pages 88-95, November 2022, Online. Association for Computational Linguistics

  13. arXiv:2307.15341  [pdf, other

    cs.CL

    Teach Me How to Improve My Argumentation Skills: A Survey on Feedback in Argumentation

    Authors: Camélia Guerraoui, Paul Reisert, Naoya Inoue, Farjana Sultana Mim, Shoichi Naito, Jungmin Choi, Irfan Robbani, Wenzhi Wang, Kentaro Inui

    Abstract: The use of argumentation in education has been shown to improve critical thinking skills for end-users such as students, and computational models for argumentation have been developed to assist in this process. Although these models are useful for evaluating the quality of an argument, they oftentimes cannot explain why a particular argument is considered poor or not, which makes it difficult to p… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: 14 pages, 4 figures

  14. arXiv:2305.18294  [pdf, other

    cs.CL

    Transformer Language Models Handle Word Frequency in Prediction Head

    Authors: Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

    Abstract: Prediction head is a crucial component of Transformer language models. Despite its direct impact on prediction, this component has often been overlooked in analyzing Transformers. In this study, we investigate the inner workings of the prediction head, specifically focusing on bias parameters. Our experiments with BERT and GPT-2 models reveal that the biases in their word prediction heads play a s… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: 11 pages, 12 figures, accepted to ACL 2023 Findings (short paper)

  15. arXiv:2303.14342  [pdf, other

    cs.CL

    Analyzing the Performance of GPT-3.5 and GPT-4 in Grammatical Error Correction

    Authors: Steven Coyne, Keisuke Sakaguchi, Diana Galvan-Sosa, Michael Zock, Kentaro Inui

    Abstract: GPT-3 and GPT-4 models are powerful, achieving high performance on a variety of Natural Language Processing tasks. However, there is a relative lack of detailed published analysis of their performance on the task of grammatical error correction (GEC). To address this, we perform experiments testing the capabilities of a GPT-3.5 model (text-davinci-003) and a GPT-4 model (gpt-4-0314) on major GEC b… ▽ More

    Submitted 30 May, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

  16. arXiv:2302.08148  [pdf, other

    cs.AI cs.CL

    Empirical Investigation of Neural Symbolic Reasoning Strategies

    Authors: Yoichi Aoki, Keito Kudo, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Neural reasoning accuracy improves when generating intermediate reasoning steps. However, the source of this improvement is yet unclear. Here, we investigate and factorize the benefit of generating intermediate steps for symbolic reasoning. Specifically, we decompose the reasoning strategy w.r.t. step granularity and chaining strategy. With a purely symbolic numerical reasoning dataset (e.g., A=1,… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: This paper is accepted as the findings at EACL 2023, and the earlier version (non-archival) of this work got the Best Paper Award in the Student Research Workshop of AACL 2022

  17. arXiv:2302.07866  [pdf, other

    cs.CL cs.AI

    Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?

    Authors: Keito Kudo, Yoichi Aoki, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Compositionality is a pivotal property of symbolic reasoning. However, how well recent neural models capture compositionality remains underexplored in the symbolic reasoning tasks. This study empirically addresses this question by systematically examining recently published pre-trained seq2seq models with a carefully controlled dataset of multi-hop arithmetic symbolic reasoning. We introduce a ski… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: accepted by EACL 2023

  18. arXiv:2302.00456  [pdf, other

    cs.CL

    Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps

    Authors: Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

    Abstract: Transformers are ubiquitous in wide tasks. Interpreting their internals is a pivotal goal. Nevertheless, their particular components, feed-forward (FF) blocks, have typically been less analyzed despite their substantial parameter amounts. We analyze the input contextualization effects of FF blocks by rendering them in the attention maps as a human-friendly visualization scheme. Our experiments wit… ▽ More

    Submitted 15 April, 2024; v1 submitted 1 February, 2023; originally announced February 2023.

    Comments: ICLR 2024 Spotlight; 37 pages, 32 figures, 3 tables

  19. arXiv:2301.06758  [pdf, other

    cs.LG cs.AI cs.CL

    Tracing and Manipulating Intermediate Values in Neural Math Problem Solvers

    Authors: Yuta Matsumoto, Benjamin Heinzerling, Masashi Yoshikawa, Kentaro Inui

    Abstract: How language models process complex input that requires multiple steps of inference is not well understood. Previous research has shown that information about intermediate values of these inputs can be extracted from the activations of the models, but it is unclear where that information is encoded and whether that information is indeed used during inference. We introduce a method for analyzing ho… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

    Comments: 5 pages, 4 figures, MathNLP

  20. arXiv:2211.01432  [pdf, other

    cs.CL cs.LG

    Cross-stitching Text and Knowledge Graph Encoders for Distantly Supervised Relation Extraction

    Authors: Qin Dai, Benjamin Heinzerling, Kentaro Inui

    Abstract: Bi-encoder architectures for distantly-supervised relation extraction are designed to make use of the complementary information found in text and knowledge graphs (KG). However, current architectures suffer from two drawbacks. They either do not allow any sharing between the text encoder and the KG encoder at all, or, in case of models with KG-to-text attention, only share information in one direc… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

  21. arXiv:2209.09746  [pdf, other

    cs.CL

    Target-Guided Open-Domain Conversation Planning

    Authors: Yosuke Kishinami, Reina Akama, Shiki Sato, Ryoko Tokuhisa, Jun Suzuki, Kentaro Inui

    Abstract: Prior studies addressing target-oriented conversational tasks lack a crucial notion that has been intensively studied in the context of goal-oriented artificial intelligence agents, namely, planning. In this study, we propose the task of Target-Guided Open-Domain Conversation Planning (TGCP) task to evaluate whether neural conversational agents have goal-oriented conversation planning abilities. U… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

    Comments: 9 pages, Accepted to The 29th International Conference on Computational Linguistics (COLING 2022)

  22. arXiv:2208.02578  [pdf, other

    cs.CL

    N-best Response-based Analysis of Contradiction-awareness in Neural Response Generation Models

    Authors: Shiki Sato, Reina Akama, Hiroki Ouchi, Ryoko Tokuhisa, Jun Suzuki, Kentaro Inui

    Abstract: Avoiding the generation of responses that contradict the preceding context is a significant challenge in dialogue response generation. One feasible method is post-processing, such as filtering out contradicting responses from a resulting n-best response list. In this scenario, the quality of the n-best list considerably affects the occurrence of contradictions because the final response is chosen… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: 8 pages, Accepted to The 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2022)

  23. arXiv:2207.13332  [pdf, other

    cs.CL

    RealTime QA: What's the Answer Right Now?

    Authors: Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari Asai, Xinyan Yu, Dragomir Radev, Noah A. Smith, Yejin Choi, Kentaro Inui

    Abstract: We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). REALTIME QA inquires about the current world, and QA systems need to answer questions about novel events or information. It therefore challenges static, conventional assumptions in open-domain QA datasets and pursues instantaneous applicat… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 July, 2022; originally announced July 2022.

    Comments: RealTime QA Website: https://realtimeqa.github.io/

  24. arXiv:2206.08288  [pdf, other

    cs.CL

    Balancing Cost and Quality: An Exploration of Human-in-the-loop Frameworks for Automated Short Answer Scoring

    Authors: Hiroaki Funayama, Tasuku Sato, Yuichiroh Matsubayashi, Tomoya Mizumoto, Jun Suzuki, Kentaro Inui

    Abstract: Short answer scoring (SAS) is the task of grading short text written by a learner. In recent years, deep-learning-based approaches have substantially improved the performance of SAS models, but how to guarantee high-quality predictions still remains a critical issue when applying such models to the education field. Towards guaranteeing high-quality predictions, we present the first study of explor… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: 12pages, To be published in proceedings of AIED2022

  25. arXiv:2205.11833  [pdf, other

    cs.LG cs.CL

    Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model

    Authors: Sosuke Kobayashi, Shun Kiyono, Jun Suzuki, Kentaro Inui

    Abstract: Ensembling is a popular method used to improve performance as a last resort. However, ensembling multiple models finetuned from a single pretrained model has been not very effective; this could be due to the lack of diversity among ensemble members. This paper proposes Multi-Ticket Ensemble, which finetunes different subnetworks of a single pretrained model and ensembles them. We empirically demon… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: Workshop on Challenges & Perspectives in Creating Large Language Models (BigScience) 2022

  26. arXiv:2205.11484  [pdf, other

    cs.CL

    Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond

    Authors: Masato Mita, Keisuke Sakaguchi, Masato Hagiwara, Tomoya Mizumoto, Jun Suzuki, Kentaro Inui

    Abstract: Natural language processing technology has rapidly improved automated grammatical error correction tasks, and the community begins to explore document-level revision as one of the next challenges. To go beyond sentence-level automated grammatical error correction to NLP-based document-level revision assistant, there are two major obstacles: (1) there are few public corpora with document-level revi… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: 14 pages

  27. arXiv:2205.11463  [pdf, other

    cs.CL

    Context Limitations Make Neural Language Models More Human-Like

    Authors: Tatsuki Kuribayashi, Yohei Oseki, Ana Brassard, Kentaro Inui

    Abstract: Language models (LMs) have been used in cognitive modeling as well as engineering studies -- they compute information-theoretic complexity metrics that simulate humans' cognitive load during reading. This study highlights a limitation of modern neural LMs as the model of choice for this purpose: there is a discrepancy between their context access capacities and that of humans. Our results showed t… ▽ More

    Submitted 1 November, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: Accepted by EMNLP2022 (main long)

  28. arXiv:2204.01512  [pdf, other

    cs.CL

    LPAttack: A Feasible Annotation Scheme for Capturing Logic Pattern of Attacks in Arguments

    Authors: Farjana Sultana Mim, Naoya Inoue, Shoichi Naito, Keshav Singh, Kentaro Inui

    Abstract: In argumentative discourse, persuasion is often achieved by refuting or attacking others arguments. Attacking is not always straightforward and often comprise complex rhetorical moves such that arguers might agree with a logic of an argument while attacking another logic. Moreover, arguer might neither deny nor agree with any logics of an argument, instead ignore them and attack the main stance of… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: 14 pages, 8 figures

  29. arXiv:2201.06777  [pdf, other

    cs.CL

    COPA-SSE: Semi-structured Explanations for Commonsense Reasoning

    Authors: Ana Brassard, Benjamin Heinzerling, Pride Kavumba, Kentaro Inui

    Abstract: We present Semi-Structured Explanations for COPA (COPA-SSE), a new crowdsourced dataset of 9,747 semi-structured, English common sense explanations for Choice of Plausible Alternatives (COPA) questions. The explanations are formatted as a set of triple-like common sense statements with ConceptNet relations but freely written concepts. This semi-structured format strikes a balance between the high… ▽ More

    Submitted 11 May, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: 6 pages, 6 figures, LREC 2022. Data available at https://github.com/a-brassard/copa-sse

  30. arXiv:2201.06674  [pdf, other

    cs.CL

    TYPIC: A Corpus of Template-Based Diagnostic Comments on Argumentation

    Authors: Shoichi Naito, Shintaro Sawada, Chihiro Nakagawa, Naoya Inoue, Kenshi Yamaguchi, Iori Shimizu, Farjana Sultana Mim, Keshav Singh, Kentaro Inui

    Abstract: Providing feedback on the argumentation of the learner is essential for developing critical thinking skills, however, it requires a lot of time and effort. To mitigate the overload on teachers, we aim to automate a process of providing feedback, especially giving diagnostic comments which point out the weaknesses inherent in the argumentation. It is recommended to give specific diagnostic comments… ▽ More

    Submitted 21 June, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

    Comments: LREC2022. The dataset is available at https://github.com/cl-tohoku/TYPIC

  31. arXiv:2110.13692  [pdf, other

    cs.CL

    Annotating Implicit Reasoning in Arguments with Causal Links

    Authors: Keshav Singh, Naoya Inoue, Farjana Sultana Mim, Shoichi Naitoh, Kentaro Inui

    Abstract: Most of the existing work that focus on the identification of implicit knowledge in arguments generally represent implicit knowledge in the form of commonsense or factual knowledge. However, such knowledge is not sufficient to understand the implicit reasoning link between individual argumentative components (i.e., claim and premise). In this work, we focus on identifying the implicit knowledge in… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: Accepted to ArgKG:Workshop on Argumentation Knowledge Graphs (AKBC 2021)

  32. arXiv:2109.13497  [pdf, other

    cs.CL cs.LG

    Instance-Based Neural Dependency Parsing

    Authors: Hiroki Ouchi, Jun Suzuki, Sosuke Kobayashi, Sho Yokoi, Tatsuki Kuribayashi, Masashi Yoshikawa, Kentaro Inui

    Abstract: Interpretable rationales for model predictions are crucial in practical applications. We develop neural models that possess an interpretable inference process for dependency parsing. Our models adopt instance-based inference, where dependency edges are extracted and labeled by comparing them to edges in a training set. The training edges are explicitly used for the predictions; thus, it is easy to… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    Comments: 15 pages, accepted to TACL 2021

  33. arXiv:2109.07152  [pdf, other

    cs.CL

    Incorporating Residual and Normalization Layers into Analysis of Masked Language Models

    Authors: Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, Kentaro Inui

    Abstract: Transformer architecture has become ubiquitous in the natural language processing field. To interpret the Transformer-based models, their attention patterns have been extensively analyzed. However, the Transformer architecture is not only composed of the multi-head attention; other components can also contribute to Transformers' progressive performance. In this study, we extended the scope of the… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: 22 pages, accepted to EMNLP 2021 main conference

  34. arXiv:2109.07080  [pdf, other

    cs.CL

    Transformer-based Lexically Constrained Headline Generation

    Authors: Kosuke Yamada, Yuta Hitomi, Hideaki Tamori, Ryohei Sasano, Naoaki Okazaki, Kentaro Inui, Koichi Takeda

    Abstract: This paper explores a variant of automatic headline generation methods, where a generated headline is required to include a given phrase such as a company or a product name. Previous methods using Transformer-based models generate a headline including a given phrase by providing the encoder with additional information corresponding to the given phrase. However, these methods cannot always include… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021

  35. arXiv:2109.06853  [pdf, other

    cs.CL

    Summarize-then-Answer: Generating Concise Explanations for Multi-hop Reading Comprehension

    Authors: Naoya Inoue, Harsh Trivedi, Steven Sinha, Niranjan Balasubramanian, Kentaro Inui

    Abstract: How can we generate concise explanations for multi-hop Reading Comprehension (RC)? The current strategies of identifying supporting sentences can be seen as an extractive question-focused summarization of the input text. However, these extractive explanations are not necessarily concise i.e. not minimally sufficient for answering a question. Instead, we advocate for an abstractive approach, where… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP2021 Long Paper (Main Track)

  36. arXiv:2109.05644  [pdf, other

    cs.CL

    SHAPE: Shifted Absolute Position Embedding for Transformers

    Authors: Shun Kiyono, Sosuke Kobayashi, Jun Suzuki, Kentaro Inui

    Abstract: Position representation is crucial for building position-aware representations in Transformers. Existing position representations suffer from a lack of generalization to test data with unseen lengths or high computational cost. We investigate shifted absolute position embedding (SHAPE) to address both issues. The basic idea of SHAPE is to achieve shift invariance, which is a key property of recent… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 (short paper, main conference)

  37. arXiv:2108.08631  [pdf, other

    cond-mat.str-el cond-mat.dis-nn cs.LG physics.comp-ph quant-ph

    Determinant-free fermionic wave function using feed-forward neural networks

    Authors: Koji Inui, Yasuyuki Kato, Yukitoshi Motome

    Abstract: We propose a general framework for finding the ground state of many-body fermionic systems by using feed-forward neural networks. The anticommutation relation for fermions is usually implemented to a variational wave function by the Slater determinant (or Pfaffian), which is a computational bottleneck because of the numerical cost of $O(N^3)$ for $N$ particles. We bypass this bottleneck by explici… ▽ More

    Submitted 22 August, 2021; v1 submitted 19 August, 2021; originally announced August 2021.

    Journal ref: Phys. Rev. Research 3, 043126 (2021)

  38. arXiv:2106.01229  [pdf, other

    cs.CL

    Lower Perplexity is Not Always Human-Like

    Authors: Tatsuki Kuribayashi, Yohei Oseki, Takumi Ito, Ryo Yoshida, Masayuki Asahara, Kentaro Inui

    Abstract: In computational psycholinguistics, various language models have been evaluated against human reading behavior (e.g., eye movement) to build human-like computational models. However, most previous efforts have focused almost exclusively on English, despite the recent trend towards linguistic universal within the general community. In order to fill the gap, this paper investigates whether the estab… ▽ More

    Submitted 1 November, 2022; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted by ACL 2021

  39. arXiv:2106.01077  [pdf, other

    cs.CL

    SyGNS: A Systematic Generalization Testbed Based on Natural Language Semantics

    Authors: Hitomi Yanaka, Koji Mineshima, Kentaro Inui

    Abstract: Recently, deep neural networks (DNNs) have achieved great success in semantically challenging NLP tasks, yet it remains unclear whether DNN models can capture compositional meanings, those aspects of meaning that have been long studied in formal semantics. To investigate this issue, we propose a Systematic Generalization testbed based on Natural language Semantics (SyGNS), whose challenge is to ma… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: Findings (long paper) of ACL-IJCNLP2021

  40. arXiv:2104.11514  [pdf, other

    cs.CL

    Learning to Learn to be Right for the Right Reasons

    Authors: Pride Kavumba, Benjamin Heinzerling, Ana Brassard, Kentaro Inui

    Abstract: Improving model generalization on held-out data is one of the core objectives in commonsense reasoning. Recent work has shown that models trained on the dataset with superficial cues tend to perform well on the easy test set with superficial cues but perform poorly on the hard test set without superficial cues. Previous approaches have resorted to manual methods of encouraging models not to overfi… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

  41. arXiv:2104.07924  [pdf, other

    cs.CL

    A Comparative Study on Collecting High-Quality Implicit Reasonings at a Large-scale

    Authors: Keshav Singh, Paul Reisert, Naoya Inoue, Kentaro Inui

    Abstract: Explicating implicit reasoning (i.e. warrants) in arguments is a long-standing challenge for natural language understanding systems. While recent approaches have focused on explicating warrants via crowdsourcing or expert annotations, the quality of warrants has been questionable due to the extreme complexity and subjectivity of the task. In this paper, we tackle the complex task of warrant explic… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

    Comments: 2 figures, 3 tables

  42. arXiv:2104.07425  [pdf, other

    cs.CL

    Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution

    Authors: Ryuto Konno, Shun Kiyono, Yuichiroh Matsubayashi, Hiroki Ouchi, Kentaro Inui

    Abstract: Masked language models (MLMs) have contributed to drastic performance improvements with regard to zero anaphora resolution (ZAR). To further improve this approach, in this study, we made two proposals. The first is a new pretraining task that trains MLMs on anaphoric relations with explicit supervision, and the second proposal is a new finetuning method that remedies a notorious issue, the pretrai… ▽ More

    Submitted 10 September, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: Long paper accepted by EMNLP2021 main conference

  43. arXiv:2102.06540  [pdf, other

    cs.CL cs.LG

    Two Training Strategies for Improving Relation Extraction over Universal Graph

    Authors: Qin Dai, Naoya Inoue, Ryo Takahashi, Kentaro Inui

    Abstract: This paper explores how the Distantly Supervised Relation Extraction (DS-RE) can benefit from the use of a Universal Graph (UG), the combination of a Knowledge Graph (KG) and a large-scale text collection. A straightforward extension of a current state-of-the-art neural model for DS-RE with a UG may lead to degradation in performance. We first report that this degradation is associated with the di… ▽ More

    Submitted 6 May, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

  44. arXiv:2101.10713  [pdf, other

    cs.CL

    Exploring Transitivity in Neural NLI Models through Veridicality

    Authors: Hitomi Yanaka, Koji Mineshima, Kentaro Inui

    Abstract: Despite the recent success of deep neural networks in natural language processing, the extent to which they can demonstrate human-like generalization capacities for natural language understanding remains unclear. We explore this issue in the domain of natural language inference (NLI), focusing on the transitivity of inference relations, a fundamental property for systematically drawing inferences.… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: accepted by EACL2021 as a long paper

  45. arXiv:2012.04207  [pdf, other

    cs.LG cs.CL cs.CV

    Efficient Estimation of Influence of a Training Instance

    Authors: Sosuke Kobayashi, Sho Yokoi, Jun Suzuki, Kentaro Inui

    Abstract: Understanding the influence of a training instance on a neural network model leads to improving interpretability. However, it is difficult and inefficient to evaluate the influence, which shows how a model's prediction would be changed if a training instance were not used. In this paper, we propose an efficient method for estimating the influence. Our method is inspired by dropout, which zero-mask… ▽ More

    Submitted 19 November, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

    Comments: This is an extended version of the paper presented at SustaiNLP 2020

  46. arXiv:2011.02121  [pdf, other

    cs.CL

    PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents

    Authors: Ryo Fujii, Masato Mita, Kaori Abe, Kazuaki Hanawa, Makoto Morishita, Jun Suzuki, Kentaro Inui

    Abstract: Neural Machine Translation (NMT) has shown drastic improvement in its quality when translating clean input, such as text from the news domain. However, existing studies suggest that NMT still struggles with certain kinds of input with considerable noise, such as User-Generated Contents (UGC) on the Internet. To make better use of NMT for cross-cultural communication, one of the most promising dire… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

    Comments: 15 pages, 4 figures, accepted at COLING 2020

  47. arXiv:2011.01785  [pdf, other

    cs.CL

    Modeling Event Salience in Narratives via Barthes' Cardinal Functions

    Authors: Takaki Otake, Sho Yokoi, Naoya Inoue, Ryo Takahashi, Tatsuki Kuribayashi, Kentaro Inui

    Abstract: Events in a narrative differ in salience: some are more important to the story than others. Estimating event salience is useful for tasks such as story generation, and as a tool for text analysis in narratology and folkloristics. To compute event salience without any annotations, we adopt Barthes' definition of event salience and propose several unsupervised methods that require only a pre-trained… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

    Comments: accepted to COLING 2020

  48. arXiv:2011.00948  [pdf, other

    cs.CL

    An Empirical Study of Contextual Data Augmentation for Japanese Zero Anaphora Resolution

    Authors: Ryuto Konno, Yuichiroh Matsubayashi, Shun Kiyono, Hiroki Ouchi, Ryo Takahashi, Kentaro Inui

    Abstract: One critical issue of zero anaphora resolution (ZAR) is the scarcity of labeled data. This study explores how effectively this problem can be alleviated by data augmentation. We adopt a state-of-the-art data augmentation method, called the contextual data augmentation (CDA), that generates labeled training instances using a pretrained language model. The CDA has been reported to work well for seve… ▽ More

    Submitted 4 November, 2020; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: 13 pages, accepted by COLING 2020

  49. arXiv:2010.06137  [pdf, other

    cs.CL

    Corruption Is Not All Bad: Incorporating Discourse Structure into Pre-training via Corruption for Essay Scoring

    Authors: Farjana Sultana Mim, Naoya Inoue, Paul Reisert, Hiroki Ouchi, Kentaro Inui

    Abstract: Existing approaches for automated essay scoring and document representation learning typically rely on discourse parsers to incorporate discourse structure into text representation. However, the performance of parsers is not always adequate, especially when they are used on noisy texts, such as student essays. In this paper, we propose an unsupervised pre-training approach to capture discourse str… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

  50. arXiv:2010.04332  [pdf, other

    cs.CL

    Langsmith: An Interactive Academic Text Revision System

    Authors: Takumi Ito, Tatsuki Kuribayashi, Masatoshi Hidaka, Jun Suzuki, Kentaro Inui

    Abstract: Despite the current diversity and inclusion initiatives in the academic community, researchers with a non-native command of English still face significant obstacles when writing papers in English. This paper presents the Langsmith editor, which assists inexperienced, non-native researchers to write English papers, especially in the natural language processing (NLP) field. Our system can suggest fl… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: Accepted at EMNLP 2020 (system demonstrations)