Skip to main content

Showing 1–50 of 59 results for author: Aizawa, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03963  [pdf, other

    cs.CL cs.AI

    LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

    Authors: LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano , et al. (57 additional authors not shown)

    Abstract: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2406.13397  [pdf, other

    cs.CL

    MoreHopQA: More Than Multi-hop Reasoning

    Authors: Julian Schnitzler, Xanh Ho, Jiahao Huang, Florian Boudin, Saku Sugawara, Akiko Aizawa

    Abstract: Most existing multi-hop datasets are extractive answer datasets, where the answers to the questions can be extracted directly from the provided context. This often leads models to use heuristics or shortcuts instead of performing true multi-hop reasoning. In this paper, we propose a new multi-hop dataset, MoreHopQA, which shifts from extractive to generative answers. Our dataset is created by util… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures. First three authors contributed equally

  3. arXiv:2404.03147  [pdf, other

    cs.LG cs.AI

    Eigenpruning: an Interpretability-Inspired PEFT Method

    Authors: Tomás Vergara-Browne, Álvaro Soto, Akiko Aizawa

    Abstract: We introduce eigenpruning, a method that removes singular values from weight matrices in an LLM to improve its performance in a particular task. This method is inspired by interpretability methods designed to automatically find subnetworks of a model which solve a specific task. In our tests, the pruned model outperforms the original model by a large margin, while only requiring minimal computatio… ▽ More

    Submitted 20 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: Extended abstract accepted to LatinX at NAACL 2024

  4. arXiv:2404.00344  [pdf, other

    cs.CL cs.AI cs.IR

    Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange

    Authors: Ankit Satpute, Noah Giessing, Andre Greiner-Petter, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, Bela Gipp

    Abstract: Large Language Models (LLMs) have demonstrated exceptional capabilities in various natural language tasks, often achieving performances that surpass those of humans. Despite these advancements, the domain of mathematics presents a distinctive challenge, primarily due to its specialized structure and the precision it demands. In this study, we adopted a two-step approach for investigating the profi… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted for publication at the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) July 14--18, 2024, Washington D.C.,USA

  5. arXiv:2403.17759  [pdf, other

    cs.IR

    TWOLAR: a TWO-step LLM-Augmented distillation method for passage Reranking

    Authors: Davide Baldelli, Junfeng Jiang, Akiko Aizawa, Paolo Torroni

    Abstract: In this paper, we present TWOLAR: a two-stage pipeline for passage reranking based on the distillation of knowledge from Large Language Models (LLM). TWOLAR introduces a new scoring strategy and a distillation process consisting in the creation of a novel and diverse training dataset. The dataset consists of 20K queries, each associated with a set of documents retrieved via four distinct retrieval… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  6. arXiv:2403.07910  [pdf, other

    cs.CY cs.CL

    MAGPIE: Multi-Task Media-Bias Analysis Generalization for Pre-Trained Identification of Expressions

    Authors: Tomáš Horych, Martin Wessel, Jan Philip Wahle, Terry Ruas, Jerome Waßmuth, André Greiner-Petter, Akiko Aizawa, Bela Gipp, Timo Spinde

    Abstract: Media bias detection poses a complex, multifaceted problem traditionally tackled using single-task models and small in-domain datasets, consequently lacking generalizability. To address this, we introduce MAGPIE, the first large-scale multi-task pre-training approach explicitly tailored for media bias detection. To enable pre-training at scale, we present Large Bias Mixture (LBM), a compilation of… ▽ More

    Submitted 15 March, 2024; v1 submitted 26 February, 2024; originally announced March 2024.

  7. arXiv:2402.17311  [pdf, other

    cs.CL

    SKT5SciSumm -- A Hybrid Generative Approach for Multi-Document Scientific Summarization

    Authors: Huy Quoc To, Hung-Nghiep Tran, Andr'e Greiner-Petter, Felix Beierle, Akiko Aizawa

    Abstract: Summarization for scientific text has shown significant benefits both for the research community and human society. Given the fact that the nature of scientific text is distinctive and the input of the multi-document summarization task is substantially long, the task requires sufficient embedding generation and text truncation without losing important information. To tackle these issues, in this p… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  8. arXiv:2401.17824  [pdf, other

    cs.CL

    A Survey of Pre-trained Language Models for Processing Scientific Text

    Authors: Xanh Ho, Anh Khoa Duong Nguyen, An Tuan Dao, Junfeng Jiang, Yuki Chida, Kaito Sugimoto, Huy Quoc To, Florian Boudin, Akiko Aizawa

    Abstract: The number of Language Models (LMs) dedicated to processing scientific text is on the rise. Keeping pace with the rapid growth of scientific LMs (SciLMs) has become a daunting task for researchers. To date, no comprehensive surveys on SciLMs have been undertaken, leaving this issue unaddressed. Given the constant stream of new SciLMs, appraising the state-of-the-art and how they compare to each ot… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: Resources are available at https://github.com/Alab-NII/Awesome-SciLM

  9. Taxonomy of Mathematical Plagiarism

    Authors: Ankit Satpute, Andre Greiner-Petter, Noah Gießing, Isabel Beckenbach, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, Bela Gipp

    Abstract: Plagiarism is a pressing concern, even more so with the availability of large language models. Existing plagiarism detection systems reliably find copied and moderately reworded text but fail for idea plagiarism, especially in mathematical science, which heavily uses formal mathematical notation. We make two contributions. First, we establish a taxonomy of mathematical content reuse by annotating… ▽ More

    Submitted 31 May, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 46th European Conference on Information Retrieval (ECIR)

  10. arXiv:2312.15751  [pdf, other

    cs.CL

    Solving Label Variation in Scientific Information Extraction via Multi-Task Learning

    Authors: Dong Pham, Xanh Ho, Quang-Thuy Ha, Akiko Aizawa

    Abstract: Scientific Information Extraction (ScientificIE) is a critical task that involves the identification of scientific entities and their relationships. The complexity of this task is compounded by the necessity for domain-specific knowledge and the limited availability of annotated data. Two of the most popular datasets for ScientificIE are SemEval-2018 Task-7 and SciERC. They have overlapping sample… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: 14 pages, 7 figures, PACLIC 37

  11. JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions

    Authors: Detai Xin, Junfeng Jiang, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, Hiroshi Saruwatari

    Abstract: We present the JVNV, a Japanese emotional speech corpus with verbal content and nonverbal vocalizations whose scripts are generated by a large-scale language model. Existing emotional speech corpora lack not only proper emotional scripts but also nonverbal vocalizations (NVs) that are essential expressions in spoken language to express emotions. We propose an automatic script generation method to… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  12. arXiv:2306.02258  [pdf, other

    cs.CL

    Probing Physical Reasoning with Counter-Commonsense Context

    Authors: Kazushi Kondo, Saku Sugawara, Akiko Aizawa

    Abstract: In this study, we create a CConS (Counter-commonsense Contextual Size comparison) dataset to investigate how physical commonsense affects the contextualized size comparison task; the proposed dataset consists of both contexts that fit physical commonsense and those that do not. This dataset tests the ability of language models to predict the size relationship between objects under various contexts… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: Accepted to ACL 2023(Short Paper)

  13. arXiv:2305.13193  [pdf, other

    cs.IR

    TEIMMA: The First Content Reuse Annotator for Text, Images, and Math

    Authors: Ankit Satpute, André Greiner-Petter, Moritz Schubotz, Norman Meuschke, Akiko Aizawa, Olaf Teschke, Bela Gipp

    Abstract: This demo paper presents the first tool to annotate the reuse of text, images, and mathematical formulae in a document pair -- TEIMMA. Annotating content reuse is particularly useful to develop plagiarism detection algorithms. Real-world content reuse is often obfuscated, which makes it challenging to identify such cases. TEIMMA allows entering the obfuscation type to enable novel classifications… ▽ More

    Submitted 13 June, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

  14. arXiv:2305.08371  [pdf, other

    cs.CL

    SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation

    Authors: Junfeng Jiang, Chengzhang Dong, Sadao Kurohashi, Akiko Aizawa

    Abstract: Dialogue segmentation is a crucial task for dialogue systems allowing a better understanding of conversational texts. Despite recent progress in unsupervised dialogue segmentation methods, their performances are limited by the lack of explicit supervised signals for training. Furthermore, the precise definition of segmentation points in conversations still remains as a challenging problem, increas… ▽ More

    Submitted 15 October, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: Accepted as a Long Paper at EMNLP 2023 (main)

  15. Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection

    Authors: Martin Wessel, Tomáš Horych, Terry Ruas, Akiko Aizawa, Bela Gipp, Timo Spinde

    Abstract: Although media bias detection is a complex multi-task problem, there is, to date, no unified benchmark grouping these evaluation tasks. We introduce the Media Bias Identification Benchmark (MBIB), a comprehensive benchmark that groups different types of media bias (e.g., linguistic, cognitive, political) under a common framework to test how prospective detection techniques generalize. After review… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: To be published in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23)

  16. arXiv:2302.05963  [pdf, other

    cs.CL

    Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering

    Authors: Xanh Ho, Anh-Khoa Duong Nguyen, Saku Sugawara, Akiko Aizawa

    Abstract: To explain the predicted answers and evaluate the reasoning abilities of models, several studies have utilized underlying reasoning (UR) tasks in multi-hop question answering (QA) datasets. However, it remains an open question as to how effective UR tasks are for the QA task when training models on both tasks in an end-to-end manner. In this study, we address this question by analyzing the effecti… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.

    Comments: Accepted by EACL 2023 (Findings)

  17. arXiv:2212.07075  [pdf, other

    cs.CV cs.CL

    Cross-Modal Similarity-Based Curriculum Learning for Image Captioning

    Authors: Hongkuan Zhang, Saku Sugawara, Akiko Aizawa, Lei Zhou, Ryohei Sasano, Koichi Takeda

    Abstract: Image captioning models require the high-level generalization ability to describe the contents of various images in words. Most existing approaches treat the image-caption pairs equally in their training without considering the differences in their learning difficulties. Several image captioning approaches introduce curriculum learning methods that present training data with increasing levels of d… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

    Comments: EMNLP 2022

  18. arXiv:2211.16220  [pdf, other

    cs.CL

    Which Shortcut Solution Do Question Answering Models Prefer to Learn?

    Authors: Kazutoshi Shinoda, Saku Sugawara, Akiko Aizawa

    Abstract: Question answering (QA) models for reading comprehension tend to learn shortcut solutions rather than the solutions intended by QA datasets. QA models that have learned shortcut solutions can achieve human-level performance in shortcut examples where shortcuts are valid, but these same behaviors degrade generalization potential on anti-shortcut examples where shortcuts are invalid. Various methods… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: Accepted to AAAI 2023

  19. arXiv:2211.16093  [pdf, ps, other

    cs.CL

    Penalizing Confident Predictions on Largely Perturbed Inputs Does Not Improve Out-of-Distribution Generalization in Question Answering

    Authors: Kazutoshi Shinoda, Saku Sugawara, Akiko Aizawa

    Abstract: Question answering (QA) models are shown to be insensitive to large perturbations to inputs; that is, they make correct and confident predictions even when given largely perturbed inputs from which humans can not correctly derive answers. In addition, QA models fail to generalize to other domains and adversarial test sets, while humans maintain high accuracy. Based on these observations, we assume… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: Accepted to the KnowledgeNLP workshop at AAAI 2023

  20. Caching and Reproducibility: Making Data Science experiments faster and FAIRer

    Authors: Moritz Schubotz, Ankit Satpute, Andre Greiner-Petter, Akiko Aizawa, Bela Gipp

    Abstract: Small to medium-scale data science experiments often rely on research software developed ad-hoc by individual scientists or small teams. Often there is no time to make the research software fast, reusable, and open access. The consequence is twofold. First, subsequent researchers must spend significant work hours building upon the proposed hypotheses or experimental framework. In the worst case, o… ▽ More

    Submitted 9 November, 2022; v1 submitted 8 November, 2022; originally announced November 2022.

    Comments: 8 pages, 1 table

    Journal ref: Frontiers in Research Metrics and Analytics, volume 7, 2022

  21. Exploiting Transformer-based Multitask Learning for the Detection of Media Bias in News Articles

    Authors: Timo Spinde, Jan-David Krieger, Terry Ruas, Jelena Mitrović, Franz Götz-Hahn, Akiko Aizawa, Bela Gipp

    Abstract: Media has a substantial impact on the public perception of events. A one-sided or polarizing perspective on any topic is usually described as media bias. One of the ways how bias in news articles can be introduced is by altering word choice. Biased word choices are not always obvious, nor do they exhibit high context-dependency. Hence, detecting bias is often difficult. We propose a Transformer-ba… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Journal ref: Proceedings of the iConference 2022

  22. arXiv:2210.16079  [pdf, other

    cs.CL

    Debiasing Masks: A New Framework for Shortcut Mitigation in NLU

    Authors: Johannes Mario Meissner, Saku Sugawara, Akiko Aizawa

    Abstract: Debiasing language models from unwanted behaviors in Natural Language Understanding tasks is a topic with rapidly increasing interest in the NLP community. Spurious statistical correlations in the data allow models to perform shortcuts and avoid uncovering more advanced and desirable linguistic features. A multitude of effective debiasing approaches has been proposed, but flexibility remains a maj… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  23. arXiv:2210.14541  [pdf, other

    cs.CL

    Look to the Right: Mitigating Relative Position Bias in Extractive Question Answering

    Authors: Kazutoshi Shinoda, Saku Sugawara, Akiko Aizawa

    Abstract: Extractive question answering (QA) models tend to exploit spurious correlations to make predictions when a training set has unintended biases. This tendency results in models not being generalizable to examples where the correlations do not hold. Determining the spurious correlations QA models can exploit is crucial in building generalizable QA models in real-world applications; moreover, a method… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted to BlackboxNLP 2022

  24. arXiv:2210.05208  [pdf, other

    cs.CL

    How Well Do Multi-hop Reading Comprehension Models Understand Date Information?

    Authors: Xanh Ho, Saku Sugawara, Akiko Aizawa

    Abstract: Several multi-hop reading comprehension datasets have been proposed to resolve the issue of reasoning shortcuts by which questions can be answered without performing multi-hop reasoning. However, the ability of multi-hop models to perform step-by-step reasoning when finding an answer to a comparison question remains unclear. It is also unclear how questions about the internal reasoning process are… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: 10 pages, 2 figures, and 8 tables; Accepted to AACL-IJCNLP 2022

  25. Neural Media Bias Detection Using Distant Supervision With BABE -- Bias Annotations By Experts

    Authors: Timo Spinde, Manuel Plank, Jan-David Krieger, Terry Ruas, Bela Gipp, Akiko Aizawa

    Abstract: Media coverage has a substantial effect on the public perception of events. Nevertheless, media outlets are often biased. One way to bias news articles is by altering the word choice. The automatic identification of bias by word choice is challenging, primarily due to the lack of a gold standard data set and high context dependencies. This paper presents BABE, a robust and diverse data set created… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: substantial text overlap with Ph.D. proposal by same author, part of dissertation arXiv:2112.13352

    Journal ref: Findings of the Association for Computational Linguistics: EMNLP 2021

  26. arXiv:2209.01824  [pdf, other

    cs.CL

    A Survey on Measuring and Mitigating Reasoning Shortcuts in Machine Reading Comprehension

    Authors: Xanh Ho, Johannes Mario Meissner, Saku Sugawara, Akiko Aizawa

    Abstract: The issue of shortcut learning is widely known in NLP and has been an important research focus in recent years. Unintended correlations in the data enable models to easily solve tasks that were meant to exhibit advanced language understanding and reasoning capabilities. In this survey paper, we focus on the field of machine reading comprehension (MRC), an important task for showcasing high-level l… ▽ More

    Submitted 6 September, 2023; v1 submitted 5 September, 2022; originally announced September 2022.

    Comments: 18 pages, 2 figures, 4 tables

  27. arXiv:2207.02463  [pdf, other

    cs.CL

    Gender Biases and Where to Find Them: Exploring Gender Bias in Pre-Trained Transformer-based Language Models Using Movement Pruning

    Authors: Przemyslaw Joniak, Akiko Aizawa

    Abstract: Language model debiasing has emerged as an important field of study in the NLP community. Numerous debiasing techniques were proposed, but bias ablation remains an unaddressed issue. We demonstrate a novel framework for inspecting bias in pre-trained transformer-based language models via movement pruning. Given a model and a debiasing objective, our framework finds a subset of the model containing… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

    Comments: Accepted to NAACL2022, 4th Workshop on Gender Bias in Natural Language Processing

  28. arXiv:2203.07828  [pdf, other

    cs.CL

    Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs

    Authors: Taichi Iki, Akiko Aizawa

    Abstract: Pre-trained Transformers are good foundations for unified multi-task models owing to their task-agnostic representation. Pre-trained Transformers are often combined with text-to-text framework to execute multiple tasks by a single model. Performing a task through a graphical user interface (GUI) is another candidate to accommodate various tasks, including multi-step tasks with vision and language… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: Work in Progress

  29. Comparative Verification of the Digital Library of Mathematical Functions and Computer Algebra Systems

    Authors: André Greiner-Petter, Howard S. Cohl, Abdou Youssef, Moritz Schubotz, Avi Trost, Rajen Dey, Akiko Aizawa, Bela Gipp

    Abstract: Digital mathematical libraries assemble the knowledge of years of mathematical research. Numerous disciplines (e.g., physics, engineering, pure and applied mathematics) rely heavily on compendia gathered findings. Likewise, modern research applications rely more and more on computational solutions, which are often calculated and verified by computer algebra systems. Hence, the correctness, accurac… ▽ More

    Submitted 31 March, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

    Journal ref: In: TACAS, Apr. 2022, pp. 87-105

  30. arXiv:2109.11256  [pdf, other

    cs.CL cs.AI

    Can Question Generation Debias Question Answering Models? A Case Study on Question-Context Lexical Overlap

    Authors: Kazutoshi Shinoda, Saku Sugawara, Akiko Aizawa

    Abstract: Question answering (QA) models for reading comprehension have been demonstrated to exploit unintended dataset biases such as question-context lexical overlap. This hinders QA models from generalizing to under-represented samples such as questions with low lexical overlap. Question generation (QG), a method for augmenting QA datasets, can be a solution for such performance degradation if QG can pro… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: MRQA workshop at EMNLP 2021

  31. Keyphrase Generation for Scientific Document Retrieval

    Authors: Florian Boudin, Ygor Gallina, Akiko Aizawa

    Abstract: Sequence-to-sequence models have lead to significant progress in keyphrase generation, but it remains unknown whether they are reliable enough to be beneficial for document retrieval. This study provides empirical evidence that such models can significantly improve retrieval performance, and introduces a new extrinsic evaluation framework that allows for a better understanding of the limitations o… ▽ More

    Submitted 28 June, 2021; originally announced June 2021.

    Comments: Accepted at ACL 2020

  32. arXiv:2106.03020  [pdf, other

    cs.CL

    Embracing Ambiguity: Shifting the Training Target of NLI Models

    Authors: Johannes Mario Meissner, Napat Thumwanit, Saku Sugawara, Akiko Aizawa

    Abstract: Natural Language Inference (NLI) datasets contain examples with highly ambiguous labels. While many research works do not pay much attention to this fact, several recent efforts have been made to acknowledge and embrace the existence of ambiguity, such as UNLI and ChaosNLI. In this paper, we explore the option of training directly on the estimated label distribution of the annotators in the NLI ta… ▽ More

    Submitted 5 June, 2021; originally announced June 2021.

    Comments: Accepted to ACL 2021

  33. arXiv:2105.14207  [pdf, other

    cs.CL cs.AI

    Maintaining Common Ground in Dynamic Environments

    Authors: Takuma Udagawa, Akiko Aizawa

    Abstract: Common grounding is the process of creating and maintaining mutual understandings, which is a critical aspect of sophisticated human communication. While various task settings have been proposed in existing literature, they mostly focus on creating common ground under static context and ignore the aspect of maintaining them overtime under dynamic context. In this work, we propose a novel task sett… ▽ More

    Submitted 29 May, 2021; originally announced May 2021.

    Comments: Accepted at TACL; pre-MIT Press publication version

  34. arXiv:2104.08066  [pdf, other

    cs.CL cs.AI

    Effect of Visual Extensions on Natural Language Understanding in Vision-and-Language Models

    Authors: Taichi Iki, Akiko Aizawa

    Abstract: A method for creating a vision-and-language (V&L) model is to extend a language model through structural modifications and V&L pre-training. Such an extension aims to make a V&L model inherit the capability of natural language understanding (NLU) from the original language model. To see how well this is achieved, we propose to evaluate V&L models using an NLU benchmark (GLUE). We compare five V&L… ▽ More

    Submitted 23 September, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: to appear at EMNLP 2021. camera-ready version

  35. Multi-sense embeddings through a word sense disambiguation process

    Authors: Terry Ruas, William Grosky, Akiko Aizawa

    Abstract: Natural Language Understanding has seen an increasing number of publications in the last few years, especially after robust word embeddings models became prominent, when they proved themselves able to capture and represent semantic relationships from massive amounts of data. Nevertheless, traditional models often fall short in intrinsic issues of linguistics, such as polysemy and homonymy. Any exp… ▽ More

    Submitted 19 December, 2022; v1 submitted 21 January, 2021; originally announced January 2021.

    Journal ref: Expert Systems with Applications. Volume 136, 1 December 2019, Pages 288-303

  36. arXiv:2011.01060  [pdf, other

    cs.CL

    Constructing A Multi-hop QA Dataset for Comprehensive Evaluation of Reasoning Steps

    Authors: Xanh Ho, Anh-Khoa Duong Nguyen, Saku Sugawara, Akiko Aizawa

    Abstract: A multi-hop question answering (QA) dataset aims to test reasoning and inference skills by requiring a model to read multiple paragraphs to answer a given question. However, current datasets do not provide a complete explanation for the reasoning process from the question to the answer. Further, previous studies revealed that many examples in existing multi-hop datasets do not require multi-hop re… ▽ More

    Submitted 12 November, 2020; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted by COLING 2020

  37. arXiv:2011.00483  [pdf, other

    cs.CL cs.AI

    Deconstruct to Reconstruct a Configurable Evaluation Metric for Open-Domain Dialogue Systems

    Authors: Vitou Phy, Yang Zhao, Akiko Aizawa

    Abstract: Many automatic evaluation metrics have been proposed to score the overall quality of a response in open-domain dialogue. Generally, the overall quality is comprised of various aspects, such as relevancy, specificity, and empathy, and the importance of each aspect differs according to the task. For instance, specificity is mandatory in a food-ordering dialogue task, whereas fluency is preferred in… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: 15 pages, 4 figures, 7 tables, Accepted to COLING 2020

  38. arXiv:2010.03127  [pdf, other

    cs.CL cs.AI

    A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions

    Authors: Takuma Udagawa, Takato Yamazaki, Akiko Aizawa

    Abstract: Recent models achieve promising results in visually grounded dialogues. However, existing datasets often contain undesirable biases and lack sophisticated linguistic analyses, which make it difficult to understand how well current models recognize their precise linguistic structures. To address this problem, we make two design choices: first, we focus on OneCommon Corpus \citep{udagawa2019natural,… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: 16 pages, Findings of EMNLP 2020

  39. arXiv:2008.01523  [pdf, other

    cs.CL

    A System for Worldwide COVID-19 Information Aggregation

    Authors: Akiko Aizawa, Frederic Bergeron, Junjie Chen, Fei Cheng, Katsuhiko Hayashi, Kentaro Inui, Hiroyoshi Ito, Daisuke Kawahara, Masaru Kitsuregawa, Hirokazu Kiyomaru, Masaki Kobayashi, Takashi Kodama, Sadao Kurohashi, Qianying Liu, Masaki Matsubara, Yusuke Miyao, Atsuyuki Morishima, Yugo Murawaki, Kazumasa Omura, Haiyue Song, Eiichiro Sumita, Shinji Suzuki, Ribeka Tanaka, Yu Tanaka, Masashi Toyoda , et al. (4 additional authors not shown)

    Abstract: The global pandemic of COVID-19 has made the public pay close attention to related news, covering various domains, such as sanitation, treatment, and effects on education. Meanwhile, the COVID-19 condition is very different among the countries (e.g., policies and development of the epidemic), and thus citizens would be interested in news in foreign countries. We build a system for worldwide COVID-… ▽ More

    Submitted 11 October, 2020; v1 submitted 27 July, 2020; originally announced August 2020.

    Comments: Accepted to EMNLP 2020 Workshop NLP-COVID

  40. arXiv:2006.10334  [pdf, other

    cs.CL cs.DL

    Extraction and Evaluation of Formulaic Expressions Used in Scholarly Papers

    Authors: Kenichi Iwatsuki, Florian Boudin, Akiko Aizawa

    Abstract: Formulaic expressions, such as 'in this paper we propose', are helpful for authors of scholarly papers because they convey communicative functions; in the above, it is showing the aim of this paper'. Thus, resources of formulaic expressions, such as a dictionary, that could be looked up easily would be useful. However, forms of formulaic expressions can often vary to a great extent. For example, '… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: 21 pages, 11 figures

  41. arXiv:2004.03238  [pdf, other

    cs.CL cs.AI cs.LG

    Improving the Robustness of QA Models to Challenge Sets with Variational Question-Answer Pair Generation

    Authors: Kazutoshi Shinoda, Saku Sugawara, Akiko Aizawa

    Abstract: Question answering (QA) models for reading comprehension have achieved human-level accuracy on in-distribution test sets. However, they have been demonstrated to lack robustness to challenge sets, whose distribution is different from that of training sets. Existing data augmentation methods mitigate this problem by simply augmenting training sets with synthetic examples sampled from the same distr… ▽ More

    Submitted 3 June, 2021; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: ACL-IJCNLP 2021 SRW

  42. arXiv:2004.01912  [pdf, other

    cs.CL

    Benchmarking Machine Reading Comprehension: A Psychological Perspective

    Authors: Saku Sugawara, Pontus Stenetorp, Akiko Aizawa

    Abstract: Machine reading comprehension (MRC) has received considerable attention as a benchmark for natural language understanding. However, the conventional task design of MRC lacks explainability beyond the model interpretation, i.e., reading comprehension by a model cannot be explained in human terms. To this end, this position paper provides a theoretical basis for the design of MRC datasets based on p… ▽ More

    Submitted 26 January, 2021; v1 submitted 4 April, 2020; originally announced April 2020.

    Comments: 21 pages, EACL 2021

  43. Discovering Mathematical Objects of Interest -- A Study of Mathematical Notations

    Authors: Andre Greiner-Petter, Moritz Schubotz, Fabian Mueller, Corinna Breitinger, Howard S. Cohl, Akiko Aizawa, Bela Gipp

    Abstract: Mathematical notation, i.e., the writing system used to communicate concepts in mathematics, encodes valuable information for a variety of information search and retrieval systems. Yet, mathematical notations remain mostly unutilized by today's systems. In this paper, we present the first in-depth study on the distributions of mathematical notation in two large scientific corpora: the open access… ▽ More

    Submitted 22 June, 2021; v1 submitted 7 February, 2020; originally announced February 2020.

    Comments: Proceedings of The Web Conference 2020 (WWW'20), April 20--24, 2020, Taipei, Taiwan

  44. arXiv:2001.02462  [pdf, other

    cs.AI cs.CL

    From Natural Language Instructions to Complex Processes: Issues in Chaining Trigger Action Rules

    Authors: Nobuhiro Ito, Yuya Suzuki, Akiko Aizawa

    Abstract: Automation services for complex business processes usually require a high level of information technology literacy. There is a strong demand for a smartly assisted process automation (IPA: intelligent process automation) service that enables even general users to easily use advanced automation. A natural language interface for such automation is expected as an elemental technology for the IPA real… ▽ More

    Submitted 8 January, 2020; originally announced January 2020.

  45. arXiv:1911.09241  [pdf, ps, other

    cs.CL

    Assessing the Benchmarking Capacity of Machine Reading Comprehension Datasets

    Authors: Saku Sugawara, Pontus Stenetorp, Kentaro Inui, Akiko Aizawa

    Abstract: Existing analysis work in machine reading comprehension (MRC) is largely concerned with evaluating the capabilities of systems. However, the capabilities of datasets are not assessed for benchmarking language understanding precisely. We propose a semi-automated, ablation-based methodology for this challenge; By checking whether questions can be solved even after removing features associated with a… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Comments: 11 pages, AAAI2020, with extra examples, data: https://github.com/Alab-NII/mrc-ablation

  46. arXiv:1911.07588  [pdf, other

    cs.CL cs.AI

    An Annotated Corpus of Reference Resolution for Interpreting Common Grounding

    Authors: Takuma Udagawa, Akiko Aizawa

    Abstract: Common grounding is the process of creating, repairing and updating mutual understandings, which is a fundamental aspect of natural language conversation. However, interpreting the process of common grounding is a challenging task, especially under continuous and partially-observable context where complex ambiguity, uncertainty, partial understandings and misunderstandings are introduced. Interpre… ▽ More

    Submitted 18 November, 2019; originally announced November 2019.

    Comments: 9 pages, 7 figures, 6 tables, Accepted by AAAI 2020

  47. arXiv:1907.03399  [pdf, other

    cs.CL cs.AI

    A Natural Language Corpus of Common Grounding under Continuous and Partially-Observable Context

    Authors: Takuma Udagawa, Akiko Aizawa

    Abstract: Common grounding is the process of creating, repairing and updating mutual understandings, which is a critical aspect of sophisticated human communication. However, traditional dialogue systems have limited capability of establishing common ground, and we also lack task formulations which introduce natural difficulty in terms of common grounding while enabling easy evaluation and analysis of compl… ▽ More

    Submitted 8 July, 2019; originally announced July 2019.

    Comments: AAAI 2019

  48. arXiv:1905.08359  [pdf, other

    cs.DL cs.AI cs.IR

    Why Machines Cannot Learn Mathematics, Yet

    Authors: André Greiner-Petter, Terry Ruas, Moritz Schubotz, Akiko Aizawa, William Grosky, Bela Gipp

    Abstract: Nowadays, Machine Learning (ML) is seen as the universal solution to improve the effectiveness of information retrieval (IR) methods. However, while mathematics is a precise and accurate science, it is usually expressed by less accurate and imprecise descriptions, contributing to the relative dearth of machine learning applications for IR in this domain. Generally, mathematical documents communica… ▽ More

    Submitted 20 May, 2019; originally announced May 2019.

    Comments: Submitted to 4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries colocated at the 42nd International ACM SIGIR Conference

    Journal ref: 2019 http://ceur-ws.org/Vol-2414/paper14.pdf

  49. arXiv:1811.10364  [pdf

    cs.IR cs.AI cs.DL cs.LG

    The Architecture of Mr. DLib's Scientific Recommender-System API

    Authors: Joeran Beel, Andrew Collins, Akiko Aizawa

    Abstract: Recommender systems in academia are not widely available. This may be in part due to the difficulty and cost of developing and maintaining recommender systems. Many operators of academic products such as digital libraries and reference managers avoid this effort, although a recommender system could provide significant benefits to their users. In this paper, we introduce Mr. DLib's "Recommendations… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

  50. arXiv:1809.00665  [pdf, other

    cs.CV

    Context-Patch Face Hallucination Based on Thresholding Locality-constrained Representation and Reproducing Learning

    Authors: Junjun Jiang, Yi Yu, Suhua Tang, Jiayi Ma, Akiko Aizawa, Kiyoharu Aizawa

    Abstract: Face hallucination is a technique that reconstruct high-resolution (HR) faces from low-resolution (LR) faces, by using the prior knowledge learned from HR/LR face pairs. Most state-of-the-arts leverage position-patch prior knowledge of human face to estimate the optimal representation coefficients for each image patch. However, they focus only the position information and usually ignore the contex… ▽ More

    Submitted 14 September, 2018; v1 submitted 3 September, 2018; originally announced September 2018.

    Comments: 13 pages, 15 figures, Accepted by IEEE TCYB