Zum Hauptinhalt springen

Showing 1–25 of 25 results for author: Nishida, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.05656  [pdf, other

    cs.LG cs.CL

    Multi-label Learning with Random Circular Vectors

    Authors: Ken Nishida, Kojiro Machi, Kazuma Onishi, Katsuhiko Hayashi, Hidetaka Kamigaito

    Abstract: The extreme multi-label classification~(XMC) task involves learning a classifier that can predict from a large label set the most relevant subset of labels for a data instance. While deep neural networks~(DNNs) have demonstrated remarkable success in XMC problems, the task is still challenging because it must deal with a large number of output labels, which make the DNN training computationally ex… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 11 pages, 6 figures, 3 tables; accepted to workshop RepL4NLP held in conjunction with ACL 2024

  2. arXiv:2401.13313  [pdf, other

    cs.CV cs.CL

    InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions

    Authors: Ryota Tanaka, Taichi Iki, Kyosuke Nishida, Kuniko Saito, Jun Suzuki

    Abstract: We study the problem of completing various visual document understanding (VDU) tasks, e.g., question answering and information extraction, on real-world documents through human-written instructions. To this end, we propose InstructDoc, the first large-scale collection of 30 publicly available VDU datasets, each with diverse instructions in a unified format, which covers a wide range of 12 tasks an… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI2024; project page: https://github.com/nttmdlab-nlp/InstructDoc

  3. arXiv:2306.12820  [pdf, other

    cs.SD eess.AS

    NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction

    Authors: Koki Nishida, Norihiro Takamune, Rintaro Ikeshita, Daichi Kitamura, Hiroshi Saruwatari, Tomohiro Nakatani

    Abstract: In this paper, we address the multichannel blind source extraction (BSE) of a single source in diffuse noise environments. To solve this problem even faster than by fast multichannel nonnegative matrix factorization (FastMNMF) and its variant, we propose a BSE method called NoisyILRMA, which is a modification of independent low-rank matrix analysis (ILRMA) to account for diffuse noise. NoisyILRMA… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 5 pages, 3 figures, accepted for European Signal Processing Conference 2023 (EUSIPCO 2023)

  4. arXiv:2304.00964  [pdf, other

    cs.CV cs.CL

    Robust Text-driven Image Editing Method that Adaptively Explores Directions in Latent Spaces of StyleGAN and CLIP

    Authors: Tsuyoshi Baba, Kosuke Nishida, Kyosuke Nishida

    Abstract: Automatic image editing has great demands because of its numerous applications, and the use of natural language instructions is essential to achieving flexible and intuitive editing as the user imagines. A pioneering work in text-driven image editing, StyleCLIP, finds an edit direction in the CLIP space and then edits the image by mapping the direction to the StyleGAN space. At the same time, it i… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  5. arXiv:2301.04883  [pdf, other

    cs.CL cs.CV

    SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images

    Authors: Ryota Tanaka, Kyosuke Nishida, Kosuke Nishida, Taku Hasegawa, Itsumi Saito, Kuniko Saito

    Abstract: Visual question answering on document images that contain textual, visual, and layout information, called document VQA, has received much attention recently. Although many datasets have been proposed for developing document VQA systems, most of the existing datasets focus on understanding the content relationships within a single image and not across multiple images. In this study, we propose a ne… ▽ More

    Submitted 12 January, 2023; originally announced January 2023.

    Comments: Accepted by AAAI2023

  6. arXiv:2210.07523  [pdf, other

    cs.CL

    Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge

    Authors: Kosuke Nishida, Naoki Yoshinaga, Kyosuke Nishida

    Abstract: Although named entity recognition (NER) helps us to extract domain-specific entities from text (e.g., artists in the music domain), it is costly to create a large amount of training data or a structured knowledge base to perform accurate NER in the target domain. Here, we propose self-adaptive NER, which retrieves external knowledge from unstructured text to learn the usages of entities that have… ▽ More

    Submitted 6 June, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: EACL2023 (long)

  7. arXiv:2207.03133  [pdf, other

    cs.CL cs.CV

    Improving Few-Shot Image Classification Using Machine- and User-Generated Natural Language Descriptions

    Authors: Kosuke Nishida, Kyosuke Nishida, Shuichi Nishioka

    Abstract: Humans can obtain the knowledge of novel visual concepts from language descriptions, and we thus use the few-shot image classification task to investigate whether a machine learning model can have this capability. Our proposed model, LIDE (Learning from Image and DEscription), has a text decoder to generate the descriptions and a text encoder to obtain the text representations of machine- or user-… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: Findings of NAACL2022

  8. arXiv:2204.13361  [pdf, other

    cs.LG

    It's DONE: Direct ONE-shot learning with quantile weight imprinting

    Authors: Kazufumi Hosoda, Keigo Nishida, Shigeto Seno, Tomohiro Mashita, Hideki Kashioka, Izumi Ohzawa

    Abstract: Learning a new concept from one example is a superior function of the human brain and it is drawing attention in the field of machine learning as a one-shot learning task. In this paper, we propose one of the simplest methods for this task with a nonparametric weight imprinting, named Direct ONE-shot learning (DONE). DONE adds new classes to a pretrained deep neural network (DNN) classifier with n… ▽ More

    Submitted 2 November, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

    Comments: 12 pages, 5 figures

  9. arXiv:2203.01535  [pdf, ps, other

    stat.ME cs.NE stat.CO stat.ML

    Kernel Density Estimation by Genetic Algorithm

    Authors: Kiheiji Nishida

    Abstract: This study proposes a data condensation method for multivariate kernel density estimation by genetic algorithm. First, our proposed algorithm generates multiple subsamples of a given size with replacement from the original sample. The subsamples and their constituting data points are regarded as $\it{chromosome}$ and $\it{gene}$, respectively, in the terminology of genetic algorithm. Second, each… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

  10. Towards Interpretable and Reliable Reading Comprehension: A Pipeline Model with Unanswerability Prediction

    Authors: Kosuke Nishida, Kyosuke Nishida, Itsumi Saito, Sen Yoshida

    Abstract: Multi-hop QA with annotated supporting facts, which is the task of reading comprehension (RC) considering the interpretability of the answer, has been extensively studied. In this study, we define an interpretable reading comprehension (IRC) model as a pipeline model with the capability of predicting unanswerable queries. The IRC model justifies the answer prediction by establishing consistency be… ▽ More

    Submitted 18 November, 2021; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: IJCNN 2021 (https://ieeexplore.ieee.org/abstract/document/9534370)

    Journal ref: International Joint Conference on Neural Networks (IJCNN), 2021, pp. 1-8

  11. arXiv:2111.07979  [pdf, other

    cs.SD cs.AI cs.LG eess.AS eess.SY q-bio.NC

    Metric-based multimodal meta-learning for human movement identification via footstep recognition

    Authors: Muhammad Shakeel, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai

    Abstract: We describe a novel metric-based learning approach that introduces a multimodal framework and uses deep audio and geophone encoders in siamese configuration to design an adaptable and lightweight supervised model. This framework eliminates the need for expensive data labeling procedures and learns general-purpose representations from low multisensory data obtained from omnipresent sensing systems.… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

  12. arXiv:2109.08354  [pdf, other

    cs.CL

    Task-adaptive Pre-training of Language Models with Word Embedding Regularization

    Authors: Kosuke Nishida, Kyosuke Nishida, Sen Yoshida

    Abstract: Pre-trained language models (PTLMs) acquire domain-independent linguistic knowledge through pre-training with massive textual resources. Additional pre-training is effective in adapting PTLMs to domains that are not well covered by the pre-training corpora. Here, we focus on the static word embeddings of PTLMs for domain adaptation to teach PTLMs domain-specific meanings of words. We propose a nov… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: ACL Findings 2021

  13. arXiv:2107.13430  [pdf, ps, other

    stat.ML cs.LG stat.CO stat.ME

    Kernel Density Estimation by Stagewise Algorithm with a Simple Dictionary

    Authors: Kiheiji Nishida, Kanta Naito

    Abstract: This study proposes multivariate kernel density estimation by stagewise minimization algorithm based on $U$-divergence and a simple dictionary. The dictionary consists of an appropriate scalar bandwidth matrix and a part of the original data. The resulting estimator brings us data-adaptive weighting parameters and bandwidth matrices, and realizes a sparse representation of kernel density estimatio… ▽ More

    Submitted 10 August, 2021; v1 submitted 27 July, 2021; originally announced July 2021.

  14. arXiv:2101.11272  [pdf, other

    cs.CL cs.CV

    VisualMRC: Machine Reading Comprehension on Document Images

    Authors: Ryota Tanaka, Kyosuke Nishida, Sen Yoshida

    Abstract: Recent studies on machine reading comprehension have focused on text-level understanding but have not yet reached the level of human understanding of the visual layout and content of real-world documents. In this study, we introduce a new visual machine reading comprehension dataset, named VisualMRC, wherein given a question and a document image, a machine reads and comprehends texts in the image… ▽ More

    Submitted 10 May, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: Accepted as a full paper at AAAI 2021. The first two authors have equal contribution

  15. arXiv:2007.00222  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    A Transformer-based Audio Captioning Model with Keyword Estimation

    Authors: Yuma Koizumi, Ryo Masumura, Kyosuke Nishida, Masahiro Yasuda, Shoichiro Saito

    Abstract: One of the problems with automated audio captioning (AAC) is the indeterminacy in word selection corresponding to the audio event/scene. Since one acoustic event/scene can be described with several words, it results in a combinatorial explosion of possible captions and difficulty in training. To solve this problem, we propose a Transformer-based audio-captioning model with keyword estimation calle… ▽ More

    Submitted 8 August, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: Accepted to Interspeech 2020

  16. arXiv:2003.13028  [pdf, other

    cs.CL

    Abstractive Summarization with Combination of Pre-trained Sequence-to-Sequence and Saliency Models

    Authors: Itsumi Saito, Kyosuke Nishida, Kosuke Nishida, Junji Tomita

    Abstract: Pre-trained sequence-to-sequence (seq-to-seq) models have significantly improved the accuracy of several language generation tasks, including abstractive summarization. Although the fluency of abstractive summarization has been greatly improved by fine-tuning these models, it is not clear whether they can also identify the important parts of the source text to be included in the summary. In this s… ▽ More

    Submitted 29 March, 2020; originally announced March 2020.

    Comments: Work in progress

  17. arXiv:2001.07331  [pdf, ps, other

    cs.CL

    Length-controllable Abstractive Summarization by Guiding with Summary Prototype

    Authors: Itsumi Saito, Kyosuke Nishida, Kosuke Nishida, Atsushi Otsuka, Hisako Asano, Junji Tomita, Hiroyuki Shindo, Yuji Matsumoto

    Abstract: We propose a new length-controllable abstractive summarization model. Recent state-of-the-art abstractive summarization models based on encoder-decoder models generate only one summary per source text. However, controllable summarization, especially of the length, is an important aspect for practical applications. Previous studies on length-controllable abstractive summarization incorporate length… ▽ More

    Submitted 20 January, 2020; originally announced January 2020.

  18. arXiv:1911.10768  [pdf, ps, other

    cs.CL

    Unsupervised Domain Adaptation of Language Models for Reading Comprehension

    Authors: Kosuke Nishida, Kyosuke Nishida, Itsumi Saito, Hisako Asano, Junji Tomita

    Abstract: This study tackles unsupervised domain adaptation of reading comprehension (UDARC). Reading comprehension (RC) is a task to learn the capability for question answering with textual sources. State-of-the-art models on RC still do not have general linguistic intelligence; i.e., their accuracy worsens for out-domain datasets that are not used in the training. We hypothesize that this discrepancy is c… ▽ More

    Submitted 21 May, 2020; v1 submitted 25 November, 2019; originally announced November 2019.

    Comments: LREC2020

  19. arXiv:1905.12848  [pdf, other

    cs.CL

    A Simple but Effective Method to Incorporate Multi-turn Context with BERT for Conversational Machine Comprehension

    Authors: Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita

    Abstract: Conversational machine comprehension (CMC) requires understanding the context of multi-turn dialogue. Using BERT, a pre-training language model, has been successful for single-turn machine comprehension, while modeling multiple turns of question answering with BERT has not been established because BERT has a limit on the number and the length of input sequences. In this paper, we propose a simple… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: Accepted at ACL 2019 Workshop on NLP for Conversational AI (NLP4ConvAI)

  20. arXiv:1905.08537  [pdf, other

    cs.LG cs.NE stat.ML

    Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search

    Authors: Youhei Akimoto, Shinichi Shirakawa, Nozomu Yoshinari, Kento Uchida, Shota Saito, Kouhei Nishida

    Abstract: High sensitivity of neural architecture search (NAS) methods against their input such as step-size (i.e., learning rate) and search space prevents practitioners from applying them out-of-the-box to their own problems, albeit its purpose is to automate a part of tuning process. Aiming at a fast, robust, and widely-applicable NAS, we develop a generic optimization framework for NAS. We turn a couple… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.

    Comments: Accepted to ICML 2019. Code is available at https://github.com/shirakawas/ASNG-NAS

  21. arXiv:1905.08511  [pdf, ps, other

    cs.CL

    Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction

    Authors: Kosuke Nishida, Kyosuke Nishida, Masaaki Nagata, Atsushi Otsuka, Itsumi Saito, Hisako Asano, Junji Tomita

    Abstract: Question answering (QA) using textual sources for purposes such as reading comprehension (RC) has attracted much attention. This study focuses on the task of explainable multi-hop QA, which requires the system to return the answer with evidence sentences by reasoning and gathering disjoint pieces of the reference texts. It proposes the Query Focused Extractor (QFE) model for evidence extraction an… ▽ More

    Submitted 28 May, 2019; v1 submitted 21 May, 2019; originally announced May 2019.

    Comments: Accepted as a long paper at ACL 2019

  22. arXiv:1901.06257  [pdf, other

    cs.CY cs.IR

    Personalized Visited-POI Assignment to Individual Raw GPS Trajectories

    Authors: Jun Suzuki, Yoshihiko Suhara, Hiroyuki Toda, Kyosuke Nishida

    Abstract: Knowledge discovery from GPS trajectory data is an important topic in several scientific areas, including data mining, human behavior analysis, and user modeling. This paper proposes a task that assigns personalized visited-POIs. Its goal is to estimate fine-grained and pre-defined locations (i.e., points of interest (POI)) that are actually visited by users and assign visited-location information… ▽ More

    Submitted 11 January, 2019; originally announced January 2019.

    Comments: 31 pages, 10 figures

    Journal ref: ACM Transactions on Spatial Algorithms and Systems (TSAS) Volume 5 Issue 3, September 2019

  23. arXiv:1901.02262  [pdf, ps, other

    cs.CL

    Multi-style Generative Reading Comprehension

    Authors: Kyosuke Nishida, Itsumi Saito, Kosuke Nishida, Kazutoshi Shinoda, Atsushi Otsuka, Hisako Asano, Junji Tomita

    Abstract: This study tackles generative reading comprehension (RC), which consists of answering questions based on textual evidence and natural language generation (NLG). We propose a multi-style abstractive summarization model for question answering, called Masque. The proposed model has two key characteristics. First, unlike most studies on RC that have focused on extracting an answer span from the provid… ▽ More

    Submitted 27 May, 2019; v1 submitted 8 January, 2019; originally announced January 2019.

    Comments: Accepted as a long paper at ACL 2019

  24. arXiv:1809.06517  [pdf, other

    cs.LG math.OC stat.ML

    Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network

    Authors: Kouhei Nishida, Hernan Aguirre, Shota Saito, Shinichi Shirakawa, Youhei Akimoto

    Abstract: Black box discrete optimization (BBDO) appears in wide range of engineering tasks. Evolutionary or other BBDO approaches have been applied, aiming at automating necessary tuning of system parameters, such as hyper parameter tuning of machine learning based systems when being installed for a specific task. However, automation is often jeopardized by the need of strategy parameter tuning for BBDO al… ▽ More

    Submitted 17 September, 2018; originally announced September 2018.

  25. Retrieve-and-Read: Multi-task Learning of Information Retrieval and Reading Comprehension

    Authors: Kyosuke Nishida, Itsumi Saito, Atsushi Otsuka, Hisako Asano, Junji Tomita

    Abstract: This study considers the task of machine reading at scale (MRS) wherein, given a question, a system first performs the information retrieval (IR) task of finding relevant passages in a knowledge source and then carries out the reading comprehension (RC) task of extracting an answer span from the passages. Previous MRS studies, in which the IR component was trained without considering answer spans,… ▽ More

    Submitted 31 August, 2018; originally announced August 2018.

    Comments: 10 pages, 6 figure. Accepted as a full paper at CIKM 2018

    Journal ref: CIKM 2018, October 22-26, 2018, Torino, Italy