Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Kambara, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13186  [pdf, other

    cs.RO

    Nearest Neighbor Future Captioning: Generating Descriptions for Possible Collisions in Object Placement Tasks

    Authors: Takumi Komatsu, Motonari Kambara, Shumpei Hatanaka, Haruka Matsuo, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Komei Sugiura

    Abstract: Domestic service robots (DSRs) that support people in everyday environments have been widely investigated. However, their ability to predict and describe future risks resulting from their own actions remains insufficient. In this study, we focus on the linguistic explainability of DSRs. Most existing methods do not explicitly model the region of possible collisions; thus, they do not properly gene… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted for presentation at Advanced Robotics 24

  2. arXiv:2407.00985  [pdf, other

    cs.RO cs.CV

    Object Segmentation from Open-Vocabulary Manipulation Instructions Based on Optimal Transport Polygon Matching with Multimodal Foundation Models

    Authors: Takayuki Nishimura, Katsuyuki Kuyo, Motonari Kambara, Komei Sugiura

    Abstract: We consider the task of generating segmentation masks for the target object from an object manipulation instruction, which allows users to give open vocabulary instructions to domestic service robots. Conventional segmentation generation approaches often fail to account for objects outside the camera's field of view and cases in which the order of vertices differs but still represents the same pol… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted for presentation at IROS2024

  3. arXiv:2312.15844  [pdf, other

    cs.RO cs.CL cs.CV

    Learning-To-Rank Approach for Identifying Everyday Objects Using a Physical-World Search Engine

    Authors: Kanta Kaneda, Shunya Nagashima, Ryosuke Korekata, Motonari Kambara, Komei Sugiura

    Abstract: Domestic service robots offer a solution to the increasing demand for daily care and support. A human-in-the-loop approach that combines automation and operator intervention is considered to be a realistic approach to their use in society. Therefore, we focus on the task of retrieving target objects from open-vocabulary user instructions in a human-in-the-loop setting, which we define as the learn… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: Accepted for RAL 2023

  4. arXiv:2311.06855  [pdf, other

    cs.CV cs.CL cs.RO

    DialMAT: Dialogue-Enabled Transformer with Moment-Based Adversarial Training

    Authors: Kanta Kaneda, Ryosuke Korekata, Yuiga Wada, Shunya Nagashima, Motonari Kambara, Yui Iioka, Haruka Matsuo, Yuto Imai, Takayuki Nishimura, Komei Sugiura

    Abstract: This paper focuses on the DialFRED task, which is the task of embodied instruction following in a setting where an agent can actively ask questions about the task. To address this task, we propose DialMAT. DialMAT introduces Moment-based Adversarial Training, which incorporates adversarial perturbations into the latent space of language, image, and action. Additionally, it introduces a crossmodal… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: Accepted for presentation at Fourth Annual Embodied AI Workshop at CVPR

  5. arXiv:2311.04260  [pdf, other

    cs.RO cs.CL cs.CV

    Fully Automated Task Management for Generation, Execution, and Evaluation: A Framework for Fetch-and-Carry Tasks with Natural Language Instructions in Continuous Space

    Authors: Motonari Kambara, Komei Sugiura

    Abstract: This paper aims to develop a framework that enables a robot to execute tasks based on visual information, in response to natural language instructions for Fetch-and-Carry with Object Grounding (FCOG) tasks. Although there have been many frameworks, they usually rely on manually given instruction sentences. Therefore, evaluations have only been conducted with fixed tasks. Furthermore, many multimod… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted at presentation for CVPR 2023 Embodied AI Workshop

  6. arXiv:2307.07166  [pdf, other

    cs.RO cs.CL cs.CV

    Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry Tasks

    Authors: Ryosuke Korekata, Motonari Kambara, Yu Yoshida, Shintaro Ishikawa, Yosuke Kawasaki, Masaki Takahashi, Komei Sugiura

    Abstract: This paper describes a domestic service robot (DSR) that fetches everyday objects and carries them to specified destinations according to free-form natural language instructions. Given an instruction such as "Move the bottle on the left side of the plate to the empty chair," the DSR is expected to identify the bottle and the chair from multiple candidates in the environment and carry the target ob… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: Accepted for presentation at IROS2023

  7. arXiv:2207.09083  [pdf, other

    cs.RO cs.CL cs.CV

    Relational Future Captioning Model for Explaining Likely Collisions in Daily Tasks

    Authors: Motonari Kambara, Komei Sugiura

    Abstract: Domestic service robots that support daily tasks are a promising solution for elderly or disabled people. It is crucial for domestic service robots to explain the collision risk before they perform actions. In this paper, our aim is to generate a caption about a future event. We propose the Relational Future Captioning Model (RFCM), a crossmodal language generation model for the future captioning… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted for presentation at ICIP2022

  8. arXiv:2107.00789  [pdf, other

    cs.RO cs.CL cs.CV

    Case Relation Transformer: A Crossmodal Language Generation Model for Fetching Instructions

    Authors: Motonari Kambara, Komei Sugiura

    Abstract: There have been many studies in robotics to improve the communication skills of domestic service robots. Most studies, however, have not fully benefited from recent advances in deep neural networks because the training datasets are not large enough. In this paper, our aim is to augment the datasets based on a crossmodal language generation model. We propose the Case Relation Transformer (CRT), whi… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

    Comments: Accepted for presentation at IROS2021