Zum Hauptinhalt springen

Showing 1–6 of 6 results for author: Sone, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2104.09061  [pdf, other

    cs.CL

    Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection

    Authors: Sihao Chen, Fan Zhang, Kazoo Sone, Dan Roth

    Abstract: Despite significant progress in neural abstractive summarization, recent studies have shown that the current models are prone to generating summaries that are unfaithful to the original context. To address the issue, we study contrast candidate generation and selection as a model-agnostic post-processing technique to correct the extrinsic hallucinations (i.e. information not present in the source… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: NAACL'21

  2. arXiv:2103.16561  [pdf, other

    cs.CV cs.AI cs.CL

    Diagnosing Vision-and-Language Navigation: What Really Matters

    Authors: Wanrong Zhu, Yuankai Qi, Pradyumna Narayana, Kazoo Sone, Sugato Basu, Xin Eric Wang, Qi Wu, Miguel Eckstein, William Yang Wang

    Abstract: Vision-and-language navigation (VLN) is a multimodal task where an agent follows natural language instructions and navigates in visual environments. Multiple setups have been proposed, and researchers apply new model architectures or training techniques to boost navigation performance. However, there still exist non-negligible gaps between machines' performance and human benchmarks. Moreover, the… ▽ More

    Submitted 4 May, 2022; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: NAACL 2022

  3. arXiv:2010.03644  [pdf, other

    cs.CL cs.AI cs.CV

    Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations

    Authors: Wanrong Zhu, Xin Eric Wang, Pradyumna Narayana, Kazoo Sone, Sugato Basu, William Yang Wang

    Abstract: A major challenge in visually grounded language generation is to build robust benchmark datasets and models that can generalize well in real-world settings. To do this, it is critical to ensure that our evaluation protocols are correct, and benchmarks are reliable. In this work, we set forth to design a set of experiments to understand an important but often ignored problem in visually grounded la… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  4. arXiv:2007.00229  [pdf, other

    cs.CL cs.AI cs.CV

    Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

    Authors: Wanrong Zhu, Xin Eric Wang, Tsu-Jui Fu, An Yan, Pradyumna Narayana, Kazoo Sone, Sugato Basu, William Yang Wang

    Abstract: One of the most challenging topics in Natural Language Processing (NLP) is visually-grounded language understanding and reasoning. Outdoor vision-and-language navigation (VLN) is such a task where an agent follows natural language instructions and navigates a real-life urban environment. Due to the lack of human-annotated instructions that illustrate intricate urban scenes, outdoor VLN remains a c… ▽ More

    Submitted 3 February, 2021; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: EACL 2021

  5. arXiv:2006.08686  [pdf, other

    cs.CV cs.LG

    Multi-Image Summarization: Textual Summary from a Set of Cohesive Images

    Authors: Nicholas Trieu, Sebastian Goodman, Pradyumna Narayana, Kazoo Sone, Radu Soricut

    Abstract: Multi-sentence summarization is a well studied problem in NLP, while generating image descriptions for a single image is a well studied problem in Computer Vision. However, for applications such as image cluster labeling or web page summarization, summarizing a set of images is also a useful and challenging task. This paper proposes the new task of multi-image summarization, which aims to generate… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

    Comments: 9 pages, 5 figures

  6. arXiv:1911.05978  [pdf, other

    cs.CV cs.CL cs.LG

    HUSE: Hierarchical Universal Semantic Embeddings

    Authors: Pradyumna Narayana, Aniket Pednekar, Abishek Krishnamoorthy, Kazoo Sone, Sugato Basu

    Abstract: There is a recent surge of interest in cross-modal representation learning corresponding to images and text. The main challenge lies in mapping images and text to a shared latent space where the embeddings corresponding to a similar semantic concept lie closer to each other than the embeddings corresponding to different semantic concepts, irrespective of the modality. Ranking losses are commonly u… ▽ More

    Submitted 14 November, 2019; originally announced November 2019.