Zum Hauptinhalt springen

Showing 1–33 of 33 results for author: Schwenk, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.05187  [pdf, other

    cs.CL cs.SD eess.AS

    Seamless: Multilingual Expressive and Streaming Speech Translation

    Authors: Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Coria Meglioli, David Dale, Ning Dong, Mark Duppenthaler, Paul-Ambroise Duquenne, Brian Ellis, Hady Elsahar, Justin Haaheim, John Hoffman, Min-Jae Hwang, Hirofumi Inaguma, Christopher Klaiber, Ilia Kulikov, Pengwei Li, Daniel Licht, Jean Maillard, Ruslan Mavlyutov, Alice Rakotoarison, Kaushik Ram Sadagopan, Abinesh Ramakrishnan, Tuan Tran, Guillaume Wenzek , et al. (40 additional authors not shown)

    Abstract: Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models that enable end-to-end expressive and multilingual translations in a streaming fashion. First, we contribute an improved version of the massively multilingual and multimodal SeamlessM4… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  2. Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer

    Authors: Paul-Ambroise Duquenne, Holger Schwenk, Benoît Sagot

    Abstract: Recent research has shown that independently trained encoders and decoders, combined through a shared fixed-size representation, can achieve competitive performance in speech-to-text translation. In this work, we show that this type of approach can be further improved with multilingual training. We observe significant improvements in zero-shot cross-modal speech translation, even outperforming a s… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Journal ref: Proceedings of Interspeech 2023

  3. arXiv:2308.11596  [pdf, other

    cs.CL

    SeamlessM4T: Massively Multilingual & Multimodal Machine Translation

    Authors: Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Cora Meglioli, David Dale, Ning Dong, Paul-Ambroise Duquenne, Hady Elsahar, Hongyu Gong, Kevin Heffernan, John Hoffman, Christopher Klaiber, Pengwei Li, Daniel Licht, Jean Maillard, Alice Rakotoarison, Kaushik Ram Sadagopan, Guillaume Wenzek, Ethan Ye, Bapi Akula, Peng-Jen Chen, Naji El Hachem, Brian Ellis, Gabriel Mejia Gonzalez, Justin Haaheim , et al. (43 additional authors not shown)

    Abstract: What does it take to create the Babel Fish, a tool that can help individuals translate speech between any two languages? While recent breakthroughs in text-based models have pushed machine translation coverage beyond 200 languages, unified speech-to-speech translation models have yet to achieve similar strides. More specifically, conventional speech-to-speech translation systems rely on cascaded s… ▽ More

    Submitted 24 October, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    ACM Class: I.2.7

  4. arXiv:2308.11466  [pdf, other

    cs.CL

    SONAR: Sentence-Level Multimodal and Language-Agnostic Representations

    Authors: Paul-Ambroise Duquenne, Holger Schwenk, Benoît Sagot

    Abstract: We introduce SONAR, a new multilingual and multimodal fixed-size sentence embedding space. Our single text encoder, covering 200 languages, substantially outperforms existing sentence embeddings such as LASER3 and LabSE on the xsim and xsim++ multilingual similarity search tasks. Speech segments can be embedded in the same SONAR embedding space using language-specific speech encoders trained in a… ▽ More

    Submitted 23 August, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

  5. arXiv:2306.12907  [pdf, other

    cs.CL

    xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages

    Authors: Mingda Chen, Kevin Heffernan, Onur Çelebi, Alex Mourachko, Holger Schwenk

    Abstract: We introduce a new proxy score for evaluating bitext mining based on similarity in a multilingual embedding space: xSIM++. In comparison to xSIM, this improved proxy leverages rule-based approaches to extend English sentences in any evaluation set with synthetic, hard-to-distinguish examples which more closely mirror the scenarios we encounter during large-scale mining. We validate this proxy by r… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: The first two authors contributed equally; ACL 2023 short; Code and data are available at https://github.com/facebookresearch/LASER

  6. arXiv:2212.08486  [pdf, other

    cs.CL

    BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric

    Authors: Mingda Chen, Paul-Ambroise Duquenne, Pierre Andrews, Justine Kao, Alexandre Mourachko, Holger Schwenk, Marta R. Costa-jussà

    Abstract: End-to-End speech-to-speech translation (S2ST) is generally evaluated with text-based metrics. This means that generated speech has to be automatically transcribed, making the evaluation dependent on the availability and quality of automatic speech recognition (ASR) systems. In this paper, we propose a text-free evaluation metric for end-to-end S2ST, named BLASER, to avoid the dependency on ASR sy… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

    ACM Class: I.2.7

  7. arXiv:2211.06474  [pdf, other

    cs.CL cs.SD eess.AS

    Speech-to-Speech Translation For A Real-world Unwritten Language

    Authors: Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hirofumi Inaguma, Sravya Popuri, Changhan Wang, Juan Pino, Wei-Ning Hsu, Ann Lee

    Abstract: We study speech-to-speech translation (S2ST) that translates speech from one language into another language and focuses on building systems to support languages without standard text writing systems. We use English-Taiwanese Hokkien as a case study, and present an end-to-end solution from training data collection, modeling choices to benchmark dataset release. First, we present efforts on creating… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  8. arXiv:2211.04508  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations

    Authors: Paul-Ambroise Duquenne, Hongyu Gong, Ning Dong, Jingfei Du, Ann Lee, Vedanuj Goswani, Changhan Wang, Juan Pino, Benoît Sagot, Holger Schwenk

    Abstract: We present SpeechMatrix, a large-scale multilingual corpus of speech-to-speech translations mined from real speech of European Parliament recordings. It contains speech alignments in 136 language pairs with a total of 418 thousand hours of speech. To evaluate the quality of this parallel speech, we train bilingual speech-to-speech translation models on mined data only and establish extensive basel… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: 18 pages

  9. arXiv:2210.11427  [pdf, other

    cs.CV

    DiffEdit: Diffusion-based semantic image editing with mask guidance

    Authors: Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord

    Abstract: Image generation has recently seen tremendous advances, with diffusion models allowing to synthesize convincing images for a large variety of text prompts. In this article, we propose DiffEdit, a method to take advantage of text-conditioned diffusion models for the task of semantic image editing, where the goal is to edit an image based on a text query. Semantic image editing is an extension of im… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Preprint

  10. arXiv:2210.05033  [pdf, other

    cs.CL

    Multilingual Representation Distillation with Contrastive Learning

    Authors: Weiting Tan, Kevin Heffernan, Holger Schwenk, Philipp Koehn

    Abstract: Multilingual sentence representations from large models encode semantic information from two or more languages and can be used for different cross-lingual information retrieval and matching tasks. In this paper, we integrate contrastive learning into multilingual representation distillation and use it for quality estimation of parallel sentences (i.e., find semantically similar sentences that can… ▽ More

    Submitted 30 April, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: EACL 2023

  11. arXiv:2207.04672  [pdf

    cs.CL cs.AI

    No Language Left Behind: Scaling Human-Centered Machine Translation

    Authors: NLLB Team, Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran , et al. (14 additional authors not shown)

    Abstract: Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today. However, such efforts have coalesced around a small subset of languages, leaving behind the vast majority of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe, high quality res… ▽ More

    Submitted 25 August, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: 190 pages

    MSC Class: 68T50 ACM Class: I.2.7

  12. arXiv:2205.12654  [pdf, other

    cs.CL

    Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages

    Authors: Kevin Heffernan, Onur Çelebi, Holger Schwenk

    Abstract: Scaling multilingual representation learning beyond the hundred most frequent languages is challenging, in particular to cover the long tail of low-resource languages. A promising approach has been to train one-for-all multilingual models capable of cross-lingual transfer, but these models often suffer from insufficient capacity and interference between unrelated languages. Instead, we move away f… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: 12 pages

  13. arXiv:2205.12216  [pdf, other

    cs.CL

    T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation

    Authors: Paul-Ambroise Duquenne, Hongyu Gong, Benoît Sagot, Holger Schwenk

    Abstract: We present a new approach to perform zero-shot cross-modal transfer between speech and text for translation tasks. Multilingual speech and text are encoded in a joint fixed-size representation space. Then, we compare different approaches to decode these multimodal and multilingual fixed-size representations, enabling zero-shot translation between languages and modalities. All our models are traine… ▽ More

    Submitted 10 November, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

  14. arXiv:2203.04705  [pdf, other

    cs.CV

    FlexIT: Towards Flexible Semantic Image Translation

    Authors: Guillaume Couairon, Asya Grechka, Jakob Verbeek, Holger Schwenk, Matthieu Cord

    Abstract: Deep generative models, like GANs, have considerably improved the state of the art in image synthesis, and are able to generate near photo-realistic images in structured domains such as human faces. Based on this success, recent work on image editing proceeds by projecting images to the GAN latent space and manipulating the latent vector. However, these approaches are limited in that only images f… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

    Comments: accepted at CVPR 2022

  15. arXiv:2112.08352  [pdf, other

    cs.CL cs.AI cs.LG eess.AS

    Textless Speech-to-Speech Translation on Real Data

    Authors: Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning Hsu

    Abstract: We present a textless speech-to-speech translation (S2ST) system that can translate speech from one language into another language and can be built without the need of any text data. Different from existing work in the literature, we tackle the challenge in modeling multi-speaker target speech and train the systems with real-world S2ST data. The key to our approach is a self-supervised unit-based… ▽ More

    Submitted 4 May, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: Accepted to NAACL 2022 (long paper)

  16. arXiv:2112.03162  [pdf, other

    cs.CV cs.CL

    Embedding Arithmetic of Multimodal Queries for Image Retrieval

    Authors: Guillaume Couairon, Matthieu Cord, Matthijs Douze, Holger Schwenk

    Abstract: Latent text representations exhibit geometric regularities, such as the famous analogy: queen is to king what woman is to man. Such structured semantic relations were not demonstrated on image representations. Recent works aiming at bridging this semantic gap embed images and text into a multimodal space, enabling the transfer of text-defined transformations to the image modality. We introduce the… ▽ More

    Submitted 20 October, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: accepted at O-DRUM (CVPR workshop 2022)

  17. arXiv:2107.06959  [pdf, ps, other

    cs.CL cs.SD eess.AS

    FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

    Authors: Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal

    Abstract: In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task. Our system is built by leveraging transfer learning across modalities, tasks and languages. First, we leverage general-purpose multilingual modules pretrained with large amounts of unlabelled and labelled data. We furth… ▽ More

    Submitted 14 August, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

    Comments: Accepted by IWSLT 2021 as a system paper

  18. arXiv:2010.11125  [pdf, other

    cs.CL cs.LG

    Beyond English-Centric Multilingual Machine Translation

    Authors: Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin

    Abstract: Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages. However, much of this work is English-Centric by training only on data which was translated from or to English. While this is supported by large sources of training data, it does not reflect translation needs worldwide. In… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

  19. arXiv:1911.04944  [pdf, other

    cs.CL

    CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB

    Authors: Holger Schwenk, Guillaume Wenzek, Sergey Edunov, Edouard Grave, Armand Joulin

    Abstract: We show that margin-based bitext mining in a multilingual sentence space can be applied to monolingual corpora of billions of sentences. We are using ten snapshots of a curated common crawl corpus (Wenzek et al., 2019) totalling 32.7 billion unique sentences. Using one unified approach for 38 languages, we were able to mine 4.5 billions parallel sentences, out of which 661 million are aligned with… ▽ More

    Submitted 1 May, 2020; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: 13 pages, 4 figures. arXiv admin note: text overlap with arXiv:1907.05791

  20. arXiv:1910.07475  [pdf, other

    cs.CL cs.AI cs.LG

    MLQA: Evaluating Cross-lingual Extractive Question Answering

    Authors: Patrick Lewis, Barlas Oğuz, Ruty Rinott, Sebastian Riedel, Holger Schwenk

    Abstract: Question answering (QA) models have shown rapid progress enabled by the availability of large, high-quality benchmark datasets. Such annotated datasets are difficult and costly to collect, and rarely exist in languages other than English, making training QA systems in other languages challenging. An alternative to building large monolingual training datasets is to develop cross-lingual systems whi… ▽ More

    Submitted 3 May, 2020; v1 submitted 16 October, 2019; originally announced October 2019.

    Comments: To appear in ACL 2020

  21. arXiv:1907.05791  [pdf, other

    cs.CL

    WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia

    Authors: Holger Schwenk, Vishrav Chaudhary, Shuo Sun, Hongyu Gong, Francisco Guzmán

    Abstract: We present an approach based on multilingual sentence embeddings to automatically extract parallel sentences from the content of Wikipedia articles in 85 languages, including several dialects or low-resource languages. We do not limit the the extraction process to alignments with English, but systematically consider all possible language pairs. In total, we are able to extract 135M parallel senten… ▽ More

    Submitted 15 July, 2019; v1 submitted 10 July, 2019; originally announced July 2019.

    Comments: 13 pages, 3 figures, 6 tables

  22. arXiv:1906.08885  [pdf, other

    cs.CL

    Low-Resource Corpus Filtering using Multilingual Sentence Embeddings

    Authors: Vishrav Chaudhary, Yuqing Tang, Francisco Guzmán, Holger Schwenk, Philipp Koehn

    Abstract: In this paper, we describe our submission to the WMT19 low-resource parallel corpus filtering shared task. Our main approach is based on the LASER toolkit (Language-Agnostic SEntence Representations), which uses an encoder-decoder architecture trained on a parallel corpus to obtain multilingual sentence representations. We then use the representations directly to score and filter the noisy paralle… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

    Comments: 6 pages, WMT 2019

    Journal ref: Conference on Machine Translation (WMT) 2019

  23. arXiv:1812.10464  [pdf, other

    cs.CL cs.AI cs.LG

    Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

    Authors: Mikel Artetxe, Holger Schwenk

    Abstract: We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts. Our system uses a single BiLSTM encoder with a shared BPE vocabulary for all languages, which is coupled with an auxiliary decoder and trained on publicly available parallel corpora. This enables us to learn a classifi… ▽ More

    Submitted 25 September, 2019; v1 submitted 26 December, 2018; originally announced December 2018.

    Comments: TACL

  24. arXiv:1811.01136  [pdf, other

    cs.CL cs.AI cs.LG

    Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings

    Authors: Mikel Artetxe, Holger Schwenk

    Abstract: Machine translation is highly sensitive to the size and quality of the training data, which has led to an increasing interest in collecting and filtering large parallel corpora. In this paper, we propose a new method for this task based on multilingual sentence embeddings. In contrast to previous approaches, which rely on nearest neighbor retrieval with a hard threshold over cosine similarity, our… ▽ More

    Submitted 7 August, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

    Comments: ACL 2019

  25. arXiv:1809.05053  [pdf, other

    cs.CL cs.AI cs.LG

    XNLI: Evaluating Cross-lingual Sentence Representations

    Authors: Alexis Conneau, Guillaume Lample, Ruty Rinott, Adina Williams, Samuel R. Bowman, Holger Schwenk, Veselin Stoyanov

    Abstract: State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models. These models are generally trained on data in a single language (usually English), and cannot be directly used beyond that language. Since collecting data in every language is not realistic, there has been a growing interest in cross-lingual language understanding (XLU)… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

    Comments: EMNLP 2018

  26. arXiv:1805.09822  [pdf, other

    cs.CL cs.AI

    Filtering and Mining Parallel Data in a Joint Multilingual Space

    Authors: Holger Schwenk

    Abstract: We learn a joint multilingual sentence embedding and use the distance between sentences in different languages to filter noisy parallel data and to mine for parallel data in large news collections. We are able to improve a competitive baseline on the WMT'14 English to German task by 0.3 BLEU by filtering out 25% of the training data. The same approach is used to mine additional bitexts for the WMT… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

    Comments: 8 pages

    Journal ref: ACL, July 2018, Melbourne

  27. arXiv:1805.09821  [pdf, ps, other

    cs.CL

    A Corpus for Multilingual Document Classification in Eight Languages

    Authors: Holger Schwenk, Xian Li

    Abstract: Cross-lingual document classification aims at training a document classifier on resources in one language and transferring it to a different language without any additional resources. Several approaches have been proposed in the literature and the current best practice is to evaluate them on a subset of the Reuters Corpus Volume 2. However, this subset covers only few languages (English, German, F… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

    Comments: 4 pages

    Journal ref: LREC, May 2018, Miyazaki, Japan

  28. arXiv:1705.02364  [pdf, ps, other

    cs.CL

    Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

    Authors: Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, Antoine Bordes

    Abstract: Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so successful. Several attempts at learning unsupervised representations of sentences have not reached satisfactory enough performance to be widely adopted. In this paper, w… ▽ More

    Submitted 8 July, 2018; v1 submitted 5 May, 2017; originally announced May 2017.

    Comments: EMNLP 2017

  29. arXiv:1704.04154  [pdf, other

    cs.CL

    Learning Joint Multilingual Sentence Representations with Neural Machine Translation

    Authors: Holger Schwenk, Matthijs Douze

    Abstract: In this paper, we use the framework of neural machine translation to learn joint sentence representations across six very different languages. Our aim is that a representation which is independent of the language, is likely to capture the underlying semantics. We define a new cross-lingual similarity measure, compare up to 1.4M sentence representations and study the characteristics of close senten… ▽ More

    Submitted 8 August, 2017; v1 submitted 13 April, 2017; originally announced April 2017.

    Comments: 11 pages, 2 figures, published at ACL workshop RepL4NLP

    MSC Class: 68T50

  30. arXiv:1606.01781  [pdf, ps, other

    cs.CL cs.LG cs.NE

    Very Deep Convolutional Networks for Text Classification

    Authors: Alexis Conneau, Holger Schwenk, Loïc Barrault, Yann Lecun

    Abstract: The dominant approach for many NLP tasks are recurrent neural networks, in particular LSTMs, and convolutional neural networks. However, these architectures are rather shallow in comparison to the deep convolutional networks which have pushed the state-of-the-art in computer vision. We present a new architecture (VDCNN) for text processing which operates directly at the character level and uses on… ▽ More

    Submitted 27 January, 2017; v1 submitted 6 June, 2016; originally announced June 2016.

    Comments: 10 pages, EACL 2017, camera-ready

  31. arXiv:1503.03535  [pdf, other

    cs.CL

    On Using Monolingual Corpora in Neural Machine Translation

    Authors: Caglar Gulcehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, Yoshua Bengio

    Abstract: Recent work on end-to-end neural network-based architectures for machine translation has shown promising results for En-Fr and En-De translation. Arguably, one of the major factors behind this success has been the availability of high quality parallel corpora. In this work, we investigate how to leverage abundant monolingual corpora for neural machine translation. Compared to a phrase-based and hi… ▽ More

    Submitted 12 June, 2015; v1 submitted 11 March, 2015; originally announced March 2015.

    Comments: 9 pages, 2 figures

  32. arXiv:1412.6650  [pdf, other

    cs.NE cs.CL cs.LG

    Incremental Adaptation Strategies for Neural Network Language Models

    Authors: Aram Ter-Sarkisov, Holger Schwenk, Loic Barrault, Fethi Bougares

    Abstract: It is today acknowledged that neural network language models outperform backoff language models in applications like speech recognition or statistical machine translation. However, training these models on large amounts of data can take several days. We present efficient techniques to adapt a neural network language model to new data. Instead of training a completely new model or relying on mixtur… ▽ More

    Submitted 7 July, 2015; v1 submitted 20 December, 2014; originally announced December 2014.

    Comments: accepted as workshop paper at ACL-IJCNLP 2015

  33. arXiv:1406.1078  [pdf, other

    cs.CL cs.LG cs.NE stat.ML

    Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

    Authors: Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio

    Abstract: In this paper, we propose a novel neural network model called RNN Encoder-Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixed-length vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of… ▽ More

    Submitted 2 September, 2014; v1 submitted 3 June, 2014; originally announced June 2014.

    Comments: EMNLP 2014