-
Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification
Authors:
Pierre Lepagnol,
Thomas Gerald,
Sahar Ghannay,
Christophe Servan,
Sophie Rosset
Abstract:
This study is part of the debate on the efficiency of large versus small language models for text classification by prompting.We assess the performance of small language models in zero-shot text classification, challenging the prevailing dominance of large models.Across 15 datasets, our investigation benchmarks language models from 77M to 40B parameters using different architectures and scoring fu…
▽ More
This study is part of the debate on the efficiency of large versus small language models for text classification by prompting.We assess the performance of small language models in zero-shot text classification, challenging the prevailing dominance of large models.Across 15 datasets, our investigation benchmarks language models from 77M to 40B parameters using different architectures and scoring functions. Our findings reveal that small models can effectively classify texts, getting on par with or surpassing their larger counterparts.We developed and shared a comprehensive open-source repository that encapsulates our methodologies. This research underscores the notion that bigger isn't always better, suggesting that resource-efficient small models may offer viable solutions for specific data classification challenges.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
New Semantic Task for the French Spoken Language Understanding MEDIA Benchmark
Authors:
Nadège Alavoine,
Gaëlle Laperriere,
Christophe Servan,
Sahar Ghannay,
Sophie Rosset
Abstract:
Intent classification and slot-filling are essential tasks of Spoken Language Understanding (SLU). In most SLUsystems, those tasks are realized by independent modules. For about fifteen years, models achieving both of themjointly and exploiting their mutual enhancement have been proposed. A multilingual module using a joint modelwas envisioned to create a touristic dialogue system for a European p…
▽ More
Intent classification and slot-filling are essential tasks of Spoken Language Understanding (SLU). In most SLUsystems, those tasks are realized by independent modules. For about fifteen years, models achieving both of themjointly and exploiting their mutual enhancement have been proposed. A multilingual module using a joint modelwas envisioned to create a touristic dialogue system for a European project, HumanE-AI-Net. A combination ofmultiple datasets, including the MEDIA dataset, was suggested for training this joint model. The MEDIA SLU datasetis a French dataset distributed since 2005 by ELRA, mainly used by the French research community and free foracademic research since 2020. Unfortunately, it is annotated only in slots but not intents. An enhanced version ofMEDIA annotated with intents has been built to extend its use to more tasks and use cases. This paper presents thesemi-automatic methodology used to obtain this enhanced version. In addition, we present the first results of SLUexperiments on this enhanced dataset using joint models for intent classification and slot-filling.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
A Benchmark Evaluation of Clinical Named Entity Recognition in French
Authors:
Nesrine Bannour,
Christophe Servan,
Aurélie Névéol,
Xavier Tannier
Abstract:
Background: Transformer-based language models have shown strong performance on many Natural LanguageProcessing (NLP) tasks. Masked Language Models (MLMs) attract sustained interest because they can be adaptedto different languages and sub-domains through training or fine-tuning on specific corpora while remaining lighterthan modern Large Language Models (LLMs). Recently, several MLMs have been rel…
▽ More
Background: Transformer-based language models have shown strong performance on many Natural LanguageProcessing (NLP) tasks. Masked Language Models (MLMs) attract sustained interest because they can be adaptedto different languages and sub-domains through training or fine-tuning on specific corpora while remaining lighterthan modern Large Language Models (LLMs). Recently, several MLMs have been released for the biomedicaldomain in French, and experiments suggest that they outperform standard French counterparts. However, nosystematic evaluation comparing all models on the same corpora is available. Objective: This paper presentsan evaluation of masked language models for biomedical French on the task of clinical named entity recognition.Material and methods: We evaluate biomedical models CamemBERT-bio and DrBERT and compare them tostandard French models CamemBERT, FlauBERT and FrALBERT as well as multilingual mBERT using three publicallyavailable corpora for clinical named entity recognition in French. The evaluation set-up relies on gold-standardcorpora as released by the corpus developers. Results: Results suggest that CamemBERT-bio outperformsDrBERT consistently while FlauBERT offers competitive performance and FrAlBERT achieves the lowest carbonfootprint. Conclusion: This is the first benchmark evaluation of biomedical masked language models for Frenchclinical entity recognition that compares model performance consistently on nested entity recognition using metricscovering performance and environmental impact.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
mALBERT: Is a Compact Multilingual BERT Model Still Worth It?
Authors:
Christophe Servan,
Sahar Ghannay,
Sophie Rosset
Abstract:
Within the current trend of Pretained Language Models (PLM), emerge more and more criticisms about the ethical andecological impact of such models. In this article, considering these critical remarks, we propose to focus on smallermodels, such as compact models like ALBERT, which are more ecologically virtuous than these PLM. However,PLMs enable huge breakthroughs in Natural Language Processing ta…
▽ More
Within the current trend of Pretained Language Models (PLM), emerge more and more criticisms about the ethical andecological impact of such models. In this article, considering these critical remarks, we propose to focus on smallermodels, such as compact models like ALBERT, which are more ecologically virtuous than these PLM. However,PLMs enable huge breakthroughs in Natural Language Processing tasks, such as Spoken and Natural LanguageUnderstanding, classification, Question--Answering tasks. PLMs also have the advantage of being multilingual, and,as far as we know, a multilingual version of compact ALBERT models does not exist. Considering these facts, wepropose the free release of the first version of a multilingual compact ALBERT model, pre-trained using Wikipediadata, which complies with the ethical aspect of such a language model. We also evaluate the model against classicalmultilingual PLMs in classical NLP tasks. Finally, this paper proposes a rare study on the subword tokenizationimpact on language performances.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Effects of phylogeny on coexistence in model communities
Authors:
Carlos A. Servan,
Jose A. Capitan,
Zachary R. Miller,
Stefano Allesina
Abstract:
Species' interactions are shaped by their traits. Thus, we expect traits -- in particular, trait (dis)similarity -- to play a central role in determining whether a particular set of species coexists. Traits are, in turn, the outcome of an eco-evolutionary process summarized by a phylogenetic tree. Therefore, the phylogenetic tree associated with a set of species should carry information about the…
▽ More
Species' interactions are shaped by their traits. Thus, we expect traits -- in particular, trait (dis)similarity -- to play a central role in determining whether a particular set of species coexists. Traits are, in turn, the outcome of an eco-evolutionary process summarized by a phylogenetic tree. Therefore, the phylogenetic tree associated with a set of species should carry information about the dynamics and assembly properties of the community. Many studies have highlighted the potentially complex ways in which this phylogenetic information is translated into species' ecological properties. However, much less emphasis has been placed on developing clear, quantitative expectations for community properties under a particular hypothesis. To address this gap, we couple a simple model of trait evolution on a phylogenetic tree with Lotka-Volterra community dynamics. This allows us to derive properties of a community of coexisting species as a function of the number of traits, tree topology and the size of the species pool. Our analysis highlights how phylogenies, through traits, affect the coexistence of a set of species. Together, these results provide much-needed baseline expectations for the ways in which evolutionary history, summarized by phylogeny, is reflected in the size and structure of ecological communities.
△ Less
Submitted 10 August, 2024; v1 submitted 22 October, 2023;
originally announced October 2023.
-
Isometric embeddings of Teichmüller spaces are covering constructions
Authors:
Frederik Benirschke,
Carlos A. Serván
Abstract:
Pulling back complex structures along a branched covering induces a holomorphic isometric embedding of Teichmüller spaces. We show that for dimension at least $2$, all isometric embeddings arise from branched coverings. This generalizes a theorem of Royden. As a consequence we obtain that totally geodesic submanifolds of Teichmüller space, which are isometric to some Teichmüller space, are coverin…
▽ More
Pulling back complex structures along a branched covering induces a holomorphic isometric embedding of Teichmüller spaces. We show that for dimension at least $2$, all isometric embeddings arise from branched coverings. This generalizes a theorem of Royden. As a consequence we obtain that totally geodesic submanifolds of Teichmüller space, which are isometric to some Teichmüller space, are covering constructions. Another consequence is the classification of locally isometric embeddings of moduli spaces of Riemann surfaces.
△ Less
Submitted 6 May, 2023;
originally announced May 2023.
-
On the cross-lingual transferability of multilingual prototypical models across NLU tasks
Authors:
Oralie Cattan,
Christophe Servan,
Sophie Rosset
Abstract:
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven to be effective for limited domain and language applications when a sufficient number of training examples are available. In practice, these approaches suffer from the drawbacks of domain-driven design and under-resourced languages. Domain and language models are supposed to grow and change as the p…
▽ More
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven to be effective for limited domain and language applications when a sufficient number of training examples are available. In practice, these approaches suffer from the drawbacks of domain-driven design and under-resourced languages. Domain and language models are supposed to grow and change as the problem space evolves. On one hand, research on transfer learning has demonstrated the cross-lingual ability of multilingual Transformers-based models to learn semantically rich representations. On the other, in addition to the above approaches, meta-learning have enabled the development of task and language learning algorithms capable of far generalization. Through this context, this article proposes to investigate the cross-lingual transferability of using synergistically few-shot learning with prototypical neural networks and multilingual Transformers-based models. Experiments in natural language understanding tasks on MultiATIS++ corpus shows that our approach substantially improves the observed transfer learning performances between the low and the high resource languages. More generally our approach confirms that the meaningful latent space learned in a given language can be can be generalized to unseen and under-resourced ones using meta-learning.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Benchmarking Transformers-based models on French Spoken Language Understanding tasks
Authors:
Oralie Cattan,
Sahar Ghannay,
Christophe Servan,
Sophie Rosset
Abstract:
In the last five years, the rise of the self-attentional Transformer-based architectures led to state-of-the-art performances over many natural language tasks. Although these approaches are increasingly popular, they require large amounts of data and computational resources. There is still a substantial need for benchmarking methodologies ever upwards on under-resourced languages in data-scarce ap…
▽ More
In the last five years, the rise of the self-attentional Transformer-based architectures led to state-of-the-art performances over many natural language tasks. Although these approaches are increasingly popular, they require large amounts of data and computational resources. There is still a substantial need for benchmarking methodologies ever upwards on under-resourced languages in data-scarce application conditions. Most pre-trained language models were massively studied using the English language and only a few of them were evaluated on French. In this paper, we propose a unified benchmark, focused on evaluating models quality and their ecological impact on two well-known French spoken language understanding tasks. Especially we benchmark thirteen well-established Transformer-based models on the two available spoken language understanding tasks for French: MEDIA and ATIS-FR. Within this framework, we show that compact models can reach comparable results to bigger ones while their ecological impact is considerably lower. However, this assumption is nuanced and depends on the considered compression method.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
On the Usability of Transformers-based models for a French Question-Answering task
Authors:
Oralie Cattan,
Christophe Servan,
Sophie Rosset
Abstract:
For many tasks, state-of-the-art results have been achieved with Transformer-based architectures, resulting in a paradigmatic shift in practices from the use of task-specific architectures to the fine-tuning of pre-trained language models. The ongoing trend consists in training models with an ever-increasing amount of data and parameters, which requires considerable resources. It leads to a strong…
▽ More
For many tasks, state-of-the-art results have been achieved with Transformer-based architectures, resulting in a paradigmatic shift in practices from the use of task-specific architectures to the fine-tuning of pre-trained language models. The ongoing trend consists in training models with an ever-increasing amount of data and parameters, which requires considerable resources. It leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for English. This raises questions about their usability when applied to small-scale learning problems, for which a limited amount of training data is available, especially for under-resourced languages tasks. The lack of appropriately sized corpora is a hindrance to applying data-driven and transfer learning-based approaches with strong instability cases. In this paper, we establish a state-of-the-art of the efforts dedicated to the usability of Transformer-based models and propose to evaluate these improvements on the question-answering performances of French language which have few resources. We address the instability relating to data scarcity by investigating various training strategies with data augmentation, hyperparameters optimization and cross-lingual transfer. We also introduce a new compact model for French FrALBERT which proves to be competitive in low-resource settings.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
On the uniqueness of the Prym map
Authors:
Carlos A. Serván
Abstract:
The classical Prym construction associates to a smooth, genus $g$ complex curve $X$ equipped with a nonzero cohomology class $θ\in H^1(X,\mathbb{Z}/2\mathbb{Z})$, a principally polarized abelian variety (PPAV) $\mbox{Prym}(X,θ)$. Denote the moduli space of pairs $(X,θ)$ by $\mathcal{R}_g$, and let $\mathcal{A}_h$ be the moduli space of PPAVs of dimension $h$. The Prym construction globalizes to a…
▽ More
The classical Prym construction associates to a smooth, genus $g$ complex curve $X$ equipped with a nonzero cohomology class $θ\in H^1(X,\mathbb{Z}/2\mathbb{Z})$, a principally polarized abelian variety (PPAV) $\mbox{Prym}(X,θ)$. Denote the moduli space of pairs $(X,θ)$ by $\mathcal{R}_g$, and let $\mathcal{A}_h$ be the moduli space of PPAVs of dimension $h$. The Prym construction globalizes to a holomorphic map of complex orbifolds $\mbox{Prym}: \mathcal{R}_g \to \mathcal{A}_{g-1}$. For $g\geq 4$ and $h \leq g-1$, we show that $\mbox{Prym}$ is the unique nonconstant holomorphic map of complex orbifolds $F:\mathcal{R}_g \to \mathcal{A}_h$. This solves a conjecture of Farb. A main component in our proof is a classification of homomorphisms $π_1^{\mbox{orb}}(\mathcal{R}_g) \to \mbox{Sp}(2h,\mathbb{Z})$ for $h \leq g-1$. This is achieved using arguments from geometric group theory and low-dimensional topology.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Using Whole Document Context in Neural Machine Translation
Authors:
Valentin Macé,
Christophe Servan
Abstract:
In Machine Translation, considering the document as a whole can help to resolve ambiguities and inconsistencies. In this paper, we propose a simple yet promising approach to add contextual information in Neural Machine Translation. We present a method to add source context that capture the whole document with accurate boundaries, taking every word into account. We provide this additional informati…
▽ More
In Machine Translation, considering the document as a whole can help to resolve ambiguities and inconsistencies. In this paper, we propose a simple yet promising approach to add contextual information in Neural Machine Translation. We present a method to add source context that capture the whole document with accurate boundaries, taking every word into account. We provide this additional information to a Transformer model and study the impact of our method on three language pairs. The proposed approach obtains promising results in the English-German, English-French and French-English document-level translation tasks. We observe interesting cross-sentential behaviors where the model learns to use document-level information to improve translation coherence.
△ Less
Submitted 16 October, 2019;
originally announced October 2019.
-
Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases
Authors:
Estelle Maudet,
Oralie Cattan,
Maureen de Seyssel,
Christophe Servan
Abstract:
This paper reports on Qwant Research contribution to tasks 2 and 3 of the DEFT 2019's challenge, focusing on French clinical cases analysis. Task 2 is a task on semantic similarity between clinical cases and discussions. For this task, we propose an approach based on language models and evaluate the impact on the results of different preprocessings and matching techniques. For task 3, we have deve…
▽ More
This paper reports on Qwant Research contribution to tasks 2 and 3 of the DEFT 2019's challenge, focusing on French clinical cases analysis. Task 2 is a task on semantic similarity between clinical cases and discussions. For this task, we propose an approach based on language models and evaluate the impact on the results of different preprocessings and matching techniques. For task 3, we have developed an information extraction system yielding very encouraging results accuracy-wise. We have experimented two different approaches, one based on the exclusive use of neural networks, the other based on a linguistic analysis.
△ Less
Submitted 6 July, 2019;
originally announced July 2019.
-
Image search using multilingual texts: a cross-modal learning approach between image and text
Authors:
Maxime Portaz,
Hicham Randrianarivo,
Adrien Nivaggioli,
Estelle Maudet,
Christophe Servan,
Sylvain Peyronnet
Abstract:
Multilingual (or cross-lingual) embeddings represent several languages in a unique vector space. Using a common embedding space enables for a shared semantic between words from different languages. In this paper, we propose to embed images and texts into a unique distributional vector space, enabling to search images by using text queries expressing information needs related to the (visual) conten…
▽ More
Multilingual (or cross-lingual) embeddings represent several languages in a unique vector space. Using a common embedding space enables for a shared semantic between words from different languages. In this paper, we propose to embed images and texts into a unique distributional vector space, enabling to search images by using text queries expressing information needs related to the (visual) content of images, as well as using image similarity. Our framework forces the representation of an image to be similar to the representation of the text that describes it. Moreover, by using multilingual embeddings we ensure that words from two different languages have close descriptors and thus are attached to similar images. We provide experimental evidence of the efficiency of our approach by experimenting it on two datasets: Common Objects in COntext (COCO) [19] and Multi30K [7].
△ Less
Submitted 14 May, 2019; v1 submitted 27 March, 2019;
originally announced March 2019.
-
Thresholding normally distributed data creates complex networks
Authors:
George T. Cantwell,
Yanchen Liu,
Benjamin F. Maier,
Alice C. Schwarze,
Carlos A. Serván,
Jordan Snyder,
Guillaume St-Onge
Abstract:
Network data sets are often constructed by some kind of thresholding procedure. The resulting networks frequently possess properties such as heavy-tailed degree distributions, clustering, large connected components and short average shortest path lengths. These properties are considered typical of complex networks and appear in many contexts, prompting consideration of their universality. Here we…
▽ More
Network data sets are often constructed by some kind of thresholding procedure. The resulting networks frequently possess properties such as heavy-tailed degree distributions, clustering, large connected components and short average shortest path lengths. These properties are considered typical of complex networks and appear in many contexts, prompting consideration of their universality. Here we introduce a simple model for correlated relational data and study the network ensemble obtained by thresholding it. We find that some, but not all, of the properties associated with complex networks can be seen after thresholding the correlated data, even though the underlying data are not "complex". In particular, we observe heavy-tailed degree distributions, a large numbers of triangles, and short path lengths, while we do not observe non-vanishing clustering or community structure.
△ Less
Submitted 29 May, 2020; v1 submitted 21 February, 2019;
originally announced February 2019.
-
SYSTRAN Purely Neural MT Engines for WMT2017
Authors:
Yongchao Deng,
Jungi Kim,
Guillaume Klein,
Catherine Kobus,
Natalia Segal,
Christophe Servan,
Bo Wang,
Dakun Zhang,
Josep Crego,
Jean Senellart
Abstract:
This paper describes SYSTRAN's systems submitted to the WMT 2017 shared news translation task for English-German, in both translation directions. Our systems are built using OpenNMT, an open-source neural machine translation system, implementing sequence-to-sequence models with LSTM encoder/decoders and attention. We experimented using monolingual data automatically back-translated. Our resulting…
▽ More
This paper describes SYSTRAN's systems submitted to the WMT 2017 shared news translation task for English-German, in both translation directions. Our systems are built using OpenNMT, an open-source neural machine translation system, implementing sequence-to-sequence models with LSTM encoder/decoders and attention. We experimented using monolingual data automatically back-translated. Our resulting models are further hyper-specialised with an adaptation technique that finely tunes models according to the evaluation test sentences.
△ Less
Submitted 12 September, 2017;
originally announced September 2017.
-
Domain specialization: a post-training domain adaptation for Neural Machine Translation
Authors:
Christophe Servan,
Josep Crego,
Jean Senellart
Abstract:
Domain adaptation is a key feature in Machine Translation. It generally encompasses terminology, domain and style adaptation, especially for human post-editing workflows in Computer Assisted Translation (CAT). With Neural Machine Translation (NMT), we introduce a new notion of domain adaptation that we call "specialization" and which is showing promising results both in the learning speed and in a…
▽ More
Domain adaptation is a key feature in Machine Translation. It generally encompasses terminology, domain and style adaptation, especially for human post-editing workflows in Computer Assisted Translation (CAT). With Neural Machine Translation (NMT), we introduce a new notion of domain adaptation that we call "specialization" and which is showing promising results both in the learning speed and in adaptation accuracy. In this paper, we propose to explore this approach under several perspectives.
△ Less
Submitted 19 December, 2016;
originally announced December 2016.
-
Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation
Authors:
Alexandre Berard,
Olivier Pietquin,
Christophe Servan,
Laurent Besacier
Abstract:
This paper proposes a first attempt to build an end-to-end speech-to-text translation system, which does not use source language transcription during learning or decoding. We propose a model for direct speech-to-text translation, which gives promising results on a small French-English synthetic corpus. Relaxing the need for source language transcription would drastically change the data collection…
▽ More
This paper proposes a first attempt to build an end-to-end speech-to-text translation system, which does not use source language transcription during learning or decoding. We propose a model for direct speech-to-text translation, which gives promising results on a small French-English synthetic corpus. Relaxing the need for source language transcription would drastically change the data collection methodology in speech translation, especially in under-resourced scenarios. For instance, in the former project DARPA TRANSTAC (speech translation from spoken Arabic dialects), a large effort was devoted to the collection of speech transcripts (and a prerequisite to obtain transcripts was often a detailed transcription guide for languages with little standardized spelling). Now, if end-to-end approaches for speech-to-text translation are successful, one might consider collecting data by asking bilingual speakers to directly utter speech in the source language from target language text utterances. Such an approach has the advantage to be applicable to any unwritten (source) language.
△ Less
Submitted 6 December, 2016;
originally announced December 2016.
-
SYSTRAN's Pure Neural Machine Translation Systems
Authors:
Josep Crego,
Jungi Kim,
Guillaume Klein,
Anabel Rebollo,
Kathy Yang,
Jean Senellart,
Egor Akhanov,
Patrice Brunelle,
Aurelien Coquard,
Yongchao Deng,
Satoshi Enoue,
Chiyo Geiss,
Joshua Johanson,
Ardas Khalsa,
Raoum Khiari,
Byeongil Ko,
Catherine Kobus,
Jean Lorieux,
Leidiana Martins,
Dang-Chuan Nguyen,
Alexandra Priori,
Thomas Riccardi,
Natalia Segal,
Christophe Servan,
Cyril Tiquet
, et al. (5 additional authors not shown)
Abstract:
Since the first online demonstration of Neural Machine Translation (NMT) by LISA, NMT development has recently moved from laboratory to production systems as demonstrated by several entities announcing roll-out of NMT engines to replace their existing technologies. NMT systems have a large number of training configurations and the training process of such systems is usually very long, often a few…
▽ More
Since the first online demonstration of Neural Machine Translation (NMT) by LISA, NMT development has recently moved from laboratory to production systems as demonstrated by several entities announcing roll-out of NMT engines to replace their existing technologies. NMT systems have a large number of training configurations and the training process of such systems is usually very long, often a few weeks, so role of experimentation is critical and important to share. In this work, we present our approach to production-ready systems simultaneously with release of online demonstrators covering a large variety of languages (12 languages, for 32 language pairs). We explore different practical choices: an efficient and evolutive open-source framework; data preparation; network architecture; additional implemented features; tuning for production; etc. We discuss about evaluation methodology, present our first findings and we finally outline further work.
Our ultimate goal is to share our expertise to build competitive production systems for "generic" translation. We aim at contributing to set up a collaborative framework to speed-up adoption of the technology, foster further research efforts and enable the delivery and adoption to/by industry of use-case specific engines integrated in real production workflows. Mastering of the technology would allow us to build translation engines suited for particular needs, outperforming current simplest/uniform systems.
△ Less
Submitted 18 October, 2016;
originally announced October 2016.
-
Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?
Authors:
Christophe Servan,
Alexandre Berard,
Zied Elloumi,
Hervé Blanchon,
Laurent Besacier
Abstract:
This paper presents an approach combining lexico-semantic resources and distributed representations of words applied to the evaluation in machine translation (MT). This study is made through the enrichment of a well-known MT evaluation metric: METEOR. This metric enables an approximate match (synonymy or morphological similarity) between an automatic and a reference translation. Our experiments ar…
▽ More
This paper presents an approach combining lexico-semantic resources and distributed representations of words applied to the evaluation in machine translation (MT). This study is made through the enrichment of a well-known MT evaluation metric: METEOR. This metric enables an approximate match (synonymy or morphological similarity) between an automatic and a reference translation. Our experiments are made in the framework of the Metrics task of WMT 2014. We show that distributed representations are a good alternative to lexico-semantic resources for MT evaluation and they can even bring interesting additional information. The augmented versions of METEOR, using vector representations, are made available on our Github page.
△ Less
Submitted 5 October, 2016;
originally announced October 2016.