Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: Dinu, G

Searching in archive cs. Search in all archives.
.
  1. RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

    Authors: Gabriele Sarti, Phu Mon Htut, Xing Niu, Benjamin Hsu, Anna Currey, Georgiana Dinu, Maria Nadejde

    Abstract: Attribute-controlled translation (ACT) is a subtask of machine translation that involves controlling stylistic or linguistic attributes (like formality and gender) of translation outputs. While ACT has garnered attention in recent years due to its usefulness in real-world applications, progress in the task is currently limited by dataset availability, since most prior approaches rely on supervised… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

    Journal ref: Proceedings of ACL (2023) 1476-1490

  2. arXiv:2305.11808  [pdf, other

    cs.CL

    Pseudo-Label Training and Model Inertia in Neural Machine Translation

    Authors: Benjamin Hsu, Anna Currey, Xing Niu, Maria Nădejde, Georgiana Dinu

    Abstract: Like many other machine learning applications, neural machine translation (NMT) benefits from over-parameterized deep neural models. However, these models have been observed to be brittle: NMT model predictions are sensitive to small input changes and can show significant variation across re-training or incremental model updates. This work studies a frequently used method in NMT, pseudo-label trai… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: accepted ICLR 2023

  3. arXiv:2211.01355  [pdf, other

    cs.CL

    MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

    Authors: Anna Currey, Maria Nădejde, Raghavendra Pappagari, Mia Mayer, Stanislas Lauly, Xing Niu, Benjamin Hsu, Georgiana Dinu

    Abstract: As generic machine translation (MT) quality has improved, the need for targeted benchmarks that explore fine-grained aspects of quality has increased. In particular, gender accuracy in translation can have implications in terms of output fluency, translation accuracy, and ethics. In this paper, we introduce MT-GenEval, a benchmark for evaluating gender accuracy in translation from English into eig… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

    Comments: Accepted at EMNLP 2022. Data and code: https://github.com/amazon-research/machine-translation-gender-eval

  4. arXiv:2210.10906  [pdf, other

    cs.CL cs.LG

    A baseline revisited: Pushing the limits of multi-segment models for context-aware translation

    Authors: Suvodeep Majumder, Stanislas Lauly, Maria Nadejde, Marcello Federico, Georgiana Dinu

    Abstract: This paper addresses the task of contextual translation using multi-segment models. Specifically we show that increasing model capacity further pushes the limits of this approach and that deeper models are more suited to capture context dependencies. Furthermore, improvements observed with larger models can be transferred to smaller models using knowledge distillation. Our experiments show that th… ▽ More

    Submitted 21 October, 2022; v1 submitted 19 October, 2022; originally announced October 2022.

  5. arXiv:2205.04022  [pdf, other

    cs.CL cs.AI

    CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality

    Authors: Maria Nădejde, Anna Currey, Benjamin Hsu, Xing Niu, Marcello Federico, Georgiana Dinu

    Abstract: The machine translation (MT) task is typically formulated as that of returning a single translation for an input segment. However, in many cases, multiple different translations are valid and the appropriate translation may depend on the intended target audience, characteristics of the speaker, or even the relationship between speakers. Specific problems arise when dealing with honorifics, particu… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

    Comments: NAACL 2022

  6. arXiv:2109.12105  [pdf, other

    cs.CL

    Faithful Target Attribute Prediction in Neural Machine Translation

    Authors: Xing Niu, Georgiana Dinu, Prashant Mathur, Anna Currey

    Abstract: The training data used in NMT is rarely controlled with respect to specific attributes, such as word casing or gender, which can cause errors in translations. We argue that predicting the target word and attributes simultaneously is an effective way to ensure that translations are more faithful to the training data distribution with respect to these attributes. Experimental results on two tasks, u… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Comments: Withdrawn from Findings of ACL 2021

  7. arXiv:2104.07695  [pdf, other

    cs.CL

    Improving Gender Translation Accuracy with Filtered Self-Training

    Authors: Prafulla Kumar Choubey, Anna Currey, Prashant Mathur, Georgiana Dinu

    Abstract: Targeted evaluations have found that machine translation systems often output incorrect gender, even when the gender is clear from context. Furthermore, these incorrectly gendered translations have the potential to reflect or amplify social biases. We propose a gender-filtered self-training technique to improve gender translation accuracy on unambiguously gendered inputs. This approach uses a sour… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  8. arXiv:2005.00580  [pdf, other

    cs.CL

    Evaluating Robustness to Input Perturbations for Neural Machine Translation

    Authors: Xing Niu, Prashant Mathur, Georgiana Dinu, Yaser Al-Onaizan

    Abstract: Neural Machine Translation (NMT) models are sensitive to small perturbations in the input. Robustness to such perturbations is typically measured using translation quality metrics such as BLEU on the noisy input. This paper proposes additional metrics which measure the relative degradation and changes in translation when small perturbations are added to the input. We focus on a class of models emp… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: Accepted at ACL 2020

  9. arXiv:2004.05219  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Joint translation and unit conversion for end-to-end localization

    Authors: Georgiana Dinu, Prashant Mathur, Marcello Federico, Stanislas Lauly, Yaser Al-Onaizan

    Abstract: A variety of natural language tasks require processing of textual data which contains a mix of natural language and formal languages such as mathematical expressions. In this paper, we take unit conversions as an example and propose a data augmentation technique which leads to models learning both translation and conversion tasks as well as how to adequately switch between them for end-to-end loca… ▽ More

    Submitted 10 April, 2020; originally announced April 2020.

  10. arXiv:1906.01105  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Training Neural Machine Translation To Apply Terminology Constraints

    Authors: Georgiana Dinu, Prashant Mathur, Marcello Federico, Yaser Al-Onaizan

    Abstract: This paper proposes a novel method to inject custom terminology into neural machine translation at run time. Previous works have mainly proposed modifications to the decoding algorithm in order to constrain the output to include run-time-provided target terms. While being effective, these constrained decoding methods add, however, significant computational overhead to the inference step, and, as w… ▽ More

    Submitted 24 June, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: Accepted as a short paper at ACL 2019

  11. Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection

    Authors: Jian Ni, Georgiana Dinu, Radu Florian

    Abstract: The state-of-the-art named entity recognition (NER) systems are supervised machine learning models that require large amounts of manually annotated data to achieve high accuracy. However, annotating NER data by human is expensive and time-consuming, and can be quite difficult for a new language. In this paper, we present two weakly supervised approaches for cross-lingual NER with no human annotati… ▽ More

    Submitted 8 July, 2017; originally announced July 2017.

    Comments: 11 pages, The 55th Annual Meeting of the Association for Computational Linguistics (ACL), 2017

  12. arXiv:1703.04489  [pdf, ps, other

    cs.CL cs.AI

    Reinforcement Learning for Transition-Based Mention Detection

    Authors: Georgiana Dinu, Wael Hamza, Radu Florian

    Abstract: This paper describes an application of reinforcement learning to the mention detection task. We define a novel action-based formulation for the mention detection task, in which a model can flexibly revise past labeling decisions by grouping together tokens and assigning partial mention labels. We devise a method to create mention-level episodes and we train a model by rewarding correctly labeled c… ▽ More

    Submitted 13 March, 2017; originally announced March 2017.

    Comments: Deep Reinforcement Learning Workshop, NIPS 2016

  13. arXiv:1602.07749  [pdf, ps, other

    cs.CL

    Toward Mention Detection Robustness with Recurrent Neural Networks

    Authors: Thien Huu Nguyen, Avirup Sil, Georgiana Dinu, Radu Florian

    Abstract: One of the key challenges in natural language processing (NLP) is to yield good performance across application domains and languages. In this work, we investigate the robustness of the mention detection systems, one of the fundamental tasks in information extraction, via recurrent neural networks (RNNs). The advantage of RNNs over the traditional approaches is their capacity to capture long ranges… ▽ More

    Submitted 24 February, 2016; originally announced February 2016.

    Comments: 13 pages, 11 tables, 3 figures

  14. arXiv:1501.02714  [pdf, other

    cs.CL cs.CV

    From Visual Attributes to Adjectives through Decompositional Distributional Semantics

    Authors: Angeliki Lazaridou, Georgiana Dinu, Adam Liska, Marco Baroni

    Abstract: As automated image analysis progresses, there is increasing interest in richer linguistic annotation of pictures, with attributes of objects (e.g., furry, brown...) attracting most attention. By building on the recent "zero-shot learning" approach, and paying attention to the linguistic nature of attributes as noun modifiers, and specifically adjectives, we show that it is possible to tag images w… ▽ More

    Submitted 24 March, 2015; v1 submitted 12 January, 2015; originally announced January 2015.

    Comments: accepted at Transactions of the Association for Computational Linguistics (TACL), 3/2015

  15. arXiv:1412.6568  [pdf, other

    cs.CL cs.LG

    Improving zero-shot learning by mitigating the hubness problem

    Authors: Georgiana Dinu, Angeliki Lazaridou, Marco Baroni

    Abstract: The zero-shot paradigm exploits vector-based word representations extracted from text corpora with unsupervised methods to learn general mapping functions from other feature spaces onto word space, where the words associated to the nearest neighbours of the mapped vectors are used as their linguistic labels. We show that the neighbourhoods of the mapped elements are strongly polluted by hubs, vect… ▽ More

    Submitted 15 April, 2015; v1 submitted 19 December, 2014; originally announced December 2014.

  16. arXiv:1301.6939  [pdf, other

    cs.CL cs.LG

    Multi-Step Regression Learning for Compositional Distributional Semantics

    Authors: Edward Grefenstette, Georgiana Dinu, Yao-Zhong Zhang, Mehrnoosh Sadrzadeh, Marco Baroni

    Abstract: We present a model for compositional distributional semantics related to the framework of Coecke et al. (2010), and emulating formal semantics by representing functions as tensors and arguments as vectors. We introduce a new learning method for tensors, generalising the approach of Baroni and Zamparelli (2010). We evaluate it on two benchmark data sets, and find it to outperform existing leading m… ▽ More

    Submitted 30 January, 2013; v1 submitted 29 January, 2013; originally announced January 2013.

    Comments: 10 pages + 1 page references, to be presented at the 10th International Conference on Computational Semantics (IWCS 2013)

    MSC Class: 68T50 ACM Class: G.1.3; H.3.1; I.2.7; I.2.6