Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Schmidt, R M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.02665  [pdf, other

    cs.CL cs.AI

    Learning Language-Specific Layers for Multilingual Machine Translation

    Authors: Telmo Pessoa Pires, Robin M. Schmidt, Yi-Hsiu Liao, Stephan Peitz

    Abstract: Multilingual Machine Translation promises to improve translation quality between non-English languages. This is advantageous for several reasons, namely lower latency (no need to translate twice), and reduced error cascades (e.g., avoiding losing gender and formality information when translating through English). On the downside, adding more languages reduces model capacity per language, which is… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

  2. arXiv:2304.12776  [pdf, other

    cs.CL cs.AI

    State Spaces Aren't Enough: Machine Translation Needs Attention

    Authors: Ali Vardasbi, Telmo Pessoa Pires, Robin M. Schmidt, Stephan Peitz

    Abstract: Structured State Spaces for Sequences (S4) is a recently proposed sequence model with successful applications in various tasks, e.g. vision, language modeling, and audio. Thanks to its mathematical formulation, it compresses its input to a single hidden state, and is able to capture long range dependencies while avoiding the need for an attention mechanism. In this work, we apply S4 to Machine Tra… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  3. arXiv:2205.10577  [pdf, other

    cs.CL cs.LG

    Non-Autoregressive Neural Machine Translation: A Call for Clarity

    Authors: Robin M. Schmidt, Telmo Pires, Stephan Peitz, Jonas Lööf

    Abstract: Non-autoregressive approaches aim to improve the inference speed of translation models by only requiring a single forward pass to generate the output sequence instead of iteratively producing each predicted token. Consequently, their translation quality still tends to be inferior to their autoregressive counterparts due to several issues involving output token interdependence. In this work, we tak… ▽ More

    Submitted 21 October, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: Accepted at EMNLP 2022

  4. arXiv:2112.02721  [pdf, other

    cs.CL cs.AI cs.LG

    NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

    Authors: Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo , et al. (101 additional authors not shown)

    Abstract: Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data split… ▽ More

    Submitted 11 October, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: 39 pages, repository at https://github.com/GEM-benchmark/NL-Augmenter

  5. arXiv:2104.01742  [pdf, other

    cs.LG cs.CV

    Explainability-aided Domain Generalization for Image Classification

    Authors: Robin M. Schmidt

    Abstract: Traditionally, for most machine learning settings, gaining some degree of explainability that tries to give users more insights into how and why the network arrives at its predictions, restricts the underlying model and hinders performance to a certain degree. For example, decision trees are thought of as being more explainable than deep neural networks but they lack performance on visual tasks. I… ▽ More

    Submitted 4 April, 2021; originally announced April 2021.

  6. arXiv:2008.10117  [pdf, other

    cs.LG stat.ML

    Collaborative Filtering under Model Uncertainty

    Authors: Robin M. Schmidt, Moritz Hahn

    Abstract: In their work, Dean, Rich, and Recht create a model to research recourse and availability of items in a recommender system. We used the definition of predictive multiplicity by Marx, Pin Calmon, and Ustun to examine different variations of this model, using different values for two model parameters. Pairwise comparison of their models show, that most of these models produce very similar results in… ▽ More

    Submitted 24 August, 2020; v1 submitted 23 August, 2020; originally announced August 2020.

    Comments: v2: small display fix in affiliation

  7. arXiv:2007.01547  [pdf, other

    cs.LG stat.ML

    Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers

    Authors: Robin M. Schmidt, Frank Schneider, Philipp Hennig

    Abstract: Choosing the optimizer is considered to be among the most crucial design decisions in deep learning, and it is not an easy one. The growing literature now lists hundreds of optimization methods. In the absence of clear theoretical guidance and conclusive empirical evidence, the decision is often made based on anecdotes. In this work, we aim to replace these anecdotes, if not with a conclusive rank… ▽ More

    Submitted 10 August, 2021; v1 submitted 3 July, 2020; originally announced July 2020.

    Comments: Raw results: https://github.com/SirRob1997/Crowded-Valley---Results

  8. arXiv:1912.05911  [pdf, other

    cs.LG stat.ML

    Recurrent Neural Networks (RNNs): A gentle Introduction and Overview

    Authors: Robin M. Schmidt

    Abstract: State-of-the-art solutions in the areas of "Language Modelling & Generating Text", "Speech Recognition", "Generating Image Descriptions" or "Video Tagging" have been using Recurrent Neural Networks as the foundation for their approaches. Understanding the underlying concepts is therefore of tremendous importance if we want to keep up with recent or upcoming publications in those areas. In this wor… ▽ More

    Submitted 23 November, 2019; originally announced December 2019.