Zum Hauptinhalt springen

Showing 1–6 of 6 results for author: Reinauer, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.08590  [pdf, other

    cs.CL

    Neural Machine Translation Models Can Learn to be Few-shot Learners

    Authors: Raphael Reinauer, Patrick Simianer, Kaden Uhlig, Johannes E. M. Mosig, Joern Wuebker

    Abstract: The emergent ability of Large Language Models to use a small number of examples to learn to perform in novel domains and tasks, also called in-context learning (ICL). In this work, we show that a much smaller model can be trained to perform ICL by fine-tuning towards a specialized training objective, exemplified on the task of domain adaptation for neural machine translation. With this capacity fo… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  2. arXiv:2303.05401  [pdf, other

    cs.CL cs.LG cs.SI

    Early Warning Signals of Social Instabilities in Twitter Data

    Authors: Vahid Shamsaddini, Henry Kirveslahti, Raphael Reinauer, Wallyson Lemes de Oliveira, Matteo Caorsi, Etienne Voutaz

    Abstract: The goal of this project is to create and study novel techniques to identify early warning signals for socially disruptive events, like riots, wars, or revolutions using only publicly available data on social media. Such techniques need to be robust enough to work on real-time data: to achieve this goal we propose a topological approach together with more standard BERT models. Indeed, topology-bas… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: 22 pages

    MSC Class: 68

  3. arXiv:2206.15195  [pdf, other

    cs.CL cs.LG math.AT

    The Topological BERT: Transforming Attention into Topology for Natural Language Processing

    Authors: Ilan Perez, Raphael Reinauer

    Abstract: In recent years, the introduction of the Transformer models sparked a revolution in natural language processing (NLP). BERT was one of the first text encoders using only the attention mechanism without any recurrent parts to achieve state-of-the-art results on many NLP tasks. This paper introduces a text classifier using topological data analysis. We use BERT's attention maps transformed into at… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

  4. arXiv:2112.15210  [pdf, other

    cs.LG math.AT

    Persformer: A Transformer Architecture for Topological Machine Learning

    Authors: Raphael Reinauer, Matteo Caorsi, Nicolas Berkouk

    Abstract: One of the main challenges of Topological Data Analysis (TDA) is to extract features from persistent diagrams directly usable by machine learning algorithms. Indeed, persistence diagrams are intrinsically (multi-)sets of points in $\mathbb{R}^2$ and cannot be seen in a straightforward manner as vectors. In this article, we introduce $\texttt{Persformer}$, the first Transformer neural network archi… ▽ More

    Submitted 26 September, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

  5. ALOJA: A Framework for Benchmarking and Predictive Analytics in Big Data Deployments

    Authors: Josep Ll. Berral, Nicolas Poggi, David Carrera, Aaron Call, Rob Reinauer, Daron Green

    Abstract: This article presents the ALOJA project and its analytics tools, which leverages machine learning to interpret Big Data benchmark performance data and tuning. ALOJA is part of a long-term collaboration between BSC and Microsoft to automate the characterization of cost-effectiveness on Big Data deployments, currently focusing on Hadoop. Hadoop presents a complex run-time environment, where costs an… ▽ More

    Submitted 6 November, 2015; originally announced November 2015.

    Comments: Submitted to IEEE Transactions on Emerging Topics in Computing (TETC). Part of the Aloja Project. Partially funded by European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 639595) - HiEST Project. arXiv admin note: substantial text overlap with arXiv:1511.02030

    ACM Class: C.4; I.2.6

  6. ALOJA-ML: A Framework for Automating Characterization and Knowledge Discovery in Hadoop Deployments

    Authors: Josep Ll. Berral, Nicolas Poggi, David Carrera, Aaron Call, Rob Reinauer, Daron Green

    Abstract: This article presents ALOJA-Machine Learning (ALOJA-ML) an extension to the ALOJA project that uses machine learning techniques to interpret Hadoop benchmark performance data and performance tuning; here we detail the approach, efficacy of the model and initial results. Hadoop presents a complex execution environment, where costs and performance depends on a large number of software (SW) configura… ▽ More

    Submitted 6 November, 2015; originally announced November 2015.

    Comments: Submitted to KDD'2015. Part of the Aloja Project. Partially funded by European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 639595) - HiEST Project

    ACM Class: C.4; I.2.6

    Journal ref: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Pages 1701-1710. ACM New York, NY, USA. 2015. ISBN: 978-1-4503-3664-2