Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Dolev, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.12570  [pdf, other

    cs.CL cs.LG

    Jamba-1.5: Hybrid Transformer-Mamba Models at Scale

    Authors: Jamba Team, Barak Lenz, Alan Arazi, Amir Bergman, Avshalom Manevich, Barak Peleg, Ben Aviram, Chen Almagor, Clara Fridman, Dan Padnos, Daniel Gissin, Daniel Jannai, Dor Muhlgay, Dor Zimberg, Edden M Gerber, Elad Dolev, Eran Krakovsky, Erez Safahi, Erez Schwartz, Gal Cohen, Gal Shachaf, Haim Rozenblum, Hofit Bata, Ido Blass, Inbal Magar , et al. (36 additional authors not shown)

    Abstract: We present Jamba-1.5, new instruction-tuned large language models based on our Jamba architecture. Jamba is a hybrid Transformer-Mamba mixture of experts architecture, providing high throughput and low memory usage across context lengths, while retaining the same or better quality as Transformer models. We release two model sizes: Jamba-1.5-Large, with 94B active parameters, and Jamba-1.5-Mini, wi… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Webpage: https://www.ai21.com/jamba

  2. arXiv:2404.19310  [pdf, other

    cs.CL

    Does Whisper understand Swiss German? An automatic, qualitative, and human evaluation

    Authors: Eyal Liron Dolev, Clemens Fidel Lutz, Noëmi Aepli

    Abstract: Whisper is a state-of-the-art automatic speech recognition (ASR) model (Radford et al., 2022). Although Swiss German dialects are allegedly not part of Whisper's training data, preliminary experiments showed that Whisper can transcribe Swiss German quite well, with the output being a speech translation into Standard German. To gain a better understanding of Whisper's performance on Swiss German, w… ▽ More

    Submitted 9 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: Accepted at VarDial 2024 (the eleventh Workshop on NLP for Similar Languages, Varieties and Dialects 2024), Mexico City

  3. arXiv:2306.08999  [pdf, other

    cs.CL cs.AI

    Voting Booklet Bias: Stance Detection in Swiss Federal Communication

    Authors: Eric Egli, Noah Mamié, Eyal Liron Dolev, Mathias Müller

    Abstract: In this study, we use recent stance detection methods to study the stance (for, against or neutral) of statements in official information booklets for voters. Our main goal is to answer the fundamental question: are topics to be voted on presented in a neutral way? To this end, we first train and compare several models for stance detection on a large dataset about Swiss politics. We find that fi… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 10 pages (including abstract and appendix), 5 figures, Keywords: stance detection, natural language processing, political analysis

  4. arXiv:2306.08702  [pdf, other

    cs.CL

    Does mBERT understand Romansh? Evaluating word embeddings using word alignment

    Authors: Eyal Liron Dolev

    Abstract: We test similarity-based word alignment models (SimAlign and awesome-align) in combination with word embeddings from mBERT and XLM-R on parallel sentences in German and Romansh. Since Romansh is an unseen language, we are dealing with a zero-shot setting. Using embeddings from mBERT, both models reach an alignment error rate of 0.22, which outperforms fast_align, a statistical model, and is on par… ▽ More

    Submitted 17 August, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Journal ref: In Proceedings of the 8th edition of the Swiss Text Analytics Conference, 2023, pages 41-53, Neuchatel, Switzerland. Association for Computational Linguistics

  5. arXiv:2305.13399  [pdf, other

    cs.CV cs.LG

    Efficient Large-Scale Visual Representation Learning And Evaluation

    Authors: Eden Dolev, Alaa Awad, Denisa Roberts, Zahra Ebrahimzadeh, Marcin Mejran, Vaibhav Malpani, Mahir Yavuz

    Abstract: Efficiently learning visual representations of items is vital for large-scale recommendations. In this article we compare several pretrained efficient backbone architectures, both in the convolutional neural network (CNN) and in the vision transformer (ViT) family. We describe challenges in e-commerce vision applications at scale and highlight methods to efficiently train, evaluate, and serve visu… ▽ More

    Submitted 1 August, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

  6. arXiv:2302.01255  [pdf, other

    cs.LG

    adSformers: Personalization from Short-Term Sequences and Diversity of Representations in Etsy Ads

    Authors: Alaa Awad, Denisa Roberts, Eden Dolev, Andrea Heyman, Zahra Ebrahimzadeh, Zoe Weil, Marcin Mejran, Vaibhav Malpani, Mahir Yavuz

    Abstract: In this article, we present a general approach to personalizing ads through encoding and learning from variable-length sequences of recent user actions and diverse representations. To this end we introduce a three-component module called the adSformer diversifiable personalization module (ADPM) that learns a dynamic user representation. We illustrate the module's effectiveness and flexibility by p… ▽ More

    Submitted 9 June, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

  7. arXiv:2006.03750  [pdf, other

    cs.LG stat.ML

    Learning to Solve Combinatorial Optimization Problems on Real-World Graphs in Linear Time

    Authors: Iddo Drori, Anant Kharkar, William R. Sickinger, Brandon Kates, Qiang Ma, Suwen Ge, Eden Dolev, Brenda Dietrich, David P. Williamson, Madeleine Udell

    Abstract: Combinatorial optimization algorithms for graph problems are usually designed afresh for each new problem with careful attention by an expert to the problem structure. In this work, we develop a new framework to solve any combinatorial optimization problem over graphs that can be formulated as a single player game defined by states, actions, and rewards, including minimum spanning tree, shortest p… ▽ More

    Submitted 11 June, 2020; v1 submitted 5 June, 2020; originally announced June 2020.