Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Gerber, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.12570  [pdf, other

    cs.CL cs.LG

    Jamba-1.5: Hybrid Transformer-Mamba Models at Scale

    Authors: Jamba Team, Barak Lenz, Alan Arazi, Amir Bergman, Avshalom Manevich, Barak Peleg, Ben Aviram, Chen Almagor, Clara Fridman, Dan Padnos, Daniel Gissin, Daniel Jannai, Dor Muhlgay, Dor Zimberg, Edden M Gerber, Elad Dolev, Eran Krakovsky, Erez Safahi, Erez Schwartz, Gal Cohen, Gal Shachaf, Haim Rozenblum, Hofit Bata, Ido Blass, Inbal Magar , et al. (36 additional authors not shown)

    Abstract: We present Jamba-1.5, new instruction-tuned large language models based on our Jamba architecture. Jamba is a hybrid Transformer-Mamba mixture of experts architecture, providing high throughput and low memory usage across context lengths, while retaining the same or better quality as Transformer models. We release two model sizes: Jamba-1.5-Large, with 94B active parameters, and Jamba-1.5-Mini, wi… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Webpage: https://www.ai21.com/jamba

  2. AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity Using Contrastive Learning and Structured Knowledge

    Authors: Tim Schopf, Emanuel Gerber, Malte Ostendorff, Florian Matthes

    Abstract: Generic sentence embeddings provide a coarse-grained approximation of semantic textual similarity but ignore specific aspects that make texts similar. Conversely, aspect-based sentence embeddings provide similarities between texts based on certain predefined aspects. Thus, similarity predictions of texts are more targeted to specific requirements and more easily explainable. In this paper, we pres… ▽ More

    Submitted 24 September, 2023; v1 submitted 15 July, 2023; originally announced July 2023.

    Comments: Accepted to the 14th International Conference on Recent Advances in Natural Language Processing (RANLP 2023)

    ACM Class: I.2.7