Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Baumel, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13274  [pdf, other

    cs.CL

    In-Context Learning on a Budget: A Case Study in Named Entity Recognition

    Authors: Uri Berger, Tal Baumel, Gabriel Stanovsky

    Abstract: Few shot in-context learning (ICL) typically assumes access to large annotated training sets. However, in many real world scenarios, such as domain adaptation, there is only a limited budget to annotate a small number of samples, with the goal of maximizing downstream performance. We study various methods for selecting samples to annotate within a predefined budget, specifically focusing on the na… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  2. Multilingual Large Language Models and Curse of Multilinguality

    Authors: Daniil Gurgurov, Tanja Bäumel, Tatiana Anikina

    Abstract: Multilingual Large Language Models (LLMs) have gained large popularity among Natural Language Processing (NLP) researchers and practitioners. These models, trained on huge datasets, show proficiency across various languages and demonstrate effectiveness in numerous downstream tasks. This paper navigates the landscape of multilingual LLMs, providing an introductory overview of their technical aspec… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  3. arXiv:2312.06514  [pdf, other

    cs.CL cs.AI

    Where exactly does contextualization in a PLM happen?

    Authors: Soniya Vijayakumar, Tanja Bäumel, Simon Ostermann, Josef van Genabith

    Abstract: Pre-trained Language Models (PLMs) have shown to be consistently successful in a plethora of NLP tasks due to their ability to learn contextualized representations of words (Ethayarajh, 2019). BERT (Devlin et al., 2018), ELMo (Peters et al., 2018) and other PLMs encode word meaning via textual context, as opposed to static word embeddings, which encode all meanings of a word in a single vector rep… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: EMNLP 2023 BlackBloxNLP 2023 Workshop

  4. arXiv:2311.08240  [pdf, other

    cs.CL cs.AI

    Investigating the Encoding of Words in BERT's Neurons using Feature Textualization

    Authors: Tanja Baeumel, Soniya Vijayakumar, Josef van Genabith, Guenter Neumann, Simon Ostermann

    Abstract: Pretrained language models (PLMs) form the basis of most state-of-the-art NLP technologies. Nevertheless, they are essentially black boxes: Humans do not have a clear understanding of what knowledge is encoded in different parts of the models, especially in individual neurons. The situation is different in computer vision, where feature visualization provides a decompositional interpretability tec… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: To be published in 'BlackboxNLP 2023: The 6th Workshop on Analysing and Interpreting Neural Networks for NLP'. Camera-ready version

  5. arXiv:2211.09722  [pdf, other

    cs.CL cs.LG

    Federated Multilingual Models for Medical Transcript Analysis

    Authors: Andre Manoel, Mirian Hipolito Garcia, Tal Baumel, Shize Su, Jialei Chen, Dan Miller, Danny Karmon, Robert Sim, Dimitrios Dimitriadis

    Abstract: Federated Learning (FL) is a novel machine learning approach that allows the model trainer to access more data samples, by training the model across multiple decentralized data sources, while data access constraints are in place. Such trained models can achieve significantly higher performance beyond what can be done when trained on a single data source. As part of FL's promises, none of the train… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

  6. arXiv:1906.00318  [pdf, other

    cs.CL cs.IR

    Question Answering as an Automatic Evaluation Metric for News Article Summarization

    Authors: Matan Eyal, Tal Baumel, Michael Elhadad

    Abstract: Recent work in the field of automatic summarization and headline generation focuses on maximizing ROUGE scores for various news datasets. We present an alternative, extrinsic, evaluation metric for this task, Answering Performance for Evaluation of Summaries. APES utilizes recent progress in the field of reading-comprehension to quantify the ability of a summary to answer a set of manually created… ▽ More

    Submitted 1 June, 2019; originally announced June 2019.

    Comments: Accepted to NAACL2019

  7. arXiv:1801.07704  [pdf, other

    cs.CL

    Query Focused Abstractive Summarization: Incorporating Query Relevance, Multi-Document Coverage, and Summary Length Constraints into seq2seq Models

    Authors: Tal Baumel, Matan Eyal, Michael Elhadad

    Abstract: Query Focused Summarization (QFS) has been addressed mostly using extractive methods. Such methods, however, produce text which suffers from low coherence. We investigate how abstractive methods can be applied to QFS, to overcome such limitations. Recent developments in neural-attention based sequence-to-sequence models have led to state-of-the-art results on the task of abstractive generic single… ▽ More

    Submitted 25 January, 2018; v1 submitted 23 January, 2018; originally announced January 2018.

  8. arXiv:1709.09587  [pdf, other

    cs.CL cs.AI

    Multi-Label Classification of Patient Notes a Case Study on ICD Code Assignment

    Authors: Tal Baumel, Jumana Nassour-Kassis, Raphael Cohen, Michael Elhadad, No`emie Elhadad

    Abstract: In the context of the Electronic Health Record, automated diagnosis coding of patient notes is a useful task, but a challenging one due to the large number of codes and the length of patient notes. We investigate four models for assigning multiple ICD codes to discharge summaries taken from both MIMIC II and III. We present Hierarchical Attention-GRU (HA-GRU), a hierarchical approach to tag a docu… ▽ More

    Submitted 20 November, 2017; v1 submitted 27 September, 2017; originally announced September 2017.