Zum Hauptinhalt springen

Showing 1–27 of 27 results for author: Feldman, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13040  [pdf, other

    cs.CL

    Turkish Delights: a Dataset on Turkish Euphemisms

    Authors: Hasan Can Biyik, Patrick Lee, Anna Feldman

    Abstract: Euphemisms are a form of figurative language relatively understudied in natural language processing. This research extends the current computational work on potentially euphemistic terms (PETs) to Turkish. We introduce the Turkish PET dataset, the first available of its kind in the field. By creating a list of euphemisms in Turkish, collecting example contexts, and annotating them, we provide both… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: In Proceedings of The First SIGTURK workshop co-located with ACL 2024: https://sigturk.github.io/workshop/

  2. arXiv:2403.10516  [pdf, other

    cs.CV cs.AI cs.IR cs.LG

    FeatUp: A Model-Agnostic Framework for Features at Any Resolution

    Authors: Stephanie Fu, Mark Hamilton, Laura Brandt, Axel Feldman, Zhoutong Zhang, William T. Freeman

    Abstract: Deep features are a cornerstone of computer vision research, capturing image semantics and enabling the community to solve downstream tasks even in the zero- or few-shot regime. However, these features often lack the spatial resolution to directly perform dense prediction tasks like segmentation and depth prediction because models aggressively pool information over large areas. In this work, we in… ▽ More

    Submitted 1 April, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted to the International Conference on Learning Representations (ICLR) 2024

  3. arXiv:2402.04442  [pdf, other

    cs.CL

    Evaluating Embeddings for One-Shot Classification of Doctor-AI Consultations

    Authors: Olumide Ebenezer Ojo, Olaronke Oluwayemisi Adebanji, Alexander Gelbukh, Hiram Calvo, Anna Feldman

    Abstract: Effective communication between healthcare providers and patients is crucial to providing high-quality patient care. In this work, we investigate how Doctor-written and AI-generated texts in healthcare consultations can be classified using state-of-the-art embeddings and one-shot classification systems. By analyzing embeddings such as bag-of-words, character n-grams, Word2Vec, GloVe, fastText, and… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  4. arXiv:2401.14526  [pdf, other

    cs.CL

    MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms

    Authors: Patrick Lee, Alain Chirino Trujillo, Diana Cuevas Plancarte, Olumide Ebenezer Ojo, Xinyi Liu, Iyanuoluwa Shode, Yuan Zhao, Jing Peng, Anna Feldman

    Abstract: This study investigates the computational processing of euphemisms, a universal linguistic phenomenon, across multiple languages. We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  5. arXiv:2310.12489  [pdf, other

    cs.CL

    MedAI Dialog Corpus (MEDIC): Zero-Shot Classification of Doctor and AI Responses in Health Consultations

    Authors: Olumide E. Ojo, Olaronke O. Adebanji, Alexander Gelbukh, Hiram Calvo, Anna Feldman

    Abstract: Zero-shot classification enables text to be classified into classes not seen during training. In this study, we examine the efficacy of zero-shot learning models in classifying healthcare consultation responses from Doctors and AI systems. The models evaluated include BART, BERT, XLM, XLM-R and DistilBERT. The models were tested on three different datasets based on a binary and multi-label analysi… ▽ More

    Submitted 12 January, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

  6. arXiv:2310.09661  [pdf, other

    cs.CL

    Legend at ArAIEval Shared Task: Persuasion Technique Detection using a Language-Agnostic Text Representation Model

    Authors: Olumide E. Ojo, Olaronke O. Adebanji, Hiram Calvo, Damian O. Dieke, Olumuyiwa E. Ojo, Seye E. Akinsanya, Tolulope O. Abiola, Anna Feldman

    Abstract: In this paper, we share our best performing submission to the Arabic AI Tasks Evaluation Challenge (ArAIEval) at ArabicNLP 2023. Our focus was on Task 1, which involves identifying persuasion techniques in excerpts from tweets and news articles. The persuasion technique in Arabic texts was detected using a training loop with XLM-RoBERTa, a language-agnostic text representation model. This approach… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

  7. arXiv:2309.10874  [pdf, other

    cs.RO eess.SY

    Guarantees on Robot System Performance Using Stochastic Simulation Rollouts

    Authors: Joseph A. Vincent, Aaron O. Feldman, Mac Schwager

    Abstract: We provide finite-sample performance guarantees for control policies executed on stochastic robotic systems. Given an open- or closed-loop policy and a finite set of trajectory rollouts under the policy, we bound the expected value, value-at-risk, and conditional-value-at-risk of the trajectory cost, and the probability of failure in a sparse cost setting. The bounds hold, with user-specified prob… ▽ More

    Submitted 13 June, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: Submitted to IEEE-TRO

  8. arXiv:2308.16149  [pdf, other

    cs.CL cs.AI cs.LG

    Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

    Authors: Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, Satheesh Katipomu, Haonan Li, Fajri Koto, William Marshall, Gurpreet Gosal, Cynthia Liu, Zhiming Chen, Osama Mohammed Afzal, Samta Kamboj, Onkar Pandit, Rahul Pal, Lalit Pradhan, Zain Muhammad Mujahid, Massa Baali, Xudong Han, Sondos Mahmoud Bsharat, Alham Fikri Aji, Zhiqiang Shen, Zhengzhong Liu, Natalia Vassilieva, Joel Hestness, Andy Hock , et al. (7 additional authors not shown)

    Abstract: We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric foundation and instruction-tuned open generative large language models (LLMs). The models are based on the GPT-3 decoder-only architecture and are pretrained on a mixture of Arabic and English texts, including source code in various programming languages. With 13 billion parameters, they demonstrate better knowledge and reasoning… ▽ More

    Submitted 29 September, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: Arabic-centric, foundation model, large-language model, LLM, generative model, instruction-tuned, Jais, Jais-chat

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  9. arXiv:2306.00217  [pdf, other

    cs.CL

    FEED PETs: Further Experimentation and Expansion on the Disambiguation of Potentially Euphemistic Terms

    Authors: Patrick Lee, Iyanuoluwa Shode, Alain Chirino Trujillo, Yuan Zhao, Olumide Ebenezer Ojo, Diana Cuevas Plancarte, Anna Feldman, Jing Peng

    Abstract: Transformers have been shown to work well for the task of English euphemism disambiguation, in which a potentially euphemistic term (PET) is classified as euphemistic or non-euphemistic in a particular context. In this study, we expand on the task in two ways. First, we annotate PETs for vagueness, a linguistic property associated with euphemisms, and find that transformers are generally better at… ▽ More

    Submitted 6 June, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

  10. arXiv:2305.10971  [pdf, other

    cs.CL

    NollySenti: Leveraging Transfer Learning and Machine Translation for Nigerian Movie Sentiment Classification

    Authors: Iyanuoluwa Shode, David Ifeoluwa Adelani, Jing Peng, Anna Feldman

    Abstract: Africa has over 2000 indigenous languages but they are under-represented in NLP research due to lack of datasets. In recent years, there have been progress in developing labeled corpora for African languages. However, they are often available in a single domain and may not generalize to other domains. In this paper, we focus on the task of sentiment classification for cross domain adaptation. We c… ▽ More

    Submitted 22 August, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023 (main conference)

  11. arXiv:2211.13327  [pdf, other

    cs.CL cs.AI

    A Report on the Euphemisms Detection Shared Task

    Authors: Patrick Lee, Anna Feldman, Jing Peng

    Abstract: This paper presents The Shared Task on Euphemism Detection for the Third Workshop on Figurative Language Processing (FigLang 2022) held in conjunction with EMNLP 2022. Participants were invited to investigate the euphemism detection task: given input text, identify whether it contains a euphemism. The input data is a corpus of sentences containing potentially euphemistic terms (PETs) collected fro… ▽ More

    Submitted 3 December, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

  12. arXiv:2209.05470  [pdf, ps, other

    cs.AI

    A Quantum Algorithm for Computing All Diagnoses of a Switching Circuit

    Authors: Alexander Feldman, Johan de Kleer, Ion Matei

    Abstract: Faults are stochastic by nature while most man-made systems, and especially computers, work deterministically. This necessitates the linking of probability theory with mathematical logics, automata, and switching circuit theory. This paper provides such a connecting via quantum information theory which is an intuitive approach as quantum physics obeys probability laws. In this paper we provide a n… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  13. arXiv:2205.10451  [pdf, other

    cs.CL

    Searching for PETs: Using Distributional and Sentiment-Based Methods to Find Potentially Euphemistic Terms

    Authors: Patrick Lee, Martha Gavidia, Anna Feldman, Jing Peng

    Abstract: This paper presents a linguistically driven proof of concept for finding potentially euphemistic terms, or PETs. Acknowledging that PETs tend to be commonly used expressions for a certain range of sensitive topics, we make use of distributional similarities to select and filter phrase candidates from a sentence and rank them using a set of simple sentiment-based metrics. We present the results of… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

    Journal ref: Proceedings of UnImplicit: The Second Workshop on Understanding Implicit and Underspecified Language, NAACL 2022, Seattle

  14. arXiv:2205.02728  [pdf, other

    cs.CL

    CATs are Fuzzy PETs: A Corpus and Analysis of Potentially Euphemistic Terms

    Authors: Martha Gavidia, Patrick Lee, Anna Feldman, Jing Peng

    Abstract: Euphemisms have not received much attention in natural language processing, despite being an important element of polite and figurative language. Euphemisms prove to be a difficult topic, not only because they are subject to language change, but also because humans may not agree on what is a euphemism and what is not. Nevertheless, the first step to tackling the issue is to collect and analyze exa… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: Proceedings of LREC 2022

  15. arXiv:2204.09711  [pdf, ps, other

    cs.CL

    yosm: A new yoruba sentiment corpus for movie reviews

    Authors: Iyanuoluwa Shode, David Ifeoluwa Adelani, Anna Feldman

    Abstract: A movie that is thoroughly enjoyed and recommended by an individual might be hated by another. One characteristic of humans is the ability to have feelings which could be positive or negative. To automatically classify and study human feelings, an aspect of natural language processing, sentiment analysis and opinion mining were designed to understand human feelings regarding several issues which c… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: Accepted to AfricaNLP Workshop @ICLR 2022

  16. arXiv:2109.12986  [pdf, other

    cs.CL cs.IR cs.LG cs.SI

    Findings of the NLP4IF-2021 Shared Tasks on Fighting the COVID-19 Infodemic and Censorship Detection

    Authors: Shaden Shaar, Firoj Alam, Giovanni Da San Martino, Alex Nikolov, Wajdi Zaghouani, Preslav Nakov, Anna Feldman

    Abstract: We present the results and the main findings of the NLP4IF-2021 shared tasks. Task 1 focused on fighting the COVID-19 infodemic in social media, and it was offered in Arabic, Bulgarian, and English. Given a tweet, it asked to predict whether that tweet contains a verifiable claim, and if so, whether it is likely to be false, is of general interest, is likely to be harmful, and is worthy of manual… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: COVID-19, infodemic, harmfulness, check-worthiness, censorship, social media, tweets, Arabic, Bulgarian, English, Chinese

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

    Journal ref: NLP4IF-2021

  17. arXiv:2010.01942  [pdf, other

    eess.IV cs.CV

    Unsupervised Region-based Anomaly Detection in Brain MRI with Adversarial Image Inpainting

    Authors: Bao Nguyen, Adam Feldman, Sarath Bethapudi, Andrew Jennings, Chris G. Willcocks

    Abstract: Medical segmentation is performed to determine the bounds of regions of interest (ROI) prior to surgery. By allowing the study of growth, structure, and behaviour of the ROI in the planning phase, critical information can be obtained, increasing the likelihood of a successful operation. Usually, segmentations are performed manually or via machine learning methods trained on manual annotations. In… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    Comments: 5 pages, 6 figures

    ACM Class: I.5.0; I.4.0

  18. arXiv:2008.08073  [pdf

    q-bio.NC cs.NE q-bio.PE

    On the Evolution of Subjective Experience

    Authors: Jerome A. Feldman

    Abstract: Subjective Experience (SE) is part of the ancient mind-body problem, which continues to be one of deepest mysteries of science. Despite major advances in many fields, there is still no plausible causal link between SE and its realization in the body. The core issue is the incompatibility of objective (3rd person) public science with subjective (1st person) private experience. Any scientific approa… ▽ More

    Submitted 25 March, 2022; v1 submitted 18 August, 2020; originally announced August 2020.

    Comments: 49 pages 5 figures. This 7/22/2021 version preserves all the content of the previous version and adds additional discussion (in italics). It also includes several new references to connect with current literature. A companion arXiv article has also been updated

  19. arXiv:2003.02671  [pdf, other

    eess.SP cs.LG math.DS math.OC stat.ML

    Hybrid modeling: Applications in real-time diagnosis

    Authors: Ion Matei, Johan de Kleer, Alexander Feldman, Rahul Rai, Souma Chowdhury

    Abstract: Reduced-order models that accurately abstract high fidelity models and enable faster simulation is vital for real-time, model-based diagnosis applications. In this paper, we outline a novel hybrid modeling approach that combines machine learning inspired models and physics-based models to generate reduced-order models from high fidelity models. We are using such models for real-time diagnosis appl… ▽ More

    Submitted 3 March, 2020; originally announced March 2020.

  20. arXiv:2001.08845  [pdf, other

    cs.CL cs.CY

    Linguistic Fingerprints of Internet Censorship: the Case of SinaWeibo

    Authors: Kei Yin Ng, Anna Feldman, Jing Peng

    Abstract: This paper studies how the linguistic components of blogposts collected from Sina Weibo, a Chinese microblogging platform, might affect the blogposts' likelihood of being censored. Our results go along with King et al. (2013)'s Collective Action Potential (CAP) theory, which states that a blogpost's potential of causing riot or assembly in real life is the key determinant of it getting censored. A… ▽ More

    Submitted 23 January, 2020; originally announced January 2020.

    Comments: AAAI 2020

  21. arXiv:1905.02303  [pdf, ps, other

    cs.AI cs.DS cs.ET cs.LO

    Design Space Exploration as Quantified Satisfaction

    Authors: Alexander Feldman, Johan de Kleer, Ion Matei

    Abstract: We present novel algorithms for design and design space exploration. The designs discovered by these algorithms are compositions of function types specified in component libraries. Our algorithms reduce the design problem to quantified satisfiability and use advanced solvers to find solutions that represent useful systems. The algorithms we present in this paper are sound and complete and are gu… ▽ More

    Submitted 31 January, 2021; v1 submitted 6 May, 2019; originally announced May 2019.

  22. arXiv:1904.02575  [pdf

    cs.CV

    Segmentation of the Prostatic Gland and the Intraprostatic Lesions on Multiparametic MRI Using Mask-RCNN

    Authors: Zhenzhen Dai, Eric Carver, Chang Liu, Joon Lee, Aharon Feldman, Weiwei Zong, Milan Pantelic, Mohamed Elshaikh, Ning Wen

    Abstract: Prostate cancer (PCa) is the most common cancer in men in the United States. Multiparametic magnetic resonance imaging (mp-MRI) has been explored by many researchers to targeted prostate biopsies and radiation therapy. However, assessment on mp-MRI can be subjective, development of computer-aided diagnosis systems to automatically delineate the prostate gland and the intraprostratic lesions (ILs)… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

  23. arXiv:1903.12331  [pdf

    cs.CV eess.IV q-bio.QM

    A Deep Dive into Understanding Tumor Foci Classification using Multiparametric MRI Based on Convolutional Neural Network

    Authors: Weiwei Zong, Joon Lee, Chang Liu, Eric Carver, Aharon Feldman, Branislava Janic, Mohamed Elshaikh, Milan Pantelic, David Hearshen, Indrin Chetty, Benjamin Movsas, Ning Wen

    Abstract: Deep learning models have had a great success in disease classifications using large data pools of skin cancer images or lung X-rays. However, data scarcity has been the roadblock of applying deep learning models directly on prostate multiparametric MRI (mpMRI). Although model interpretation has been heavily studied for natural images for the past few years, there has been a lack of interpretation… ▽ More

    Submitted 14 May, 2020; v1 submitted 28 March, 2019; originally announced March 2019.

  24. arXiv:1807.03654  [pdf, ps, other

    cs.CL

    Linguistic Characteristics of Censorable Language on SinaWeibo

    Authors: Kei Yin Ng, Anna Feldman, Jing Peng, Chris Leberknight

    Abstract: This paper investigates censorship from a linguistic perspective. We collect a corpus of censored and uncensored posts on a number of topics, build a classifier that predicts censorship decisions independent of discussion topics. Our investigation reveals that the strongest linguistic indicator of censored content of our corpus is its readability.

    Submitted 10 July, 2018; originally announced July 2018.

    Journal ref: 1st Workshop on NLP for Internet Freedom (NLP4IF-2018)

  25. arXiv:1802.09961  [pdf, ps, other

    cs.CL

    Classifying Idiomatic and Literal Expressions Using Topic Models and Intensity of Emotions

    Authors: Jing Peng, Anna Feldman, Ekaterina Vylomova

    Abstract: We describe an algorithm for automatic classification of idiomatic and literal expressions. Our starting point is that words in a given text segment, such as a paragraph, that are highranking representatives of a common topic of discussion are less likely to be a part of an idiomatic expression. Our additional hypothesis is that contexts in which idioms occur, typically, are more affective and the… ▽ More

    Submitted 27 February, 2018; originally announced February 2018.

    Comments: EMNLP 2014

  26. A Model-Based Active Testing Approach to Sequential Diagnosis

    Authors: Alexander Feldman, Gregory Provan, Arjan van Gemund

    Abstract: Model-based diagnostic reasoning often leads to a large number of diagnostic hypotheses. The set of diagnoses can be reduced by taking into account extra observations (passive monitoring), measuring additional variables (probing) or executing additional tests (sequential diagnosis/test sequencing). In this paper we combine the above approaches with techniques from Automated Test Pattern Generation… ▽ More

    Submitted 15 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 39, pages 301-334, 2010

  27. Approximate Model-Based Diagnosis Using Greedy Stochastic Search

    Authors: Alexander Feldman, Gregory Provan, Arjan van Gemund

    Abstract: We propose a StochAstic Fault diagnosis AlgoRIthm, called SAFARI, which trades off guarantees of computing minimal diagnoses for computational efficiency. We empirically demonstrate, using the 74XXX and ISCAS-85 suites of benchmark combinatorial circuits, that SAFARI achieves several orders-of-magnitude speedup over two well-known deterministic algorithms, CDA* and HA*, for multiple-fault diagnose… ▽ More

    Submitted 15 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 38, pages 371-413, 2010