Zum Hauptinhalt springen

Showing 1–6 of 6 results for author: Razeghi, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.00588  [pdf, other

    cs.CL cs.AI cs.CV cs.CY cs.LG

    Are Models Biased on Text without Gender-related Language?

    Authors: Catarina G Belém, Preethi Seshadri, Yasaman Razeghi, Sameer Singh

    Abstract: Gender bias research has been pivotal in revealing undesirable behaviors in large language models, exposing serious gender stereotypes associated with occupations, and emotions. A key observation in prior work is that models reinforce stereotypes as a consequence of the gendered correlations that are present in the training data. In this paper, we focus on bias where the effect from training data… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: In International Conference on Learning Representations 2024

  2. arXiv:2309.10687  [pdf, other

    cs.CL

    EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context Learning

    Authors: Rajasekhar Reddy Mekala, Yasaman Razeghi, Sameer Singh

    Abstract: Language models are achieving impressive performance on various tasks by aggressively adopting inference-time prompting techniques, such as zero-shot and few-shot prompting. In this work, we introduce EchoPrompt, a simple yet effective approach that prompts the model to rephrase its queries before answering them. EchoPrompt is adapted for both zero-shot and few-shot in-context learning with standa… ▽ More

    Submitted 20 February, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

  3. arXiv:2307.11922  [pdf, other

    cs.LG cs.AI cs.CL

    Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors

    Authors: Kolby Nottingham, Yasaman Razeghi, Kyungmin Kim, JB Lanier, Pierre Baldi, Roy Fox, Sameer Singh

    Abstract: Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games, utilizing their general world knowledge and planning abilities. However, previous work does little to explore what environment state information is provided to LLM actors via language. Exhaustively describing high-dimensional states can impair performance and raise i… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  4. arXiv:2203.12184  [pdf, other

    cs.CL

    A Theoretically Grounded Benchmark for Evaluating Machine Commonsense

    Authors: Henrique Santos, Ke Shen, Alice M. Mulvehill, Yasaman Razeghi, Deborah L. McGuinness, Mayank Kejriwal

    Abstract: Programming machines with commonsense reasoning (CSR) abilities is a longstanding challenge in the Artificial Intelligence community. Current CSR benchmarks use multiple-choice (and in relatively fewer cases, generative) question-answering instances to evaluate machine commonsense. Recent progress in transformer-based language representation models suggest that considerable progress has been made… ▽ More

    Submitted 14 July, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

  5. arXiv:2202.07206  [pdf, other

    cs.CL cs.LG

    Impact of Pretraining Term Frequencies on Few-Shot Reasoning

    Authors: Yasaman Razeghi, Robert L. Logan IV, Matt Gardner, Sameer Singh

    Abstract: Pretrained Language Models (LMs) have demonstrated ability to perform numerical reasoning by extrapolating from a few examples in few-shot settings. However, the extent to which this extrapolation relies on robust reasoning is unclear. In this paper, we investigate how well these models reason with terms that are less frequent in the pretraining data. In particular, we examine the correlations bet… ▽ More

    Submitted 23 May, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

  6. arXiv:2010.15980  [pdf, other

    cs.CL cs.LG

    AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts

    Authors: Taylor Shin, Yasaman Razeghi, Robert L. Logan IV, Eric Wallace, Sameer Singh

    Abstract: The remarkable success of pretrained language models has motivated the study of what kinds of knowledge these models learn during pretraining. Reformulating tasks as fill-in-the-blanks problems (e.g., cloze tests) is a natural approach for gauging such knowledge, however, its usage is limited by the manual effort and guesswork required to write suitable prompts. To address this, we develop AutoPro… ▽ More

    Submitted 7 November, 2020; v1 submitted 29 October, 2020; originally announced October 2020.

    Comments: v2: Fixed error in Figure 2