Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Bean, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.07888  [pdf, other

    cs.CL

    Fine-tuning Large Language Models with Human-inspired Learning Strategies in Medical Question Answering

    Authors: Yushi Yang, Andrew M. Bean, Robert McCraith, Adam Mahdi

    Abstract: Training Large Language Models (LLMs) incurs substantial data-related costs, motivating the development of data-efficient training methods through optimised data ordering and selection. Human-inspired learning strategies, such as curriculum learning, offer possibilities for efficient training by organising data according to common human learning practices. Despite evidence that fine-tuning with cu… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  2. arXiv:2406.06196  [pdf, other

    cs.CL

    LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages

    Authors: Andrew M. Bean, Simi Hellsten, Harry Mayne, Jabez Magomere, Ethan A. Chi, Ryan Chi, Scott A. Hale, Hannah Rose Kirk

    Abstract: In this paper, we present the LingOly benchmark, a novel benchmark for advanced reasoning abilities in large language models. Using challenging Linguistic Olympiad puzzles, we evaluate (i) capabilities for in-context identification and generalisation of linguistic patterns in very low-resource or extinct languages, and (ii) abilities to follow complex task instructions. The LingOly benchmark cover… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures, 16 pages supplemental materials

  3. arXiv:2404.16019  [pdf, other

    cs.CL

    The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

    Authors: Hannah Rose Kirk, Alexander Whitefield, Paul Röttger, Andrew Bean, Katerina Margatina, Juan Ciro, Rafael Mosquera, Max Bartolo, Adina Williams, He He, Bertie Vidgen, Scott A. Hale

    Abstract: Human feedback plays a central role in the alignment of Large Language Models (LLMs). However, open questions remain about the methods (how), domains (where), people (who) and objectives (to what end) of human feedback collection. To navigate these questions, we introduce PRISM, a new dataset which maps the sociodemographics and stated preferences of 1,500 diverse participants from 75 countries, t… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  4. arXiv:2310.07629  [pdf, other

    cs.CL cs.CY

    The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

    Authors: Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale

    Abstract: Human feedback is increasingly used to steer the behaviours of Large Language Models (LLMs). However, it is unclear how to collect and incorporate feedback in a way that is efficient, effective and unbiased, especially for highly subjective human preferences and values. In this paper, we survey existing approaches for learning from human feedback, drawing on 95 papers primarily from the ACL and ar… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted for the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP, Main)

  5. arXiv:2310.07225  [pdf, ps, other

    cs.CL

    Exploring the landscape of large language models in medical question answering

    Authors: Andrew M. Bean, Karolina Korgul, Felix Krones, Robert McCraith, Adam Mahdi

    Abstract: With the rapid development of new large language models (LLMs), each claiming to surpass previous models, an overall picture of medical LLM research can be elusive. To address this challenge, we benchmark a range of top LLMs and identify consistent patterns which appear across models. We test $8$ well-known LLMs on $874$ newly collected questions from Polish medical licensing exams. For each quest… ▽ More

    Submitted 9 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 11 pages, 8 figures

  6. Indian-BhED: A Dataset for Measuring India-Centric Biases in Large Language Models

    Authors: Khyati Khandelwal, Manuel Tonneau, Andrew M. Bean, Hannah Rose Kirk, Scott A. Hale

    Abstract: Large Language Models (LLMs), now used daily by millions, can encode societal biases, exposing their users to representational harms. A large body of scholarship on LLM bias exists but it predominantly adopts a Western-centric frame and attends comparatively less to bias levels and potential harms in the Global South. In this paper, we quantify stereotypical bias in popular LLMs according to an In… ▽ More

    Submitted 9 August, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: To be published in GoodIT '24, doi:10.1145/3677525.3678666. 14 pages

  7. arXiv:2307.11242  [pdf, other

    cs.NE cs.AI cs.LG

    On-Sensor Data Filtering using Neuromorphic Computing for High Energy Physics Experiments

    Authors: Shruti R. Kulkarni, Aaron Young, Prasanna Date, Narasinga Rao Miniskar, Jeffrey S. Vetter, Farah Fahim, Benjamin Parpillon, Jennet Dickinson, Nhan Tran, Jieun Yoo, Corrinne Mills, Morris Swartz, Petar Maksimovic, Catherine D. Schuman, Alice Bean

    Abstract: This work describes the investigation of neuromorphic computing-based spiking neural network (SNN) models used to filter data from sensor electronics in high energy physics experiments conducted at the High Luminosity Large Hadron Collider. We present our approach for developing a compact neuromorphic model that filters out the sensor data based on the particle's transverse momentum with the goal… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Manuscript accepted at ICONS'23

  8. arXiv:1812.02945  [pdf

    cs.HC

    An Objective Assessment of the Utility of a Driving Simulator for Low Mu Testing

    Authors: Richard Romano, Gustav Markkula, Erwin Boer, Hamish Jamson, Alex Bean, Andrew Tomlinson, Anthony Horrobin, Ehsan Sadraei

    Abstract: Driving simulators can be used to test vehicle designs earlier, prior to building physical prototypes. One area of particular interest is winter testing since testing is limited to specific times of year and specific regions in the world. To ensure that the simulator is fit for purpose, an objective assessment is required. In this study a simulator and real world comparison was performed with thre… ▽ More

    Submitted 7 December, 2018; originally announced December 2018.