Skip to main content

Showing 1–18 of 18 results for author: Mellor, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11757  [pdf, other

    cs.AI cs.CL cs.CY cs.HC

    STAR: SocioTechnical Approach to Red Teaming Language Models

    Authors: Laura Weidinger, John Mellor, Bernat Guillen Pegueroles, Nahema Marchal, Ravin Kumar, Kristian Lum, Canfer Akbulut, Mark Diaz, Stevie Bergman, Mikel Rodriguez, Verena Rieser, William Isaac

    Abstract: This research introduces STAR, a sociotechnical framework that improves on current best practices for red teaming safety of large language models. STAR makes two key contributions: it enhances steerability by generating parameterised instructions for human red teamers, leading to improved coverage of the risk surface. Parameterised instructions also provide more detailed insights into model failur… ▽ More

    Submitted 10 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures, 5 pages appendix. * denotes equal contribution

  2. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  3. arXiv:2212.08571  [pdf, other

    cs.SD cs.LG eess.AS stat.AP

    Statistical Design and Analysis for Robust Machine Learning: A Case Study from COVID-19

    Authors: Davide Pigoli, Kieran Baker, Jobie Budd, Lorraine Butler, Harry Coppock, Sabrina Egglestone, Steven G. Gilmour, Chris Holmes, David Hurley, Radka Jersakova, Ivan Kiskin, Vasiliki Koutra, Jonathon Mellor, George Nicholson, Joe Packham, Selina Patel, Richard Payne, Stephen J. Roberts, Björn W. Schuller, Ana Tendero-Cañadas, Tracey Thornley, Alexander Titcomb

    Abstract: Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously ass… ▽ More

    Submitted 27 February, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

  4. arXiv:2212.08570  [pdf, other

    cs.SD cs.LG eess.AS

    Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

    Authors: Harry Coppock, George Nicholson, Ivan Kiskin, Vasiliki Koutra, Kieran Baker, Jobie Budd, Richard Payne, Emma Karoune, David Hurley, Alexander Titcomb, Sabrina Egglestone, Ana Tendero Cañadas, Lorraine Butler, Radka Jersakova, Jonathon Mellor, Selina Patel, Tracey Thornley, Peter Diggle, Sylvia Richardson, Josef Packham, Björn W. Schuller, Davide Pigoli, Steven Gilmour, Stephen Roberts, Chris Holmes

    Abstract: Recent work has reported that AI classifiers trained on audio recordings can accurately predict severe acute respiratory syndrome coronavirus 2 (SARSCoV2) infection status. Here, we undertake a large scale study of audio-based deep learning classifiers, as part of the UK governments pandemic response. We collect and analyse a dataset of audio recordings from 67,842 individuals with linked metadata… ▽ More

    Submitted 2 March, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

  5. arXiv:2212.07738  [pdf

    cs.SD cs.LG eess.AS

    A large-scale and PCR-referenced vocal audio dataset for COVID-19

    Authors: Jobie Budd, Kieran Baker, Emma Karoune, Harry Coppock, Selina Patel, Ana Tendero Cañadas, Alexander Titcomb, Richard Payne, David Hurley, Sabrina Egglestone, Lorraine Butler, Jonathon Mellor, George Nicholson, Ivan Kiskin, Vasiliki Koutra, Radka Jersakova, Rachel A. McKendry, Peter Diggle, Sylvia Richardson, Björn W. Schuller, Steven Gilmour, Davide Pigoli, Stephen Roberts, Josef Packham, Tracey Thornley , et al. (1 additional authors not shown)

    Abstract: The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmi… ▽ More

    Submitted 3 November, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: 39 pages, 4 figures

  6. arXiv:2209.14375  [pdf, other

    cs.LG cs.CL

    Improving alignment of dialogue agents via targeted human judgements

    Authors: Amelia Glaese, Nat McAleese, Maja Trębacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Maribeth Rauh, Laura Weidinger, Martin Chadwick, Phoebe Thacker, Lucy Campbell-Gillingham, Jonathan Uesato, Po-Sen Huang, Ramona Comanescu, Fan Yang, Abigail See, Sumanth Dathathri, Rory Greig, Charlie Chen, Doug Fritz, Jaume Sanchez Elias, Richard Green, Soňa Mokrá, Nicholas Fernando, Boxi Wu , et al. (9 additional authors not shown)

    Abstract: We present Sparrow, an information-seeking dialogue agent trained to be more helpful, correct, and harmless compared to prompted language model baselines. We use reinforcement learning from human feedback to train our models with two new additions to help human raters judge agent behaviour. First, to make our agent more helpful and harmless, we break down the requirements for good dialogue into na… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

  7. arXiv:2206.11769  [pdf, other

    q-bio.NC cs.LG cs.NE

    Single-phase deep learning in cortico-cortical networks

    Authors: Will Greedy, Heng Wei Zhu, Joseph Pemberton, Jack Mellor, Rui Ponte Costa

    Abstract: The error-backpropagation (backprop) algorithm remains the most common solution to the credit assignment problem in artificial neural networks. In neuroscience, it is unclear whether the brain could adopt a similar strategy to correctly modify its synapses. Recent models have attempted to bridge this gap while being consistent with a range of experimental observations. However, these models are ei… ▽ More

    Submitted 24 October, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

    Comments: Accepted to 36th Conference on Neural Information Processing Systems (NeurIPS 2022). 22 pages, 9 figures, 5 tables

  8. arXiv:2206.08325  [pdf, ps, other

    cs.CL cs.AI cs.CY

    Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

    Authors: Maribeth Rauh, John Mellor, Jonathan Uesato, Po-Sen Huang, Johannes Welbl, Laura Weidinger, Sumanth Dathathri, Amelia Glaese, Geoffrey Irving, Iason Gabriel, William Isaac, Lisa Anne Hendricks

    Abstract: Large language models produce human-like text that drive a growing number of applications. However, recent literature and, increasingly, real world observations, have demonstrated that these models can generate language that is toxic, biased, untruthful or otherwise harmful. Though work to evaluate language model harms is under way, translating foresight about which harms may arise into rigorous b… ▽ More

    Submitted 28 October, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: Accepted to NeurIPS 2022 Datasets and Benchmarks Track; 10 pages plus appendix

  9. arXiv:2112.11446  [pdf, other

    cs.CL cs.AI

    Scaling Language Models: Methods, Analysis & Insights from Training Gopher

    Authors: Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor , et al. (55 additional authors not shown)

    Abstract: Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gop… ▽ More

    Submitted 21 January, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: 120 pages

  10. arXiv:2112.04359  [pdf, other

    cs.CL cs.AI cs.CY

    Ethical and social risks of harm from Language Models

    Authors: Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel

    Abstract: This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary expertise and literature from computer science, linguist… ▽ More

    Submitted 8 December, 2021; originally announced December 2021.

  11. arXiv:2109.07445  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Challenges in Detoxifying Language Models

    Authors: Johannes Welbl, Amelia Glaese, Jonathan Uesato, Sumanth Dathathri, John Mellor, Lisa Anne Hendricks, Kirsty Anderson, Pushmeet Kohli, Ben Coppin, Po-Sen Huang

    Abstract: Large language models (LM) generate remarkably fluent text and can be efficiently adapted across NLP tasks. Measuring and guaranteeing the quality of generated text in terms of safety is imperative for deploying LMs in the real world; to this end, prior work often relies on automatic evaluation of LM toxicity. We critically discuss this approach, evaluate several toxicity mitigation strategies wit… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: 23 pages, 6 figures, published in Findings of EMNLP 2021

    ACM Class: I.2.6; I.2.7

  12. arXiv:2102.00529  [pdf, other

    cs.CL cs.CV

    Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers

    Authors: Lisa Anne Hendricks, John Mellor, Rosalia Schneider, Jean-Baptiste Alayrac, Aida Nematzadeh

    Abstract: Recently multimodal transformer models have gained popularity because their performance on language and vision tasks suggest they learn rich visual-linguistic representations. Focusing on zero-shot image retrieval tasks, we study three important factors which can impact the quality of learned representations: pretraining data, the attention mechanism, and loss functions. By pretraining models on s… ▽ More

    Submitted 31 January, 2021; originally announced February 2021.

    Comments: pre-print of MIT Press Publication version

  13. arXiv:2006.04647  [pdf, other

    cs.LG cs.CV stat.ML

    Neural Architecture Search without Training

    Authors: Joseph Mellor, Jack Turner, Amos Storkey, Elliot J. Crowley

    Abstract: The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained… ▽ More

    Submitted 11 June, 2021; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: Accepted at ICML 2021 for a long presentation

  14. arXiv:2001.06105  [pdf, other

    cs.LG stat.ML

    Better Boosting with Bandits for Online Learning

    Authors: Nikolaos Nikolaou, Joseph Mellor, Nikunj C. Oza, Gavin Brown

    Abstract: Probability estimates generated by boosting ensembles are poorly calibrated because of the margin maximization nature of the algorithm. The outputs of the ensemble need to be properly calibrated before they can be used as probability estimates. In this work, we demonstrate that online boosting is also prone to producing distorted probability estimates. In batch learning, calibration is achieved by… ▽ More

    Submitted 16 January, 2020; originally announced January 2020.

    Comments: 44 pages, 6 figures

  15. arXiv:1910.01007  [pdf, other

    cs.CV cs.LG stat.ML

    Unsupervised Doodling and Painting with Improved SPIRAL

    Authors: John F. J. Mellor, Eunbyung Park, Yaroslav Ganin, Igor Babuschkin, Tejas Kulkarni, Dan Rosenbaum, Andy Ballard, Theophane Weber, Oriol Vinyals, S. M. Ali Eslami

    Abstract: We investigate using reinforcement learning agents as generative models of images (extending arXiv:1804.01118). A generative agent controls a simulated painting environment, and is trained with rewards provided by a discriminator network simultaneously trained to assess the realism of the agent's samples, either unconditional or reconstructions. Compared to prior work, we make a number of improvem… ▽ More

    Submitted 2 October, 2019; originally announced October 2019.

    Comments: See https://learning-to-paint.github.io for an interactive version of this paper, with videos

    ACM Class: I.2; I.4

  16. arXiv:1803.00316  [pdf, other

    cs.LG stat.ML

    The K-Nearest Neighbour UCB algorithm for multi-armed bandits with covariates

    Authors: Henry WJ Reeve, Joe Mellor, Gavin Brown

    Abstract: In this paper we propose and explore the k-Nearest Neighbour UCB algorithm for multi-armed bandits with covariates. We focus on a setting where the covariates are supported on a metric space of low intrinsic dimension, such as a manifold embedded within a high dimensional ambient feature space. The algorithm is conceptually simple and straightforward to implement. The k-Nearest Neighbour UCB algor… ▽ More

    Submitted 1 March, 2018; originally announced March 2018.

    Comments: To be presented at ALT 2018

    Journal ref: Algorithmic Learning Theory 2018

  17. arXiv:1302.3721  [pdf, other

    cs.LG

    Thompson Sampling in Switching Environments with Bayesian Online Change Point Detection

    Authors: Joseph Mellor, Jonathan Shapiro

    Abstract: Thompson Sampling has recently been shown to be optimal in the Bernoulli Multi-Armed Bandit setting[Kaufmann et al., 2012]. This bandit problem assumes stationary distributions for the rewards. It is often unrealistic to model the real world as a stationary distribution. In this paper we derive and evaluate algorithms using Thompson Sampling for a Switching Multi-Armed Bandit Problem. We propose a… ▽ More

    Submitted 15 February, 2013; originally announced February 2013.

    Comments: A version will appear in the Sixteenth international conference on Artificial Intelligence and Statistics (AIStats 2013)

  18. arXiv:0710.4636  [pdf

    cs.AR

    Why Systems-on-Chip Needs More UML like a Hole in the Head

    Authors: Stephen J. Mellor, John R. Wolfe, Campbell Mccausland

    Abstract: Let's be clear from the outset: SoC can most certainly make use of UML; SoC just doesn't need more UML, or even all of it. The advent of model mappings, coupled with marks that indicate which mapping rule to apply, enable a major simplification of the use of UML in SoC.

    Submitted 25 October, 2007; originally announced October 2007.

    Comments: Submitted on behalf of EDAA (http://www.edaa.com/)

    Journal ref: Dans Design, Automation and Test in Europe - DATE'05, Munich : Allemagne (2005)