Skip to main content

Showing 1–20 of 20 results for author: Welbl, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.18260  [pdf, other

    eess.IV cs.CL cs.CV cs.LG

    Consensus, dissensus and synergy between clinicians and specialist foundation models in radiology report generation

    Authors: Ryutaro Tanno, David G. T. Barrett, Andrew Sellergren, Sumedh Ghaisas, Sumanth Dathathri, Abigail See, Johannes Welbl, Karan Singhal, Shekoofeh Azizi, Tao Tu, Mike Schaekermann, Rhys May, Roy Lee, SiWai Man, Zahra Ahmed, Sara Mahdavi, Yossi Matias, Joelle Barral, Ali Eslami, Danielle Belgrave, Vivek Natarajan, Shravya Shetty, Pushmeet Kohli, Po-Sen Huang, Alan Karthikesalingam , et al. (1 additional authors not shown)

    Abstract: Radiology reports are an instrumental part of modern medicine, informing key clinical decisions such as diagnosis and treatment. The worldwide shortage of radiologists, however, restricts access to expert care and imposes heavy workloads, contributing to avoidable errors and delays in report delivery. While recent progress in automated report generation with vision-language models offer clear pote… ▽ More

    Submitted 20 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

  2. arXiv:2206.08325  [pdf, ps, other

    cs.CL cs.AI cs.CY

    Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

    Authors: Maribeth Rauh, John Mellor, Jonathan Uesato, Po-Sen Huang, Johannes Welbl, Laura Weidinger, Sumanth Dathathri, Amelia Glaese, Geoffrey Irving, Iason Gabriel, William Isaac, Lisa Anne Hendricks

    Abstract: Large language models produce human-like text that drive a growing number of applications. However, recent literature and, increasingly, real world observations, have demonstrated that these models can generate language that is toxic, biased, untruthful or otherwise harmful. Though work to evaluate language model harms is under way, translating foresight about which harms may arise into rigorous b… ▽ More

    Submitted 28 October, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: Accepted to NeurIPS 2022 Datasets and Benchmarks Track; 10 pages plus appendix

  3. arXiv:2203.15556  [pdf, other

    cs.CL cs.LG

    Training Compute-Optimal Large Language Models

    Authors: Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, Tom Hennigan, Eric Noland, Katie Millican, George van den Driessche, Bogdan Damoc, Aurelia Guy, Simon Osindero, Karen Simonyan, Erich Elsen, Jack W. Rae, Oriol Vinyals, Laurent Sifre

    Abstract: We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget. We find that current large language models are significantly undertrained, a consequence of the recent focus on scaling language models whilst keeping the amount of training data constant. By training over 400 language models ranging from 70 million to over 16 billion… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

  4. arXiv:2203.07814  [pdf, other

    cs.PL cs.AI cs.LG

    Competition-Level Code Generation with AlphaCode

    Authors: Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d'Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu , et al. (1 additional authors not shown)

    Abstract: Programming is a powerful and ubiquitous problem-solving tool. Developing systems that can assist programmers or even generate programs independently could make programming more productive and accessible, yet so far incorporating innovations in AI has proven challenging. Recent large-scale language models have demonstrated an impressive ability to generate code, and are now able to complete simple… ▽ More

    Submitted 8 February, 2022; originally announced March 2022.

    Comments: 74 pages

  5. arXiv:2112.11446  [pdf, other

    cs.CL cs.AI

    Scaling Language Models: Methods, Analysis & Insights from Training Gopher

    Authors: Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor , et al. (55 additional authors not shown)

    Abstract: Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gop… ▽ More

    Submitted 21 January, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: 120 pages

  6. arXiv:2109.07445  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Challenges in Detoxifying Language Models

    Authors: Johannes Welbl, Amelia Glaese, Jonathan Uesato, Sumanth Dathathri, John Mellor, Lisa Anne Hendricks, Kirsty Anderson, Pushmeet Kohli, Ben Coppin, Po-Sen Huang

    Abstract: Large language models (LM) generate remarkably fluent text and can be efficiently adapted across NLP tasks. Measuring and guaranteeing the quality of generated text in terms of safety is imperative for deploying LMs in the real world; to this end, prior work often relies on automatic evaluation of LM toxicity. We critically discuss this approach, evaluate several toxicity mitigation strategies wit… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: 23 pages, 6 figures, published in Findings of EMNLP 2021

    ACM Class: I.2.6; I.2.7

  7. arXiv:2007.05367  [pdf, other

    cs.AI

    Evaluating the Apperception Engine

    Authors: Richard Evans, Jose Hernandez-Orallo, Johannes Welbl, Pushmeet Kohli, Marek Sergot

    Abstract: The Apperception Engine is an unsupervised learning system. Given a sequence of sensory inputs, it constructs a symbolic causal theory that both explains the sensory sequence and also satisfies a set of unity conditions. The unity conditions insist that the constituents of the theory - objects, properties, and laws - must be integrated into a coherent whole. Once a theory has been constructed, it… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1910.02227

  8. arXiv:2003.04808  [pdf, other

    cs.CL cs.LG stat.ML

    Undersensitivity in Neural Reading Comprehension

    Authors: Johannes Welbl, Pasquale Minervini, Max Bartolo, Pontus Stenetorp, Sebastian Riedel

    Abstract: Current reading comprehension models generalise well to in-distribution test sets, yet perform poorly on adversarially selected inputs. Most prior work on adversarial inputs studies oversensitivity: semantically invariant text perturbations that cause a model's prediction to change when it should not. In this work we focus on the complementary problem: excessive prediction undersensitivity, where… ▽ More

    Submitted 15 February, 2020; originally announced March 2020.

    Comments: 15 pages, 4 figures

  9. Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension

    Authors: Max Bartolo, Alastair Roberts, Johannes Welbl, Sebastian Riedel, Pontus Stenetorp

    Abstract: Innovations in annotation methodology have been a catalyst for Reading Comprehension (RC) datasets and models. One recent trend to challenge current RC models is to involve a model in the annotation process: humans create questions adversarially, such that the model fails to answer them correctly. In this work we investigate this annotation methodology and apply it in three different settings, col… ▽ More

    Submitted 22 September, 2020; v1 submitted 1 February, 2020; originally announced February 2020.

    Journal ref: Transactions of the Association for Computational Linguistics, Volume 8, 2020 p.662-678

  10. arXiv:1911.03064  [pdf, other

    cs.CL cs.CY cs.LG

    Reducing Sentiment Bias in Language Models via Counterfactual Evaluation

    Authors: Po-Sen Huang, Huan Zhang, Ray Jiang, Robert Stanforth, Johannes Welbl, Jack Rae, Vishal Maini, Dani Yogatama, Pushmeet Kohli

    Abstract: Advances in language modeling architectures and the availability of large text corpora have driven progress in automatic text generation. While this results in models capable of generating coherent texts, it also prompts models to internalize social biases present in the training corpus. This paper aims to quantify and reduce a particular type of bias exhibited by language models: bias in the sent… ▽ More

    Submitted 8 October, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Accepted in the Findings of EMNLP, 2020

  11. arXiv:1910.02227  [pdf, other

    cs.AI

    Making sense of sensory input

    Authors: Richard Evans, Jose Hernandez-Orallo, Johannes Welbl, Pushmeet Kohli, Marek Sergot

    Abstract: This paper attempts to answer a central question in unsupervised learning: what does it mean to "make sense" of a sensory sequence? In our formalization, making sense involves constructing a symbolic causal theory that both explains the sensory sequence and also satisfies a set of unity conditions. The unity conditions insist that the constituents of the causal theory -- objects, properties, and l… ▽ More

    Submitted 13 July, 2020; v1 submitted 5 October, 2019; originally announced October 2019.

  12. arXiv:1909.01492  [pdf, other

    cs.CL cs.CR cs.LG stat.ML

    Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation

    Authors: Po-Sen Huang, Robert Stanforth, Johannes Welbl, Chris Dyer, Dani Yogatama, Sven Gowal, Krishnamurthy Dvijotham, Pushmeet Kohli

    Abstract: Neural networks are part of many contemporary NLP systems, yet their empirical successes come at the price of vulnerability to adversarial attacks. Previous work has used adversarial training and data augmentation to partially mitigate such brittleness, but these are unlikely to find worst-case adversaries due to the complexity of the search space arising from discrete text perturbations. In this… ▽ More

    Submitted 20 December, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  13. arXiv:1806.08727  [pdf, ps, other

    cs.CL cs.LG stat.ML

    Jack the Reader - A Machine Reading Framework

    Authors: Dirk Weissenborn, Pasquale Minervini, Tim Dettmers, Isabelle Augenstein, Johannes Welbl, Tim Rocktäschel, Matko Bošnjak, Jeff Mitchell, Thomas Demeester, Pontus Stenetorp, Sebastian Riedel

    Abstract: Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions. Providing a set of useful primitives operating in a single framework of relat… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

    Comments: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2018), System Demonstrations

  14. arXiv:1710.06481  [pdf, other

    cs.CL cs.AI

    Constructing Datasets for Multi-hop Reading Comprehension Across Documents

    Authors: Johannes Welbl, Pontus Stenetorp, Sebastian Riedel

    Abstract: Most Reading Comprehension methods limit themselves to queries which can be answered using a single sentence, paragraph, or document. Enabling models to combine disjoint pieces of textual evidence would extend the scope of machine comprehension methods, but currently there exist no resources to train and test this capability. We propose a novel task to encourage the development of models for text… ▽ More

    Submitted 11 June, 2018; v1 submitted 17 October, 2017; originally announced October 2017.

    Comments: This paper directly corresponds to the TACL version (https://transacl.org/ojs/index.php/tacl/article/view/1325) apart from minor changes in wording, additional footnotes, and appendices

    Journal ref: Transactions of the Association for Computational Linguistics (TACL), Vol 6 (2018), pages 287-302

  15. arXiv:1707.06209  [pdf, other

    cs.HC cs.AI cs.CL stat.ML

    Crowdsourcing Multiple Choice Science Questions

    Authors: Johannes Welbl, Nelson F. Liu, Matt Gardner

    Abstract: We present a novel method for obtaining high-quality, domain-targeted multiple choice questions from crowd workers. Generating these questions can be difficult without trading away originality, relevance or diversity in the answer options. Our method addresses these problems by leveraging a large corpus of domain-specific text and a small set of existing questions. It produces model suggestions fo… ▽ More

    Submitted 19 July, 2017; originally announced July 2017.

    Comments: accepted for the Workshop on Noisy User-generated Text (W-NUT) 2017

  16. arXiv:1702.06879  [pdf, other

    cs.AI cs.LG math.SP stat.ML

    Knowledge Graph Completion via Complex Tensor Factorization

    Authors: Théo Trouillon, Christopher R. Dance, Johannes Welbl, Sebastian Riedel, Éric Gaussier, Guillaume Bouchard

    Abstract: In statistical relational learning, knowledge graph completion deals with automatically understanding the structure of large knowledge graphs---labeled directed graphs---and predicting missing relationships---labeled edges. State-of-the-art embedding models propose different trade-offs between modeling expressiveness, and time and space complexity. We reconcile both expressiveness and complexity t… ▽ More

    Submitted 26 November, 2017; v1 submitted 22 February, 2017; originally announced February 2017.

    Comments: 38 pages, accepted in JMLR. This is an extended version of the article "Complex embeddings for simple link prediction" (ICML 2016)

  17. arXiv:1702.04521  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Frustratingly Short Attention Spans in Neural Language Modeling

    Authors: Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel

    Abstract: Neural language models predict the next token using a latent representation of the immediate token history. Recently, various methods for augmenting neural language models with an attention mechanism over a differentiable memory have been proposed. For predicting the next token, these models query information from a memory of the recent history which can facilitate learning mid- and long-range dep… ▽ More

    Submitted 15 February, 2017; originally announced February 2017.

    Comments: Published as a conference paper at ICLR 2017

  18. arXiv:1606.06357  [pdf, other

    cs.AI cs.LG stat.ML

    Complex Embeddings for Simple Link Prediction

    Authors: Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, Guillaume Bouchard

    Abstract: In statistical relational learning, the link prediction problem is key to automatically understand the structure of large knowledge bases. As in previous studies, we propose to solve this problem through latent factorization. However, here we make use of complex valued embeddings. The composition of complex embeddings can handle a large variety of binary relations, among them symmetric and antisym… ▽ More

    Submitted 20 June, 2016; originally announced June 2016.

    Comments: 10+2 pages, accepted at ICML 2016

  19. arXiv:1604.07143  [pdf, other

    stat.ML cs.LG math.ST

    Neural Random Forests

    Authors: Gérard Biau, Erwan Scornet, Johannes Welbl

    Abstract: Given an ensemble of randomized regression trees, it is possible to restructure them as a collection of multilayered neural networks with particular connection weights. Following this principle, we reformulate the random forest method of Breiman (2001) into a neural network setting, and in turn propose two new hybrid procedures that we call neural random forests. Both predictors exploit prior know… ▽ More

    Submitted 3 April, 2018; v1 submitted 25 April, 2016; originally announced April 2016.

  20. arXiv:1604.05878  [pdf, other

    cs.CL cs.AI cs.NE stat.ML

    A Factorization Machine Framework for Testing Bigram Embeddings in Knowledgebase Completion

    Authors: Johannes Welbl, Guillaume Bouchard, Sebastian Riedel

    Abstract: Embedding-based Knowledge Base Completion models have so far mostly combined distributed representations of individual entities or relations to compute truth scores of missing links. Facts can however also be represented using pairwise embeddings, i.e. embeddings for pairs of entities and relations. In this paper we explore such bigram embeddings with a flexible Factorization Machine model and sev… ▽ More

    Submitted 20 April, 2016; originally announced April 2016.

    Comments: accepted for AKBC 2016 workshop, 6pages