Skip to main content

Showing 1–4 of 4 results for author: Villalobos, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.07413  [pdf, other

    cs.AI cs.LG

    AI capabilities can be significantly improved without expensive retraining

    Authors: Tom Davidson, Jean-Stanislas Denain, Pablo Villalobos, Guillem Bas

    Abstract: State-of-the-art AI systems can be significantly improved without expensive retraining via "post-training enhancements"-techniques applied after initial training like fine-tuning the system to use a web browser. We review recent post-training enhancements, categorizing them into five types: tool-use, prompting methods, scaffolding, solution selection, and data generation. Different enhancements im… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 30 pages, 24 figures

  2. arXiv:2211.04325  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.CY

    Will we run out of data? Limits of LLM scaling based on human-generated data

    Authors: Pablo Villalobos, Anson Ho, Jaime Sevilla, Tamay Besiroglu, Lennart Heim, Marius Hobbhahn

    Abstract: We investigate the potential constraints on LLM scaling posed by the availability of public human-generated text data. We forecast the growing demand for training data based on current trends and estimate the total stock of public human text data. Our findings indicate that if current LLM development trends continue, models will be trained on datasets roughly equal in size to the available stock o… ▽ More

    Submitted 4 June, 2024; v1 submitted 25 October, 2022; originally announced November 2022.

  3. arXiv:2207.02852  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    Machine Learning Model Sizes and the Parameter Gap

    Authors: Pablo Villalobos, Jaime Sevilla, Tamay Besiroglu, Lennart Heim, Anson Ho, Marius Hobbhahn

    Abstract: We study trends in model size of notable machine learning systems over time using a curated dataset. From 1950 to 2018, model size in language models increased steadily by seven orders of magnitude. The trend then accelerated, with model size increasing by another five orders of magnitude in just 4 years from 2018 to 2022. Vision models grew at a more constant pace, totaling 7 orders of magnitude… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

  4. Compute Trends Across Three Eras of Machine Learning

    Authors: Jaime Sevilla, Lennart Heim, Anson Ho, Tamay Besiroglu, Marius Hobbhahn, Pablo Villalobos

    Abstract: Compute, data, and algorithmic advances are the three fundamental factors that guide the progress of modern Machine Learning (ML). In this paper we study trends in the most readily quantified factor - compute. We show that before 2010 training compute grew in line with Moore's law, doubling roughly every 20 months. Since the advent of Deep Learning in the early 2010s, the scaling of training compu… ▽ More

    Submitted 9 March, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

    Journal ref: 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 2022, pp. 1-8