Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: McKenzie, I R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2306.09479  [pdf, other

    cs.CL cs.AI cs.CY

    Inverse Scaling: When Bigger Isn't Better

    Authors: Ian R. McKenzie, Alexander Lyzhov, Michael Pieler, Alicia Parrish, Aaron Mueller, Ameya Prabhu, Euan McLean, Aaron Kirtland, Alexis Ross, Alisa Liu, Andrew Gritsevskiy, Daniel Wurgaft, Derik Kauffman, Gabriel Recchia, Jiacheng Liu, Joe Cavanagh, Max Weiss, Sicong Huang, The Floating Droid, Tom Tseng, Tomasz Korbak, Xudong Shen, Yuhui Zhang, Zhengping Zhou, Najoung Kim , et al. (2 additional authors not shown)

    Abstract: Work on scaling laws has found that large language models (LMs) show predictable improvements to overall loss with increased scale (model size, training data, and compute). Here, we present evidence for the claim that LMs may show inverse scaling, or worse task performance with increased scale, e.g., due to flaws in the training objective and data. We present empirical evidence of inverse scaling… ▽ More

    Submitted 12 May, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: Published in TMLR (2023), 39 pages

    Journal ref: Transactions on Machine Learning Research (TMLR), 10/2023, https://openreview.net/forum?id=DwgRm72GQF