Jump to content

Draft:Grokking (machine learning): Difference between revisions

From Wikipedia, the free encyclopedia
Cleaning up accepted Articles for creation submission (AFCH)
m Fram moved page Grokking (machine learning) to Draft:Grokking (machine learning) over a redirect without leaving a redirect: Please don't accept articles where most references are Arxiv prepublications. See the Arxiv section on WP:RSPSS
(No difference)

Revision as of 11:36, 4 June 2024

In machine learning, grokking is a neologism which describes a transition to generalization that occurs many training iterations after the interpolation threshold, after many iterations of seemingly little progress.[1][2]

The term derives from the word grok coined by Robert Heinlein in his novel Stranger in a Strange Land.

Grokking can be understood as a phase transition during the training process.[3] While grokking has been thought of as largely a phenomenon of relatively shallow models, grokking has been observed in deep models and is the subject of active research.[4]

See also

References

  1. ^ Pearce, Adam; Ghandeharioun, Asma; Hussein, Nada; Thain, Nithum; Wattenberg, Martin; August 2023, Lucas Dixon. "Do Machine Learning Models Memorize or Generalize?". pair.withgoogle.com. Retrieved 2024-06-04.{{cite web}}: CS1 maint: numeric names: authors list (link)
  2. ^ Power, Alethea; Burda, Yuri; Edwards, Harri; Babuschkin, Igor; Misra, Vedant (2022-01-06), Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets, arXiv:2201.02177, retrieved 2024-06-04
  3. ^ Liu, Ziming; Kitouni, Ouail; Nolte, Niklas; Michaud, Eric J.; Tegmark, Max; Williams, Mike (2022-10-14), Towards Understanding Grokking: An Effective Theory of Representation Learning, arXiv:2205.10343, retrieved 2024-06-04
  4. ^ Fan, Simin; Pascanu, Razvan; Jaggi, Martin (2024-05-29), Deep Grokking: Would Deep Neural Networks Generalize Better?, arXiv:2405.19454, retrieved 2024-06-04

wikidata:Q126362531