Draft:Grokking (machine learning): Difference between revisions
Appearance
Cleaning up accepted Articles for creation submission (AFCH) |
m Fram moved page Grokking (machine learning) to Draft:Grokking (machine learning) over a redirect without leaving a redirect: Please don't accept articles where most references are Arxiv prepublications. See the Arxiv section on WP:RSPSS |
(No difference)
|
Revision as of 11:36, 4 June 2024
In machine learning, grokking is a neologism which describes a transition to generalization that occurs many training iterations after the interpolation threshold, after many iterations of seemingly little progress.[1][2]
The term derives from the word grok coined by Robert Heinlein in his novel Stranger in a Strange Land.
Grokking can be understood as a phase transition during the training process.[3] While grokking has been thought of as largely a phenomenon of relatively shallow models, grokking has been observed in deep models and is the subject of active research.[4]
See also
References
- ^ Pearce, Adam; Ghandeharioun, Asma; Hussein, Nada; Thain, Nithum; Wattenberg, Martin; August 2023, Lucas Dixon. "Do Machine Learning Models Memorize or Generalize?". pair.withgoogle.com. Retrieved 2024-06-04.
{{cite web}}
: CS1 maint: numeric names: authors list (link) - ^ Power, Alethea; Burda, Yuri; Edwards, Harri; Babuschkin, Igor; Misra, Vedant (2022-01-06), Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets, arXiv:2201.02177, retrieved 2024-06-04
- ^ Liu, Ziming; Kitouni, Ouail; Nolte, Niklas; Michaud, Eric J.; Tegmark, Max; Williams, Mike (2022-10-14), Towards Understanding Grokking: An Effective Theory of Representation Learning, arXiv:2205.10343, retrieved 2024-06-04
- ^ Fan, Simin; Pascanu, Razvan; Jaggi, Martin (2024-05-29), Deep Grokking: Would Deep Neural Networks Generalize Better?, arXiv:2405.19454, retrieved 2024-06-04