Grokking (ML)

Edited by Morpheus last updated 29th Feb 2024

A Phenomenon in machine learning where a machine learning model generalizes to a test set only long after it achieved perfect loss on the training set.

Posts tagged Grokking (ML)

1

378A Mechanistic Interpretability Analysis of Grokking

Ω

Neel Nanda, Tom Lieberum

4y

Ω

48

1

116QAPR 5: grokking is maybe not *that* big a deal?

Ω

Quintin Pope

3y

Ω

15

1

102Explaining grokking through circuit efficiency

Ω

Vikrant Varma, Rohin Shah

3y

Ω

11

1

84Ambiguous out-of-distribution generalization on an algorithmic task

Wilson Wu, Louis Jaburi

1y

6

1

75Grokking, memorization, and generalization — a discussion

Kaarel, Dmitry Vaintrob

3y

11

1

46Paper+Summary: OMNIGROK: GROKKING BEYOND ALGORITHMIC DATA

Ω

Marius Hobbhahn

4y

Ω

11

1

36Mesa-Optimizers via Grokking

Ω

orthonormal

4y

Ω

4

1

33The slingshot helps with learning

Wilson Wu

2y

0

1

23An interactive introduction to grokking and mechanistic interpretability

Ω

Adam Pearce, Asma Ghandeharioun

3y

Ω

3

1

21A short project on Mamba: grokking & interpretability

Alejandro Tlaie

2y

0

1

20AXRP Episode 29 - Science of Deep Learning with Vikrant Varma

Ω

DanielFilan

2y

Ω

1

14A Simple Method for Accelerating Grokking

josh :)

6mo

1

10Grokking Beyond Neural Networks

Jack Miller

3y

0

1

5Minor interpretability exploration #1: Grokking of modular addition, subtraction, multiplication, for different activation functions

Rareș Baron

1y

13