LESSWRONG
LW

28

Scaling Laws

Edited by riley, plex last updated 18th Jun 2023

Scaling Laws refer to the observed trend that the scaling behaviors of deep neural networks (i.e. how the evaluation metric of interest varies as one varies the amount of compute used for training (or inference), number of model parameters, training dataset size, model input size, or number of training steps) follows variants of power laws.

External links

"Broken Neural Scaling Laws" paper

Scaling laws graph from Scaling Laws for Neural Language Models

1

1

Posts tagged Scaling Laws

136"Can AI Scaling Continue Through 2030?", Epoch AI (yes)

1y

4

424chinchilla's wild implications

3y

128

185What will GPT-2030 look like?

2y

43

21/r/MLScaling: new subreddit for NN scaling research/discussion

5y

0

82Thoughts on the Alignment Implications of Scaling Language Models

4y

11

35My ML Scaling bibliography

4y

9

32Google's new text-to-image model - Parti, a demonstration of scaling benefits

3y

4

172o1: A Technical Primer

11mo

19

109Paper: On measuring situational awareness in LLMs

Owain_Evans, Daniel Kokotajlo, Mikita Balesni, Tomek Korbak, Asa Cooper Stickland, Meg, Maximilian Kaufmann

2y

17

67Superhuman Coders in AI 2027 - Not So Fast

dschwarz, FutureSearch

6mo

0

63Ethan Caballero on Private Scaling Progress

Michaël Trazzi

3y

2

58Inverse Scaling Prize: Second Round Winners

Ian McKenzie, Sam Bowman, Ethan Perez

3y

17

51NVIDIA and Microsoft releases 530B parameter transformer model, Megatron-Turing NLG

4y

36

51[Link] Training Compute-Optimal Large Language Models

4y

23

50A closer look at chess scalings (into the past)

4y

14

Load More (15/74)

Add Posts