x

LESSWRONG

LW

Daan — LessWrong

Daan

Daan

Message

41

1

1

2y

Daan

41

2y

Estimating efficiency improvements in LLM pre-training

TL/DR: GPT-3 was trained on 10,000 NVIDIA V100’s for 14.8 days. Based on my - very uncertain - estimates, OpenAI should now be able train a comparably capable model on an equivalent cluster in under an hour. This implies a drop in hardware requirements of more than 400x. Factoring out...

Jan 19, 2024•43