282

LESSWRONG
LW

281

Arthur Conmy's Shortform

by Arthur Conmy
1st Nov 2022
AI Alignment Forum
1 min read
1

2

This is a special post for quick takes by Arthur Conmy. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
Arthur Conmy's Shortform
1Arthur Conmy
1 comment, sorted by
top scoring
Click to highlight new comments since: Today at 4:38 PM
[-]Arthur Conmy3y10


Has anyone done any reproduction of double descent [https://openai.com/blog/deep-double-descent/] on the transformers they train (or better, GPT-like transformers)? Since grokking can be somewhat understood by transformer interpretability [https://openreview.net/forum?id=9XFSbDPmdW], this seems like a possibly tractable direction

Reply
Moderation Log
More from Arthur Conmy
View more
Curated and popular this week
1Comments