2361

LESSWRONG
LW

2360

Arthur Conmy's Shortform — LessWrong

Arthur Conmy's Shortform

by Arthur Conmy

1st Nov 2022

AI Alignment Forum

1 min read

2

This is a special post for quick takes by Arthur Conmy. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Arthur Conmy's Shortform

1 comment, sorted by

Click to highlight new comments since: Today at 11:51 AM

[-]Arthur Conmy3y10

Has anyone done any reproduction of double descent [https://openai.com/blog/deep-double-descent/] on the transformers they train (or better, GPT-like transformers)? Since grokking can be somewhat understood by transformer interpretability [https://openreview.net/forum?id=9XFSbDPmdW], this seems like a possibly tractable direction

More from Arthur Conmy

Curated and popular this week