Message

Evan Hockings

Message

Evan Hockings

Bandwidth Rules Everything Around Me: Oliver Habryka on OpenPhil and GoodVentures

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

It’s great to see that these techniques basically work at scale, but not so much to hear that things remain messy. Do you have any intuition for whether things would start to clean up if the model was trained until the loss curve flattened out? Maybe Chinchilla-optimality even has some interesting bearing on this!

[Linkpost] Introducing Superalignment

Evan Hockings3y147

This is incredible—the most hopeful I’ve been in a long time. 20% of current compute and plans for a properly-sized team! I’m not aware of any clearer or more substantive sign from a major lab that they actually care, and are actually going to try not to kill everyone. I hope that DeepMind and Anthropic have great things planned to leapfrog this!

[SEE NEW EDITS] No, *You* Need to Write Clearer

Evan Hockings3y2220

Agreed—thanks for writing this. I have the sense that there's somewhat of a norm that goes like 'it's better to publish something that not, even if it's unpolished' and while this is not wrong, exactly, I think those who are doing this professionally, or seek to do this professionally, ought to put in the extra effort to polish their work.

I am often reminded of this Jacob Steinhardt comment.

Researchers are, in a very substantial sense, professional writers. It does no good to do groundbreaking research if you are unable to communicate what you have d... (read more)