ML models can perform a range of tasks and subtasks, some of which are more closely related to one another than are others. In this post, we set out two very initial starting points. First, we motivate reverse engineering models’ task decompositions. We think this can be helpful for interpretability...
Tl/Dr: * Generalisation is a capability in its own right, and in our experiments it scales less quickly than specific capabilities. * Our experiments show that on a range of tasks, finetuning Babbage on task A transfers more to task B than finetuning Davinci on task A transfers to task...
TLDR: We’re running a small iteration of MLAB (~10 participants) in Oxford towards the end of September. If you’re interested in participating, apply here by 7 September. If you’re interested in being a TA, please email us directly at oxfordmlab@gmail.com Edit: The dates are now confirmed for the 23 September-...
(Note: John discusses similar ideas here. We drafted this before he published his post, so some of the concepts might jar if you read it with his framing in mind. ) Traditionally, focus in Agent Foundations has been around the characterization of ideal agents, often in terms of coherence theorems...
We’re running a workshop (20 – 24 March) in Oxford with three goals. 1. Go in with only a vague idea; come out with a drafted academic essay that you could eventually publish. 2. Build tacit knowledge about the research/doing-good-things sphere. 3. Meet other people in that sphere, for peers...
Cross-posted from my new blog. Recently, I gave a few people some advice on how to write well. Below is that advice. What is it to write well? 1. Have a thought. 2. Get that thought from your head to someone else’s. What is a thought? In this community, a...