Introduction and summary
This retrospective focuses on the 4-month MATS extension phase (referred to as "MATS 5.1") that ran from April 1 to July 25, 2024, and presents findings gathered from an end-of-extension survey as well as follow-up interviews and surveys ~5 months after the program.
Main changes from the 4.1 to 5.1 extension phase:
- Cohort grew from 26 to 36 scholars split across London, Berkeley and remote participants;
- MATS formalized research management for the London cohort and grew the team to 2 FTEs;
- The cohort visited Google DeepMind's London offices;
- The London team organized Tuesday lightning talks from scholars and MATS staff.
Key takeaways from MATS extension impact:
- Research success: 75% of scholars published results in some form (paper, LW/AF post, codebase), of which 57% got accepted to a conference.
- Career transitions:
- 61% of scholars are
...
I think this was good, but I really think laying out the basic arguments for convergent instrumental goals is a foundational part of introducing the topic to a new audience (which I expect is the case for most of Lex's listeners) which wasnt sufficiently explained.
Making it clear that most innocuous goals beget resource acquisition and self preservation which is what puts an agentic AI in conflict with humans by default is what really makes the concern concrete for many people. Otherwise I think there is a tendancy to assume that some leap is required for a system to be in conflict which is much harder to picture and seems intuitively more preventable.