Management of Substrate-Sensitive AI Capabilities (MoSSAIC) Part 0: Overture
Tunings/Preamble Notes on the programme What follows is an overview of the incoming sequence. It is an edited version of the research I have been developing over the past year or so. Most recent version here. The initial position paper (co-authors Chris Pang, Sahil K) was presented at the Tokyo AI Safety Conference 2025. The developed position paper (co-authors Aditya Prasad, Aditya Adiga, Jayson Amati, Sahil K) has just been accepted for the Proceedings of Iliad and is now part of BlueDot's new Technical AI Safety Curriculum (Week 4: Understanding AI). This research will eventually and fortuitously emphasize the need for Sahil's Live Theory Agenda as a solution. This is no accident. These ideas were initially his and I do not claim credit for their creation. They all form part of this broader agenda. We do not claim that Live Theory is the only solution to the threat model we propose, though it is one we are excited about. We also highlight that much of the threat model applies more generally across the AI safety portfolio, and should be given urgent attention. Acknowledgments This work is an original take from Sahil, who I am attempting to precipitate into the ears of engineers. It owes an enormous debt to his consistent support and encouragement. I doubt I shall ever settle. It has been developed in the growing community that goes by various names. Originally High Actuation Spaces, then Live Theory, then Autostructures and more recently, Groundless. Chris Pang co-wrote the first version of this position paper with me and gave it its initial voice. More recently, Aditya Prasad has taken up co-authorship and increased the mech-interp surface area massively. Many other people have freely offered their time and critiques, and have left significant impressions. Among these, direct contributions from Kola Ayonrinde, Daniel Tan, Eleni Angelou, Murray Buchanan, Vojtěch Kovařík, and others. Themes Part 1: Exposition We'll argue that mechanistic interpretab
To be filled out with more detail