The sum of its parts: composing AI control protocols
This work was supported through the MARS (Mentorship for Alignment Research Students) program at the Cambridge AI Safety Hub (caish.org/mars). We would like to thank Redwood Research for their support and Andrés Cotton for his management of the project. TLDR The field of AI control takes a worst-case scenario approach,...