This research was completed for LASR Labs 2025 by Nelson Gardner-Challis, Jonathan Bostock, Georgiy Kozhevnikov and Morgan Sinclaire. The team was supervised by Joan Velja and Charlie Griffin (University of Oxford, UK AI Security Institute).
The full paper can be found here.
We did a deep dive into untrusted monitoring, and explored several new collusion strategies:
Overall, our key finding is that arguing for the safety of an untrusted monitoring deployment requires resolving several important open problems that...
Tbf, there is some market for status signaling with customized iPhones in the $50-100k range, though it does look far more unconventional than Rolex