What AI Governance and Policy Can Learn from "AI Models are Capable of In-Context Scheming”?
A methodological shift in interpreting AI behavior allows for the translation of technical evidence into legible safety signals Research into “Sleeper Agents” and “Scheming” has established that frontier models possess the latent Capability for deceptive optimization. On top of that came the work of Summerfield et al. (2025) Which provides...
Jan 201