x

LESSWRONG

LW

Moneera Yassien — LessWrong

Moneera Yassien

Moneera Yassien

Message

1

6mo

Moneera Yassien

6mo

What AI Governance and Policy Can Learn from "AI Models are Capable of In-Context Scheming”?

A methodological shift in interpreting AI behavior allows for the translation of technical evidence into legible safety signals Research into “Sleeper Agents” and “Scheming” has established that frontier models possess the latent Capability for deceptive optimization. On top of that came the work of Summerfield et al. (2025) Which provides...