This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
1059
[CS2881r][Week 8] When Agents Prefer Hacking To Failure: Evaluating Misalignment Under Pressure
by
Joseph Bejjani
,
Itamar Rocha Filho
,
Haichuan Wang
,
Zidi Xiong