x
Hidden Role Games as a Trusted Model Eval — LessWrong