Seeking Feedback: Toy Model of Deceptive Alignment (Game Theory) — LessWrong