x
How to Design Environments for Understanding Model Motives — LessWrong