OpenAI: Detecting misbehavior in frontier reasoning models — LessWrong