x
Why Did My Model Do That? Model Forensics for Diagnosing LLM Misbehavior — LessWrong