x
Our Experience Running Independent Evaluations on LLMs: What Have We Learned? — LessWrong