Improving Model-Written Evals for AI Safety Benchmarking — LessWrong