x
Twitter thread on AI safety evals — LessWrong