x
Thoughts on causal isolation of AI evaluation benchmarks — LessWrong