x
List of commonly used benchmarks for LLMs — LessWrong