x
Benchmarking LLM Agents on Kaggle Competitions — LessWrong