A Benchmark for Decision Theories — LessWrong