Two flaws in the Machiavelli Benchmark — LessWrong