Introducing MASK: A Benchmark for Measuring Honesty in AI Systems — LessWrong