Validator models: A simple approach to detecting goodharting — LessWrong