Trustworthy and untrustworthy models — LessWrong