AI Safety via Generalization and Caution: A Research Agenda
This post is a condensed version of the full paper. You can also watch this talk for an overview of the conceptual arguments (although the talk focuses on one project out of five). Suggested reading options: 1. Just the summary 2. Summary + "A generalization-based framing of AI safety" +...
Feb 171