[CS 2881r AI Safety] [Week 1] Introduction
Authors: Jay Chooi, Natalia Siwek, Atticus Wang Lecture slides: link Lecture video: link Student experiment slides: link Student experiment blogpost: Some Generalizations of Emergent Misalignment This is the first of a series of blog posts on Boaz’s AI Safety class. Each week, a group of students will post a blog...