This blog post summarizes the pre-readings and lecture content for Week 5 of Harvard CS 2881r: AI Safety taught by Boaz Barak.
This week’s class focused on content policies: creating effective content & moderation policies, platform governance, and the many challenges of policy enforcement.
Authors’ Intro
Hi, we’re Audrey Yang and MB Crosier Samuel.
Audrey: I’m a Junior at Harvard College studying Computer Science, with a minor in Philosophy. I am intrigued by the intersection between AI safety and ethics, especially in the context of writing and critiquing model specifications and the implications of moral responsibility for noncompliant models. I hope to continue learning about the technical challenges to protecting models from adversarial attacks, as well... (read 3365 more words →)