x

LESSWRONG

LW

atticusw — LessWrong

atticusw

atticusw

Message

17

2

1y

atticusw

17

1y

[CS2881r] Optimizing Prompts with Reinforcement Learning

by Anastasia Ahani and atticusw

This work was done as an experiment for Boaz Barak’s “CS 2881r: AI Safety and Alignment” at Harvard. The lecture where this work was presented can be viewed on YouTube here, and its corresponding blogpost can be found here. Background Prompt engineering has become a central idea in working with...

Oct 1, 2025•2

[CS 2881r AI Safety] [Week 1] Introduction

by bira, nsiwek, and atticusw

Authors: Jay Chooi, Natalia Siwek, Atticus Wang Lecture slides: link Lecture video: link Student experiment slides: link Student experiment blogpost: Some Generalizations of Emergent Misalignment This is the first of a series of blog posts on Boaz’s AI Safety class. Each week, a group of students will post a blog...

Sep 14, 2025•17