LESSWRONG
LW

2272
Wikitags

CS 2881r

Edited by habryka last updated 11th Sep 2025

CS 2881r is a class by @boazbarak on AI Safety and Alignment at Harvard. 

This tag applies to all posts about that class, as well as posts created in the context of it, e.g. as part of student assignments.

Subscribe
Discussion
1
Subscribe
Discussion
1
Posts tagged CS 2881r
12[CS 2881r] Some Generalizations of Emergent Misalignment
Valerio Pepe
2mo
0
18AI Safety course intro blog
boazbarak
3mo
0
4[CS 2881r] [Week 6] Recursive Self-Improvement
Joshua Qin
17d
0
106Learnings from AI safety course so far
boazbarak
1mo
6
53Call for suggestions - AI safety course
boazbarak
4mo
23
15[CS 2881r AI Safety] [Week 1] Introduction
bira, nsiwek, atticusw
2mo
0
5[CS 2881r] Can We Prompt Our Way to Safety? Comparing System Prompt Styles and Post-Training Effects on Safety Benchmarks
hughvd
3d
0
3[CS 2881r] [Week 3] Adversarial Robustness, Jailbreaks, Prompt Injection, Security
egeckr
1mo
0
2[CS2881r] Optimizing Prompts with Reinforcement Learning
Anastasia Ahani, atticusw
1mo
0
1[CS 2881r AI Safety] [Week 5] Content Policies
MB Samuel, audreyty
14d
0
1[CS 2881r AI Safety] [Week 2] Modern LLM Training
jusyc
1mo
0
Add Posts