LESSWRONG
LW

75
How might we solve the alignment problem?

How might we solve the alignment problem?

Oct 30, 2024 by Joe Carlsmith

This is a four-part series of posts about how we might solve the alignment problem. It also builds off of my previous post, here, about what it would even be to solve the alignment problem; and to some extent, off of this post outlining a framework for thinking the incentives at stake in AI power-seeking.

54How might we solve the alignment problem? (Part 1: Intro, summary, ontology)
Joe Carlsmith
1y
5
45Motivation control
Joe Carlsmith
1y
9
28Option control
Joe Carlsmith
1y
0
31Incentive design and capability elicitation
Joe Carlsmith
1y
0