LESSWRONG
LW

How might we solve the alignment problem?

This is a four-part series of posts about how we might solve the alignment problem. It also builds off of my previous post, here, about what it would even be to solve the alignment problem; and to some extent, off of this post outlining a framework for thinking the incentives at stake in AI power-seeking.

54How might we solve the alignment problem? (Part 1: Intro, summary, ontology)

Joe Carlsmith

1y

5

45Motivation control

Joe Carlsmith

1y

9

28Option control

Joe Carlsmith

1y

0

31Incentive design and capability elicitation

Joe Carlsmith

1y

0