LESSWRONG
LW

676
joetey
1010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No posts to display.
No wikitag contributions to display.
A Rocket–Interpretability Analogy
joetey1y20

I personally don't think that working on better capabilities, and working on the alignment problem are two distinct, separate problems. For example, if you're able to create better, more precise control mechanisms and intervention techniques, you can better align human & AI intent.  Alignment feels as much of a technical, control interface problem, than merely a question of "what should we align to?".

Reply