Josh Hickman

Message

Outer Alignment is the Necessary Compliment to AI 2027's Best Case Scenario

To the extent we believe more advanced training and control techniques will lead to alignment of agents capable enough to strategically make successor agents -- and be able to solve inner alignment as a convergent instrumental goal -- we must also consider that inner alignment for successor systems can be...

Jun 9, 20254

Contra Yudkowsky on 2-4-6 Game Difficulty Explanations

To be clear from the outset: I don't believe Eliezer Yudkowsky would make any of the mistakes I describe in this post when playing 2-4-6-alike games. I've performed well in novel pattern-finding challenges, but not necessarily better than I imagine Yudkowsky would. I'm confident he correctly understands and would emphatically...

Sep 8, 20246

LESSWRONG
LW

LESSWRONG
LW

Josh Hickman

Josh Hickman

Outer Alignment is the Necessary Compliment to AI 2027's Best Case Scenario

Contra Yudkowsky on 2-4-6 Game Difficulty Explanations

Josh Hickman

Josh Hickman

Outer Alignment is the Necessary Compliment to AI 2027's Best Case Scenario

Contra Yudkowsky on 2-4-6 Game Difficulty Explanations