x

LESSWRONG

LW

Coil — LessWrong

Coil

Coil

Message

-16

1

7mo

Coil

-16

7mo

Why Would we get Inner Misalignment by Default?

As I understand, one of the primary ways that inner alignment could go wrong even if our objective function captures our intention (outer alignment is magically solved) is that the AI would: 1. Develop a mesa-objective which approximates the objective function or is instrumentally useful 2. Develop goal preservation and...

Oct 29, 2025•3

Hedonium is AI Alignment

1. Introduction I will argue that the outcome we should aim to achieve from aligned AI is Hedonium: the conversion of the universe into the densest possible packing of positive valence. Valence represents the preferability of a given moment, as though we asked you to compare which moments of life...

Aug 31, 2025•-17