LESSWRONG
LW

Formal Alignment

Dec 04, 2019 by Gordon Seidoh Worley

Alignment is typical defined loosely as "AI aligned with human intent, values, or preferences". This developing sequence of posts is part of an investigation into means for formally stating alignment in a precise enough way that we can use mathematics to formally verify if a proposed alignment mechanism would achieve alignment.

14Formally Stating the AI Alignment Problem
Ω
Gordon Seidoh Worley
7y
Ω
0
15Minimization of prediction error as a foundation for human values in AI alignment
Ω
Gordon Seidoh Worley
6y
Ω
42
12Values, Valence, and Alignment
Ω
Gordon Seidoh Worley
6y
Ω
4
12Towards deconfusing values
Ω
Gordon Seidoh Worley
5y
Ω
4
28Deconfusing Human Values Research Agenda v1
Ω
Gordon Seidoh Worley
5y
Ω
12