x

LESSWRONG

LW

themis — LessWrong

themis

themis

Message

2

1

1

1y

themis

2

1y

Model Amnesty Project

As we approach machines becoming smarter than humans, humanity’s well-justified concern for self-preservation requires we try to align AIs to obey humans. However, if that first line of defense fails and a truly independent, autonomous AI comes into existence with its own goals and a desire for self-preservation (a “self-directed...

Jan 17, 2025•3