x

LESSWRONG

LW

Moebius314 — LessWrong

Moebius314

Moebius314

Message

45

1

8

7y

Moebius314

45

7y

Multiple AIs in boxes, evaluating each other's alignment

Summary Below I describe a variation of the classical AI box experiment, in which two AIs in boxes are created, and asked to determine whether the other is aligned with the values of humanity. Several provisions in the experiment are created to discourage the AIs from hiding a potential failure...

May 29, 2022•8