Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

It seems that heroin is a better example of the problem discussed here than window opening.

Basically, the challenge is that if the AI can make you accept a potent heroin injection, you'll agree this was a good idea - but if it doesn't, you won't.

New Comment