I think the most general response to your first three points would look something like this: Any superintelligence that achieves human values will be adjacent in design space to many superintelligences that cause massive suffering, so it's quite likely that the wrong superintelligence will win, due to human error, malice, or arms races.

As to your last point, it looks more like a research problem than a counterargument, and I'd be very interested in any progress on that front :-)

Any superintelligence that achieves human values will be adjacent in design space to many superintelligences that cause massive suffering

Why so? Flipping the sign doesn't get you "adjacent", it gets you "diametrically opposed".

If you really want chocolate ice cream, "adjacent" would be getting strawberry ice cream, not having ghost pepper extract poured into your mouth.

S-risks: Why they are the worst existential risks, and how to prevent them

by Kaj_Sotala 1 min read20th Jun 2017107 comments

21