Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

A write up of the various ideas we've had around reduced impact AIs:

https://www.dropbox.com/s/cjt5t6ny5gwpcd8/Low_impact_S%2BB.pdf?raw=1

New to LessWrong?

New Comment
2 comments, sorted by Click to highlight new comments since: Today at 12:40 AM

Regarding the asteroid scenario. It seems to me that if you have a formal mathematical model of the laser and the asteroid then you can build a safe math oracle to solve the problem while if you want the AI to figure out physics by itself then "correct x coordinate" can be dangerous by exploiting an unintended mechanism of influence on the world.

It seems like the purpose of the asteroid scenario is not to come up with ways of deflecting an asteroid, but as an example system in which two uncoordinated AIs (pardon the pun) can minimize impact in an interesting way.