Some ideas how to set up the AGI training such that it wouldn't pose existential threat to humanity, even if its ability to manipulate us and (indirectly) our surroundings were superior to us. If these ideas have already been elaborated and/or debunked elsewhere, please let me know!
Different ways to lie about the substrate
When the AGI is trained (think: black-box evolution) to adapt to its training environment, the environment should be idealized in ways that are unrealistic. This way, the resulting intelligence comes to depend on features that are difficult to efficiently implement in our substrate, the laws of physics.
The core idea here is: whatever the AGI is, when finished and certified "human-exceeding",... (read 861 more words →)