Not necessarily all instances. Just enough instances to allow our observations to not be incredibly unlikely. I wouldn't be too surprised if out of a sample of 100 000 AIs none of them managed to produce successful vNP before crashing. In addition to the previous points the vNP would have to leave the solar system fast enough to avoid the AI's "crash radius" of destruction.

Regarding your second point, if it turns out that most organic races can't produce a stable AI, then I doubt an insane AI would be able to make a sane intelligence. Even if it had the knowledge to, its own unstable value system could cause the VNP to have a really unstable value system too.

It might be the case that the space of self-modifying unstable AI has attractor zones that cause unstable AI of different designs to converge on similar behaviors, none of which produce VNP before crashing.

Your last point is an interesting idea though.

In my second point I meant original people, who created AI. Not all of the will be killed during creation and during AI halt. Many will survive and will be rather strong posthumans from our point of view. Just one instance of them is enough to start intelligence wave.

Another option is that AI may create nanobots capable to self-replicate in space, but not to star travel. But they anyway will jump from one comet to another randomly and in 1 billion year (arox) will colonise all Galaxy. We could search for such relicts in space. They may be rather benign from the risk point, just like mechanical plants.

AI as a resolution to the Fermi Paradox.

by Raiden 1 min read2nd Mar 201624 comments


The Fermi paradox has been discussed here a lot, and many times it has been argued that AI cannot be the great filter, because we would observe the paperclipping of the AI just as much as we would observe alien civilizations. I don't think we should totally rule this out though.

It may very well be the case that most unfriendly AI are unstable in various ways. For instance imagine an AI that has a utility function that changes whenever it looks at it. Or an AI that can't resolve the ontological crisis and so fails when it learns more about the world. Or maybe an AI that has a utility function that contradicts itself. There seem to be lots of ways that an AI can have bugs other than simply having goals that aren't aligned with our values.

Of course most of these AI would simply crash, or flop around and not do anything. A small subset of them might foom and stabilize as it does so. Most AI developers would try to move their AI from the former to the latter, and in doing so may pass through a space of AI that can foom to a significant degree without totally stabilizing. Such an AI might become very powerful, but exhibit "insane" behaviors that cause it to destroy itself and its parent civilization.

It might seem unlikely that an "insane" AI could manage to foom, but remember that we ourselves are examples of systems that can use general reasoning to gain power while still having serious flaws.

This would prevent us from observing either alien civilizations or paperclipping, and is appealing as a solution to the Fermi paradox because any advancing civilization would likely begin developing AI. Other threats that could arise after the emergence of civilization probably require the civilization to exhibit behaviors that not all civilizations would. Just because we threatened each other with nuclear annihilation doesn't mean all civilizations would, and it only takes one exception. But AI development is a natural step in the path of progress and very tricky. No matter how a civilization behaves, it could still get AI wrong.

If this does work as a solution, it would imply that friendliness is super hard. Most of the destroyed civilizations probably thought they had it figured out when they first flipped the switch.