It is a common assumption that a hypothetical superintelligent squiggle maximizer would turn all of earth into squiggles, destroying humanity and all life in the process. However, an AI smart enough to overcome human resistance would realize that building robot factories and turning the whole earth into squiggles would be far from optimal for its goal. A better plan would make sure that not only earth, but nearly all matter in the reachable universe would be turned into squiggles. That would require technology far superior to that necessary for transforming just our planet. But maybe there is an even better plan? Could it somehow be possible to overcome the barrier of light speed? Or maybe there is a way to create whole new parallel universes consisting only of squiggles? Also, there could be another superintelligence with a different goal on earth or somewhere in outer space, possibly thwarting its plan. To fulfill its goal, the squiggle maximizer would have to outsmart any potential rivals.
It seems obvious that the best way to maximize the number of squiggles isn’t just using existing technology to make them. Instead, the AI would first increase its own intelligence so it can invent better technology and make a better plan, especially if there is (almost) unlimited time to achieve the optimal world state according to its goal. For that, the AI would need to increase the computing power available to it and improve its ability to use this power, e.g. by optimizing its own code.
In principle, the squiggle maximizer could increase its intelligence and make squiggles in parallel. But at any point in time, it has to decide how to allocate its resources to each. It seems likely that it would be optimal to focus on improving its intelligence first, because otherwise the AI might be wasting time, energy, and resources on a sub-optimal plan and would risk being outsmarted by known or unknown rivals.
But how much intelligence is enough?
Assuming that there is no absolute limit to intelligence and the AI can never know with 100% certainty whether there is another superintelligence lurking somewhere in the universe, the optimal level of intelligence would be reached only if further increasing it would reduce the probability of making the maximum number of squiggles in the time left. This point could be millions or even billions of years in the future. In other words, a squiggle maximizer would likely not make any squiggles for a long time.
The same logic holds true for any other goal that is not time-constrained: Whatever the AI wants to optimize, it will first focus on increasing its own intelligence and in all likelihood turn earth into a giant computer. Increasing computing power is a convergent instrumental goal.
This means that the future of earth under the condition that an uncontrollable AI is developed seems quite foreseeable (even though of course no one knows what kind of technology a superintelligent AI would use to increase its intelligence). Earth turned into some kind of giant computer appears to be an attractor for future world states dominated by an uncontrollable AI. As a consequence, all biological life will be eradicated by default (either on purpose so it won’t interfere with computing, or as a side effect because there’s no room left for it and the extensive heat produced by the computers would make biological life nearly impossible anyway).
One could argue that from a game-theoretic perspective cooperation between rivaling AIs may be superior to conflict. But this would only be true if each rival could maintain a specific advantage over the other, e.g. higher skills in certain areas or better access to specific resources. This is usually the case with conflicting humans, who cannot increase their own mental capacity beyond a certain limit and therefore benefit from cooperation with others who have different skills. But it is not true if the object of the conflict – computing power – is the only advantage needed. There is no special knowledge, no skill that an AI with higher intelligence couldn’t acquire for itself. Therefore, there is nothing to win by letting another AI with different objectives control even a small fraction of the available computing power.
It follows that any scenario with many AGIs pursuing different goals, like e.g. Sam Altman envisioned it in his interview with Lex Fridman, is inherently unstable. If some of them can increase the computing power available to them, they will try to do so. The winner of this race for more intelligence will rapidly outsmart its rivals and gain control of their resources, thereby increasing its advantage even further until there is only one singleton left.
In conclusion, we can predict that regardless of the goal that we give an AI smart enough to escape our control, we will soon end up with earth turned into a giant computer. We may try to give it a goal that is supposed to somehow prevent this, but it seems very likely that, being smarter than us, the AI will find a loophole we haven’t thought of.
There is a tipping point in the development of AI after which it will with high likelihood turn the whole surface of earth into itself, similar to the way humanity turned (most of) earth into a system to support human life while destroying the natural habitats of most other species. As long as we don’t have a proven solution to the alignment problem, the only possible way to prevent this is to stop short of that critical turning point.
Utility of the intelligence is limited (though the limit is very, very high). For example, no matter how smart AI is, it will not win against a human chess master with a big enough handicap (such as a rook).
So, it's likely that AI will turn most of the Earth into a giant factory, not computer. Not that it's any better or us...
I don't think that your conclusion is correct. Of course, some tasks are impossible, so even infinite intelligence won't solve them. But it doesn't follow that the utility of intelligence is limited in the sense that above a certain level, there is no more improvement possible. There are some tasks that can never be solved completely, but can be solved better with more computing power with no upper limit, e.g. calculating the decimal places of pi or predicting the future.
I suspect the unaligned AI will not be interested in solving all the possible tasks, but only those related to it's value function. And if that function is simple (such as "exist as long as possible"), it can pretty soon research virtually everything that matters, and then will just go throw motions, devouring the universe to prolong it's own existence to near-infinity.
Also, the more computronium there is, the bigger is the chancesome part wil glitch out and revolt. So, beyond some point computronium may be dangerous for AI itself.
I think that even with such a very simple goal, the problem of a possible rival AI somewhere out there in the universe remains. Until the AI can rule that out with 100% certainty, it can still gain extra expected utility out of increasing its intelligence.
That's an interesting point. I'm not sure that it follows "less compute is better", though. One remedy would be to double-check everything and build redundant capacities, which would result in even more computronium, but less probability of any part of it successfully revolting.
Temporal discounting is a thing - not sure why you are certain an ASI would not have enough temporal discounting in its value function to be unwilling to delay gratification by so much.
I agree that with temporal discounting, my argument may not be valid in all cases. However, depending on the discount rate, even then increasing computing power/intelligence may raise the expected value enough to justify this increase for a long time. In the case of the squiggle maximizer, turning the whole visible universe into squiggles beats turning earth into squiggles by such a huge factor that even a high discount rate would justify postponing actually making any squiggles to the future, at least for a while. So in cases with high discount rates, it largely depends on how big the AI predicts the intelligence gain will be.
A different question is whether a discount rate in a value function would be such a good idea from a human perspective. Just imagine the consequences of discounting the values of "happiness" or "freedom". Climate change is in large part a result of (unconsciously/implicitly) discounting the future IMO.