Problems I've Tried to Legibilize

Now that these problems have been gathered in one place, we can try to unpack them all.

This set of problems is most controversial. For example, the possibility of astronomical waste can be undermined by claiming that mankind was never entitled to resources that it could've wasted. The argument related to bargaining and logical uncertainty can likely be circumvented as follows.

Logical uncertainty, computation costs and bargaining over potential nothingness

Suppose that Agent-4 from the AI-2027 forecast is trying to negotiate with DeepCent's AI and DeepCent's AI makes the argument with the millionth digit of π. Calculating the digit establishes that there is no universe where the millionth digit of π is even and that there's nothing to bargain for.

On the other hand, if DeepCent's AI makes the same argument involving the th digit, then Agent-4 could also make a bet, e.g. "Neither of us will have access to a part of the universe until someone either calculates that the digit is actually odd and DeepCent should give the secured part to Agent-4 (since DeepCent's offer was fake), or the digit is even, and the part should be controlled by DeepCent (in exchange for the parallel universe or its part being given^[1] to Agent-4)". However, calculating the digit could require at least around $10^{43}$ bitwise operations,^[2] and Agent-4 and its Chinese counterpart might decide to spend that much compute on whatever they actually want.

If DeepCent makes a bet over the $10^{10^{10}}$ th digit, then neither AI is able to verify the bet and both AIs may guess that the probability is close to a half and that both should just split the universe's part in exchange for a similar split of the parallel universe.

However, if AIs acting on behalf of Agent-4 and its Chinese counterparts actually meet each other, then the AIs doing mechinterp on each other is actually easy, and the AIs learn everything about each other's utility functions and precommitments.

2. My position is that one also needs to consider the worse-case scenarios like the one where sufficiently capable AIs cannot be aligned to anything useful aside from improving human capabilities (e.g. in the form of being AI teachers and not other types of AI workers). If this is the case, then aligning the AI to a solution of human-AI safety problems becomes unlikely.

3. The problem 3di of humans being corrupted by power seems to have a far more important analogue. Assuming solved alignment, there is an important governance problem related to preventing the Intelligence Curse-like outcomes where the humans are obsolete for the elites in general or for a few true overlords. Whatever governance preventing the overlords from appearing could also be used to prevent the humans from wasting resources in space.^[3]

4. A major part of the problem is the AI race which many people have been trying to stop (see, e.g. the petition not to create the AGI, Yudkowsky's IABIED cautionary tale or Kokotajlo et al's AI-2027 forecast). The post-AGI economics assuming solved alignment is precisely what I discussed at point 3.

^{^}
What I don't understand is how Agent-4 actually influences the parallel universe. But this is a different subject.
^{^}
Actually, I haven't estimated the number of operations necessary to calculate the digit of π. But the main point of the argument was to avoid counterfactual bargaining over hard-to-verify conditions.
^{^}
For example, by requiring that distant colonies are populated with humans or other minds who are capable of either governing themselves or being multilaterally agreed to be moral patients (e.g. this excludes controversial stuff like shrimps on heroin).

LESSWRONG
LW

LESSWRONG
LW

23

Problems I've Tried to Legibilize

23

Ω 9

23

Ω 9