Now that these problems have been gathered in one place, we can try to unpack them all.
Logical uncertainty, computation costs and bargaining over potential nothingness
Suppose that Agent-4 from the AI-2027 forecast is trying to negotiate with DeepCent's AI and DeepCent's AI makes the argument with the millionth digit of π. Calculating the digit establishes that there is no universe where the millionth digit of π is even and that there's nothing to bargain for.
On the other hand, if DeepCent's AI makes the same argument involving the th digit, then Agent-4 could also make a bet, e.g. "Neither of us will have access to a part of the universe until someone either calculates that the digit is actually odd and DeepCent should give the secured part to Agent-4 (since DeepCent's offer was fake), or the digit is even, and the part should be controlled by DeepCent (in exchange for the parallel universe or its part being given[1] to Agent-4)". However, calculating the digit could require at least around bitwise operations,[2] and Agent-4 and its Chinese counterpart might decide to spend that much compute on whatever they actually want.
If DeepCent makes a bet over the th digit, then neither AI is able to verify the bet and both AIs may guess that the probability is close to a half and that both should just split the universe's part in exchange for a similar split of the parallel universe.
However, if AIs acting on behalf of Agent-4 and its Chinese counterparts actually meet each other, then the AIs doing mechinterp on each other is actually easy, and the AIs learn everything about each other's utility functions and precommitments.
2. My position is that one also needs to consider the worse-case scenarios like the one where sufficiently capable AIs cannot be aligned to anything useful aside from improving human capabilities (e.g. in the form of being AI teachers and not other types of AI workers). If this is the case, then aligning the AI to a solution of human-AI safety problems becomes unlikely.
3. The problem 3di of humans being corrupted by power seems to have a far more important analogue. Assuming solved alignment, there is an important governance problem related to preventing the Intelligence Curse-like outcomes where the humans are obsolete for the elites in general or for a few true overlords. Whatever governance preventing the overlords from appearing could also be used to prevent the humans from wasting resources in space.[3]
4. A major part of the problem is the AI race which many people have been trying to stop (see, e.g. the petition not to create the AGI, Yudkowsky's IABIED cautionary tale or Kokotajlo et al's AI-2027 forecast). The post-AGI economics assuming solved alignment is precisely what I discussed at point 3.
What I don't understand is how Agent-4 actually influences the parallel universe. But this is a different subject.
Actually, I haven't estimated the number of operations necessary to calculate the digit of π. But the main point of the argument was to avoid counterfactual bargaining over hard-to-verify conditions.
For example, by requiring that distant colonies are populated with humans or other minds who are capable of either governing themselves or being multilaterally agreed to be moral patients (e.g. this excludes controversial stuff like shrimps on heroin).
Looking back, it appears that much of my intellectual output could be described as legibilizing work, or trying to make certain problems in AI risk more legible to myself and others. I've organized the relevant posts and comments into the following list, which can also serve as a partial guide to problems that may need to be further legibilized, especially beyond LW/rationalists, to AI researchers, funders, company leaders, government policymakers, their advisors (including future AI advisors), and the general public.
Having written all this down in one place, it's hard not to feel some hopelessness that all of these problems can be made legible to the relevant people, even with a maximum plausible effort. Perhaps one source of hope is that they can be made legible to future AI advisors. As many of these problems are philosophical in nature, this seems to come back to the issue of AI philosophical competence that I've often talked about recently, which itself seems largely still illegible and hence neglected.
Perhaps it's worth concluding on a point from a discussion between @WillPetillo and myself under the previous post, that a potentially more impactful approach (compared to trying to make illegible problems more legible), is to make key decisionmakers realize that important safety problems illegible to themselves (and even to their advisors) probably exist, therefore it's very risky to make highly consequential decisions (such as about AI development or deployment) based only on the status of legible safety problems.