The risk from AI-enabled coups in particular is detailed at length here. To reduce this risk, we can try to introduce constraints on coup-assisting uses of AI, diversify military AI suppliers, slow autocracies via export controls, and promote credible benefit-sharing.
Why are AI-enabled coups worse than the alternative in expectation? I'm personally quite uncertain about this. I looked around a bit and found an explanation from you in this podcast, which I quote below and respond to. Please let me know if you have a more detailed writeup somewhere.
People who successfully become global dictators would likely have worse values than is ideal. The process selects for Machiavellianism, sadism, and other Dark Triad traits. So that's one thing.
I agree this has historically been the case, but think the selection pressure could be a bit different with AI-enabled global takeover, which perhaps require somewhat less political maneuvering and will to dominate other humans, and more technical foresight/understanding and appreciation of astronomical stakes. It seems unclear to me that on expectation the kind of person who becomes an AI-enabled global dictator would have worse values than average. (Of course "worse than ideal" is much more likely, but seems like the wrong standard to use here? Morality is Scary gives some explanations for my relatively low opinion of average human's values.)
A second is that if you have an intense concentration of power, some of the mechanisms by which we get moral progress—like people being able to argue with each other and having to defend their views in the public sphere—go away.
I'm afraid that such mechanisms would probably go away even without intense concentration of power due to considerations like these. In short, AI-optimized persuasion would likely replace human debate and break whatever poorly understood mechanism has been responsible for humans making moral/philosophical progress in the past.
Whoever does take over the world, if they were sufficiently wise (i.e., concerned about their own moral progress), could deliberately institute new mechanisms to support moral progress with AI help. I think historically, decentralization of power and competition has (apparently) been better for moral progress, but this may not hold in the future.
And third, you don't get the potential huge benefits from trade. I've argued that it's quite unlikely that most people will converge on the best moral views. So, while there's some small chance that a dictator ends up having good moral views and acting on them, it's probably not the case. And then there's no one else to trade with.
This seems to ignore acausal trade (which I see you've acknowledged as a possibility elsewhere). Compared to the kind of trade you talk about, I think I'm more worried that in a more decentralized/competitive world, philosophical progress might get derailed more easily (due to above concerns), causing acausal trade or other opportunities that are potentially vastly more beneficial (see Beyond Astronomical Waste) to be lost.
What projects today could most improve a post-AGI world?
In “How to make the future better”, I lay out some areas I see as high-priority, beyond reducing risks from AI takeover and engineered pandemics.
These areas include:
Here’s an overview.
First, preventing post-AGI autocracy. Superintelligence structurally leads to concentration of power: post-AGI, human labour soon becomes worthless; those who can spend the most on inference-time compute have access to greater cognitive abilities than anyone else; and the military (and whole economy) can in principle be aligned to a single person.
The risk from AI-enabled coups in particular is detailed at length here. To reduce this risk, we can try to introduce constraints on coup-assisting uses of AI, diversify military AI suppliers, slow autocracies via export controls, and promote credible benefit-sharing.
Second, governance of ASI projects. If there’s a successful national project to build superintelligence it will wield world-shaping power. We therefore need governance structures—ideally multilateral or at least widely distributed—that can be trusted to reflect global interests, embed checks and balances, and resist drift toward monopoly or dictatorship. Rose Hadshar and I give a potential model here: Intelsat, a successful US-led multilateral project to build the world’s first global communications satellite network.
What’s more, for any new major institutions like this, I think we should make their governance explicitly temporary: coming with reauthorization clauses, explicitly stating that the law or institution must be reauthorized after some period of time.
Intelsat gives an illustration: it was created under “interim agreements”; after five years, negotiations began for “definitive agreements”, which came into force four years after that. The fact that the initial agreements were only temporary helped get non-US countries on board.
Third, deep space governance. This is crucial for two reasons: (i) the acquisition of resources within our solar system is a way in which one country or company could get more power than the rest of the world combined, and (ii) almost all the resources that can ever be used are outside of our solar system, so decisions about who owns these resources are decisions about almost everything that will ever happen.
Here, we could try to prevent lock-in, by pushing for international understanding of the Outer Space Treaty such that de facto grabs of space resources (“seizers keepers”) are clearly illegal.
Or, assuming the current “commons” regime breaks down given how valuable space resources will become, we could try to figure out in advance what a good alternative regime for allocating space resources might look like.
Fourth, working on AI value-alignment. Though corrigibility and control are important to reduce takeover risk, we want to also focus on ensuring that the AI we create positively influences society in the worlds where it doesn’t take over. That is, we need to figure out the “model spec” for superintelligence - what character it should have - and how to ensure it has that character.
I think we want AI advisors that aren’t sycophants, and aren’t merely trying to fulfill their users’ narrow self-interest - at least in the highest-stakes situations, like AI for political advice. Instead, we should at least want them to nudge us to act in accordance with the better angels of our nature.
(And, though it might be more difficult to achieve, we can also try to ensure that, even if superintelligent AI does take over, it (i) treats humans well, and (ii) creates a more-flourishing AI-civilisation than it would have done otherwise.)
Fifth, AI rights. Even just for the mundane reasons that it will be economically useful to give AIs rights to make contracts (etc), as we do with corporations, I think it’s likely we’ll start soon giving AIs at least some rights.
But what rights are appropriate? An AI rights regime will affect many things: the risk of AI takeover; the extent to which AI decision-making guides society; and the wellbeing of AIs themselves, if and when they become conscious.
In the future, it’s very likely that almost all beings will be digital. The first legal decisions we make here could set precedent for how they’re treated. But there are huge unresolved questions about what a good society involving both human beings and superintelligent AIs would look like. We’re currently stumbling blind into one of the most momentous decisions that will ever be made.
Finally, deliberative AI. AI has the potential to be enormously beneficial for our ability to think clearly and make good decisions, both individually and collectively. (And, yes, has the ability to be enormously destructive here, too.)
We could try to build and widely deploy AI tools for fact-checking, forecasting, policy advice, macrostrategy research and coordination; this could help ensure that the most crucial decisions are made as wisely as possible.
I’m aware that there’s a lot of different ideas here, and I’m aware that these are just potential ideas - more like proof of concept, rather than fully-fleshed out proposals. But my hope is that work on these areas - taking them from inchoate to tractable - could help society to keep its options open, to steer any potential lock-in events in better directions, and to equip decision-maker with the clarity and incentives needed to build a flourishing, rather than a merely surviving, future.
To get regular updates on Forethought’s research, you can subscribe to our Substack newsletter here.