How middle powers may prevent the development of artificial superintelligence

Alex Amadori; Gabriel Alfour; Andrea_Miotti; Eva_B

This is a linkpost for https://asi-prevention.com

In this paper, we make recommendations for how middle powers may band together through a binding international agreement and achieve the goal of preventing the development of ASI, without assuming initial cooperation by superpowers.

You can read the paper here: asi-prevention.com

In our previous work Modelling the Geopolitics of AI, we pointed out that middle powers face a precarious predicament in a race to ASI. Lacking the means to seriously compete in the race or unilaterally influence superpowers to halt development, they may need to resort to a strategy we dub “Vassal’s Wager”: allying themselves with a superpower and hoping that their sovereignty is respected after the superpower attains a DSA.

Of course, this requires superpowers to avert the extinction risks posed by powerful AI systems, something over which middle powers have little or no control over. Thus, we argue that it is in the interest of most middle powers to collectively deter and prevent the development of ASI by any actor, including superpowers.

In this paper, we design an international agreement that could enable middle powers to form a coalition capable of achieving this goal. The agreement we propose is complementary to a “verification framework” that can prevent the development of ASI if it achieves widespread adoption, such as articles IV to IX of MIRI’s latest proposal.

Our proposal tries to answer the following question: how may a coalition of actors pressure others to join such a verification framework, without assuming widespread initial participation?

Key Mechanisms

Trade restrictions. The agreement imposes comprehensive export controls on AI-relevant hardware and software, and import restrictions on AI services from non-members, with precedents ranging from the Chemical Weapons Convention and the Nuclear Non-Proliferation Treaty.

Reactive deterrence. Escalating penalties—from strengthened export controls to targeted sanctions, broad embargoes, and ultimately full economic isolation—are triggered as actors pursue more and more dangerous AI R&D outside of the verification framework.

Preemptive self-defense rights. The coalition recognizes that egregiously dangerous AI R&D constitutes an imminent threat tantamount to an armed attack, permitting members to claim self-defense rights in extreme cases.

Escalation in unison. The agreement would establish AI R&D redlines as well as countermeasures tied to each breach. These are meant to ensure that deterrence measures are triggered in a predictable manner, in unison by all participants of the agreement. This makes it clear to actors outside of the agreement which thresholds are not to be crossed, while ensuring that any retaliation by actors receiving penalties are distributed among all members of the coalition.

Though these measures represent significant departures from established customs, they are justified by AI’s unique characteristics. Unlike nuclear weapons, which permit a stable equilibrium through mutually assured destruction (MAD), AI R&D may lead to winner-take-all outcomes. Any actor who automates all the key bottlenecks in Automated AI R&D secures an unassailable advantage in AI capabilities: its lead over other actors can only grow over time, eventually culminating in a decisive strategic advantage.

Path to Adoption

We recommend that the agreement activates once signatories represent at least 20% of the world’s GDP and at least 20% of the world’s population. This threshold is high enough to exert meaningful pressure on superpowers; at the same time, it is reachable without assuming that any superpower champions the initiative in its early stages.

This threshold enables middle powers to build common knowledge of their willingness to participate in the arrangement without immediately antagonizing actors in violation of the redlines, and without paying outsized costs at a stage when the coalition commands insufficient leverage.

As the coalition grows, network effects may accelerate adoption. Trade restrictions make membership increasingly attractive while non-membership becomes increasingly costly.

Eventually, the equilibrium between competing superpowers may flip from racing to cooperation: each superpower could severely undermine the others by joining the coalition, leaving the final holdouts facing utter economic and strategic isolation from the rest of the world. If this is achieved early enough, all other relevant actors are likely to follow suit and join the verification framework.

Urgency

The agreement's effectiveness depends critically on timing. Earlier adoption may be achieved through diplomatic and economic pressure alone. As AI R&D is automated, superpowers may grow confident they can achieve decisive strategic advantage through it. If so, more extreme measures will likely become necessary.

Once superpowers believe ASI is within reach and are willing to absorb staggering temporary costs in exchange for a chance at total victory, even comprehensive economic isolation may prove insufficient and more extreme measures may be necessary to dissuade them.

The stakes—encompassing potential human extinction, permanent global dominance by a single actor, or devastating major power war—justify treating this challenge with urgency historically reserved for nuclear proliferation. We must recognize that AI R&D may demand even more comprehensive international coordination than humanity has previously achieved.

I appreciate the amount of thought that went into this proposal, but I'm skeptical this is feasible, for a few reasons

First, timing. Your proposal acknowledges this but I think undersells it. The window where economic pressure could plausibly work is narrow and possibly already closed. If US or Chinese leadership believes ASI is 2-5 years out, they'll absorb enormous economic costs for a shot at decisive advantage. Middle power coordination would need to outpace AI capability growth, and I don't see how that happens if diplomats need a year and a half to rush this.

Second, the US and China aren't passive actors here. A coalition explicitly designed to prevent American or Chinese AI development would be accurately viewed as hostile. The US has extensive tools for pressuring middle powers to defect. The proposal assumes coalition members absorb retaliation costs collectively, but the US can apply pressure bilaterally in ways that make early defection attractive. China has its own methods, but I can only speak confidently about America.

Third, the proposal assumes middle powers share an interest in preventing ASI, but the UK, Japan, and Australia might prefer staying friends with America post-ASI to a world where ASI is prevented. Pakistan and Russia might have different calculations than Brazil. Getting to 20% requires very different countries with very different relationships to agree this is their best option.

I don't have a better proposal. My best idea is to create a friendly AI and let RSI fix everything, but I'm also writing a blog post about how that's probably not going to work. The underlying problem is real, but I don't think small countries suddenly developing unprecedented coordination to try to stop superpowers from achieving total victory will work.

Edit: I've started thinking of ways that middle powers could use direct diplomacy with AI labs to get their interests taken into account. I might have a post on that later this week, but other people are welcome to think about it on their own.

What if their ask is less "Don't build ASI" and more "Build it, but implement the following transparent governance structure that loops in representatives from our nations, so that we have some say in what goals/values/commands/etc. are given to the army of ASIs?"

If a superpower rejects even this mild ask, then that might help rally the middle powers to join the coalition, because it makes it more clear that the superpower can't be trusted to grant its vassals any rights/autonomy/etc. whatsoever.

I think that's a better ask. Third countries trying to ensure that ASI is only built with their feedback and some kind of oversight is likely to be seen as less hostile than just demanding China/America stop trying to build it. I don't expect China or Trump would say yes to that either, but the odds and politics seem better.

Either way this is still a plan that would need to get started today if we don't want it to be overcome by events.

I wrote a post saying it would be better for middle powers to do diplomacy and work directly with the AI labs, but I no longer endorse it and it will likely stay in drafts indefinitely. If you want to read that post, I'd recommend writing it yourself.

First, timing ... If US or Chinese leadership believes ASI is 2-5 years out, they'll absorb enormous economic costs for a shot at decisive advantage

I think this is reasonable, which is why we include the more extreme measures including recognition of the right to self-defense. I personally would be surprised if we could throw this together so quickly that none of the conditional deterrence measures ever need to be activated...

In darker timelines, I think the more extreme economic measures could slow down the superpower AI programs and give time for middle powers to get more serious with their military deterrence, which has a good chance of being effective imo.

The US has extensive tools for pressuring middle powers to defect. The proposal assumes coalition members absorb retaliation costs collectively, but the US can apply pressure bilaterally in ways that make early defection attractive. China has its own methods, but I can only speak confidently about America.

This is a good point. We didn't have time to address this in the first version of the proposal, but there are potentially some mitigations that can be implemented here, like very heavy penalties for defecting from the agreement.

At the end of the day though... you just must get deep buy in about the x-risks of ASI among middle powers (including softer ones like the possibility of a permanent US and China singleton).

More superficial motivations could be easy to break, but I think it would be difficult to tempt a country where the relevant decision makers think the best case scenario for ASI is for one's state to be completely dismantled by a US singleton (effectively if not literally).

No object-level comments besides: this seems like a worthy angle of attack; countries besides the US and China can matter and should be mobilized. Really glad someone's pushing on this.

While it wouldn't be ideal for international security, middle powers will also probably feel a lot of pressure to acquire and commit to using weapons of mass destruction. It's probably much cheaper to develop powerful weapons and elaborate fail-deadlies than it is to kickstart your own AI infrastructure (particuarly if it's viable to steal Chinese/American models), so it's attractive to bank on staying geopolitically relevant through deterrence instead of having to coordinate.

I think this is particularly likely to happen in worlds where misalignment isn't seen as omnicidal, and where the primary perceived risk is loss of sovereignty. Since being part of a coalition itself erodes some sovereignty (especially one that carries security commitments), it might seem easier to turn investment inwards.

I keep trying to map this onto Canada's situation and getting stuck. We're mid-negotiation with the US on trade and political capital is finite. How does leadership spend it on ASI risks most Canadians aren't thinking about?

My instinct is IPCC-style consensus building first, but your argument implies that's too slow given winner-take-all dynamics. Are you saying middle powers should skip that step?

I keep trying to map this onto Canada's situation and getting stuck. We're mid-negotiation with the US on trade and political capital is finite. How does leadership spend it on ASI risks most Canadians aren't thinking about?

IMO the key here is "most Canadians aren't thinking about", this can be changed through awareness campaigns. Most people aren't aware that AI companies are shooting for ASI, and wouldn't like it if they knew.

I see the loss of control argument. Regarding the argument of being dominated: to what extent is being dominated by a superpower with ASI different from being dominated by a superpower with nuclear weapons, conventional military dominance, and economic dominance, which is the current situation for many middle powers? I can imagine that post-ASI, control granularity might be higher, and permanence might be higher. How important will these differences be?

control granularity might be higher, and permanence might be higher

That's pretty much it, just to an extreme extent. Given that ASI could be extremely powerful and that it's really hard to predict what it could do, I recommend thinking of it as:

The control granularity is basically infinite (the state can read your mind, persuade you to take arbitrary actions, predict what you'll do long in advance, etc.)
The permanence is basically infinite (eg. until the heat death of the universe)

Agree. Guess granularity will be a function of AI power.

I appreciate the amount of thought that went into this proposal, but I'm skeptical this is feasible, for a few reasons

First, timing ... If US or Chinese leadership believes ASI is 2-5 years out, they'll absorb enormous economic costs for a shot at decisive advantage

The US has extensive tools for pressuring middle powers to defect. The proposal assumes coalition members absorb retaliation costs collectively, but the US can apply pressure bilaterally in ways that make early defection attractive. China has its own methods, but I can only speak confidently about America.

No object-level comments besides: this seems like a worthy angle of attack; countries besides the US and China can matter and should be mobilized. Really glad someone's pushing on this.

My instinct is IPCC-style consensus building first, but your argument implies that's too slow given winner-take-all dynamics. Are you saying middle powers should skip that step?

I keep trying to map this onto Canada's situation and getting stuck. We're mid-negotiation with the US on trade and political capital is finite. How does leadership spend it on ASI risks most Canadians aren't thinking about?

control granularity might be higher, and permanence might be higher

That's pretty much it, just to an extreme extent. Given that ASI could be extremely powerful and that it's really hard to predict what it could do, I recommend thinking of it as:

The control granularity is basically infinite (the state can read your mind, persuade you to take arbitrary actions, predict what you'll do long in advance, etc.)
The permanence is basically infinite (eg. until the heat death of the universe)

Agree. Guess granularity will be a function of AI power.

LESSWRONG
LW

LESSWRONG
LW

127

How middle powers may prevent the development of artificial superintelligence

127

Key Mechanisms

Path to Adoption

Urgency

127

127