In this paper, we make recommendations for how middle powers may band together through a binding international agreement and achieve the goal of preventing the development of ASI, without assuming initial cooperation by superpowers.
You can read the paper here: asi-prevention.com
In our previous work Modelling the Geopolitics of AI, we pointed out that middle powers face a precarious predicament in a race to ASI. Lacking the means to seriously compete in the race or unilaterally influence superpowers to halt development, they may need to resort to a strategy we dub “Vassal’s Wager”: allying themselves with a superpower and hoping that their sovereignty is respected after the superpower attains a DSA.
Of course, this requires superpowers to avert the extinction risks posed by powerful AI systems, something over which middle powers have little or no control over. Thus, we argue that it is in the interest of most middle powers to collectively deter and prevent the development of ASI by any actor, including superpowers.
In this paper, we design an international agreement that could enable middle powers to form a coalition capable of achieving this goal. The agreement we propose is complementary to a “verification framework” that can prevent the development of ASI if it achieves widespread adoption, such as articles IV to IX of MIRI’s latest proposal.
Our proposal tries to answer the following question: how may a coalition of actors pressure others to join such a verification framework, without assuming widespread initial participation?
Trade restrictions. The agreement imposes comprehensive export controls on AI-relevant hardware and software, and import restrictions on AI services from non-members, with precedents ranging from the Chemical Weapons Convention and the Nuclear Non-Proliferation Treaty.
Reactive deterrence. Escalating penalties—from strengthened export controls to targeted sanctions, broad embargoes, and ultimately full economic isolation—are triggered as actors pursue more and more dangerous AI R&D outside of the verification framework.
Preemptive self-defense rights. The coalition recognizes that egregiously dangerous AI R&D constitutes an imminent threat tantamount to an armed attack, permitting members to claim self-defense rights in extreme cases.
Escalation in unison. The agreement would establish AI R&D redlines as well as countermeasures tied to each breach. These are meant to ensure that deterrence measures are triggered in a predictable manner, in unison by all participants of the agreement. This makes it clear to actors outside of the agreement which thresholds are not to be crossed, while ensuring that any retaliation by actors receiving penalties are distributed among all members of the coalition.
Though these measures represent significant departures from established customs, they are justified by AI’s unique characteristics. Unlike nuclear weapons, which permit a stable equilibrium through mutually assured destruction (MAD), AI R&D may lead to winner-take-all outcomes. Any actor who automates all the key bottlenecks in Automated AI R&D secures an unassailable advantage in AI capabilities: its lead over other actors can only grow over time, eventually culminating in a decisive strategic advantage.
We recommend that the agreement activates once signatories represent at least 20% of the world’s GDP and at least 20% of the world’s population. This threshold is high enough to exert meaningful pressure on superpowers; at the same time, it is reachable without assuming that any superpower champions the initiative in its early stages.
This threshold enables middle powers to build common knowledge of their willingness to participate in the arrangement without immediately antagonizing actors in violation of the redlines, and without paying outsized costs at a stage when the coalition commands insufficient leverage.
As the coalition grows, network effects may accelerate adoption. Trade restrictions make membership increasingly attractive while non-membership becomes increasingly costly.
Eventually, the equilibrium between competing superpowers may flip from racing to cooperation: each superpower could severely undermine the others by joining the coalition, leaving the final holdouts facing utter economic and strategic isolation from the rest of the world. If this is achieved early enough, all other relevant actors are likely to follow suit and join the verification framework.
The agreement's effectiveness depends critically on timing. Earlier adoption may be achieved through diplomatic and economic pressure alone. As AI R&D is automated, superpowers may grow confident they can achieve decisive strategic advantage through it. If so, more extreme measures will likely become necessary.
Once superpowers believe ASI is within reach and are willing to absorb staggering temporary costs in exchange for a chance at total victory, even comprehensive economic isolation may prove insufficient and more extreme measures may be necessary to dissuade them.
The stakes—encompassing potential human extinction, permanent global dominance by a single actor, or devastating major power war—justify treating this challenge with urgency historically reserved for nuclear proliferation. We must recognize that AI R&D may demand even more comprehensive international coordination than humanity has previously achieved.