How Powerful AIs Get Cheap

In the previous article in this series, I described how AI could contribute to the development of cheap weapons of mass destruction, the proliferation of which would be strategically destabilizing. This article will take a look at how the cost to build the AI systems themselves might fall.

Key Points

Even though the costs to build frontier models is increasing, the cost to reach a fixed level of capability is falling. While making GPT-4 was initially expensive, the cost to build a GPT-4 equivalent keeps tumbling down.
This is likely to be as true of weapons-capable AI systems as any other.
A decline in the price of building an AI model is not the only way that the cost to acquire one might decrease. If it's possible to buy or steal frontier models, the high costs of development can be circumvented.
Because of these factors, powerful AI systems (and their associated weapons capabilities) will eventually become widely accessible without preemptive measures.
Fortunately, the high cost to develop frontier models means that the strongest capabilities will be temporarily monopolized at their inception, giving us a window to evaluate and limit the distribution of models when needed.

Lessons From Cryptography

Although the offensive capabilities of future AI systems usually invite comparisons to nuclear weapons (for their offense-dominance, the analogy of compute to enriched uranium, or their strategic importance) I often find that a better point of comparison is to cryptography---another software based technology with huge strategic value.

While cryptography might feel benign today, a great part of its historical heritage is as an instrument of war: a way for Caesar to pass secret messages to his generals, or for the Spartans to disguise field campaigns. The origins of modern cryptography were similarly militaristic: Nazi commanders using cipher devices to hide their communications in plain sight while Allied codebreakers raced to decrypt Enigma and put an end to the war.

Even in the decades following the collapse of the Axis powers, the impression of cryptography as a military-first technology remained. Little research on cryptographic algorithms happened publicly. What did took place under the purview of the NSA, a new organization which had been created with the express purpose of protecting the U.S's intelligence interests during the Cold War. The only people that got to read America's defense plans were going to be the DoD and God, and only if He could be bothered to factor the product of arbitrarily large primes.

It wasn't until the late 70s that the government's hold over the discipline began to crack, as institutional researchers developed new techniques like public key exchange and RSA encryption. The governments of the U.S and Britain were not pleased. The very same algorithms that they had secretly developed just a few years prior had been rediscovered by a handful of stubborn researchers out of MIT---and it was all the worse that those researchers were committed to publishing their ideas for anyone to use. So began two decades' worth of increasingly inventive lawfare between the U.S and independent cryptography researchers, whose commitment to open-sourcing their ideas continually frustrated the government's attempts to monopolize the technology.

A quick look at any piece of modern software will tell you who won that fight. Cryptography underlies almost every legal application you can imagine, and just as many illegal ones---the modern internet, financial system, and drug market would be unrecognizable without it. The more compelling question is why the government lost. After all, they'd been able to maintain a near-monopoly on encryption for over thirty years prior to the late 70s. What made controlling the use and development of cryptographic technology so much more challenging in the 80s and 90s that the government was forced to give up on the prospect?

The simple answer is that it got much cheaper to do cryptographic research and run personal encryption. Before the 1970s, cryptography required either specialized hardware (the US Navy paid $50,000 per Bombe in 1943, or about $1 million today) or general-purpose mainframes costing millions of dollars, barriers which allowed the government to enforce a monopoly over distribution. As one of the few institutional actors capable of creating, testing, and running encryption techniques, organizations like the NSA could control the level of information security major companies and individuals had access to. As the personal computer revolution took off, however, so too did the ability of smaller research teams to develop new algorithms and of individuals to test them personally.

Bombe | Code Breaking, History, Design, & Facts | Britannica — A mechanical Bombe, prototyped by Alan Turing's team at Bletchley park to help decrypt the German Enigma cipher.

Despite the algorithm behind RSA being open sourced in the late 70s, for instance, it wasn't until the early 90s that consumers had access to enough personal computing power to actually run the algorithm---a fact which almost bankrupted the company developing it commercially, RSA Security. But as the power of computer hardware kept doubling, it became cheap, and then trivial, for computers to quickly perform the necessary calculations. As new algorithms like PGP and AES were created to take advantage of this windfall of processing power, and as the internet allowed algorithmic secrets to easily evade military export controls, the government's ability to enforce non-proliferation crumbled completely by the turn of the millenium.

This is remembered as a victory for proponents of freedom and personal privacy. And it was, but only because cryptography proved to be a broadly defense-dominant technology: one that secured institutions and citizens from attack rather than enabling new forms of aggression. The government monopoly over the technology was unjustified because it was withholding protection for the sake of increasing its own influence.

Had cryptography been an offense-dominant technology, however, this would be a story of an incredible national security failure instead of a libertarian triumph. Imagine an alternative world where as personal computing power kept growing, the ability to break encryption began to outpace efforts to make it stronger. The financial system, government secrets, and personal privacy would be under constant threat of attack, with cryptographic protections becoming more and more vulnerable every year. In this world, the government would be entirely justified in trying to control the distribution of the algorithmic secrets behind cryptanalysis, and would have been tragically, not heroically, undermined by researchers recklessly open sourcing their insights and the growth in personal computing power.

This is an essay about AI, not cryptography. But the technologies are remarkably similar. Like cryptography, AI systems are software based technologies with huge strategic implications. Like cryptography, AI systems are expensive to design but trivial to copy. Like cryptography, AI capabilities that were once gated by price become more accessible as computing power got cheaper. And like cryptography, the combination of AI's commercial value, ideological proponents of open-sourcing, and the borderless nature of the internet makes export controls and government monopolies difficult to maintain over time. The only difference is one of outcome: cryptography, a technology used to enhance our collective security and privacy, and AI, a dual-use tool that has as many applications for the design of weapons of mass destruction as it does for medical, economic, and scientific progress.

Just as the Japanese population collapse provided the world with an early warning of the developed demographic crisis decades before it happened, the proliferation of cryptography gives us a glimpse into the future challenges of trying to control the spread of offensive AI technology. Whether AI follows the same path depends on whether its cost of development will continue to fall, and whether we have the foresight to preempt the proliferation of the most dangerous dual-use models before it becomes irreversible.

The previous article in this series described how AI systems could become strategically relevant by enabling the production of cheap yet powerful weapons. An AI model capable of expertly assisting with gain of function research, for example, could make it much easier for non-state actors to develop lethal bioweapons, while a general artificial superintelligence (ASI) could provide the state that controls it with a scalable army of digital workers and unmatched strategic dominance over its non ASI competitors.

One hope for controlling the distribution of these offensive capabilities is that the AI systems that enable them will remain extremely expensive to produce. Just as the high cost of nuclear enrichment has allowed a handful of nuclear states to (mostly) monopolize production, the high cost of AI development could be used to restrict proliferation through means like export controls on compute.

Unfortunately, the cost to acquire a powerful AI system will probably not remain high. In practice, algorithmic improvements and the ease of transferring software will put pressure on enforcement controls, expanding the range of actors that become capable of building or acquiring AI models.

Specifically, there are two major problems:

First, the cost to build an AI system is falling. Once a frontier benchmark it met, it gets cheaper and cheaper for each successive generation to reach that same level of performance. As a result, formerly expensive offensive capabilities will quickly become cheaper to acquire.
Second, the cost to take an AI system is extraordinarily low compared to other weapons technologies. AI models are ultimately just software files, which makes them uniquely vulnerable to theft.

Because of these dynamics, proliferation of strategically relevant AI systems is the default outcome. The goal of this article is to look at how these costs are falling, in order to lay the groundwork for future work on the strategic implications of distributed AI systems and policy solutions to avoid proliferation of offensive capabilities.

Cost of Fixed Capabilities

The first concern, and the most detrimental for long-term global stability, is that the cost to build a powerful AI system will collapse in price. As these systems become widely available for any actor with the modest compute budget required to train them, their associated weapons capabilities will follow, leading to an explosion of weapons proliferation. These offensive capabilities would diffuse strategic power into the hands of rogue governments and non-state actors, empowering them to, at best, raise the stakes of mutually assured destruction, and at worst, end the world through the intentional or accidental release of powerful superweapons like mirror life or misaligned artificial superintelligences.

Empirically, we can already see a similar price dynamic in the performance and development of contemporary AI models.^[1] While it's becoming increasingly expensive to build new frontier models (as a consequence of scaling hardware for training runs), the cost to train an AI capable of a given, or "fixed" level of capability is steadily decreasing.

Image credit to Scharre (2024). Even as the cost to train new frontier model goes up, the cost to match what used to be the frontier quickly goes down. GPT-4 cost an estimated $100 million to train when it was released in March of 2023: 8 months later, Inflection-2 had it matched at just $12 million. By January 2025, you could fine-tune a model better than GPT-4 for less than $500.

The primary driver of this effect is improvements to algorithmic efficiency, which reduce the amount of computation (or compute) that AI models need during training. This has two distinct but complementary effects on AI development.

First, all of your existing compute becomes more valuable. Because your training process is now more compute efficient, any leftover compute can be reinvested into increasing the size or the duration of the model's training run, which naturally pushes up performance.^[2] Despite having the same number of GPUs to start with, you have more "effective" compute relative to the previous training runs, which lets you acquire new capabilities that were previously bounded by scale.
1. The transition from LSTMs to transformer architectures, for instance, made it massively more efficient to train large models. LSTMs process text sequentially, moving through sentences one word at a time, with each step depending on the previous one.You might own thousands of powerful processors, but the sequential nature of the architecture meant that most of them wait around underutilized while the algorithm processes each word in order.^[3]
  Transformers changed this by introducing attention mechanisms that could process all positions in a sequence simultaneously. Instead of reading "The cat sat on the mat" one word at a time, transformers could analyze relationships between all six words in parallel.^[4] This meant that research labs with fixed GPU budgets could suddenly train much larger and more capable models than before, simply because they were no longer bottlenecked by sequential processing. Even at their unoptimized introduction in 2017, transformers were so much more efficient that likely increased effective compute by more than sevenfold compared to the previous best architectures.
Second, it becomes cheaper for anyone to train a model to a previously available, or fixed, level of performance. Since the price floor for performance is lower, capabilities that were previously only accessible with large compute investments become widely distributed.
1. At the time GPT-4o was released (May 2024), the prevailing sentiment of American policymakers and tech writers was that the U.S was comfortably ahead of AI competition with China, given the U.S's massive lead in compute and export controls on high-end chips. By December, however, Chinese competitor Deepseek had leapt to match the performance of OpenAIs newest reasoning models with an infamously small training run of $5.6 million. Most of these cost-savings are the result of techniques like mixture of experts, which allow models to mimic the performance of expensive generalist AI systems with a much smaller amount of computation per token. This lets you get the benefits of a very large model (such as generality and context) while only using a small amount of your overall compute for any given problem.^[5]

The upshot of this dynamic for weapons proliferation is that dangerous capabilities will initially be concentrated among actors with the largest compute budgets. From there, however, formerly frontier capabilities will quickly collapse in price, allowing rogue actors to cheaply access them.

One of the most salient capability concerns for future AI systems is their ability to contribute to the development of biological weapons. As I pointed out in a previous piece, rogue actors who sought to acquire biological weapons in the past have often been frustrated not by a lack of resources, but by a lack of understanding of the weapons they were working with. Aum Shinrikyo may have invested millions of dollars into mass production of anthrax, but were foiled by simply failing to realize that anthrax cultivated from a vaccine strain would be harmless to humans.^[6]

The production of future bioweapons, especially a virulent pandemic, is likewise constrained by the limited supply of expert advice. Virologists already know how to make bioweapons: which organisms are best to weaponize, which abilities would be most dangerous, how to engineer new abilities, optimal strategies for dispersal, or which regulatory gaps could be exploited. But because so few of these experts have the motive to contribute to their development, non-state actors are forced to stumble over otherwise obvious technical barriers.

To help model the cost dynamics we described earlier, a good place to start would be with an AI that can substitute for this intellectual labor. How much might it cost to train an AI capable of giving expert level scientific advice, and how long would it take before it starts to become widely accessible for non-frontier actors to do the same?

While I provide a much more detailed account of how these costs can be calculated below, the basic principle is that expanding the amount of compute used to train a model can (inefficiently) increase the model's final performance. By using scaling laws to predict how much compute is required for a given level of performance, you can set a soft ceiling on the amount of investment it would require for a given capability. Barnett and Besiroglu (2023), for example, estimate that you could train an AI that would be capable of matching human scientific reasoning with 10^35 FLOPs of compute, or the equivalent of training a version of ChatGPT-4 at roughly ten billion times the size.^[7] The result of this training process would be an AI that can provide professional human advice across all scientific disciplines, a subset of which are the skills relevant to the development of biological weapons.

Concretely, we can imagine these skills being tailored to the cultivation of an infectious disease like avian influenza (bird flu). For example, the AI's advice might include circumventing screening for suspicious orders by self-synthesizing the disease. Just as polio was recreated using only its publicly available genome in the early 2000s, influenza could be acquired without ever needing an original sample. From there, the virus could be bootstrapped with gain of function techniques, making it dramatically more infectious and lethal.^[8] With some basic strategy in spreading the resulting disease over a wide geographic area, it would be possible to trigger an uncontrollable pandemic. Depending on the level of lethality and rate of spread (both of which could be engineered to be optimally high), normal response systems like quarantines and vaccine production could be completely overwhelmed.^[9]

At ten billion times the size of GPT-4, such an AI would be prohibitively expensive to train today. But with even conservative increases in algorithmic efficiency and AI hardware improvements, the cost of getting enough effective compute will rapidly decline. When compared to the growing financial resources of the major AI companies, a frontier lab could afford the single-run budget to train our AI scientist by the early 2030s.^[10] By the end of the next decade, the cost to train a comparable system will likely collapse into the single digit millions.

Calculation Context

This graph was built using the median approaches and assumptions outlined in The Direct Approach. All I did was normalize the results to the price of a fixed amount of effective compute, in order to better illustrate how accessible the tech might become. The longer explanation below is just to give more context to some of the important assumptions, as well as to highlight some of the ways in which those assumptions might be conservative. Details on the specific formulas used can be found in the appendix here.

To begin with: how do you measure how much compute it would take to train an AI to give reliable scientific advice when no one's done it before?

One way is to measure distinguishability: if your AI produces outputs that aren't discernably different from those of an expert human, then for all intents and purposes, it's just as good at completing the relevant tasks (even if the internal reasoning is very different).

For example, you might compare a scientific paper written by a human biologist with one written by an AI. The worse the AI is at biology than its human counterpart, the easier it would be for an external verifier to tell which paper it's responsible for: maybe its writing is unprofessional, it makes factual errors, or its data gets presented deceptively. Conversely, the closer the AI is in performance, the harder it gets to tell them apart---the verifier needs to examine the papers more and more deeply for evidence to distinguish them. Once the outputs are equal, no amount of evidence can tell them apart, so you can conclude that the skills of both are equal.

In other words, the verifier needing lots of evidence -> higher likelihood of equal skill.

Currently, AIs like ChatGPT 5 cannot reliably pass this test. However, there are still two potential paths to making them smart enough to do so in the future.

The first path would be for an AI architecture that learns more efficiently than transformers. One major difference between "training" the performance of a human compared to an AI is that people are vastly more sample efficient. While it might take reading hundreds of articles for a human to begin writing ones of comparable quality, it might take the AI millions of examples before it even produces something legible. This inefficiency is very taxing on the AI's training budget, because it requires you to find orders of magnitude more data and parameters to sort through all of that extra information. If we had an algorithm that better mimicked the human learning process, we could train the AI in the same hundred or so examples it takes a normal researcher and save on all that compute. But this is hard! It's much easier (intellectually) to make small optimizations to existing architectures and training setups than it is to find entirely new architectures that are fundamentally more efficient.
The alternative is simple but expensive. We know from model scaling laws that you can logarithmically improve performance by putting in more compute with each iteration. Although each new order of magnitude of compute is subject to diminishing returns, it's possible to scale the underlying transformer algorithm to the point that it can predict tokens at a very high level of precision over large contexts. These relationships are described in scaling laws like the ones below, where N and D are reflections of the amount of compute you add.

{\displaystyle L={\frac {A}{N^{\alpha }}}+{\frac {B}{D^{\beta }}}+L_{0}}

Because this second method is so straightforward, it lets us approximate how much "evidence" the judge needs to decide between the human and the AI. You can calculate a "loss" (L in the equation above) for a given amount of compute, and measuring how close it is to the theoretical irreducible loss of the true distribution. By finding a model size where the amount of evidence you need to be confident that they are different begins to explode, we can assert that a model that big is very likely to be indistinguishable from human performance. By then graphing how this final number changes as you input more compute, you can plot the distribution of the results and estimate that 10^35 FLOPs is the most likely amount of compute you'd need to train an indistinguishable model.

For a concrete analogy, imagine that you were handed a biased coin with 90% odds of landing heads. How many times would you need to flip this coin to be 90% sure that it wasn't actually a regular old 50/50 coin? The answer is 9 times: if more than 7 come up heads, you can be pretty confident your coin is weighted. But what if it's a smaller amount of bias, like 60% heads? All of a sudden, you need to flip the coin 168 times just to be 90% sure that you're being cheated. What about a bias of 55%? At that point you'd need to sit there and flip it over 650 times. 51% heads? By then you'd need to spend days on end flipping it, tracking the results for over 16,000 attempts before you can be confident in your guess.

The pattern here is straightforward: the closer the biased coin is to a real coin, the more flips you have to do. But the reverse is also true: the more flips you have to do to check, the more likely it is that the bias of your coin is small. At an absurdly high number of flips, the bias is so minimal that you can, for all practical uses, substitute it for a real one.

All this model does is fancier coin flipping: figuring out how close your AI is to a human scientist by the amount of tokens it takes to tell their two papers apart.

From there, it's just a question of estimating how much it would cost to train an AI using that much compute, and then plotting the decline of that cost over time.

Specifically, we're interested in the price performance of a GPU (the number of FLOPs/$), so that we can get a dollar value for the amount of hardware it takes to get 10^35 FLOPs at any given point in time. This price performance has two components: the efficiency of the hardware itself, and the amount of "effective" compute that is being added by algorithmic improvements.

In order to calculate the amount of compute a given GPU provides for you over time, you start with the FLOPs/GPU in 2023, and then scale this figure up by applying trends in hardware performance over time (basically, how many extra FLOPs a given GPU produces per year). You then multiply this number by how long the GPU can realistically run for (how many total FLOPs you'll get out of each one), and divide by the cost of the GPU you used for the baseline 2023 figure (in this case, $5000).

This hardware performance is further supplemented by improvements to algorithmic efficiency. These efficiency gains have roughly tripled every year since 2014, meaning that the same amount of money is effectively buying three times the compute (hence "effective" compute). This number is penalized by a domain transfer multiplier (of about 2/3rds), to compensate for the fact that investments in some areas of AI research do not generalize into others. For instance, improvements to AI image generation don't necessarily help the efficiency of language models (although most of the current investment is in optimizing LLMs, so the penalty is pretty small).

The effect of all these considerations is that a dollar buys you about 3 times as much effective compute each year, although this begins to slow down as you run into physical limits on hardware and the low hanging fruit of algorithmic improvements dry up. This is why the graph starts to taper off in 2040, because you've run into atomic limitations on the size of GPU internals and diminishing returns for algorithmic improvements (gains to price performance that respectively cap out at about 250x and 10,000x each).

This example was highlighted because it presents a dangerous AI capability that is both plausibly near term and simple to achieve---a powerful weapon that can be cheaply created by just scaling up the size of existing language models. If language models alone have the potential to make the production of biological or cyber WMDs cheap in a matter of years, then we should begin taking the idea of AI development as being the domain of national security seriously.

It's important to note, however, that cheap bioweapon assistants are not an exceptional case. Because compute scaling is such a fundamental part of how all AI models are trained, any advancements in efficiency will make all past capabilities retroactively more accessible, whether those capabilities involve spreadsheet logistics or the engineering of lethal autonomous weapons systems, biosphere-destroying mirror life bacteria, or artificial superintelligences.

Even with concerted efforts towards making sure that frontier labs behave responsibly, the natural consequence of AIs becoming more efficient to train are that increasingly dangerous capabilities will become more widely distributed.

Model Theft

In the previous section, we discussed the issue of building powerful models---namely, that it continues to get cheaper to do. Although governments may rightly want to stop rogue actors from training their own bioweapons expert or misaligned ASI, the constant decrease in cost will make it increasingly difficult to detect and deter the development of strategically relevant AI systems.

Part of what makes the proliferation problem so difficult is the way that lower prices invite theft. As more and more actors become capable of building powerful AI systems, the number of actors that are vulnerable to theft, reckless, or ideologically committed to open-sourcing their models grows in turn. After all, building your own AI model is only necessary if it's impossible to use the one someone else built for you. Once the development of offensive AI capablities shifts from being something only a major government can afford to a company-level project, the number of possible targets and the difficulty of defending them will explode.

We can subdivide this problem into four major challenges. Models, being software products, are easy to steal and hard to secure. Because of their economic and military value, many competent actors will be motivated to steal them. Once a model is stolen, there will be no way to recover the original or deny the attacker from making copies. Finally, the more independent actors there are training dual-use AI systems, the more potential targets will exist.

AI models are expensive to produce but cheap to copy - Almost all of the expense required to use an AI model comes from the process of developing, not distributing it. The output of billions of dollars in AI hardware, electrical infrastructure, and technical talent is a file that you can fit on a high-end thumb drive. Since this file can be endlessly copied and remain exactly as effective, all that an attacker needs to do to succeed is to copy those weights and transfer them to an external server. This problem is further exacerbated by the fact that many of your employees need to have access to the model weights for legitimate research, that the weights are necessarily decrypted during use (such as when they are loaded onto the GPU during inference), and that much of your software infrastructure is connected to the internet.
These realities give the AI development process a very large attack surface, or number of ways a model could be stolen. You can take a direct approach by attacking the software stack directly, looking for vulnerabilities that let you run unauthorized code or bypass the access system. You can steal the credentials of someone who has legitimate access through social engineering or by cracking their passwords. You can attack the supply chain, stealing information from or compromising the software of third party vendors. At higher levels of sophistication, you can start employing human agents by bribing/extorting employees or getting a spy hired into an position with legitimate access. These agents can be used to spy directly or to covertly smuggle hardware, such as by plugging in drives loaded with malware or installing surveillance equipment.
AI also has some unique vulnerabilities. The first is that the AI stack is both new and highly concentrated: many of the software tools involved are untested against serious efforts to compromise them and have many dependencies.^[11] The second challenge is that the AIs themselves are agents with permissions, who can be tricked or manipulated into helping access their weights. As their intelligence and control over internal software development grows, so too does the value of compromising these AI developers.^[12]
There are strong economic and strategic incentives to steal models - Most powerful AI systems will have skills with both civilian and military applications. Our example human-level biologist is extremely commercially valuable, since its expertise can be used to help automate the discovery of life saving drugs and push the frontier of medicine. But on the flip side, many of the same skills that make it an effective researcher (a deep understanding of diseases, the immune system, genetic engineering, etc) make it well suited to help design and engineer biological weapons. Since these systems have both economic and strategic value, model developers will have to be secure against a wide range of potential threats, including their competitors, criminal groups, and nation-state actors.
The economic motivations for theft are the most straightforward: as AI becomes increasingly good at substituting for human labor, it will become increasingly financially valuable. The first company to develop the tools to fully automate software engineering, for instance, will be sitting on an AI model worth hundreds of billions of dollars in labor savings alone.^[13] Their competitors are in a rough position: Although the price to acquire those same capabilities will eventually come down, you might have to wait years before your lab can afford enough compute to train an equivalent model, at which point the leading player may have already locked in their market share. To avoid having to either match their frontier spending or absorb a multi-year penalty, it might be worth stealing a competitor's model.^[14] The high value of these projects, combined with the relative ease of extraction, also makes them attractive to ordinary criminal groups. As we've seen with crypto exchanges in the recent past, the combination of an incredibly valuable software asset and a lack of institutional security can prove irresistible to thieves.
The largest challenge, however, involves securing future AI projects against nation-state actors. Because access to powerful AI systems will likely be pivotal for future strategic relevance (given their ability to design powerful weapons), states will likely go to great effort to sabotage and steal the leading AI projects from their competitors.^[15] These cyber operations would be on an entirely different level of sophistication compared to ordinary cyberattacks, given the advantages states enjoy in resources, access to intelligence services, and effective international legal immunity. Even from the little that has been revealed publicly, nation-states have proved themselves capable of exploits as advanced as taking control of an iOS device with just a phone number, gaining full system access to every computer on the same network with a single compromised machine, or remotely destroying power plants by repeatedly activating circuit breakers. If these resources were concentrated on an AI project with only commercial security, it's almost certain that they could be easily compromised.
There is no way to reverse a model leak - It is incredibly hard to take information off the internet, even with the resources of a major government. We know this because there have been decades long efforts to monitor and enforce bans on illegal online activity---most notably, the online drug market, the sale of computer exploits/malware, and CSAM---that have repeatedly proved themselves unsucessful.
The issue is mostly architectural. Because internet services are widely distributed across many jurisdictions and protected by encryption, there are too many communication channels to monitor and limited ways to identify the end users. These features have helped make the modern internet commercially resilient and promoted intellectual freedom, even in countries where the internet is actively censored by the state or private interests. Those same characteristics, however, also make it extremely difficult for the government to exercise legitimate control over illegal content. Because that content can be quickly copied and distributed across foreign servers faster than the government can react, the primary strategy for dealing with illegal markets involves targeting major hubs for distribution and attempting to arrest ringleaders. While these strategies might serve as effective scare tactics, they don't have the ability to actually get rid of the illegal content itself. How could they? All of the actual products are stored locally across the globe, safe behind layers of encryption, anonymity, and jurisdiction.^[16]
Even the most sophisticated actors have no means of recovery. When the NSA's zero-day for Microsoft Windows was stolen by hackers in March of 2017, the group responsible quickly attempted to sell it online, and later open-sourced the vulnerability to the public. Even with a month of advance warning to assist Microsoft with developing a patch, there was nothing the NSA could do to stop state and criminal groups from operationalizing the exploit themselves in the aftermath of the leak. The largest of these cyberattacks came just four months later, when Russian hacking groups used the exploit to indiscriminately target Ukranian internet infrastructure, causing over $10 billion worth of damage.^[17] If a powerful AI model gets stolen, it's likely to follow a similar pattern: first sold online through illegal markets, eventually spreading to the public once it passes through one too many hands, and then finally getting deployed maliciously on a large scale.
All of these problems become more difficult to solve the cheaper models are to train - These challenges are severe enough as they are. Variations of them plague organizations as diverse as startups and government hacking groups today, leaving commercially or nat-sec critical software at constant risk of theft. Even if the development of powerful AI systems were concentrated into a single airgapped and government-secured project, there would still be substantial challenges in securing them, particularly against highly competent state operations in countries like North Korea, Russia, and China (the SL5 standard for model security).
Even that enormous effort, however, will be undermined by the consistent decrease in the training costs for powerful models. The more distributed training becomes, and the more people have access to models capable of designing cheap weapons of mass destruction, the easier it will be for rogue actors to steal natsec relevant capabilities. "Move fast and break things" is not a security conscious approach, and we should be wary about allowing unsecured private actors to train models with strong dual-use capabilities. And though many of these companies might want to set stringent security standards (even if only to protect their IP), they simply don't have the relevant expertise or resources to adequately protect themselves. What experience does OpenAI have in airgapping its datacenters? How can their leadership prepare for the cyber capabilities of foreign states when they don't have the intelligence services to predict them? Could they be privately motivated to trade speed for security, when a lead of a few months might end up deciding who wins the market?
The answer is that OpenAI would not be capable of reaching this standard on its own, even if it had the best possible intentions. Security of this scale is a state level problem, and there's only so much state capacity to go around for the growing number of actors capable of training powerful models.

Given these vulnerabilities, we can easily imagine how an AI company could be compromised by a lack of government attention and recklessness.

Suppose it's 2035, and a startup has just raised $110 million in VC funding to train a general AI biologist, per our earlier example. They plan to use it to help with biological research for drug discovery. Even granting that the federal government has passed laws requiring high-end infosecurity for powerful dual-use models by now, there are simply too many of these startups to audit them consistently. Although our hypothetical startup is law abiding, it has neither the same resources or infosec expertise as a professional government project. Perhaps it sets aside a budget to hire security consultants, assigns mandatory IT training to its employees, and leans on the federal government to help screen their backgrounds. The company's leadership, however, still sees itself as an economic effort instead of a strategic one, and doesn't want to delay it's research agenda for too long: more secure plans like switching over from a cloud provider to a personal, airgapped datacenter could take months, and would be huge investment for nebulous returns. Feeling pressured to keep up with its competitors, the startup decides to train the model anyways, hoping its existing security is good enough.

The security is not good enough. The combination of an extremely valuable product and the ease of stealing a software file attracts the attention of many foreign hacking groups, who begin probing the company's defenses. After a few weeks, an executive's security permissions get stolen through a spearphishing campaign, giving the thieves access to the model weights.^[18] The AI model is covertly sent abroad to a foreign server, after which the group responsible promptly sells it off. The government quickly becomes aware of the theft, but there's little they can do to actually take the weights back---legal action and policework are simply too slow to stop backups from being copied and transferred. The state ramps up their takedowns of darknet malware markets, but the model continues to circulate through peer-to-peer connections despite the government's best efforts. Over the next few months the model repeatedly exchanges hands online, finding new customers each time. Eventually, one of the increasingly large number of customers decides to leak the weights publicly, making it accessible to run locally for anyone with a few high-end consumer GPUs.^[19]

Although the government tries furiously to scrub public mention of the weights off the internet, too many people have gotten access to ever fully eliminate it. Some of these people spread it further because they're absolutists about technological freedom, others share it precisely because it's the government trying to regulate it, and some just want to impress their colleagues with their access to a dangerous and illicit toy.^[20] The world teeters constantly on the brink of disaster, waiting for the model to finally fall into the hands of someone who intends to use it maliciously.

Opportunities for Control

Taken together, the dynamics we've sketched out so far seem to make model proliferation impossible to stop. Any attempt to secure model weights or to regulate frontier developers will be constantly undercut by the decline in training costs, which both creates new opportunities for theft and enables rogue actors to train powerful models directly. Even if frontier labs can be coerced into behaving responsibly, the government won't be able to control or deter every new actor that becomes able to develop dangerous capabilities.

There are, however, still opportunities for control. Because performance improvements will be mostly concentrated in leading developers, and because those same developers are the main recipients of efficiency improvements, there will be a window in time where dangerous capabilities are apparent but gated by price. This window can be further extended limiting the distribution of efficiency gains outside of these large players. Depending on the severity of those restrictions, the window can become arbitrarily large.

Fortunately, the same process that allows for the decline in training costs also leaves room for intervention. As we mentioned in the first section on fixed capabilities, improvements to algorithmic efficiency have two contrasting effects. The first is the one this report has spent most of its time focusing on: the fact that algormithic improvements make it cheaper to train models. If it used to take 1000 high-end GPUs to train an AI with some dangerous capability X, but a new algorithm comes along that lets you do it with just 100, then many actors who were previously priced out can now train a model that does X themselves.

The second effect, however, is that those same algorithmic efficiency improvements make existing GPUs more valuable. If a new algorithm is 10x as efficient as the previous state of the art, any actors with extra compute can reinvest their assets into training more powerful models. Our actor with a 1000 GPUs now has a sudden surplus of 900, which can either be used directly for the same training run (such as by training a new model 10x the size) or for compute-intensive experiments. Although a smaller actor might benefit from more access to existing capabilities, bigger investors instead get to access new capabilities by using their existing capacity to improve performance.

Figure credit to Pilz et al. (2024). Even though every actor benefits from the increasing effective power of their hardware over time, the effect is largest for the actors who already have the most physical compute.

The main implication of this fact is that the actors with the most physical compute are the likeliest to discover powerful dual-use capabilities before anyone else. As a result, frontier labs are likely to have (temporary) natural monopolies on the first strategically relevant AI models, during which they will be the only actors well-resourced enough to train them. This leaves a window where it's possible to understand whether frontier capabilities are offense dominant, and how severe government restrictions might need to be if they are.

The high level decision making for proliferation dual-use technology is straightfoward. Cheap superweapons like mirror life and misaligned artifical superintelligences are the clearest examples of scalable harm, and must be at least partially restricted. Credit to Hendrycks et al. (2025)

How long this natural monopoly ends up lasting (and how wide the associated period for governance is) is a function of how fast the price-performance of AI training continues to improve. If the price declines quickly enough, nothing the state does to regulate the frontier actors will matter in the long term: another small actor will eventually develop the same capability, and then potentially deploy it maliciously. Our earlier bioweapons-assistant, for example, was estimated to cost $6 billion to train in 2031. Since this investment is so massive, state capacity can focus entirely on the handful of actors that can absorb that cost.^[21] By 2040, however, the cost of a similar project ends up at a measly $7 million, well past the point where the government can effectively secure or deter it.

This is still an improvement over the default situation of no oversight. If the first frontier lab can at least be secured against theft, for example, the high costs of model development will still give us a few years of nonproliferation before similar models start being widely developed. But that's clearly not a complete victory: ideally, we'd both be securing the first actors to develop new capabilities and slowing down, then halting, the decline in price for dangerous capabilities.

With permanent intervention, the cost of accessing a dangerous capability is prevented from ever declining enough that a rogue actor could afford to access it.

Thankfully, the decline in AI training costs is not an automatic process. Its main enablers---the constant improvements in algorithmic and hardware efficiency---are the result of localized research that then gets distributed across the AI ecosystem. Your hardware price performance will not improve unless you can actually buy the next generation of Nvidia GPUs. Improvements in algorithmic efficiency only happen when companies like Google research and publish optimizations like transformers and GQA for others to use. By concentrating where these improvements are allowed to spread, you can limit the pace at which AI models become cheaper to train across the industry and abroad.

Where these improvements are located, how large they are, and how they get distributed is an important subject for future research (and will receive a more detailed look in an upcoming article in this series on policy recommendations). Even without these details, however, there are still some clear high-level options to extend the size of the intervention window: some of which are already being implemented today. The diffusion of algorithmic innovations out of frontier labs, for instance, has slowed to crawl---gone are the days where companies like OpenAI will even publish a parameter count for their new models, let alone a major architectural insight like the transformer.^[22] Outside of these economic incentives, we've also seen regulation used to directly slow unwanted AI progress. China's struggles with matching the scale and quality of western AI hardware, for instance, can be largely attributed to the increasingly strict export controls the PRC has been placed under since 2022.

While partially effective, the measures so far are necessarily temporary. Preventing China from buying GPUs from Western allies is only going to make a difference in the time it takes for China to develop its own domestic AI supply chain; likewise, preventing frontier companies from sharing their ideas only works up until the point that researchers in other labs come up with parallel solutions.^[23] Any permanent solution to the problem of declining costs for offensive capabilities can't just be about withholding your own technology: it has to involve some kind of active enforcement against the other actors.^[24]

This feature makes permanent interventions much harder to design---by nature, they need to be large in scope and to have ways to intervene when an actor doesn't cooperate. The nuclear nonproliferation treaty only functions because its members are willing and capable of bombing the enrichment facilities of those who don't want to play by the rules. The challenge of designing these permanent solutions is about making sure that there are incentives for powerful actors to cooperate, as well as which enforcement mechanisms have the fewest tradeoffs with things we ideally want to keep, like personal privacy and the beneficial applications of dual-use AI systems.

A major component of the next two articles in this series will be figuring out which of these permanent solutions fit within the bounds of those restrictions. For instance, any proposal which involves the U.S unilaterally agreeing not to build superintelligent models is probably off the table. Proposals that allow the U.S to enforce restrictions on other countries, however, might be more promising. A Sino-U.S coalition on the nonproliferation of superintelligence to non-members, for example, could a) be practically implemented through measures like monopolizing the AI hardware supply chain and b) would be incentive compatible for both countries, on the grounds that no one wants terrorists to have WMDS and that the spread of ASI systems would threaten their mutual hegemony.

Closing Thoughts

Future AI systems are going to allow for the cheap development of powerful superweapons. Because of the potential for easy pandemics, autonomous drone swarms, cheap misaligned superintelligences, and other massively impactful weapons, the proliferation of powerful enough AI models threatens to enable rogue actors to threaten whole countries, or in some cases, the world itself. Likewise, the same AI systems capable of developing those superweapons will, without our intervention, eventually become widely accessible through either a decline in training costs or plain theft. Considering the history of similar technologies like cryptography, it's apparent that controlling the spread of dual-use AI systems will be significantly harder than with nuclear weapons, even though those same AI models may end up having just as much, if not more, of a strategic impact.

On the other hand, history should inspire us as well: humanity did actually rise to the challenge of nuclear weapons. In the 80 years since the U.S first used them to intimidate imperial Japan, not a single nuke has ever been deployed in anger. Even when those same governance mechanisms got tested by their cheaper and more destructive cousins, genetically engineered bioweapons, our institutions prevailed regardless. That success wasn't without luck, and definitely not without effort, but it was success all the same. AI-derived superweapons will just be another challenge in the same line of technology, if leaner and meaner than the rest of their family.

Perhaps even more importantly, our history with dual-use technologies has shown us that nonproliferation doesn't mean we have to curtail the good, even when we secure against the bad. The applications of nuclear fission never ended at the bomb: it took just a year to start making radioactive isotopes for cancer treatment after Hiroshima was turned to ash, and only five more to open the first nuclear power plant. Would it be a better world if we had thrown nuclear restrictions to the wind? If we'd said let anyone build a bomb, if it meant the power plants would arrive in 1948 instead of 1951? Even the U.S and the Soviets were able to agree on the answers to those questions.

Dual-use AI technology will have incredible potential for uplifting humanity in every aspect of life. Cures for the worst diseases, a redefinition of work, and massive material abundance are well within reach, if only we can restrain ourselves from using the most dangerous tools it will offer us. All we have to do to capitalize on that potential is to make the same sensible choice we've always made: to first make sure that the state can enforce the nonproliferation of offense dominant technology, and then hand free rein to the public to make of its benefits as they please.

The next article in this series will look at the strategic implications of powerful AI systems. In particular, it will discuss why AI-derived superweapons are likely to be offense-dominant even with defensive innovation, the limits of states and their ability to defend themselves, and what this might mean for the relative standing of the U.S and China, both to each other and the rest of the world.

^{^}
A fact that can be observed, for instance, in how open source models routinely trail the performance of frontier models within a year. This trend has even accelerated recently, with open-source models now just barely three months behind their closed competitors. Because models are becoming cheaper to train to a fixed level of performance over time (ie, making a model just as good as GPT-4 at math gets cheaper to do), it's possible for companies with substantially less compute investment to stay close to the state of the art.

If this were not the case, then we'd expect to see performance mostly monopolized by the richest companies. If it still took 100 million dollars to get GPT-4 peformance, you'd see the market dominated by the handful of companies with the resources to spend 9 figures on a single training run. In reality, we saw an explosion of comparable models over the course of 2024 once training costs declined.
^{^}
For example, imagine that you have a compute budget of 1 billion FLOPs. With Algorithm A, training a model to achieve 70% accuracy on some task costs your full budget---1 billion FLOPs. But then your researchers develop Algorithm B, which achieves that same 70% accuracy using only 100 million FLOPs. Now you can take your original 1 billion FLOP budget and train a model that's 10x larger, or train for 10x longer, or explore 10x more architectural variations. Functionally, you have 10x as much compute as you started with, even without any investment into additional hardware. This extra compute then lets you explore a larger search space of possible weights, making it more likely that the final performance of your AI model is higher.
^{^}
Imagine you have 100 workers assembling a product, but your assembly process requires each step to be completed before the next can begin. Even though you have 100 workers available, 99 of them stand idle at any given moment while one person completes their task. Using an LSTM to process language similarly forced most of your GPUs to idle while it worked on the original sequence.
^{^}
Going back to the factory analogy, you suddenly have an assembly process where all 100 workers are able to work their own lines in parallel, rather than wait on the output of everyone else.
^{^}
In practice, this looks like DeepSeek v3 using just 5% of its full parameters to analyze each token. If it notices the data it's reading through looks a lot like math, for example, it'll delegate that processing to a math "expert", which is the small fraction of its overall parameters that focused on math during training. Since the model is so large, you can fit lots of these individual experts and call them in whenever they're needed.
Your brain does something very similar with muscle memory: rather than have you constantly conciously check whether your hands are safe, your brain will automatically use your muscle memory when it notices "HOT!" and you need to pull your hands away from the stovetop.
^{^}
Technical analysis of Aum's misstep available here. The primary issue was that they had used anthrax from a veterinary vaccine, which is intentionally handicapped by removing a crucial gene that allows it to multiply.
^{^}
The assumption being that if a model can write a scientific manuscript that's indistinguishable from those of a human expert, then it is just as good as a human at the relevant scientific skills needed to write one (in this case, human-expert level bioweapons assistance).

In practice, you can probably achieve human-expert performance in scientific research well before this number. 10^35 is an upper bound estimate generated by predicting how distinguishable the model's outputs of a certain length (like a scientific paper) would be from papers written by humans, given only increases in the amount of compute that a transformer is trained on. In reality, however, there are going to be algorithms that can learn to write a high quality scientific paper without needing to be shown billions of examples. After all, human scientists don't need to read billions of papers in order to write one---our brain's learning "algorithm" is clearly many orders of magnitude more data efficient.
^{^}
In fact, the early 2010s saw multiple research teams do exactly this: edit bird flu in order to make it airborne, able to be transmitted through just a cough or sneeze. Although these researchers took great care to make sure that the disease would not spread by weakening the virus preemptively, there's little reason to expect terrorists or other rogue actors to show the same restraint.

The controversial history of these projects and the government reaction to them is chronicled here.
^{^}
After all, if even Covid-19 (a virus with a sub-1% fatality rate, which was spread largely by accident) managed to almost collapse healthcare services and required a year of intensive investment to begin producing, let alone distribute, a working vaccine, it's clear that an intentional bioweapon would be existentially dangerous.
^{^}
Or even earlier, if government investment is poured into the project.
^{^}
For example, most AI training runs involve the use of Nvidia GPUs and a proprietary software, CUDA, that allows the GPUs to be used efficiently for training. If you can compromise the CUDA driver, you could effectively take control of the GPUs it's interfacing with, using them to write arbitrary code, disable monitoring software, and getting direct access to model weights as they get loaded into memory.
Unfortunately, there's no easy replacement for this, because there's no easy replacement for Nvidia and their level of vertical integration. The only solutions are to make CUDA more airtight, and to add additional layers of deterrence around it.
^{^}
Today, AI systems are not smart or reliable enough to be entrusted with such permissions. But they're still vulnerable to unusual attacks like prompt injections and model distillation, which manipulate the model's outputs to either write executable code or to infer internal information about its weights.
^{^}
An expectation of value which can be observed in the valuation of the major AI companies and their suppliers like Nvidia, which appear increasingly predicated on the ability to automate major parts of the economy. Automating software engineering would reduce direct labor costs by over $168 billion in the U.S alone, which doesn't even account for its international value or the potential to increase productivity in non-tech sectors. It also undercounts the potentially astronomical value of accelerating the pace of AI research and developing superintelligent models before your competitors: tools which would not only replace humans, but qualitatively surpass them in every domain.
^{^}
While labs haven't (yet) been caught outright stealing a competitor's weights, we've still seen examples of "soft" theft between the AI labs. One particularly prominent case was the training of Deepseek's V3 and R1 models, which were trained by distilling synthetic data from ChatGPT-4. This method allowed Deepseek to rapidly catch up to OpenAI's performance, without investing in the same technical research. Although legal, OpenAI has since moved to block its competitors from using its model to train their own, placing limits on API use.
^{^}
Similar cyber operations have already played an important role in nuclear non proliferation efforts, most notably in the sabotage of Iran's nuclear enrichment program through Stuxnet, a program designed to subtly destroy centrifuge equipment. This virus used multiple zero-days for Microsoft Windows, was covertly installed on local hardware using human agents, and partially routed through the centrifuge supply chain, all so covertly that it took over five years for the bug to get discovered. While the U.S and Israel never officially took credit for the program, no ordinary criminal group has the motive or resources to carry out such a complicated attack.
^{^}
As seen, for example, in the FBI takedown of the Silk Road in 2013 and the arrest of its founder, Ross Ulbricht. But while the government might have been able to punish him in particular, it did little to disrupt the actual flow of online drug sales, which merely shifted to new marketplaces like Agora. Because there's no easy way to capture every individual supplier, the same content and products will quickly resurface as sellers look for new customers.
^{^}
For perspective, Ukraine's GDP was $112 billion at the time. Some of the most damaging targets included disabling the radiation monitoring system at Chernobyl, attacking major state banks, and corrupting air traffic controls.
^{^}
Even major defense contractors like Boeing and Lockheed Martin get subjected to opportunistic cyberattacks---companies which are, by law, required to have strong info-security measures in place. And these companies are a best case scenario: veteran institutions with a history of practicing information security, with direct support from the government's military and intelligence services. Our hypothetical AI startup on the other hand, might end up about as well defended as organizations like crypto exchanges, which are infamously rife with cybersecurity challenges and theft.
^{^}
In fact, we've already seen AI models themselves get leaked in a similar way. Back in 2023, Meta's plan for their LLaMA model was to hand a license to verified researchers, making sure that while academics could have the model to run experiments on, it wouldn't be open-sourced to the public until they decided it was safe. Within a week, it was put up for anyone to download on 4Chan.
^{^}
While it's tempting to think that no one is really like this, some people are willing to leak military secrets on Discord to win an argument over whether a mobile game's tank rounds are realistic enough. Some people are dumb. Some are easy to bribe. And some are just convinced that no matter how threatening a piece of technology might be to national security, government restrictions of any sort are an even greater risk.
^{^}
When IBM developed new cryptographic tools in the 60s and 70s, for instance, the government was able to limit their distribution to important sectors like the military and commercial banking. As one of the only organizations with enough computing infrastructure to test and implement new models, they could get the brunt of the government's national security attention.
^{^}
While it's difficult to estimate how much of an effect this is having on non-frontier progress today, it's likely to have an enormous impact in the future. Once frontier AI capabilities reach the point that they can semi- or fully-autonomously conduct AI R&D research, we're likely to see the frontier labs experience an explosion of algorithmic efficiency gains. In comparison, the non-frontier labs that are behind this breakpoint will still be relying on humans to do most of the work, leaving them subjective years behind.
^{^}
Analogously, we can think about how it would have been impossible to keep the mechanics behind nuclear bombs secret for very long, even if the U.S had never pursued the project (and subsequently gotten the idea stolen by Soviet spies during the Manhattan Project). While it might've been Leo Szilard who first came up with the idea of a fission chain reaction, the key insight was obvious enough that someone else would inevitably stumble upon it.
Szilard himself was humble enough to realize that "someone else" probably included scientists in Nazi Germany: hence why he advocated that President Roosevelt begin a national project to build the bomb first, before the U.S could lose its strategic advantage.
^{^}
This is the main limitation of centering your nonproliferation approach around infosecurity and export controls. What use is it to stop people from stealing your model if they can just build their own instead? Sure, it buys you time---but that time is meaningless unless you use it to actually implement a long term solution.

LESSWRONG
LW