AI will make biological extinction risks worse before it makes them better

MichaelDickens

An argument goes: If we don't build aligned artificial superintelligence, we risk driving ourselves extinct for some other reason. We should rush to build ASI quickly, in spite of the risks—the longer we wait, the more vulnerable we are to extinction from a different cause.

Other than ASI, the biggest extinction risk is synthetic biology. Some lab could (accidentally or on purpose) develop a highly transmissible, 100% fatal super-plague that wipes out humanity.

An aligned ASI could stop that from happening by shutting down dangerous biological research, or by developing advanced countermeasures that stop the spread of deadly infections. So the argument goes: We need to build ASI to save us from non-AI extinction risks.

However, that argument doesn't work. In the near term, AI will make biological risks worse, not better. AI will accelerate scientific research, which will bring us closer to the level of knowledge necessary to build extinction-level pathogens. And in the long term, the way ASI eliminates biological x-risk is by taking control of the world.

Cross-posted from my website.

In the near term, AI makes biorisk worse

Some people imagine that AI models would accelerate defensive research while refusing to assist with developing bioweapons. This plan has two minor issues and one fatal one.

The first minor issue: Current AI model refusals are not robust, and there are workarounds to get information out of them for people who want to. It's very hard for AI developers to patch all holes, but the jailbreakers only need to find one.

The second minor issue: Even if the leading AI developer makes their model safe and un-jailbreakable, at least one of their competitors will probably fail at that task.

The fatal issue: It's not just about what AI assistants can do for humans. It's that AI accelerates the rate of scientific progress. As state of knowledge improves for humanity in general, it becomes possible for humanity to develop existentially risky pathogens, even if AI does not assist directly. It seems impossible to advance biological science while surgically preserving ignorance on just those bits of knowledge that are required to engineer pathogens.

AI might refuse to participate in gain-of-function research, and that would be better than not refusing. But suppose I'm an evil scientist and I want to develop a 100% lethal airborne pathogen. Here in the year 2026, I can't do it. Even if I'm on the cutting edge of medicine and biology, I still won't be able to create the "extinction pathogen", because that would require a level of scientific understanding that humanity simply hasn't achieved. If AI advances science in general, it will push me closer to my evil goal of killing everyone with bioweapons.

There is the question of "offense-defense balance": is it easier to develop deadly pathogens, or easier to protect people against pathogens? That question matters in many contexts, but it's not relevant here. At our current level of scientific understanding, we have ~zero ability to develop extinction-level bioweapons. If our understanding becomes sufficiently advanced, then that ability will move from zero to nonzero, regardless of the offense-defense balance.

Leaving AI out of the picture, humanity will probably have the knowledge necessary to make extinction-level pathogens within the next hundred years. If AI causes a hundred years of progress in the next decade, then the evil scientist will be able to engineer their extinction pathogen by 2036, thanks to AI—even if the AI itself doesn't directly participate in the creation of the pathogen.

By 2036, assuming AI hasn't killed us yet, biorisk will be higher than in the alternative 2036 where AI capabilities stopped improving. Would 2036-biorisk-with-AI be higher than 2126-biorisk-without-AI? Maybe not—maybe AI scientists would be safer than human scientists per unit of research effort. But at minimum, AI-accelerated science is more dangerous per unit of time. AI acceleration means the high-risk period starts sooner, and it means we have less time. Less time to identify risks, less time for policy-makers to respond, less time to consider what direction we should go in. Speedrunning through a century of progress in a decade makes it much harder to manage the risks as they come.

AI can't control scientific progress unless it controls everything

The only way to accelerate scientific progress in biology without increasing x-risk is for AI to have complete control over scientific capabilities—basically, it has to be impossible for any humans to use their increasingly-advanced knowledge of biology to develop bioweapons. I don't see how to do that unless all science is being done by AI, with humans not participating anymore.

Many people have a vision of the future in which humans will coexist with advanced AI, and we will remain in control of the steering wheel. But if humanity is in control, how can AI prevent us from developing powerful bioweapons? We can't have it both ways.

One might say, "Governments will have to prevent terrorist and mad scientists from developing bioweapons." To which I say, indeed they should do that. But AI makes governments' jobs harder on that front, not easier, unless AI has totalitarian grip on society—at which point we're back to the scenario where humans lose control over the future.

Another attempt at escaping the dilemma: Let the government control AI, and AI control everyone else. Even in the world where the government is democratically elected, that world is starting to sound like an extreme version of Bad Definitions Of "Democracy" Shade Into Totalitarianism, in which your life is fully controlled by AI, and the only time when you get any say in the matter is at the voting booth. ^[1] I can imagine much worse outcomes than that, but it's not what I would describe as a happy ending.

Low biorisk trades off against high AI takeover risk

AI increases biorisk until it's powerful enough to completely shut down any danger. Therefore, the way to minimize AI-driven biological x-risk is to have a very short window of time between "AI is smart enough to accelerate biological research" and "superintelligent AI controls everything". But if that window is short, then we have little time to solve the alignment problem, and little time to steer AI while we are still in control of the future. AI-enhanced biorisk is lowest in the worlds where AI takeover risk is highest.

People with relatively low credence in AI takeover risk tend to expect a slow takeoff. But in a slow takeoff, AI makes biorisk worse well before it's smart enough to robustly prevent extinction-level pandemics.

Accelerating AI development is not a good way to reduce biorisk

We don't currently know how to build bioweapons that kill everyone, and eventually we will know how to do that. ^[2] Much like how, in 1900, there was no risk of nuclear winter because we didn't yet know how to build nuclear weapons.

Scientific progress brings prosperity, but it can also enable dangerous new technologies. General biology research might even be harmful on balance due to increasing extinction risk—I don't have a well-informed view on whether that's true. What I can say is that the following argument does not hold up:

We need to accelerate AI progress so that it can save us from biological extinction risks.

Consider the neighboring argument, "we need to accelerate AI progress to create medical advancements." That argument is failing to do basic cost-benefit analysis (the risk of extinction is not outweighed by short-term improvements in medicine), but at least it's true that AI could, indeed, improve the state of medicine. "We should accelerate AI to reduce biological x-risk" isn't even clearly correct about the upside. ^[3]

This is yet another illustration of the fact that we don't know what "aligned AI" means

In the (possibly brief) window where AI is smart enough to do scientific research but doesn't yet control the whole world, AI increases biological x-risk by improving humanity's knowledge of how to develop powerful bioweapons. After that window, what happens? If we're in a world where ASI is powerful enough to reduce extinction risk to zero, what does that world look like, and what should it look like? I find it difficult to imagine what sort of radical transformations to civilization would be necessary to achieve a total elimination of x-risk.

Some people imagine a future where everyone owns their own galaxy. How can we make meaningful claims about x-risk when the future looks that weird? If I can own a galaxy (whatever that means), maybe some other person can deconstruct a handful of planets to build an army of 100% deadly super-nanoviruses and send them throughout the universe at 99.9999% the speed of light so that they kill everyone before anyone even sees them coming. Or something.

Many people have an intuition that aligned ASI will fix everything and the world will be great. But if we succeed at figuring out how to get ASI to do what we want, how do we then specify its behavior such that we get a good outcome? Some people hand-wave the problem away by saying "the ASI will be smart, it will help us figure out what to tell it to do." Much like alignment bootstrapping, this answer has a chicken-and-egg problem: how can the ASI figure out what you should tell it to do if you haven't yet told it how to determine what you should tell it to do?

(If an "assistant ASI" comes to you with some answer, and it's far smarter than you, how can you judge whether its answer is correct?)

The biorisk case is an example of the general problem that we don't know how to specify how an ASI should behave. Others have discussed this problem in more general terms, including:

A Conflict Between AI Alignment and Philosophical Competence by Wei Dai (2025)
Intent alignment seems incoherent by Joe Rogero (2025)
Many individual CEVs are probably quite bad by Villiam (2025)

The concerns with biological x-risk are a specific illustration of the general problem. How, exactly, do you build an AI that prevents humans from killing each other with bioweapons, but without making things horrible as a side effect?

To be clear, I do not believe this scenario is at all likely. I'm using it as a hypothetical way of escaping the dilemma, to illustrate that even this "solution" still isn't something we want. ↩︎
Unless AI kills us first. ↩︎
This brings to mind an important (but off-topic) question: if scientific advancement increases existential risk, but it's also essential to improve standards of living, how should we proceed? We don't have an answer for that question yet, but whatever we come up with, I imagine it would be fair to summarize as: "We proceed carefully." As we learn more about what sorts of advancements are dangerous, we can implement mitigations.

If AI rapidly accelerates progress—even assuming AI itself doesn't kill everyone—then it will be difficult to implement mitigations as we go, because the time gap between "top scientists foresee a dangerous technology on the horizon" and "anyone can develop this technology in their garage" will become much shorter.

(Another possibility is that humanity doesn't solve the problem of how to advance science without introducing new x-risks. Instead, we solve AI alignment, and then AI solves every other problem.) ↩︎

At our current level of scientific understanding, we have ~zero ability to develop extinction-level bioweapons.

This isn't true, but publishing a detailed breakdown of why it isn't true makes the FBI jumpy. Men with guns, skillfully deployed are a pretty successful countermeasure.

AI acceleration means the high-risk period starts sooner, and it means we have less time. Less time to identify risks, less time for policy-makers to respond, less time to consider what direction we should go in.

This assumes that AI accelerates bio research but not any of the machinery for managing bio research.

On some level all xrisk arguments have this form:

The advancement rate of bad_thing must be faster than the rate of progress in good_thing.
Therefore, bad_thing will kill us all unless we globally prohibit every _thing.

This is an extraordinary claim and we should require evidence for it.

Let's take cybersecurity as an example. Mythos is, by good attestation from many parties, a world-changing cyberweapon capable of either exploiting or fixing many latent security problems. However, with even a small lead time you can simply patch every security problem that Mythos can find. Provided we continue to see a responsibly phased rollout, good_thing has, in fact, happened sooner than bad_thing. Wonderful success.

So to carry this case you have to argue that this can't be true of unknown future events, which seems essentially impossible. Among other things: Everyone involved in model release processes assesses for these sorts of risks before release, by agreements that have been in place for years now. What makes you think these processes must break down and this goes completely wrong?

Men with guns, skillfully deployed are a pretty successful countermeasure.

Yes. I think publicly available gain of function research implies we ("humanity") have substantially nonzero capability to engineer a world ending pathogen. If I had to guess this capability already exists for states with a relatively modest (e.g. 100 million dollar) budget, has for decades, and the reason it hasn't happened is there is no state with sufficient capital that wants the world to end in an indiscriminate way. The pitch for bioweapons has always been killing your enemies cheaply in a plausibly deniable way, not to destroy human civilization. This means that the biggest concern is the amount of uplift that an LLM provides to nonstate actors like terrorist groups, who may feel it is actually in their interest to end the world.

So to carry this case you have to argue that this can’t be true of unknown future events, which seems essentially impossible. Among other things: Everyone involved in model release processes assesses for these sorts of risks before release, by agreements that have been in place for years now.

I don't think this is an impossible argument to make. I argued in Varieties of Doom that we can more or less know from first principles that defense wins cybersecurity (at least the parts that aren't about tricking humans like e.g. phishing), but that we don't know this for biosecurity, and in fact what we do know isn't encouraging. I think the exclusive focus on pathogens is a little myopic, and that the proper threat model takes into account both pathogens and various forms of ecological terrorism like the release of mirror life. This implies that the attack surface we have to defend is basically the entire ecosystem of life on earth, and that the countermeasures will probably look a lot like Drexler's active shield if they exist.

But, speaking about the situation this right moment, I think our biosecurity posture is a lot worse than our cybersecurity posture. COVID-19 should have been a reality check for the whole world that even a moderately lethal bioengineered pandemic could be a huge risk to our way of life, and that we need to make major investments in defense and consider radical new public health measures like active air filtering and active monitoring of sewage for new pathogens. For the most part this hasn't happened, instead the American electorate thoroughly rejected the pandemic as "overblown" and forced Trump, who spearheaded the fast track approval of the COVID-19 vaccines, to repudiate his own actions even as they saved thousands of lives. America is now clearly in a worse position to defend itself from a pandemic than it was before COVID-19, with trust in public health authorities shredded and budgets correspondingly slashed. If the maximally lethal and maximally viral pathogen appeared right now, with a long incubation that made it basically unstoppable, it's not clear to me our civilization would have any effective response to that.

This having been said, my biggest problem with this entire genre of argument is it strikes me as an isolated demand for rigor. The pitch for this threat model is almost never "I am a biosecurity expert and I need more time to develop countermeasure, A, B, and C, at which point we'll be safe.", it's almost always "I am an ex-MIRI agent foundations guy desperately clinging to anything I can to get AI banned now that my preferred threat model is considered publicly disgraced", and that at the same time Mythos is doing out of distribution vulnerability research of the type we were assured it would not be possible to keep models corrigible doing with current alignment methods. I am supposed to politely pretend that this idea that we need to shut down AI over its externality of accelerating science which has the externality of increasing biorisk is about a principled concern over biorisk as opposed to being one of the authors soldiers in their war of attrition against anyone willing to point out that the paperclipper is not a particularly likely outcome.

A world-ending pathogen by its very definition doesn't have any use beyond, well, destroying the world when your favorite state or terrorist group is on the verge of collapse or due to some other circumstances spiritually similar to a state's nuclear doctrine. In order to become actually useful for a group of humans in other ways, the bioweapon would have to either be easy to treat or let people create a vaccine and spread it.

As for the isolated demand for rigor, I doubt that this is an undeserved demand for rigor.

when your favorite state or terrorist group is on the verge of collapse

Some terrorist groups such as Aum Shinrikyo explicitly want(ed) to end the world. There doesn't have to be other circumstances involved, a lot of the reason that synthetic bio capabilities diffusing to smaller and smaller nonstate actors is concerning is that some of them already want to end the world.

I doubt that this is an undeserved demand for rigor.

Define "deserve". The Internet has already done a lot to make it easier for someone to make a potential world ending pathogen. Imagining the pipeline from start to finish:

It's made it easier to be radicalized into wanting to do such a thing in the first place. The number of people being exposed to ideas like negative utilitarianism has almost certainly gone way up compared to their profile in the print universe. This may even be the Internet's largest effect on the risk.
It's made it easier to go down an obsessive rabbit hole about this subject specifically. Back in the print and TV era if you saw something on TV, say a documentary on biological weapons programs, the TV would generally shift to another subject after the show is over. There wasn't a button you could instantly press to be supplied with a more advanced documentary about biological weapons production, and then another one about selective breeding, and then another one about lab procedure. Everything had a lot more friction. You might see the documentary and go "Huh, I wonder if I could do that?" and get distracted before you can even follow up with more detailed information. Libraries and booksellers didn't always have what you want, especially about niche subjects like this so you'd probably have to go down to a university library (which you as a relatively non-academic person might not even know to do) and figure out how to phrase your questions to a human librarian so as not to draw suspicion. You would have to interact with multiple people during the process which might remind you that hey, you're currently trying to kill this nice old lady who's helping you with your books are you sure you want to do that?
It makes it vastly easier to find a second person who shares your interests. Someone is a lot more likely to stick with an antisocial interest like this if they can find other people in their environment to validate their desires and help them. Humans are generally not built to do long solitary endeavors, and it's a lot easier to give up when you get frustrated if the only one who has to know you've failed is you. Not to mention that you might have underground forums or chatrooms where people can share tips and help each other get unstuck on problems they encounter. In almost any other scientific context this would be a benefit of the Internet, but here it's a negative because it helps people do something extremely antisocial.
Only after we have already massively increased the risk of encountering a motivated, intellectually encouraged, and socially connected actor do we get to the actual assistance that the Internet might provide with the lab work. Being able to order relevant equipment over the Internet from sites like eBay or the Silk Road is already a huge help, since it reduces the number of people the wannabe terrorist has to interact with who might report them to the authorities. The Internet has made it possible to access information about all kinds of lab processes that would have previously been very difficult to come by without a university education. It further provides help in the form of potential peers and mentors who can be convinced to help by not telling them the full truth about what the actor is doing. Because these peers and mentors don't live close by to the actor, they're less likely to present complications in the form of wanting to see their lab setup or walk around in their workspace. The time and amount of travel necessary to get access to relevant documents like published gain of function research is also vastly reduced by the Internet.

It's difficult to quantify exactly how much easier the Internet makes malicious gain of function research (especially without going into details and considerations I think would probably be unwise on a public forum), but I think it's at least a quadrupling of the relevant risks? The Internet is of course also many other things and has many other benefits, so a post in 1995 about how we have to shut the Internet down right now because it will accelerate science and gain of function research is part of science so we're all going to die if we don't is...well I can imagine an internally coherent longtermist troll case for it but I think making this argument for AI commits you to a lot of corollary arguments that are never really explored or grappled with seriously which lets me infer that the generator is "concerned about AI paperclipper and this argument makes a good soldier" vs. "obsessively worried about biosecurity and AI is a casualty of that worry".

The paradox that AI Safety has to somehow deal with is that "aligned to who?" extends to pretty much every policy decision nowadays. If ASI is free and open source, then a lot depends on whether that world is offense- or defense- dominant. If it's not, and it's controlled by a government, then how can we be sure that that government wouldn't do horrific things with that power^[1]?

Unfortunately, even the idea of pausing or regulating AI has hit this blocker. The intertwining of the "AI shouldn't offend me" and "ASI is dangerous" factions at various points has caused people to fear that a 'pause' looks more like regulatory capture than good-faith protection of the commons, and would ensure a less aligned AI. Whether or not this is a correct belief, it is a belief that has steadily grown over the past few years, and which will have to be addressed to make pausing politically workable.

^{^}
Rotherham has hopefully put to rest the idea that "Western" governments are above facilitating awful crimes against the populations they govern.