Random things...
16 is (roughly) 5 percent of 300, not 0.5 percent.
What happens when you trigger the 20 to 40 percent detection threshold? Do you just get told no and get to try again, with no other consequence? What if you trigger it 10 times?
Your cost and "employer" filters haven't changed. Although it's probably not hard to create a good enough "business" to pass the employer one.
Something just being infectuous (or deadly) doesn't make it a good bioweapon. In fact, even if you have smallpox, you don't have a bioweapon yet. You still have to package and deliver it. And if you want full-on apocalypse cred, you also have to make it a lot worse than the background of naturally occurring pathogens.
Bioweapons are lousy weapons, because they aren't effectively targetable. That may be a big part of your Great Filter.
Bioweapons are lousy weapons, because they aren't effectively targetable. That may be a big part of your Great Filter.
Something just being infectuous (or deadly) doesn't make it a good bioweapon. In fact, even if you have smallpox, you don't have a bioweapon yet. You still have to package and deliver it. And if you want full-on apocalypse cred, you also have to make it a lot worse than the background of naturally occurring pathogens.
This is why none of the people smart enough to build and deploy one in 2025 have a good reason to, including nation states. If the number of people who can do it goes up by 3 OOMs, you might get someone crazy enough to build one despite these facts.
Your cost and "employer" filters haven't changed. Although it's probably not hard to create a good enough "business" to pass the employer one.
That's true, but as they say in the army: three is two, two is one, and one is none! I'm not moving to New Zealand yet, but one of the three major reasons I wasn't worried may no longer apply.
What happens when you trigger the 20 to 40 percent detection threshold? Do you just get told no and get to try again, with no other consequence? What if you trigger it 10 times?
If the filter assigns 40% chance that it's viral, you (probably) just don't trigger any filter. This is the bayesian probability assigned by a filtering model, not the probability that a binary filter gets triggered.
I was slightly worried about sociohazards of posting this here, but the cat's already out of the bag on this one: the authors of the paper put it up on BioRXiv and got an article written about them in Nature, as well as getting into a bunch of newspapers and all over twitter. I expect that the marginal audience of LessWrong is much more heavily enriched in people who'll use this information to do good than to do evil.
TL;DR:
If anyone reading this works in Biosecurity: nucleotide synthesis companies should start using protein structure annotation as part of their screening protocol for genes, if they aren't already.
The Fermi Paradox is the question of why we don't see any evidence of aliens spreading out across the galaxy, despite there being loads of planets on which intelligent life could evolve.
The Germy Paradox is the question of why we don't see any bioweapons spreading through our population, despite there being eight billion people with internet access who could (as its proponents argue) just download the sequence to smallpox and get the genome printed.
For a while, I had an easy answer to the latter: the great filter of biotech being really fucking annoying to work in.
My background is in bio/nanotech. At one point in my PhD, I was trying to order one single gene for one (kinda) toxic protein. I had a series of conversations like this:
Me: can you sell me a gene?
Gene Vendor (imagine them as a mysterious traveller standing in front of a glowing flask): Why of course!
Me: OK, I would like [Gene Sequence]
GV: Ohohohoh! I smell S. Aureus sequence! This appears to be a pathogenic gene! I cannot sell this to you!
Me: Right, but it's missing the receptor-binding domain, so it can't actually do anything
GV: That doesn't matter
Me: And it's not sufficient to turn a harmless E. coli strain into a pathogenic one
GV: Doesn't matter
Me: And they make it in Oxford all the time
GV: Doesn't matter!
Me: And you can buy the protein that it produces off the internet
GV: Doesn't. matter.
Me: ...
This repeated with a few different vendors until I just emailed the guy in Oxford who makes the gene, and he sent me the whole plasmid (gene + scaffolding) in the post.
On a separate occasion, I had to order an (again, not very harmful) gene in two parts and attempt to stitch them together. Despite working in what was supposed to be a competent biolab, we never managed to stitch the parts together and get them into a plasmid.
From this, I decided that AIxBiorisks were basically a non-threat. Pathogen genes are really very difficult to get hold of; the difficulties in assembling them would make it totally impossible for some random schmuck to build a bioweapon based on ChatGPT's instructions; anyone capable of building such a weapon would be capable of getting a job in one of the insane laboratories where they make bioweapons for fun (normally called gain-of-function research) and smuggling the virus out. This filters down the number of potential attackers from ~8 billion to maybe several thousand.[1]
It's also quite expensive. Even the cheapest synthesis will run around 0.1$/bp, which means for a 30k genome you'd need about $3k, potentially more. And most places don't just send DNA to random schmucks, so you'd need to steal $3k from your employer.
And remember those automatic screening programs? You're going to trip a lot of those if you try and order smallpox sequences online.
So I felt like the Germy Paradox had a reasonable explanation built from the great filters filters of competence, convenience, and cost. Most would-be terrorists just aren't willing to put in the effort to spend 4 years in university, and 4 more in a PhD in order to try and order some sequences and get immediately flagged by an automated filter, or their university's requisitions department.
Then this paper came out:
This is the kind of thing which reminds me that, as a species, we're not even really trying to survive AI.
So what did they do?
They started with two genome language models. These are transformer models, GPTs, but for DNA sequences. They fine-tuned them on a set bacteriophage genomes, and got them to generate full sequences. Bacteriophages have small genomes, about 4-6kb, which probably made the task relatively easy compared to generating a human pathogen. They synthesized 300 of these genomes, and found that 16 of them (around 0.5% 5%!) were functional phages, capable of infecting E. coli.
So are their models just acting as """stochastic parrots""", repeating back or slightly remixing existing genomes? There are two pieces of evidence against.
Point 1 sounds scary, but point 2 is actually much worse. Remember those automated genome classifiers from my escapades earlier? The ones that stopped me from ordering pathogen sequences online? Those wouldn't have worked here.[2]
If this method was used to generate human pathogens, it would allow a bad actor to order functional viral sequences without tripping any safeguards. This is really bad.
This would be incredibly cool research if it had been done by a shadowy secretive corporation, under lock and key, with security clearance for everyone involved. Or they could have at least had the decency to publish it in one of those slimy academic journals which charges a $200 fee to read it. As it is, they put it on BioRXiv, for anyone to read for free. Great.
So why did I get this wrong, and what can we do next?
To my credit, I wasn't that wrong about this. But perhaps that makes it less excusable that I failed at the final stumbling block, rather than being tripped up by total ignorance.
Previous discussions about AIxBiorisks usually went like this
Interlocutor: [New AI] can meaningfully help people build a bioweapon?
Me: How?
I: It can give useful instructions on setting up a lab, and carrying out experiments
Me: But how would they get the genes?
I: Order them online!
Me: I think that's impossible, since (despite what you hear) it's actually really quite hard to order pathogenic genomes online
I: Hmmm
Me: Sure, if an AI can design a totally new pathogen from scratch, it would easily get around every filter, but this capability is at the level of Powerful AI where the concern is no longer misuse. Humans are nowhere close to building whole pathogens ex nihilo, so this is clearly in the "10+ years ahead of human tech" level where the AI just kills you. Nobody gets to "use" an AI that smart.
And uhhh, I guess I was wrong on that one. It is possible to build a narrow AI which can generate a whole novel genome, without it killing you. What should humanity do, and what will I do?
Narrow AI like this is well within the abilities of humans to defend against, if we're not stupid. Ironically, the same models which are capable of generating these genomes are the ideal technology for filtering genomes.
The Evo models are capable of "functional annotation" which means looking at a genome and figuring out what each part does. Instead of looking for pathogenic DNA sequences, synthesis companies could set up filters which annotate the sequence, and classify those annotations. If several sections are annotated as "coronavirus spike protein" "coronavirus replication protein" "coronavirus coat protein" then you've probably got something harmful on your hands, and you should check more thoroughly. This is all reported in the paper as a method. So we get to the update for the rest of the world:
If anyone reading this works in Biosecurity: nucleotide synthesis companies should start using protein structure annotation as part of their screening protocol for genes, if they aren't already.[3]
What am I going to do with this information? It's a bit scary on the front of "AI is much scarier for biorisks than I thought" but in some ways it's positive. It seems like the frontier of what's accessible to narrow, bio-focused AI is bigger than I previously thought.[4] This makes me quite a bit more positive on the potential value of using AI for tasks like human intelligence augmentation.
GPT 5 estimates approximately 1-10k GoF researchers worldwide. Claude refused to answer.
In practice, there aren't restrictions on ordering bacteriophage genes online, because phages aren't harmful to humans.
GPT 5 told me they generally don't. I didn't even try asking Claude for this one. This really is the sort of thing that AI's shouldn't be telling random members of the public, but whatever I guess.
This might generalize to other domains, but I'm not sure. In an ideal world, I'd keep being surprised at what narrow, non-killing-everyone AIs can do, until alignment is solved by a narrow philosopher-mathematician-programmer AI, which I currently consider to be extremely unlikely.