It Is Untenable That Near-Future AI Scenario Models Like “AI 2027” Don't Include Open Source AI

[-]Daniel Kokotajlo7mo170

owerful guardrails in place) as miracle disease cures are arriving and robot factories are rampant and yet there is still no mention of what is happening with the open models of that day. How do we get through 2025, 2026 and 2027 with no super viruses? Or high-profile drone assassinations of political leaders?

The AI 2027 scenario predicts that no super viruses will happen in 2025-2027. This is because the open-weights AIs aren't good enough to do it all on their own during this period and while they could provide some uplift to humans there aren't that many human groups interested in building super viruses anyway.

A crux for me is that latter variable. If you could convince me that e.g. there are 10 human groups that would love to build super viruses but lack the technical know-how that LLMs could provide, (and that expertise was indeed the bottleneck -- that there haven't been any human groups over the last decade or two who had both the expertise and the motive) I'd become a lot more concerned.

As for drone assassinations: This has nothing to do with AI, especially with open-weights AI. The way to do a drone assassination is to fly the drone yourself like they do in Ukraine. Maybe if the security is good they have EW jammers but even then just do a fiber optic cable. Or maaaaybe you want to go for AI at that point -- but you won't be using LLMs, you'll be using tiny models that can fit on a drone and primarily recognize images.

[-]Charbel-Raphaël7mo60

biorisks is not the only risk.

Full ARA, might not be existential, but I might be a pain in the ass once we have full adaptation and superhuman cyber/persuasion abilities.

[-]Andrew Dickson7mo30

@Daniel Kokotajlo - thanks for taking the time to read this and for your thoughtful replies.

So to make sure I understand your perspective, it sounds you believe that open models will continue to be widely available and will continue to lag about a year behind the very best frontier models for the forseeable future. But that they will simply be so underwhelming compared to the very best closed models that nothing significant on the world stage will come from it by 2030 (the year your scenario model runs to), even with (presumably) millions of developers building on open models by that point? And that you have such a high confidence in this underwhelmingness that open models are simply not worth mentioning at all. Is that all correct?

The AI 2027 scenario predicts that no super viruses will happen in 2025-2027.

Okay. I don't buy this based on the model capability projections in your scenario. But even if we set aside 2025-2027 what about the years 2028 - 2030, which are by far the most exciting parts of your scenario? For example in Februrary 2028 of AI 2027, we have "Preliminary tests on Safer-3 find that it has terrifying capabilities. When asked to respond honestly with the most dangerous thing it could do, it offers plans for synthesizing and releasing a mirror life organism which would probably destroy the biosphere."

... which, based on a one year lag for open models, would mean that by February 2029 we have open models that are capable of offering plans for synthesizing and releasing mirror life to basically anyone in the world. (And presumably also able to allow almost anyone in the world to make a super virus with ease, since this is a much lower lift than creating mirror life).

Even setting aside synbio risks and other blackball risks and considering only loss of control (which you seem to take much more seriously than other AI risks), things still seem problematic for your account of things. Because even in 2026 and 2027, developers at OpenBrain and DeepCenter seem seriously concerned about a loss of control of their models in those years. But if we jump ahead just a year, then a loss of control of models with those same capabilities (from the year before) will be essentially guaranteed in open models, based on developers willing to run with few or no safeguards, or even AI owners intentionally giving over autonomy to to the AI. Can you please explain how a rogue AI with 2027 frontier capabilities is incredibly scary in 2027 and not even worthy of mention in 2028, or a rogue AI with 2028 frontier capabilities might be species-ending (per your "Race" branch) in 2028 and not scary at all in 2029 and 2030?

[-]Daniel Kokotajlo7mo62

We didn't talk about this much, but we did think about it a little bit. I'm not confident. But my take is that yeah, maybe in 2028 some minor lab somewhere releases an open-weights equivalent of the Feb 2027 model (this is not at all guaranteed btw, given what else is going on at the time, and given the obvious risks of doing so!) but at that point things are just moving very quickly. There's an army of superintelligences being deployed aggressively into the economy and military. Any terrorist group building a bioweapon using this open-weights model would probably be discovered and shut down, as the surveillance abilities of the army of superintelligences (especially once they get access to US intelligence community infrastructure and data) would be unprecedented. And even if some terrorist group did scrape together some mirror life stuff midway through 2028... it wouldn't even matter that much I think, because mirror life is no longer so threatening at that point. The army of superintelligences would know just what to do to stop it, and if somehow it's impossible to stop, they would know just what to do to minimize the damage and keep people safe as the biosphere gets wrecked.

Again, not confident in this. I encourage you to write a counter-scenario laying out your vision.

[-]Daniel Kokotajlo7mo169

This prediction is not holding up well so far and if anything the gap appears to be closing with Epoch AI estimating that open models are only behind the best closed models by about one year.

One year is already a long time in AI, but during the intelligence explosion it is so long that it means irrelevance.

[-]Andrew Dickson7mo10

One year is already a long time in AI, but during the intelligence explosion it is so long that it means irrelevance.

This may seem obvious to you, but it is not at all obvious to many AI researchers, myself included. Can you share references to any surveys of AI researchers or research papers formally arguing this claim? I have heard some researchers make this claim, but despite having read pretty widely in AI research I have not seen anything like a serious empirical or theoretical attempt to justify or defend it.

[-]Andrew Dickson7mo30

As a follow-on I would ask you: what is the mechanism for this "irrelevance" and why does it not appear in your scenario? In your scenario we are meant to be terrified of early 2028-frontier models going rogue, but by early 2029 (based on a one-year lag) models with those same capabilities would be in the hands of the general public and widely deployed (presumably many with no guardrails at all, or even overtly dangerous goals). And yet in your scenario there is no military first strike on the owners of these open models by OpenBrain, or again, even mention of these open models at all.

[-]Daniel Kokotajlo7mo41

We had very limited space. I think realistically in the Race ending they would be doing first strikes on various rivals, both terrorist groups and other companies, insofar as those groups seemed to be a real threat, which most of them wouldn't be and probably all of them wouldn't be. It didn't seem worth talking about.

[-]Daniel Kokotajlo7mo20

I think it depends on takeoff speeds? It seems a fairly natural consequence of the takeoff speed we describe in AI 2027, so I guess my citation would be the Research page of AI-2027.com. I don't have a survey of takeoff speeds opinions, sorry, but I wouldn't trust such a survey anyway since hardly anyone has thought seriously about the topic.

[-]Daniel Kokotajlo7mo152

I encourage you to make your own mini-scenario (a couple pages long) that is basically an alternate version of AI 2027, but with more realism-according-to-you. Like, pretend you wrote a 100-page scenario branching off from AI 2027 at some point, and then actually write the 3-page summary of it. (I'm suggesting 3 pages to make it low-effort for you, the more pages you are willing to write the better)

[-]kellysubscript7mo41

In reading AI2027 I couldn't help but to imagine an adjunct, dystopian storyline based on the scenario you provided. I'm a practicing psychotherapist and have already heard rumblings of my job being replaced by virtual therapists in the future. This is a terrifying prospect given the lesson in the swiftness and effectiveness of mind-control in citizens through politics over the course of the last 10 years. There is great good to come of AI in the future but we are on the cusp of releasing technology, akin to the nuclear weaponry of the 40's, on a planet that is already on the precipice of collapse. So much more to say but I think it can be summed up by saying, "garbage in, garbage out" and we have yet to sort out the differences.

[-]StanislavKrym7mo10

Talking about alternate realistic^[1] scenarios, Zvi mentioned in his post that "Nvidia even outright advocates that it should be allowed to sell to China openly, and no one in Washington seems to hold them accountable for this."

Were Washington to let NVIDIA sell chips to China, the latter would receive far more compute, which would likely end up in DeepCent's hands. Then the slowdown might cause the aligned AI created in OpenBrain to be weaker than the misaligned AI created in DeepCent. What would the two AIs do?

^{^}
I think that unrealistic scenarios like destruction of Taiwan and South Korea due to the nuclear war between India and Pakistan in May 2025 can also provide useful insights. For example, if we make the erroneous assumption that the total compute in the USA stops increasing and the compute in China increases linearly, while the AI takeoff potential per compute stays the same, then by May 2030 OpenBrain and DeepCent will have created misaligned AGIs and be unable to slow down and reassess.

[-]Random Developer7mo61

I fear your concerns are very real. I've spent a lot of time running experiments on the mid-sized Qwen3 models (32B, 30B A3B), and they are strongly competitive with frontier models up through gpt-4o-1120. The latter writes better and has more personality, but the former are more likely to pass your high school exams.

What happened here? Well, two things. First, the Alibaba Group is competent and knows what it's doing. But more importantly, it turned out that "reasoning" was surprisingly easy, and everyone cloned it within a few months, sometimes on budgets of less than $5,000. And a well-built reasoning model can be much stronger than GPT 4o on complex tasks.

As long as we relied on the Chinchilla scaling laws to improve frontier models, every frontier model cost far more than the last. This made AI possible to control, at least in theory. But Chinchilla scaling finally seems to be slowing, and further advancements will likely come from unexpected directions.

And some of those further advancements may turn out to be like reasoning, something that can be trained into models for $5,000. Or perhaps it will require fresh base models, but the underlying technique will be obvious enough that any serious lab can replicate it.

In other words: We need to consider the scenario where one good paper might make it obvious how to train a 200B parameter model into a weak AGI.

I think the only way we survive this is a global halt with teeth. Maybe Eliezer's book will convince some people. Maybe we'll get a nasty public scare that makes politicians freak out. I strongly suspect we will not be able to align an ASI any more than we can fly to the moon by flapping our arms.

[-]Mo Putera7mo30

Chinchilla scaling finally seems to be slowing

Interesting, any pointers to further reading?

[-]Random Developer7mo103

The idea that Chinchilla scaling might be slowing comes from the fact that we've seen a bunch of delays and disappointments in the next generation of frontier models.

GPT 4.5 was expensive and it got yanked. We're not hearing rumors about how amazing GPT 5 is. Grok 3 scaled up and saw some improvement, but nothing that gave it an overwhelming advantage. Gemini 2.5 is solid but not transformative.

Nearly all the gains we've seen recently come from reasoning, which is comparatively easy to train into models. For example, DeepScaleR is a 1.8B parameter local model that is hilariously awful at everything but high school math. But a $4,500 fine tune was enough to make it competitive with frontier models in that one area. Qwen3's small reasoning models are surprisingly strong. (Try feeding 32B or 30B A3B high school homework problems. Use Gemma3 to OCR worksheets and Qwen3 to solve them. You could just about take a scanner, a Python control script, and a printer, and build a 100% local automated homework machine.)

I've heard different kinds of speculation why Chinchilla scaling might be struggling:

Maybe we're running low on good training data?
Maybe the resulting models are too large to be affordable?
Maybe the training runs are so expensive that it's getting hard to run enough experiments to debug problems?
Maybe this stuff is just an S-curve, and it's finally starting to flatten? Most technological S-curves outside of machine learning do eventually slow.

LLM control is frequently analogized to nuclear non-proliferation. But from what various experts and semi-experts have told me, building fission weapons is actually pretty easy. In fact, most good university engineering departments could apparently do it. Simplified, low-yield designs are even easier. But what's harder to get in any quantity is enriched U-235 (or a substitute?). Most of the routes to enrichment are supposedly hard to hide. Because fissile material is somewhat easier to control, nuclear non-proliferation is possible.

Chinchilla scaling is similarly hard to hide. You need a big building full of a lot of expensive GPUs. If governments cared enough, they could find anyone relying on scaling laws to train the equivalent of GPT-5 or GPT-6. If you somehow got the US, China and Europe scared enough, you could shut down further scaling. If smaller countries defected, you could physically destroy data centers or their supporting power generation (just like countries sometimes threaten to do to uranium enrichment operations).

This is why "reasoning" models were such a nasty shock for me. They showed that relatively inexpensive RL could upgrade existing models with very real new capabilities and the ability to handle multi-step tasks more robustly.

Some estimates claim that training Grok 3 cost $3 billion or more. If AI non-proliferation means preventing $30 billion or $300 billion training runs, that's probably theoretically feasible (at least in a world where powerful people fear AGI badly enough). But if AI non-proliferation involves preventing $4,500 fine tunes by random researchers (like primitive "reasoning" apparently does), that's a much stickier situation.

So, if like Yudkowsky, you have a nasty suspicion that "If anyone builds this, everyone dies" (seriously, go preorder his book[1]), then we need to consider that AGI might arrive via another route than Chinchilla scaling. And in that case, non-proliferation might be much harder than joint US/China treaties. I don't have any good answers for this case. But I agree with OP that we need to include it as a branch in planning scenarios. And in those scenarios, mid-tier open weight models like Qwen are potentially significant, either as a base for fine-tuning in dangerous directions, or as evidence that some non-US labs making 32B parameter models are highly capable.

[1] https://www.lesswrong.com/posts/iNsy7MsbodCyNTwKs/eliezer-and-i-wrote-a-book-if-anyone-builds-it-everyone-dies

[-]Mitchell_Porter7mo30

You could say there are two conflicting scenarios here: superintelligent AI taking over the world, and open-source AI taking over daily life. In the works that you mention, superintelligence comes so quickly that AI mostly remains a service offered by a few big companies, and open-source AI is just somewhere in the background. In an extreme opposite scenario, superintelligence might take so long to arrive, that the human race gets completely replaced by human-level AI before superintelligent AI ever exists.

It would be healthy to have all kinds of combinations of these scenarios being explored. For example, you focus a bit on open-source AI as a bioterror risk. I don't think a supervirus is going to wipe out the human race or even end civilization, because (as Covid experience shows), we are capable of extreme measures in order to contain truly deadly disease. But a supervirus could certainly bring the world to a halt again, and if it was known to have been designed with open-source AI, that would surely have a huge impact on AI's trajectory. (I suspect that in such a scenario, AI for civilian purposes would suffer, but deep states worldwide would insist on pressing forward, and that there would also be a lobby arguing for AI as a defense against superviruses. Also, it's very plausible that a supervirus might be designed by AI, but that there would be no proof of it, in which case there wouldn't even be a backlash.)

Another area where futurology about open-source AI might be good, is in the area of gradual disempowerment and replacement of humanity. We have societies with a division of roles, humans presently fill those roles but AI and robots will be capable of filling more and more of them; eventually every role in the economic, cultural, and political structure could be filled by AIs rather than by humans. The story of how that could happen, certainly deserves to be explored.

Still another area when open-source AI scenarios deserve to be studied, is in the highly concrete realm of near-future economics and culture. What does an AI economy look like if o4-level models are just freely available? This really is an urgent question for anyone concerned with concrete questions like, who will lead the AI industry and how will it be structured, because there seem to be factions in both China and America who are thinking in this direction. One should want to understand what they envision, and what kind of competitive landscape they are likely to create in the short term.

My own belief is that this would be such an upheaval, that it would inevitably end up invalidating many conventional political and economic premises. The current world order of billionaires and venture capitalists, stock markets and human democracies, I just don't see it surviving such a transition, even without superintelligence appearing. There are just too many explosive possibilities, too many new symbioses of AI with human mind, for the map of the world and the solar system to not be redrawn.

However, in the end I believe in short timelines to superintelligence, and that makes all the above something of a secondary concern, because something is going to emerge that will overshadow humans and human-level AI equally. It's a little monotonous to keep referring back to Iain Banks's Culture universe, but it really is the outstanding depiction of a humanly tolerable world in which superintelligence has emerged. His starfaring society is really run by the "Minds", which are superintelligent AIs characteristically inhabiting giant spaceships or whole artificial worlds, and the societies over which they invisibly preside, include both biological intelligences (such as humans) and human-level AIs (e.g. the drones). The Culture is a highly permissive anarchy which mostly regulates itself via culture, i.e. shared values among human-level intelligences, but it has its own deep state, in the form of special agencies and the Minds behind them, who step in when there's a crisis that has escaped the Minds' preemptive strategic foresight.

This is one model of what relations between superintelligence and lesser intelligences might be like. There are others. You could have an outcome in which there are no human-level intelligences at all, just one or more superintelligences. You could have superintelligences that have a far more utilitarian attitude to lower intelligences, creating them for temporary purposes and then retiring them when they are no longer needed. I'm sure there are other possibilities.

The point is that from the perspective of a governing superintelligence, open-source AIs are just another form of lower intelligence, that may be useful or destabilizing depending on circumstance, and I would expect a superintelligence to decide how things should be on this front, and then to make it so, just as it would with every other aspect of the world that it cared about. The period in which open-source AI was governed only by corporate decisions, user communities, and human law would only be transitory.

So if you're focused on superintelligence, the real question is whether open-source AI matters in the development of superintelligence. I think potentially it does - for example, open source is both a world of resources that Big Tech can tap into, as well as a source of destabilizing advances that Big Tech has to keep up with. But in the end, superintelligence - not just reasoning models, but models that reason and solve problems with strongly superhuman effectiveness - looks like something that is going to emerge in a context that is well-resourced and very focused on algorithmic progress. And by definition, it's not something that emerges incrementally and gets passed back and forth and perfected by the work of many independent hands. At best, that would describe a precursor of superintelligence.

Superintelligence is necessarily based on some kind of incredibly powerful algorithm or architecture, that gets maximum leverage out of minimum information, and bootstraps its way to overwhelming advantage in all domains at high speed. To me, that doesn't sound like something invented by hobbyists or tinkerers or user communities. It's something that is created by highly focused teams of genius, using the most advanced tools, who are also a bit lucky in their initial assumptions and strategies. That is something you're going to find in an AI think tank, or a startup like Ilya Sutskever's, or a rich Big Tech company that has set aside serious resources for the creation of superintelligence.

I recently posted that superintelligence is likely to emerge from the work of an "AI hive mind" or "research swarm" of reasoning models. Those could be open-source models, or they could be proprietary. What matters is that the human administrators of the research swarm (and ultimately, the AIs in the swarm itself) have access to their source code and their own specs and weights, so that they can engaged in informed self-modification. From a perspective that cares most about superintelligence, this is the main application of open source that matters.

[-]StanislavKrym7mo10

The problem with non-open-weight models is that they need to be exfiltrated before wrecking havoc, while open-weight models cannot avoid being evaluated. Suppose that the USG decides that all open-weight models are to be tested by OpenBrain for being aligned or misaligned. Then even a misaligned Agent-x has no reason to blow its cover by failing to report an open-weight rival.

LESSWRONG
LW

LESSWRONG
LW

37

It Is Untenable That Near-Future AI Scenario Models Like “AI 2027” Don't Include Open Source AI

37

Ω 14

37

Ω 14