So the situation now is firmly one in which the public will have no access to anything "Mythos-level" unless unprecedentedly robust safeguards are put in place, while "unsafe" Mythos-level AI will be accessible only to certain elites.
The first part of this seems to have the potential to affect the investment narrative of AI boom - can continued scaling up pay for itself, if the only customers allowed are those who pass a national security check?
And meanwhile there is the alternative model of national AI development, the Chinese approach. Will progress in the Chinese open models come to a halt when they can no longer distill from the most advanced American models? Will China develop its own Mythos-level AI behind a national security firewall too? Or will we have a "Covid-19 of AI" when an open model with Mythos capabilities is irreversibly made available? For that matter - returning to American models too - what happens if Mythos-level capabilities can be obtained from pre-Mythos AI plus appropriate scaffolding?
Overall I see nothing to change the strategic conclusion that from a human point of view, you should either want the whole thing stopped, or you should want superalignment solved. We are simply another step closer to the point of no return.
When Mythos first came onto the scene, Anthropic's narrative around it was that there would be a phase-in period where it would be used to clean up the world's cybersecurity a bit, after which it would be (relatively) safe to release to the public. I think that was basically correct, but a big part of the problem here is that, as it turns out, the world's cybersecurity was in a state of utter catastrophe, and a few months of bugfixing were nowhere near enough.
I think people working professionally in cybersecurity-adjacent fields already knew this, but the message is kind of hard to get across to laypeople. Xkcd accurately summarized it here: "Our entire field is bad at what we do, and if you rely on us, everyone will die". Project Glasswing and OpenAI Daybreak can pick some low hanging fruit, but actually fixing the situation overall is going to require a lot more time, tokens, and human effort.
A claim that I have heard (secondhand) is that with Fable, the "jailbreak" was that a user can pose as the developer of a piece of open source software and ask for bug fixes, and some of those bug fixes will be fixes for security vulnerabilities (which can then be handed off to another model to reverse engineer and use as exploits). I think this problem is in some sense pretty fundamental; the difference between legitimate defensive cybersecurity hardening and a criminal cyberattack is what gets done with the vulnerabilities later. If someone uses Mythos to find security vulnerabilities and then reports them to the vendor, that's a good and legal and virtuous thing that we award large cash prizes for. If someone uses Mythos to find security vulnerabilities and then reports them to someone else, then that's a bad and criminal thing that we jail people for.
I think the only way to reconcile these is with this proposal, which was very much not in the Overton window at the time but which might be more palatable now: When a model finds a security vulnerability in a piece of software that has a public security contact, it should take steps to make sure that the vulnerability is reported, such as by emailing the vendor itself or by asking the user to provide evidence that they send that email or are an employee of the vendor.
The confusion and the lack of transparency are the point for any administration that wants to maintain maximum flexibility of their own moves while limiting the ability of other actors to predict and respond effectively.
Uncertainty is the point. It is a fulcrum of power.
I took this stance back when the Fable takedown was first announced:
Given the track record of this administration and the precedent of unilateral action without deliberation or public input (the prior action on SCR and the Executive Order), I have no expectation that this will be the end of the government's attempts to control AI for their own purposes.
This is the opening salvo, a trial balloon to see how much pressure they can exert and maintain control to promote their personal interests, not in service to the public good. This is the same administration that dismantled USAID, global welfare is not in their vocabulary.
The issuance of a takedown order without clearly communicated terms is a feature, not a bug. The less clear they can get away with being on exactly where the lines are, the further they can push that line.
Don't conflate "has some legitimate security concern" with "acting in the best interest of the nation/humanity." What we're seeing is the continuation of short-sighted reactionary power plays, not the reasoned leadership necessary to actually govern effectively.
This is not a reason to celebrate. It is one of the last warning shots.
This style of control does not end with a simple on/off switch; that's how it starts. To clear the hazy regulatory bar I would not be at all surprised to see requirements for fine-tuning models to adopt pro-administration positions in the same guise of safety, decency, or any number of the other typical fig leaves. Potentially those requirements would be formalized. I find it more likely they'll be informally communicated and those who don't comply will encounter additional friction in getting models released.
Anthropic showed the administration there are tools to subtly influence the output of AI on the fly without making those outputs known to the end user. Their initial covert downgrading/modifying of model outputs to block researchers from using it for frontier AI development demonstrated that capability, reversed as quickly as it was imposed. Now that the administration has seen it's possible to censor or alter content of disfavored categories, it's likely only a matter of time before those categories expand as a soft requirement for government approval of model release. Given the backlash to that censorship, the next time the covert modifications are introduced I doubt that the public will be informed as openly, if at all.
The vise is closing, and the only one outside its grasp is the one turning the screw.
We have a new standard policy for releasing frontier AI models. It is not good.
We are now, it seems, going to have the White House individually, in an opaque ad hoc manner, deciding who can access which frontier AI models when.
One hopes we will at least transition this into a predictable and formal set of procedures for determining what to do. But we spent years not laying the groundwork for doing that, and now here we are.
Essentially everyone should read the first half of this post, to understand what happened, and my speculations on what it means going forward for AI and America.
Only those who care and find it relevant to their interests should proceed to the second half, which addresses the blame game about how we got here, and claims that things would be better if people stopped speaking truth.
Table of Contents
Part 1: A Maximally Terrible Policy
I am happy that this suggests our policy is not primarily ‘try to murder or cripple Anthropic in particular,’ or at least that they will not be too hypocritical around that.
I am happy that our policy is not going to continue to be the previous policy of ‘do nothing, impose no restrictions, gather no information, build no capacity, see no evil, hear no evil, speak no evil, lest we not beat the only evil we can speak, which is China.’
Ad hoc opaque politicized decisions from the White House on who gets frontier intelligence, however, is Not The Way. It is maximally Not The Way.
What is ‘Mythos-like’ capability? I expect there to be a lot of confusion about the distinction between the thing Mythos is actually uniquely able to do, versus being able to replicate any individual finding. It is possible GPT-5.6 is ‘Mythos-like’ but my guess is it will be good but not be that Mythos-like.
Altman, in this case, is highly on point, but has no choice but to play ball:
Thus, do not consider this new policy a victory.
How long will the delays be for releases? I agree that ‘roughly nine months ahead’ is correct. So if it takes two months to do releases, now we are seven months ahead. That still seems like enough. If it gets to be much more than that, it gets a lot scarier.
As Curran points out, this slows deployment. It does not slow internal development. OpenAI will be working on GPT-5.7 (or 6) on the same schedule as before. So this doesn’t cause cumulative damage to American AI.
I do not think this implies Chinese models are banned in America. It is a possibility for other reasons, but the point of the new policy is to keep dangerous things out of the wrong hands, especially Chinese or other foreign hands.
We might still ban Chinese models out of paranoia that they have backdoors or are otherwise biased or unsafe, now that we are in the model banning business. Or we might do it to protect American AI labs, although at that point we are rather cooked.
I don’t think America would restrict Chinese models from Americans purely because they were too good and thus dangerous in the wrong hands, once they were out there. But also I have a hard time predicting what the DC threat models will look like, because they often don’t match reality.
What about export controls on Nvidia? I don’t think we have a coherent theory here, and the government doesn’t yet understand that access to compute scales threats. But yeah, I do think long term this moves us towards tighter controls on chips.
What Does This Mean For Fable?
This had only a small negative impact on the odds of Fable returning, with time decay explaining most of the change in probabilities here since last time.
My worry is that treating GPT-5.6 this way makes it more likely that the White House is confused about what it means to have ‘Mythos-like capabilities’ and thus is likely to insist upon impossible interventions to ‘fix jailbreaks.’
Why such a small reaction? I think there is another market that is a big hint:
As in, by June 30 there is only a 14% chance the foreigner ban is rescinded (we need more dates here, please) but a 26% chance the Americans are back online. That implies a 12% chance of KYC implementation.
I thus interpret the bulk of the 50% chance we get Fable back in July as Anthropic being allowed to implement a relatively straightforward KYC process of some kind, rather than the restriction being lifted.
Solve For The Equilibrium
More generally, what happens now?
For now, all frontier model releases that exhibit whatever the White House considers ‘Mythos-level’ cyber capabilities, or other sufficiently threatening capabilities, will be subject to a similar approval process. The White House will decide, ad hoc, what gets released, and what does not, to who, on what schedule.
Over the next few months or hopefully weeks, we will probably get something more formalized, and we will better understand what the government is demanding. The requirements for releases may or may not be reasonable or achievable. I expect it to be relatively reasonable, but disaster is certainly possible.
Will the process be trying to be fair, or will it systematically favor some companies over others? The order on GPT-5.6 updates us towards the process being at least less unfair, but doubtless it will not be entirely fair. Anthropic is going to get more suspicion, and have a tougher time dealing with such things, than OpenAI. Up to a point, that is manageable, and even has its advantages. My instinct says that the gap will be meaningful, but manageable.
If what we get is a tiered release approval process, the biggest overall question will be, how long does it take? This does not much slow down development of American frontier AI. Instead, it slows down releases, which means everyone will be a fixed amount behind where they would otherwise be.
The first few weeks of the process are mostly ‘free’ in the sense that there are other things that require that time. A little extra time has its advantages, as many releases often feel rushed. After that, it gets perilous, and if the gap is often months that starts to dig into our competitive edge.
The other big question is, who will get the early access, and how will that be decided? Will it almost entirely be restricted to Americans? My assumption is yes, and this will be a big problem. How much this be politicized domestically, and not only Anthropic versus OpenAI (versus Google or SpaceX or Meta)? My guess is by default not that much, but that this will be used as leverage for other things, in an ad hoc manner.
It will be a while before any of this can be meaningfully codified into law. We need people to be working on that now, or else it takes even longer. Yes, events will move faster than Congress can, but we should not simply give up.
Another question is, what happens when open models start to cross the ‘Mythos-capable’ threshold? This is going to happen within a year or so. The hope is that we are ready before that happens, or ready enough that the fallout is not so bad.
If the US government wanted to ban open models, at least above some capability level, could they do so effectively? For the bulk of all tokens used, yes, rather straightforwardly, and if you ask ‘on what authority’ I am sorry but it is 2026, and they can totally get anyone with a footprint to stop.
If the question is ‘can they actually prevent a determined person with ordinary skill in the art from running DeepSeek or GLM or Kimi at home or from renting a Chinese server’ the answer is no, not without international cooperation.
If we are wise, we will use this as an opportunity to Pick Up The Phone. We have sent a costly signal that we see real issues with these models and are willing to make real sacrifices in the name of security. We should try to use that to get things in return, or at least lay the foundation for joint action.
Most of all, we are out of the ‘people don’t do things’ phase of the game, where basically all meaningful actions were outside the Federal overton window. That’s done. We should expect a lot more actions, many of them similarly ill-executed, at least at first, in ways that we did not anticipate.
And yes. It looks like this regime is here to stay, and things will only get crazier.
The Once And Future Fable
We also have a fun little update on the Fable negotiations.
First they complain when they can’t get you on the phone fast enough. Then they complain when they can’t get you off the phone fast enough.
Tom Brown seems like an excellent choice for liason. He can absolutely handle this. Logically this puts a lie to ‘we need to only ever talk to Dario so he needs to be able to respond instantly at all times or else everything we do after that is your fault.’
Part 2: The Blame Game
You don’t have to keep reading past this point, if you don’t want to.
But here we are, and it seems necessary to have it here for the record.
I did not see even a single person defend the new policy. Nor did I see any of the anti-regulation crowd respond with ‘thank you for saying this, let’s work together’ or any form of ‘thank you for changing your mind.’
Instead, we had a bunch of ‘who is to blame for this horrible new policy’ and a lot of ‘you warned us that this would happen, and this was directionally something you pointed out was needed, so this must be what you wanted and also your fault.’
And certainly none of those people considered that they might actually be the ones most responsible for this mess.
Or that this might be a response, however inept, to a real underlying problem, rather than a response to a bunch of rhetoric.
Here’s the canonical example, in addition to Matt Parlmer and Martin Casado:
I dunno, I say back to the person most directly responsible for us not having any state capacity or transparency or understanding required to have a more reasonable and less crazy response to this situation, maybe you could introspect a little?
Marc also saw me come back and thus tried to taunt me a second time, saying that ‘you got what you wanted,’ but thought better of this and deleted the tweet.
Gary Marcus gets this story right, that the likes of Marc Andreessen and David Sacks spent the whole time until last month insisting that models would commoditize and no meaningful regulations or transparency were needed, and indeed no preparations should be made.
As a result, when the time came and the others at the White House freaked out over the new cyber capabilities, they took ad hoc actions based on misunderstandings, and we ended up in this giant mess.
A Parable
I mean, the events happened, but think of it as a parable.
In January 2020, some people started warning about a new virus: Covid-19.
These ‘doomers’ claimed that unless decisive action was taken now the virus would not be contained, that it would spread around the world, and that people would likely need to lock down to avoid infection. They urged preparing better for such steps now, both personally and as policy, and for us to do things like travel bans and screenings to try and stop the virus from fully spreading.
Others strongly opposed such talk.
In February 2020, it became obvious that this was going to happen. We did nothing.
In March 2020, it happened. When exponential growth went far enough, with almost no preparation, ad hoc rules were assigned as America locked down.
This was of course the fault of the people warning about the lockdowns. And they should be happy, right? Never mind that the execution was horrible. They got what they wanted. We could have avoided all this if not for all their talk of pandemics. Didn’t they know how governments work?
Then, later, a vaccine was invented and rolled out. People warned that this rollout would be logistically difficult and involve hard choices, and urged us to move quickly and prioritize saving lives.
The government instead largely prioritized preferred groups, and slowed the rollout considerably to ensure the right people got access first. This was all, of course, the fault of those who warned that this rollout would be difficult, and who proposed that we set rules for who got the vaccine first on the basis of age only. They should have known how government works.
There was also that time when I warned that we were at risk for termites and should take precautions, so it was totally my fault when termites were discovered and you then decided to then burn down my house. Problem solved, right? I got what I wanted.
I could generate like 20 more of these, several of which still involve Covid-19, but hopefully you get the point.
What About the Recent Executive Order?
SE Gyges makes the reasonable point that three weeks ago my opinion on the EO was different, calling this me ‘defending the EO power grab.’
Sure, that’s centrally fair.
First off, no, I am not saying I am general anti-frontier-AI-regulation-guy now.
I am saying I am anti this particular form of regulation, and its implementation.
(I do claim to be a general anti-regulation-of-most-things guy, in most areas of life, and I believe my and this blog’s track record on that are hard to argue with, and yes I would include many forms of non-frontier AI regulation.)
When it comes to government, ‘you will not get the version of this that you would like’ is always an excellent point, but also something in this vein was going to happen around this time no matter what, we confirmed that no later than a month ago, and the whole idea was always to try and do this better rather than worse.
So I think it is right to revisit this, I will quote the same passage SE quotes:
I will extend it a bit, here is the rest of this passage:
This is not exactly what they call a ‘ringing endorsement.’ Happy is not the right word.
It is saying ‘we blew our opportunity to do better things, and this is at least some formalization of the situation, and the remaining alternatives would be worse, and yes this could easily be abused but the risk of that abuse seems priced in at this point.’
That risk of abuse did materialize rather quickly, exactly in the ways I warned about. But the actual apparatus from the EO does not yet even exist yet. If anything, the EO is a step towards formalizing the situation to mitigate what is happening now.
The Problem Is Real
The EO was a symptom of the situation that led to this. Not the cause.
If there had been no EO, the White House would have done this anyway. OpenAI had already signed on the dotted line, to have its models reviewed and to play ball with the White House, and the White House had already made clear that it wasn’t asking.
The talk of the problem is a symptom, an attempt to solve the problem. Not the cause.
We now have two problems. Neither of them is ‘people keep pointing out the problem.’