I basically agree.
Shutting down would make people say, holy shit, they are serious about this extinction risk thing.
I don't know. I think a lot of people would say something like, 'They're making excuses because they knew they couldn't win the race.'
If one company decides not to plow ahead recklessly, and actually stops building existentially dangerous technology, that sends a hard-to-ignore message that coordination might be possible.
I don't know. People might wonder, 'If coordination is possible, why did they choose to shut down unilaterally?'
Shutting down may have a critical positive effect ~only when there are other visceral signs of AI danger, I second these points.
I'm a little worried Anthropic has missed the window for this option, since now it might look like the Whitehouse was out to destroy them and they were just throwing in the towel.
I think most of this effect disappears if they just wait a couple months for the local news cycle to pass (assuming the public spats die down, even if the actual relationship remains similarly adversarial in less visible ways).
Yeah but I kinda do put moderate odds on "The White House continues to actively try to destroy Anthropic and eventually either succeeds or at least it's pretty visibly in-question."
I have no business experience, but I am wondering about the practicalities of shutting down such a company. If it has already gone public, its directors may not even own it. The shareholders would have to agree, which they'll hardly be willing to do or they wouldn't have invested. Even if the directors technically own it, what stops the employees banding together to take it over? Or other companies mounting a takeover, or poaching the employees and IP? Or the current regime or the next nationalising it, which there has been mention of from both political sides?
Are these companies already too big to shut down?
A related thesis you could have is: "if you're a frontier AI company and you're IPO'ing soon, you should put safeguards in place to give yourself the option to shut down the company if things look too dangerous."
There was news recently about how apparently Elon Musk organized the SpaceX governance in such a way that shareholders aren't allowed to sue him. If he can pull that off, I bet AI companies can find a way to create similar safety protections, if they're sufficiently motivated.
It seems plausible that Anthropic could shut itself down and win the legal battle due to how they're structured, but I doubt Google could. Even if they won the legal battle, I'd expect the US government to demand the weights, GPUs, and all research for national security reasons, so all they'd really achieve is giving everything to whoever is best politically connected (either an internal government program or X).
Most of the value would be from the researchers themselves not pushing capabilities, and it might still be meaningful. My read of the situation is that surprisingly few people really "get it", so Anthropic's researchers refusing to work on capabilities would be a meaningful slowdown by itself.
I'd expect the US government to demand the weights, GPUs, and all research for national security reasons
I think there's a pretty good chance this wouldn't happen (maybe 50/50 but I haven't really thought about it). The US government is mostly not AGI-pilled. Things might be different a year from now.
Also, even if they wanted to, I think the government would have a hard time putting together an org that's as good at AI development as OpenAI, Anthropic, or Google.
I don't think it matters if the US government is AGI-pilled. It's clearly a militarily-relevant technology and the DoW seems to think it's important, so I don't think they'll let this capability go away rather than just giving it to another military contractor like OpenAI.
I'm not even sure they would have to entirely shut down, plausibly they could just completely halt frontier development? Stop training bigger and better models but keep serving the models they already have?
If you replace the words 'frontier AI' with 'fossil fuel', switch the companies to Aramco and Total, and post this in a climate change community, the post will still make sense.
It would also be great if countries stopped going to war, governments followed sound economic policy, and we started a Manhattan project to solve aging, but none of those seem very likely.
How concretely do you plan to persuade any of the frontier labs to shut down?
This is the latest of several occasions in which I mentally predicted that a post would be well-received on the EA Forum and poorly received on LessWrong, and then it ended up being the other way around.
Huh, curious for you model of why you predicted the other-which-way? This seemed like a classic "does better on LW" kinda post (for good or for ill). Although I wouldn't have predicted so much disparity.
Yeah the obvious reason to predict more success on LW is that the post is implicitly pessimistic about AI companies' ability to solve alignment. The reason I expected less success is that the post is making a simplified, arguably naive* argument where I could've made many caveats, or said a lot more about the real-world complexities of what I'm proposing, but I left those bits out because ultimately I didn't think they were important enough. LW users (including me) tend to write comments pointing out those things I didn't talk about.
In retrospect I suppose it's unsurprising that this post was well-received on LW. It might just be risk-aversion combined with the fact that I'm not very good at predicting which posts LW users will like.
*in the colloquial sense of "not understanding how the world works", not the mathematician's sense
And given that your argument is good and no one is doing that, we can infer they're just lying, and they personally want to plow ahead aggressively of their own volition. I just wish you stated this directly. We shouldn't be in the business of saving the appearances for their commitment to x-risk reduction. And we really shouldn't assume they're anti- s-risk when they think they'll be the godkings through this transformation.
But perhaps instead of shutting down, an AI company could reallocate 100% of its budget on some combination of safety research + global coordination to make AI development safer, and do just those things until it runs out of money. Think of how much more safety work a they could do if they dedicated all their resources to the problem!
A lot of people think we should pause at AGI, but one problem is that AI capabilities are spiky, so we will probably get powerful and dangerous capabilities even before it is human level at all tasks. So I was thinking what might actually be politically/economically feasible is pausing further development once they get an automated AI researcher, because all those AI capability researchers are not going to be happy about being laid off. So we could turn them into AI safety researchers!
OpenAI's board tried to shut it down around 2024, I think in reaction to the early reasoning models. Microsoft was then in the middle of acqui-hiring the whole company and the shutdown was canceled, and the board members trying to shut it down were ousted. So we tried this already, the result is the shutdown attempt just fails completely.
I originally suggested none of them just sincerely want to shut it down, but we pretty much have a stronger result already -- we tried, it failed.
Cross-posted from my website.
Prior discussion: niplav's shortform (2025); Planning for Extreme AI Risks (2025) by Joshua Clymer
A frontier AI company (any one, I don't care which) should close shop and make an announcement along the lines of:
A common refrain among safety-conscious AI developers: "it doesn't matter if we stop building dangerous AI, because someone else will just build it instead." Is that really true, though? If a multi-hundred-billion-dollar company comes out and says "We've concluded that our product is horribly dangerous, nobody knows how to make it safe, and there's too high a risk that it leads to human extinction", this won't raise any eyebrows? This has no chance of spurring policy-makers into action?
Shutting down would make people say, holy shit, they are serious about this extinction risk thing. Shutting down sends a strong signal to governments that they should pay serious attention to AI x-risk.
It also encourages other companies to take safety more seriously. Right now, at least three AI companies have said something like, "maybe we'd prefer to slow down and pay more attention to safety, but then the other companies will plow ahead recklessly." If one company decides not to plow ahead recklessly, and actually stops building existentially dangerous technology, that sends a hard-to-ignore message that coordination might be possible.
If a frontier AI company shuts down, will that work? Will companies work together to slow down? Will we get sane AI regulations as a direct result of the shutdown? Probably not. It won't singlehandedly solve all the coordination problems. But it's still a better idea than the current strategy of "race ahead while doing a dash of safety research on the side", which is even less likely to work. By AI companies' own admission, competitive pressures don't allow them to slow down. Why would things change in the future? How are they going to align AI if they have to move at maximum speed? Even if they slow down somewhat, what if alignment is hard [1] , and they can't slow down by enough to properly solve the problem?
Counterpoint: If the most safety-conscious company shuts down, then it can't do any more safety research.
I expect shutting down would be worth the tradeoff—companies' safety research isn't doing much to reduce AI takeover risk. But perhaps instead of shutting down, an AI company could reallocate 100% of its budget on some combination of safety research + global coordination to make AI development safer, and do just those things until it runs out of money. Think of how much more safety work a they could do if they dedicated all their resources to the problem!
(Some might argue that AI companies need to build frontier models so they have something on which to do safety research. That argument doesn't make much sense when you think about it. There are a lot of kinds of research that don't require frontier models, [2] they can do plenty of research on the models that already exist, and they can make deals with other companies to get access to their latest models.)
What if investors sue the company?
It is my understanding that a self-induced shutdown would be legal for Anthropic (which is a public benefit corporation). I'm not sure about OpenAI—it's a for-profit now, but it's still owned in large part by a nonprofit that's allegedly obligated to put the benefit of humanity first.
More importantly, "we have to risk killing everyone because otherwise our investors might sue us" is not a serious position. I almost can't think of a worse excuse.
Some people might believe that a safety-minded AI company should shut down under some circumstances, but not now. My question then is: Under what conditions should they shut down? How will we know when those conditions are met? And how do we know that they'll follow through?
It probably is. ↩︎
Safety-minded AI companies treat alignment as an engineering problem, or treat philosophical problems as easy. There are critical aspects of the problem that can't be solved by engineering (or that aren't legible). You can work on those other aspects even if you don't have frontier models. ↩︎