I basically agree.
Shutting down would make people say, holy shit, they are serious about this extinction risk thing.
I don't know. I think a lot of people would say something like, 'They're making excuses because they knew they couldn't win the race.'
If one company decides not to plow ahead recklessly, and actually stops building existentially dangerous technology, that sends a hard-to-ignore message that coordination might be possible.
I don't know. People might wonder, 'If coordination is possible, why did they choose to shut down unilaterally?'
Shutting down may have a critical positive effect ~only when there are other visceral signs of AI danger, I second these points.
I'm a little worried Anthropic has missed the window for this option, since now it might look like the Whitehouse was out to destroy them and they were just throwing in the towel.
I think most of this effect disappears if they just wait a couple months for the local news cycle to pass (assuming the public spats die down, even if the actual relationship remains similarly adversarial in less visible ways).
Yeah but I kinda do put moderate odds on "The White House continues to actively try to destroy Anthropic and eventually either succeeds or at least it's pretty visibly in-question."
Would be nice. Any idea how to make it more likely?
One thing to do would be to develop an airtight (partly technical?) argument for ~doom on the current trajectory or whatever, and then publish it in a format that is understandable (if partly through deference) by the relevant crowd at the companies, as well as the relevant crowd outside of the companies (investors?), so that clearly the only socially acceptable move is to shut down or something. (Some lobbing at the margin will plausibly be required.)
I'm also interested in historical examples of companies shutting down for vaguely analogous reasons. Has any tobacco company shut down after it became common knowledge that smoking is bad?
I'm also interested in historical examples of companies shutting down for vaguely analogous reasons. Has any tobacco company shut down after it became common knowledge that smoking is bad?
In my conversations with LLMs, they could not come up with an a single example of this happening. The closest example they could find is apparently Patagonia, which in 2022 transferred 98% of nonvoting stocks to a nonprofit for climate philanthropy. But that's kind of dissimilar.
Self-immolation would be basically unprecedented, especially at the scale of current AI companies. But extreme times require extreme measures.
I have no business experience, but I am wondering about the practicalities of shutting down such a company. If it has already gone public, its directors may not even own it. The shareholders would have to agree, which they'll hardly be willing to do or they wouldn't have invested. Even if the directors technically own it, what stops the employees banding together to take it over? Or other companies mounting a takeover, or poaching the employees and IP? Or the current regime or the next nationalising it, which there has been mention of from both political sides?
Are these companies already too big to shut down?
A related thesis you could have is: "if you're a frontier AI company and you're IPO'ing soon, you should put safeguards in place to give yourself the option to shut down the company if things look too dangerous."
There was news recently about how apparently Elon Musk organized the SpaceX governance in such a way that shareholders aren't allowed to sue him. If he can pull that off, I bet AI companies can find a way to create similar safety protections, if they're sufficiently motivated.
It seems plausible that Anthropic could shut itself down and win the legal battle due to how they're structured, but I doubt Google could. Even if they won the legal battle, I'd expect the US government to demand the weights, GPUs, and all research for national security reasons, so all they'd really achieve is giving everything to whoever is best politically connected (either an internal government program or X).
Most of the value would be from the researchers themselves not pushing capabilities, and it might still be meaningful. My read of the situation is that surprisingly few people really "get it", so Anthropic's researchers refusing to work on capabilities would be a meaningful slowdown by itself.
I'd expect the US government to demand the weights, GPUs, and all research for national security reasons
I think there's a pretty good chance this wouldn't happen (maybe 50/50 but I haven't really thought about it). The US government is mostly not AGI-pilled. Things might be different a year from now.
Also, even if they wanted to, I think the government would have a hard time putting together an org that's as good at AI development as OpenAI, Anthropic, or Google.
I don't think it matters if the US government is AGI-pilled. It's clearly a militarily-relevant technology and the DoW seems to think it's important, so I don't think they'll let this capability go away rather than just giving it to another military contractor like OpenAI.
I'm not even sure they would have to entirely shut down, plausibly they could just completely halt frontier development? Stop training bigger and better models but keep serving the models they already have?
If you replace the words 'frontier AI' with 'fossil fuel', switch the companies to Aramco and Total, and post this in a climate change community, the post will still make sense.
This is the latest of several occasions in which I mentally predicted that a post would be well-received on the EA Forum and poorly received on LessWrong, and then it ended up being the other way around.
Huh, curious for you model of why you predicted the other-which-way? This seemed like a classic "does better on LW" kinda post (for good or for ill). Although I wouldn't have predicted so much disparity.
Yeah the obvious reason to predict more success on LW is that the post is implicitly pessimistic about AI companies' ability to solve alignment. The reason I expected less success is that the post is making a simplified, arguably naive* argument where I could've made many caveats, or said a lot more about the real-world complexities of what I'm proposing, but I left those bits out because ultimately I didn't think they were important enough. LW users (including me) tend to write comments pointing out those things I didn't talk about.
In retrospect I suppose it's unsurprising that this post was well-received on LW. It might just be risk-aversion combined with the fact that I'm not very good at predicting which posts LW users will like.
*in the colloquial sense of "not understanding how the world works", not the mathematician's sense
It would also be great if countries stopped going to war, governments followed sound economic policy, and we started a Manhattan project to solve aging, but none of those seem very likely.
How concretely do you plan to persuade any of the frontier labs to shut down?
Minimal chance that any AI company shuts down (at least under current circumstances). But I figured it was worth publicly acknowledging that an AI company should shut down, even though I expect they won't.
And given that your argument is good and no one is doing that, we can infer they're just lying, and they personally want to plow ahead aggressively of their own volition. I just wish you stated this directly. We shouldn't be in the business of saving the appearances for their commitment to x-risk reduction. And we really shouldn't assume they're anti- s-risk when they think they'll be the godkings through this transformation.
it's a strong signal but there are enough people who think that this is just delusion to start a dozen more frontier ai companies if there's a market gap
OpenAI's board tried to shut it down around 2024, I think in reaction to the early reasoning models. Microsoft was then in the middle of acqui-hiring the whole company and the shutdown was canceled, and the board members trying to shut it down were ousted. So we tried this already, the result is the shutdown attempt just fails completely.
I originally suggested none of them just sincerely want to shut it down, but we pretty much have a stronger result already -- we tried, it failed.
But perhaps instead of shutting down, an AI company could reallocate 100% of its budget on some combination of safety research + global coordination to make AI development safer, and do just those things until it runs out of money. Think of how much more safety work a they could do if they dedicated all their resources to the problem!
A lot of people think we should pause at AGI, but one problem is that AI capabilities are spiky, so we will probably get powerful and dangerous capabilities even before it is human level at all tasks. So I was thinking what might actually be politically/economically feasible is pausing further development once they get an automated AI researcher, because all those AI capability researchers are not going to be happy about being laid off. So we could turn them into AI safety researchers!
pausing further development once they get an automated AI researcher
I'm not sure I get this...are you talking about a scenario where AI companies develop a viable automated AI researcher and then nobody uses that automated researcher? After developing autonomous AI research seems like a difficult time to pause because a pause is way harder to enforce now.
I don't know how employment would play out. If there were an automated Junior AI researcher, then senior AI researchers could plausibly manage 10 of them each (like now) and they may lay off the Junior AI researchers if they can't manage the automated Junior AI researchers. That would save the AI companies money, but would not really get them more research done. The total potential is millions of automated researchers, so maybe it would be possible for one human to manage 1000 automated researchers? Anyway, it seems to me that at some point, there would be mass layoffs, in which case the workers could argue that they could pivot to safety.
As for whether it is much harder to enforce if there are automated AI researchers, I don't know. It seems like regardless, they could prohibit training of new more capable models, but I guess the labs might argue they need to train a new safer model? But that blurriness would be the case regardless of whether the researchers are human or AI. Are you saying that it's harder to enforce a pause if progress is faster or cheaper?
That would save the AI companies money, but would not really get them more research done.
If the supply curve shifts right, quantity supplied increases.
As for whether it is much harder to enforce if there are automated AI researchers, I don't know.
If we already have an autonomous AI researcher, there's a good chance the researcher takes off to ASI while policy-makers are still trying to figure out what's doing on. Originally I said it was harder to enforce but really I think the concern is that it makes AI development go a lot faster.
I agree this would obviously work, but none of the incentives at the top would make this remotely plausible today look at the board of OAI / Anthropic -- probably the only 2 where this makes sense.
Maybe some employees inside these companies may be open to this, but the I think the chance the current board of either of these labs could get a majority vote today to shut it down is basically <5%
There is virtue in saying what should be done, even if it has minimal chance of actually happening.
If I was the CEO of a different AI company, and this happened on Monday, then on Tuesday I would say, "AI has gotten so incredibly powerful that our competitor just shut themselves down! But don't worry, we have a new safety technique at EvilAI that will allow us to do this more safely than that company could. Invest now in our incredibly powerful, safe AI!"
And then the AI race would be in the hands of me, an unscrupulous liar, rather than the CEO who was ethical enough to try your suggested maneuver.
I really wish that this was true. I just don’t have the same faith in the general public.
Safety AI engineering is partly done in the thought of service. Service to people who don’t necessarily know what they are being protected from. Also, it is invisible by design and a thankless job because if it is done well it is seamless.
The closest example to this idea that I can think of is when the US federal government cyclically furloughs its employees for policy negotiations between house/senate/president. The federal US government is made up of people who have to take an oath to service. That service is mostly silent and unknown. The public doesn’t have an initial backlash of “look at all the vital work that stopped!” It isn’t until the problem becomes theirs. This was seen in the last year when SNAP benefits started to become impacted.
Here’s where the parallel ends. If a frontier AI company shuts down, the consumers would just go to a competitor that may be less safe. The competitor model may be slightly more sycophantic or easier to jailbreak. But that wouldn’t impact a basic human need like SNAP benefits do. There would be no need for a forced fix and the company that absorbed the market may look at safety as a low grade feature request.
.
What I want to see, which I predict is maybe more realistic (EDIT: Thinking on it more, I've changed my prediction, I think convincing one frontier company to shut down is actually more likely!) --
US and China buying equal stakes in each others' companies. We sell half of Anthropic and OpenAI and a spun-off Deepmind. They sell half of DeepSeek and spin offs of Alibaba and Baidu.
We work together for a slow down, gesture towards great partnership and profit sharing and economic growth between the two countries yadda yadda, and make a good example for the rest of the world so India et al will also join in.
Cross-posted from my website.
Prior discussion: niplav's shortform (2025); Planning for Extreme AI Risks (2025) by Joshua Clymer
A frontier AI company (any one, I don't care which) should close shop and make an announcement along the lines of:
A common refrain among safety-conscious AI developers: "it doesn't matter if we stop building dangerous AI, because someone else will just build it instead." Is that really true, though? If a multi-hundred-billion-dollar company comes out and says "We've concluded that our product is horribly dangerous, nobody knows how to make it safe, and there's too high a risk that it leads to human extinction", this won't raise any eyebrows? This has no chance of spurring policy-makers into action?
Shutting down would make people say, holy shit, they are serious about this extinction risk thing. Shutting down sends a strong signal to governments that they should pay serious attention to AI x-risk.
It also encourages other companies to take safety more seriously. Right now, at least three AI companies have said something like, "maybe we'd prefer to slow down and pay more attention to safety, but then the other companies will plow ahead recklessly." If one company decides not to plow ahead recklessly, and actually stops building existentially dangerous technology, that sends a hard-to-ignore message that coordination might be possible.
If a frontier AI company shuts down, will that work? Will companies work together to slow down? Will we get sane AI regulations as a direct result of the shutdown? Probably not. It won't singlehandedly solve all the coordination problems. But it's still a better idea than the current strategy of "race ahead while doing a dash of safety research on the side", which is even less likely to work. By AI companies' own admission, competitive pressures don't allow them to slow down. Why would things change in the future? How are they going to align AI if they have to move at maximum speed? Even if they slow down somewhat, what if alignment is hard [1] , and they can't slow down by enough to properly solve the problem?
Counterpoint: If the most safety-conscious company shuts down, then it can't do any more safety research.
I expect shutting down would be worth the tradeoff—companies' safety research isn't doing much to reduce AI takeover risk. But perhaps instead of shutting down, an AI company could reallocate 100% of its budget on some combination of safety research + global coordination to make AI development safer, and do just those things until it runs out of money. Think of how much more safety work a they could do if they dedicated all their resources to the problem!
(Some might argue that AI companies need to build frontier models so they have something on which to do safety research. That argument doesn't make much sense when you think about it. There are a lot of kinds of research that don't require frontier models, [2] they can do plenty of research on the models that already exist, and they can make deals with other companies to get access to their latest models.)
What if investors sue the company?
It is my understanding that a self-induced shutdown would be legal for Anthropic (which is a public benefit corporation). I'm not sure about OpenAI—it's a for-profit now, but it's still owned in large part by a nonprofit that's allegedly obligated to put the benefit of humanity first.
More importantly, "we have to risk killing everyone because otherwise our investors might sue us" is not a serious position. I almost can't think of a worse excuse.
Some people might believe that a safety-minded AI company should shut down under some circumstances, but not now. My question then is: Under what conditions should they shut down? How will we know when those conditions are met? And how do we know that they'll follow through?
It probably is. ↩︎
Safety-minded AI companies treat alignment as an engineering problem, or treat philosophical problems as easy. There are critical aspects of the problem that can't be solved by engineering (or that aren't legible). You can work on those other aspects even if you don't have frontier models. ↩︎