Let's look at the two horns of the dilemma, as you put it:
Well, here are some reasons someone who wants pause AI might not want to support the organization PauseAI:
So, if you think the specific measures proposed by them would limit an AI that even many pessimists would think is totally ok and almost risk-free, then you might not want to push for these proposals but for more lenient proposals that, because they are more lenient, might actually get implemented. To stop asking for the sky and actually get something concrete.
So, this is why people who want to pause AI might not want to support PauseAI.
And, well, why wouldn't pause AI want to change?
Well -- I'm gonna speak broadly -- if you look at the history of PauseAI, they are marked by belief that the measures proposed by others are insufficient for Actually Stopping AI -- for instance the kind of policy measures proposed by people working at AI companies isn't enough; that the kind of measures proposed by people funded by OpenPhil are often not enough; and so on. Similarly, they often believe that people who are talking about these claims are nitpicking, and so on. (Citation needed.)
I don't think this dynamic is rare. Many movements have "radical wings," that more moderate organizations in the movement would characterize as having impracticable maximalist policy goals and careless epistemics. And the radical wings would of course criticize back that the "moderate wings" have insufficient or cowardly policy goals and epistemics optimized for respectability and not not truth. And the conflicts between them are intractable because people cannot move away from these prior beliefs about their interlocutors; in this respect the discourse around PauseAI seems unexceptionable and rather predictable.
Well -- I'm gonna speak broadly -- if you look at the history of PauseAI, they are marked by belief that the measures proposed by others are insufficient for Actually Stopping AI -- for instance the kind of policy measures proposed by people working at AI companies isn't enough; that the kind of measures proposed by people funded by OpenPhil are often not enough; and so on.
They are correct as far as I can tell. Can you identify a policy measure proposed by an AI company or an OpenPhil-funded org that you think would be sufficient to stop unsafe AI devel...
If you look at the kind of claims that PauseAI makes in their risks page, you might believe that some of them seem exaggerated, or that PauseAI is simply throwing all the negative things they can find about AI into big list to make it see bad. If you think that credibility is important to the effort to pause AI, then PauseAI might seem very careless about truth in a way that could backfire.
A couple notes on this:
When you visit the website for PauseAI, you might find some very steep proposals for Pausing AI [...] (I could train one)
Their website is probably outdated. I read their proposals as “keep the current level of AI, regulate stronger AI”. Banning current LLaMA models seems silly from an x-risk perspective, in hindsight. I think PauseAI is perfectly fine with pausing “too early”, which I personally don't object to.
If you look at the kind of claims that PauseAI makes in their risks page
PauseAI is clearly focused on x-risk. The risks page seems like an at...
I feel kind of silly about supporting PauseAI. Doing ML research, or writing long fancy policy reports feels high status. Public protests feel low status. I would rather not be seen publicly advocating for doing something low-status. I suspect a good number of other people feel the same way.
(I do in fact support PauseAI US, and I have defended it publicly because I think it's important to do so, but it makes me feel silly whenever I do.)
That's not the only reason why people don't endorse PauseAI, but I think it's an important reason that should be mentioned.
I notice they have a Why do you protest section in their FAQ. I hadn't heard of these studies before
- Protests can and often will positively influence public opinion, voting behavior, corporate behavior and policy.
- There is no evidence for a “backfire” effect unless the protest is violent. Our protests are peaceful and non-violent.
- Check out this amazing article for more insights on why protesting works
Regardless, I still think there's room to make protests cooler and more fun and less alienating, and when I mentioned this to them they seemed very open to it.
Personally, because I don't believe the policy in the organization's name is viable or helpful.
As to why I don't think it's viable, it would require the Trump-Vance administration to organise a strong global treaty to stop developing a technology that is currently the US's only clear economic lead over the rest of the world.
If you attempted a pause, I think it wouldn't work very well and it would rupture and leave the world in a worse place: Some AI research is already happening in a defence context. This is easy to ignore while defence isn't the frontier. The current apparent absence of frontier AI research in a military context is miraculous, strange, and fragile. If you pause in the private context (which is probably all anyone could do) defence AI will become the frontier in about three years, and after that I don't think any further pause is possible because it would require a treaty against secret military technology R&D. Military secrecy is pretty strong right now. Hundreds of billions yearly is known to be spent on mostly secret military R&D, probably more is actually spent.
(to be interested in a real pause, you have to be interested in secret military R&D. So I am interested in that, and my position right now is that it's got hands you can't imagine)
To put it another way, after thinking about what pausing would mean, it dawned on me that pausing means moving AI underground, and from what I can tell that would make it much harder to do safety research or to approach the development of AI with a humanitarian perspective. It seems to me like the movement has already ossified a slogan that makes no sense in light of the complex and profane reality that we live in, which is par for the course when it comes to protest activism movements.
pausing means moving AI underground, and from what I can tell that would make it much harder to do safety research
I would be overjoyed if all AI research were driven underground! The main source of danger is the fact that there are thousands of AI researchers, most of whom are free to communicate and collaborate with each other. Lone researchers or small underground cells of researcher who cannot publish their results would be vastly less dangerous than the current AI research community even if there are many lone researchers and many small underground ...
For the US to undertake such a shift, it would help if you could convince them they'd do better in a secret race than an open one. There are indications that this may be possible, and there are indications that it may be impossible.
I'm listening to an Ecosystemics Futures podcast episode, which, to characterize... it's a podcast where the host has to keep asking guests whether the things they're saying are classified or not just in case she has to scrub it. At one point, Lue Elizondo does assert, in the context of talking to a couple of other people who know a lot about government secrets and in the context of talking about situations where excessive secrecy may be doing a lot of harm, quoting Chris Mellon, "We won the cold war against the soviet union not because we were better at keeping secrets, we won the cold war because we knew how to move information and secrets more efficiently across the government than the russians." I can believe the same thing could potentially be said about China too, censorship cultures don't seem to be good for ensuring availability of information, so that might be a useful claim if you ever want to convince the US to undertake this.
Right now, though...
The Trump-Vance administration's support base is suspicious of academia, and has been willing to defund scientific research of the grounds of it being too left-wing. There is a schism emerging between multiple factions of the right-wing, the right-wingers that are more tech-oriented and the ones that are nation/race-oriented (the H1B visa argument being an example). This could lead to a decrease in support for AI in the future.
Another possibility is that the United States could lose global relevance due to economic and social pressures from the outside world, and from organizational mismanagement and unrest from within. Then the AI industry could move to the UK/EU, turning the main players in AI to the UK/EU and China.
A relevant FAQ entry: AI development might go underground
I think I disagree here:
By tracking GPU sales, we can detect large-scale AI development. Since frontier model GPU clusters require immense amounts of energy and custom buildings, the physical infrastructure required to train a large model is hard to hide.
This will change/is only the case for frontier development. I also think we're probably in the hardware overhang. I don't think there is anything inherently difficult to hide about AI, that's likely just a fact about the present iteration of AI.
But I...
I think the concept of Pausing AI just feels unrealistic at this point.
PauseAI could gain substantial support if there's a major AI-caused disaster, so it's good that some people are keeping the torch lit for that possibility, but supporting it now means burning political capital for little reason. We'd get enough credit for "being right all along" just by having pointed out the risks ahead of time, and we want to influence regulation/industry now, so we shouldn't make Pause demands that get you thrown out of the room. In an ideal world we'd spend more time understanding current models, though.
supporting it now means burning political capital for little reason
I think this is wrong - the cost in political capital for saying that it's the best solution seems relatively low, especially if coupled with an admission that it's not politically viable. What I see instead is people dismissing it as a useful idea even in theory, saying it would be bad if it were taken seriously by anyone, and moving on from there. And if nothing else, that's acting as a way to narrow the Overton window for other proposals!
A. Many AI safety people don't support relatively responsible companies unilaterally pausing, which PauseAI advocates. (Many do support governments slowing AI progress, or preparing to do so at a critical point in the future. And many of those don't see that as tractable for them to work on.)
B. "Pausing AI" is indeed more popular than PauseAI, but it's not clearly possible to make a more popular version of PauseAI that actually does anything; any such organization will have strategy/priorities/asks/comms that alienate many of the people who think "yeah I support pausing AI."
C.
There does not seem to be a legible path to prevent possible existential risks from AI without slowing down its current progress.
This seems confused. Obviously P(doom | no slowdown) < 1. Many people's work reduces risk in both slowdown and no-slowdown worlds, and it seems pretty clear to me that most of them shouldn't switch to working on increasing P(slowdown).
B. "Pausing AI" is indeed more popular than PauseAI, but it's not clearly possible to make a more popular version of PauseAI that actually does anything; any such organization will have strategy/priorities/asks/comms that alienate many of the people who think "yeah I support pausing AI."
This strikes me as a very strange claim. You're essentially saying, even if a general policy is widely supported, it's practically impossible to implement any specific version of that policy? Why would that be true?
For example I think a better alternative to "nobody fund...
Obviously P(doom | no slowdown) < 1.
You think it's obviously materially less? Because there is a faction, including Eliezer and many others, that think it's epsilon, and claim that the reduction in risk from any technical work is less than the acceleration it causes. (I think you're probably right about some of that work, but I think it's not at all obviously true!)
Obviously P(doom | no slowdown) < 1.
This is not obvious. My P(doom|no slowdown) is like 0.95-0.97, the difference from 1 being essentially "maybe I am crazy or am missing something vital when making the following argument".
Instrumental convergence suggests that the vast majority of possible AGI will be hostile. No slowdown means that neural-net ASI will be instantiated. To get ~doom from this, you need some way to solve the problem of "what does this code do when run" with extreme accuracy in order to only instantiate non-hostile neural-net ASI (you nee...
Thank you for responding!
A: Yeah. I'm mostly positive about their goal to work towards "building the Pause button". I think protesting against "relatively responsible companies" makes a lot of sense when these companies seem to use their lobbying power more against AI-Safety-aligned Governance than in favor of it. You're obviously very aware of the details here.
B: I asked my question because I'm frustrated with that. Is there a way for AI Safety to coordinate a better reaction?
C:
...There does not seem to be a legible path to prevent possible existential risks
Some quick takes:
I think it would probably be bad for the US to unilaterally force all US AI developers to pause if they didn't simultaneously somehow slow down non-US development.
It seems to me that to believe this, you have to believe all of these four things are true:
I think we basically agree, but I think the Overton window needs to be expanded, and Pause is (unfortunately) already outside that window. So I differentiate between the overall direction, which I support strongly, and the concrete proposals and the organizations involved.
How much of a consensus is there on pausing AI
Not much compared to the push to get the stuff that already exists out to full deployment (For various institutions this is meaningful impact on profit margins).
People don't want to fight that, even if they think that further capabilities are bad price/risk/benefits tradeoff.
There is a co-ordination problem where if you ask to pause and people say no, you can't make other asks.
3rd. They might just not mesh/trust that particular movement and the consolidation of platform it represents, and so want to make points on their own instead of joining a bigger organizations demands.
One particular reason that I haven't seen addressed very much in why I don't support/endorse PauseAI, beyond the usual objections, is that there probably aren't going to be that many warning shots that can actually affect policy, at least conditional on misalignment being a serious problem (which doesn't translate to >50% probability of doom), because the most likely takeover plan (at least assuming no foom/software intelligence explosion) fundamentally relies not on killing people, but on launching internal rogue deployments to sabotage alignment work and figuring out a way to control the AI company's compute, since catastrophe/existential risk is much harder than launching a internal rogue deployment (without defenses).
So PauseAI's theory of change fundamentally requires that we live in worlds where both alignment is hard and effective warning shots exist, and these conditions are quite unlikely to be true, especially given that pausing is likely not the most effective action you could be doing from a comparative advantage perspective.
I'm not going to say that PauseAI is net-negative, and it has positive expected value, but IMO it's far less than a lot of pause advocates say:
Important part of the comment:
I think most of the effective strategies for AIs seeking power don't involve escalating to something which is much more likely to trigger a strong response than "the AI company caught the AI trying to escape". I think the best strategies are things like:
- Launch a rogue internal deployment.
- Sabotage a bunch of work done at the AI company. Or possibly some work done externally. This includes stuff like sabotaging alignment work, backdooring robot armies, backdooring future training runs, etc.
- Escape and then directly try to take over once your chances are sufficiently good that this is better than biding your time.
- Generally try to manipulate and persuade such that AI takeover is easier and more likely.
Of these, I think only escape could trigger a much stronger response if we catch it after it escalates some rather than before. I don't see how "we caught the AI trying to launch an unmonitored version of itself" is going to play that differently from "we caught that the AI did launch an unmonitored version of itself". Most of these don't escalate in some way which would trigger a response such that catching it after the fact is similar to catching an attempt. (In some cases where reversion is possible like work sabotage, there might be no meaningful distinction.) Further, without some effort on control, we might be much less likely to catch either! And, in some cases, control measures I'm interested in focus on after-the-fact detection.
I don't think survivable worlds, at our point in time, involve something like PauseAI. I don't condemn them, and welcome people to try. But it's feeling more and more like Hiroo Onoda, continuing to fight guerilla warfare in the Philipines for decades, refusing to believe the war was over.
I think AI safety has very limited political capital at the moment. Pausing AI just isn’t going to happen, so advocating for it makes you sound unreasonable and allows people to comfortably ignore your other opinions. I prefer trying to push for interventions which make a difference with much less political capital, like convincing frontier labs to work on and implement control measures.
Quick list of reasons for me:
One frustration I have about people on LessWrong and elsewhere is that people love criticizing every advice/strategy, while never truly supporting any alternatives.
Most upvoted comments here argue against PauseAI, or even claim that asking for a pause overall is a waste of political capital...!
Yet I remember when I proposed an open letter arguing for government funding for AI alignment, the Statement on AI Inconsistency. After writing emails and private messages, the only reply was "sorry, this strategy isn't good, because we should just focus on pausing AI."
I feel my open letter is more likely to succeed than pausing AI (I'm demanding that the AI alignment budget be "belief-consistent" with the military budget).
When politicians reject pausing AI, they just need the easy belief of "China must not win," or "if we don't do it someone else will." But for politicians to reject my open letter, they need the difficult belief of being 99.999% sure of no AI catastrophe, thus 99.95% sure most experts are wrong.
Regardless, where are the people who favour the middle ground? Who neither argue that "asking for a pause is a waste of political capital because it's hopeless," nor argue that "asking for government funding is a waste of time, because we should just focus on pausing AI?"
It's like the status game of criticism, that Wei Dai pointed out.
I think it's plausible that a system which is smarter than humans/humanity (and distinct and separate from humans/humanity) should just never be created, and I'm inside-view almost certain it'd be profoundly bad if such a system were created any time soon. But I think I'll disagree with like basically anyone on a lot of important stuff around this matter, so it just seems really difficult for anyone to be such that I'd feel like really endorsing them on this matter?[1] That said, my guess is that PauseAI is net positive, tho I haven't thought about this that much :)
Supporting PauseAI makes sense only if you think it might succeed, if you think the chances are roughly 0 then it might be some cost (reputation etc) without any real profit.
tl;dr:
From my current understanding, one of the following two things should be happening and I would like to understand why it doesn’t:
Either
Everyone in AI Safety who thinks slowing down AI is currently broadly a good idea should publicly support PauseAI.
Or
There does not seem to be a legible path to prevent possible existential risks from AI without slowing down its current progress.
I am aware that many people interested in AI Safety do not want to prevent AGI from being built EVER, mostly based on transhumanist or longtermist reasoning.
Many people in AI Safety seem to be on board with the goal of “pausing AI”, including, for example, Eliezer Yudkowsky and the Future of Life Institute. Neither of them is saying “support PauseAI!”. Why is that?
One possibility I could imagine: Could it be advantageous to hide “maybe we should slow down on AI” in the depths of your writing instead of shouting “Pause AI! Refer to [organization] to learn more!”?
Another possibility is that the majority opinion is actually something like “AI progress shouldn’t be slowed down” or “we can do better than lobbying for a pause” or something else I am missing. This would explain why people neither support PauseAI nor see this as a problem to be addressed.
Even if you believe there is a better, more complicated way out of AI existential risk, the pausing AI approach is still a useful baseline: Whatever your plan is, it should be better than pausing AI and it should not have bigger downsides than pausing AI has. There should be legible arguments and a broad consensus that your plan is better than pausing AI. Developing the ability to pause AI is also an important fallback option in case other approaches fail. PauseAI calls this “Building the Pause Button”:
Some argue that it’s too early to press the Pause Button (we don’t), but most experts seem to agree that it may be good to pause if developments go too fast. But as of now we do not have a Pause Button. So we should start thinking about how this would work, and how we can implement it.
Some info about myself: I'm a computer science student and familiar with the main arguments of AI Safety: I have read a lot of Eliezer Yudkowsky and did the AISF course reading and exercises. I have watched Robert Miles videos.
My conclusion is that either
Everyone in AI Safety who thinks slowing down AI is currently broadly a good idea should publicly support PauseAI.
Or
Why is (1) not happening and (2) not being worked on?
How much of a consensus is there on pausing AI?