Holly is an independent AI Pause organizer, which includes organizing protests (like this upcoming one). Rob is an AI Safety YouTuber. I (jacobjacob) brought them together for this dialogue, because I've been trying to figure out what I should think of AI safety protests, which seems like a possibly quite important intervention; and Rob and Holly seemed like they'd have thoughtful and perhaps disagreeing perspectives. 

Quick clarification: At one point they discuss a particular protest, which is the anti-irreversible proliferation protest at the Meta building in San Francisco on September 29th, 2023 that both Holly and Rob attended.

Also, the dialogue is quite long, and I think it doesn't have to be read in order. You should feel free to skip to the section title that sounds most interesting to you.  


Let's jump right into it. Rob, when I suggested discussing protests, you said you were confused about: "How do we get to be confident enough in anything to bring it the energy that activism seems to require?" I very much agree with this confusion! 

Robert Miles

This is a pretty standard joke/stereotype, the XKCD protest where the signs are giant and full of tiny text specifying exactly what we mean. My protest sign just said "Careful Now", in part because I'm not sure what else you can fit on a sign that I would fully endorse.


There's a big difference in communication style between something like science or lesswrong and advocacy.  You have less bandwidth in advocacy. It's much closer to normal speech, where we don't exactly qualify every statement-- and that's in many ways more accurate as a practice considering how short of a window you have for that kind of communication. "Pause AI" does the job where an exact description of how you implement the policy does not-- it would be too confusing and hard to take in.  And similarly the resolution of policy you have to discuss in advocacy is a lot lower (or the concepts higher level), so you can endorse a very broad policy aim that you do support while having a lot of genuine uncertainty about what mechanism or exact approach to use. 


(A related old, good post: You Get About Five Words. That post also suggests that you have 3 words left. Could spend them on "PauseAI: but not sure" )


(Okay this may get us to a crux: that's a terrible idea)


(Lol, pretty excited for that thread to be hashed out if feels interesting to Rob)

Robert Miles

(I think I agree that throwing generic uncertainty into your slogan is a waste of words, when you could instead add concrete nuance. Like "Pause Frontier AI", "Pause AGI", "Pause God-like AI", whatever) 


(Agree. I just wouldn't use a word that's too technical. I like "Pause God-like AI" except maybe some people think an AI can never be like a god categorically or something. More qualifiers can increase opportunities for confusion when people basically have the right idea to start.)

(Well, "basically having the right idea" might be what's at issue.) 

Robert Miles

Replying to the previous point: the other side of this is just that policy seems so complicated and uncertain, I really honestly do not know what we should do. It feels almost dishonest to yell that we should do policy X when I'm like at least 20% that it would make things worse.


Hmm, yeah, I hear that.  The reason I'm still in favor of advocating a policy in that situation is that the nature of the update the audience is making is often a level or two above that. So we say "Pause AI", but what they take away is "AI is dangerous, there is a solution that doesn't involve making AI, it's feasible to support this solution..."

(I also happen to think Pause AI is a really good advocacy message in that it's broad, points in a meaningful direction, and is not easily misinterpreted to mean something dangerous. There are lots of conceivable versions of pausing and I think most of them would be good.)

Robert Miles

I think it could be misinterpreted to mean "pause all AI development and deployment", which results in a delayed deployment of "sponge safe" narrow AI systems that would improve or save a large number of people's lives. There's a real cost to slowing things down.


I do mean pause development and deployment, I think.  What I really want to accomplish is getting us out of this bizarre situation where we have to prove that AI is dangerous instead of AI developers having to prove that it is safe for them proceed. I'm not even sure there is a safe way to learn that high level AI is safe! I want society to see how ludicrous it is to allow AI development or deployment to continue given this massive risk and uncertainty.

That feels like the more important message to me overall than acknowledging that AI would also have benefits, though I'm happy to acknowledge that in the proper setting-- where the emphasis can remain correct.

Robert Miles

This feels closer to a crux for me. I basically think that most AI is broadly good, has been broadly good and will stay broadly good for at least a little while. There are serious safety concerns but they're not different in kind from any other technology, and our society's normal way of developing technology and responding to new tech is probably up to the task. Like, you allow companies to develop AI and deploy it, and then learn that it's biased, and people yell at the companies, and governments do things, and they fix it, and overall you're doing better than you would be doing if every company had been required to legibly demonstrate that the system didn't have any problems at all before deploying it, because then you'd slow everything down so much that a lot of problems would stay unsolved for longer, which has a higher cost


You don't think there are more day 0 issues with AI than other technologies? What if the early mistakes of a powerful AI are just too big to allow for iterative correction?

It's not different from other technology in kind but it is different in the magnitude of capabilities and the potential consequences of unforeseen problems.


Besides the object-level question about whether AI is too dangerous for iterative correction, there's the issue of whether advocates need to be fair to the category of AI or just focus on the worst dangers. I'm not that concerned if people come away with a picture of AI that's biased toward danger because I'm talking about the issue I consider most serious. I want people to have accurate ideas, but there's only so much I can communicate, and I put the warning first.

(Wrote this about how characterizing whether technology as a whole is good or bad is beside the point to kind of get at the same thing: https://hollyelmore.substack.com/p/the-technology-bucket-error )

Robert Miles

Yeah, I guess this kind of thing is roughly proportional to the power of the tech and the rate at which it's being developed. And of course, people protesting is part of society's existing mechanisms for dealing with such things, so it's not an argument not to do it. But I think the AGI case really is fundamentally different from all previous tech, because

  1. Potentially a real discontinuity in power. If AGI automates your research and you get the next 50 years of new tech over the course of a year, that feels like a difference in kind
  2. Normal tech (including narrow AI) basically allows people to get more of what they want, and most people want basically good things, so normal tech ends up mostly good. AGI creates the possibility for a shift from tech broadly getting people what they want, to the tech itself getting what it wants, which could be almost anything, and completely separate from human values

I feel like the argument for responding strongly to AGI risk is weakened by being associated with arguments that should also apply to like, the internet

Robert Miles

Like, there's a whole argument which has been going on forever, about new technology, and "Is this tech or that tech good or bad", and most of the discussion about current AI fits into that tradition fine, and I want to say "No, AGI is its own, distinct and much more serious thing"


I feel like the argument for responding strongly to AGI risk is weakened by being associated with arguments that should also apply to like, the internet

I'm not sure that it is, from an advocacy perspective. It is in argument, but I genuinely don't know if the will to deal with x-risk is weakened by bringing up other arguments/issues in an advocacy context. There happen to be a lot of reasons to Pause, ranging from fairly old concerns about job displacement from automation to x-risk concerns about other agents usurping us and not caring about what's good for us. 

Robert Miles

I think if you say one strong thing, people who disagree can ignore you or respond to your strong thing. If you say three things, people who disagree can respond just to whichever thing is the weakest, and appear to win



I think if you say one strong thing, people who disagree can ignore you or respond to your strong thing. If you say three things, people who disagree can respond just to whichever thing is the weakest, and appear to win

This reminds of the coining of the term "AI-notkilleveryoneism".

(Meta-note: At this point in the dialogue, there was a brief pause, after which Rob and Holly switched to having a spoken conversation, that was then transcribed.)

Techno-optimism about everything... except AGI

Robert Miles

Ok where were we? I think I was describing a frustration I sometimes have with these arguments? I'll try to recap that.

So I am still mostly a techno-optimist, to the extent that that's a coherent thing to be (which I think it broadly isn't… to be "an optimist" is probably a bucket error.) But yeah, technology, broadly speaking, gets people more of what they want. And this can sometimes be a problem when accidents happen, or when the thing people want isn't actually what's good for them. 

But usually, people getting what they want is good. It's actually a reasonable basis for a definition of what ‘good’ is, as far as I'm concerned. And so most technology just works out good overall. If you only had one dial that was, like, “more” or “less”  - and I know that you don't, but if you did - technology comes out as being good, very good in fact. And then, AGI is this weird other thing, where you have a technology which gets more of what it wants, rather than more of what people want. And this makes it unique in the class of technology and therefore needs a different approach … so when talking about AI risk, it’s fraught to use arguments which also would seem to apply to technology in general, since a bunch of other tech that people were worried about turns out to be fine. It feels like it makes it easy for people to address that part, or pull it into this existing debate and ignore the part of it that I think is really important -- which is the AGI side of things.


I think this gets at another general theme that was intended for this discussion: advocacy communication is different than communicating on LessWrong. It's less precise, you have less room. 

I can agree that it's a pitfall for somebody to be able to pattern match arguments about AGI danger to arguments they’ve already heard about technology in general that almost all resolved in favor of technology being good. That does seem like it could weaken the argument. I also think, for people who are really educated on the topic and for a lot of people in our circles who know a lot about tech and are generally broadly pro-tech, that might be a really important argument for them psychologically. It might be really important to them that technology has always turned out well in the past. 

But of course I should note that a lot of people in the world don't think that technology has always turned out well in the past, and a lot of people are really concerned about it this time, even if it's happened a million times before. 

Robert Miles

Yeah. This actually ties into another thing which frustrates me about this whole situation: the conversation seems... too small and too narrow for the topic? In the sense that I should expect there to be an advocacy group going “Yeah, we're Luddites, and proud! We don't like where this technology is going, we don't like where technology is going in general, and we want to stop it.”

And I want to yell at that group, and have that be part of the conversation. I don't want that to be the same group that is also making sensible claims that I agree with. I want both of those to exist separately.

I want the Overton window to have people in it all across this range, because these are real positions that people hold, and that people who aren't crazy can hold. I think they're wrong: but they're not bad people and they're not crazy. It's frustrating to have so few people that you need the individual advocacy groups to cover a broad range of the spectrum of opinion, within which there's actually a lot of conflict and disagreement. Currently there's just so few people that I have to be like “They're right about some things, and overall it's good for their voice to be there, so I don't want to yell at them too much”, or something like that.

If you're THE Pause AI advocacy group, it puts you in a difficult situation. I'm now sympathising a bit more with it, in that you're trying to get support from a wider range of people, and maybe you have to sacrifice some amount of being-correct to do that. But it would be cool to have everybody be advocating for the thing they actually think is correct - for there to be enough people with enough range of opinions for that to be a sensible spectrum within which you can have real discussions.


I would argue that what's being sacrificed isn't correctness, but rather specificity. Less specific statements can still be true or clear. 

The message of PauseAI is broad; it's not saying "I want AI paused this specific way," but rather "I want AI paused indefinitely." "Pause AI" doesn't specify how long, like how many years-- I understand that various factors could influence the duration it's actually paused for even if we get our way. In advocacy, you often get to push in a direction rather than to precise coordinates. 

A significant issue with alignment-based messages is that, if misunderstood, they can seem to advocate the opposite of their intended purpose. PauseAI doesn't suffer from this ambiguity. It's aiming to shift the burden of proof from "we need to prove AI is dangerous" to "we have reasons to believe AI is different, and you have to show that building it is safe before you do it." While there are compelling reasons to hasten AI's development due to its potential benefits, it's not overwhelmingly compelling. For the majority of the world, waiting a bit longer for AGI in exchange for increased safety is a fair trade-off, especially if they understood the full implications.

Robert Miles

It’s a difficult one. I believe the reason people aren't eager for AGI to arrive two years sooner is because they aren't aware.... either they don't realise it's possible, or they don't understand the full implications.


I believe there are many other issues where people don't appreciate the urgency. For instance, inventing new vaccines a year sooner can mean thousands of lives saved. Usually, I'm the one emphasizing that. It's true that many don't feel the urgency then, and similarly, they don't feel the urgency now. But in this case, their usual inclination might lead them in the right direction... AI risk could be mitigated with more time. While I'd be sad if AGI couldn't be safely built, I think we should be pushing a policy that accommodates the possibility that we can't make it safely.

Robert Miles

Well it'll never be truly safe, insofar as "safe" is defined as having zero risk, which is clearly not achievable. So the question is: at what point does the benefit of doing something outweigh its risk? We're definitely not there currently.


I actually think advocacy offers a way to sidestep that question. Precisely how much risk should we tolerate for what benefit seems like a subjective question. However, most people are in fact fairly risk-averse on matters like this. If you ask most stakeholders, meaning the global population [NB: the stakeholder are the population of Earth but the polls I was thinking are of Americans], they generally favor a risk-averse approach. That seems correct in this context. It feels different from the kind of risk/reward we're typically set up to think about. We're not particularly equipped to think well about the risk of something as significant as a global nuclear war, and that's arguably more survivable than getting AI wrong.

I feel I am more risk-averse than a lot of our friends. My instinct is, let's just not go down that path right now. I feel like I could get the right answer to any hypothetical if you presented me with various fake utilities on the different sides.  Yet the real question is, do we take the leap now or later? I assume that any pause on AI advancement would eventually be lifted unless our society collapses for other reasons. I think that, in the long run, if it's possible, we'll achieve safe AGI and capture a lot of that value because we paused.

Cluelessness and robustly good plans

Robert Miles

Hmm... I think I'm convinced that pausing is better than not pausing. The thing I'm not convinced of is that the world if you advocate pausing, is better than the world if you don't advocate pausing.


This is interesting, because I was talking to one of the co-organizers for the October 21st protest, and he said that actually he wasn't sure a pause is good, but he was more sure that advocating for a pause is good. He came away more convinced that advocacy was solidly good than that pause was the right policy decision.

Robert Miles

This comes back to the very first thing... When it comes to the technical stuff, you can sit down and take it apart. And it's complicated, but it's the kind of complicated that's tractable. But for advocacy... it feels so hard to even know the sign of anything. It makes me want to just back away from the whole thing in despair.


I have felt that people I know in AI safety think that government/politics stuff is just harder than technical stuff. Is that kind of what you're thinking?

Robert Miles

Yeah. It's so chaotic. The systems you're working with are just extremely... not well behaved. The political situation is unprecedented, and has been various different kinds of unprecedented at every point in the last century at least. I'm sure there are people who've read a ton of history and philosophy and believe they can discern the grand arcs of all these events and understand what levers to pull and when, but I'm not convinced that those people actually know what they think they know - they just get so few feedback loops to test their models against reality. It seems very common for unexpected factors emerge and cause actions to have the opposite of their intended effect.

That's not a reason not to try. But it gives me a feeling of cluelessness that's not compatible with the fervour that I think advocacy needs.


I do just feel pretty good about the pause message. It’s one I consistently resonate with.  And not every message resonates with me. For instance, the message of the protest you attended at Meta turned out to be a more complex message to convey that I expected. 

[The idea that advocacy is intractable but the technical problem isn't] might be a double crux?

Sounds like you're more concerned about hindering technological progress, and threading that needle just right… but I feel like what we're talking about is a just loss of some decades, where we have to sort of artificially keep AGI from coming, because if you allowed everything to just develop at its pace then it would be here sooner but it wouldn't be safe.

That's what I imagine when I say “pause”.  And I think that's clearly a worthwhile trade.  And so I feel much happier to just say “pause AI until it's safe”, than I ever did about any policy on alignment.

Robert Miles

The only thing I've always felt is obviously the thing to do is “speed up and improve alignment research”. Just direct resources and attention at the problem, and encourage the relevant people to understand the situation better. This feels fairly robust. 

(Also, on some kind of aesthetic level, I feel like if you make things worse by saying what you honestly believe, then you’re in a much better position morally than if you make things worse by trying to be machiavellian.)


I am no expert on AI, but I come away feeling that we're not on track to solve this problem. I'm not sure if promoting alignment research alone can address the issue, when we maybe only have like 5 years to solve the problem.  

Robert Miles

Yeah... and I don't know, maybe encouraging labs to understand the situation is is not even robustly good. It's difficult to get across “Hey, this is a thing to be taken seriously” without also getting across “Hey, AGI is possible and very powerful” - which is a message which... if not everyone realises that, you're maybe in a better situation.

Taking beliefs seriously, and technological progress as enabler of advocacy 


I believe that merely providing people with information isn't sufficient; they often don't know how to interpret or act on it. One reason I turned to protests is that they send a clear message. When people see a protest, they instantly grasp that the protestors find something unacceptable.  This understanding is often more profound than what one might derive from a blog post weighing the pros and cons.

People tend to assume that if they were privy to crucial information that the majority were unaware of, especially if the stakes were world-altering, they would be out on the streets advocating for change.  There was confusion around why this wasn't the case for AI safety. I personally felt confused. 

(The most extreme case would be when Eliezer suggested that no amount of violence could be justified to halt AGI, even if it was going to kill everyone on earth… I also think that you should have a deontological side constraint against violence because it's always so tempting to think that way, and it just wouldn't be helpful; it would just end up backfiring. But I was confused, and I really appreciated the TIME article where he mentions enforcing treaties using the same state violence as for other international treaties.) 

Still, for a long time, I was unsure about the AI safety community's commitment due to their seeming reluctance to adopt more visible and legible actions, like being out in the streets protesting… even though I myself am not typically one to attend protests.

Robert Miles

I think it’s maybe a dispositional thing. I feel like I am just not a person who goes to protests. It’s not how I engage with the world.

And I think this is also part of, where I stand on technology in general. Technology is such a big lever, that it kind of almost... ends up being the most important thing, right? It’s like…

Okay, well now I'm gonna say something controversial. People, I think, focus on the advocates more than the technologists, in a way that's maybe a mistake?


I already know I agree.

Robert Miles

You can advocate for women's rights, but you can also invent effective, reliable birth control. Which one will have the bigger impact? I think it's the technology. Activists need to push for change, but it's technology that makes the change politically doable. I think people are like... the foam on the top of the enormous waves of the grand economic shifts that change what's politically and socially viable. For instance, I think our ability to get through the climate crisis mostly depends on renewable energy tech (if AGI somehow doesn't happen first) - make clean energy cheaper than fossil fuels, people will switch. I think all the vegan activism might end up being outweighed by superior substitute meats. When there's meat products that are indistinguishable from farmed meat but a tiny bit cheaper, the argument for veganism will mysteriously become massively more compelling to many. In both those cases, one big thing advocates did was create support for funding the research. I even think I remember reading someone suggest that the abolitionist movement really got traction in places that had maple syrup, because those people had access to a relatively inexpensive alternative supply of sugar, so being moral was cheaper for them. Obviously, activists are essential to push the arguments, but often these arguments don't gain traction until the technology/economics/pragmatics make them feasible. People are amazingly able to believe falsehoods when it's convenient, and technology can change what's convenient.


We could call this the “efficient market hypothesis of history” or something. Because I hear this a lot, and it's also part of my worldview.  Just not enough to make me not consider activism or advocacy. I think its all generally true and I pretty much agree with it, but at the same time you can still find $20 on the ground. 

Robert Miles

I actually did last week :)

[Added later: That makes me think, maybe that's the way to do it: rather than looking at what changes would be best, you look for the places where the technology and economics is already shifting in a good direction and all that's needed is a little activism to catalyse the change, and focus there. But probably there are already enough activists in most places that you're not doing much on the margin.]


[Added later, in response: there are definitely not enough activists for AI Safety at this moment in time! It makes sense there would be "market failure" here because understanding the issue up until now has required a lot of technical know-how and running in a pretty advocacy-averse social circle.]

Perhaps it's a form of modest epistemology to defer so heavily to this, to override what we're actually witnessing. I even think there are some individuals who kind of know what you can do with advocacy.  They know about policy and understand that influencing even a small number of people can shift the Overton window.  And still, even those people with an inside view that advocacy can sometimes work, will struggle with whether it's the right approach or feel like somehow it shouldn’t work, or that they should be focusing on what they deem more generally important (or identify with more).

However, I think what we're discussing is a unique moment in time. It's not about whether advocacy throughout history can outpace technology. It's about whether, right now, we can employ advocacy to buy more time, ensuring that AGI is not developed before sufficient safety measures are in place. 

Robert Miles

There's definitely a perspective where, if you have the ability to do the technical work, you should do it. But that’s actually quite a small number of people. A larger number have the ability to advocate, so they bring different skills to the table.

Actually, I've think I’ve pinned down a vibe that many who can do technical AI safety work might resonate with. It feels akin to being assigned a group project at school/university. You've got two choices: either somehow wrangle everybody in this group to work together to do the project properly, or just do the whole thing yourself. Often the second one seems a lot easier.


Though if you don't understand that the exercise was about learning to work in a group (instead of doing the assignment) you can miss out on a lot of education…

Robert Miles

Yeah. Hmm... maybe this is actually a mis-learned lesson from everyone's education! We learned the hard way that "solving the technical problem is way easier than solving the political one". But group project technical problems are artificially easy.


I am kind of annoyed at myself that I didn't question that vibe earlier in the community, because I had the knowledge… and I think I just deferred too much. I thought, “oh, well, I don't know that much about AI or ML-- I must be missing something”. But I had everything I needed to question the sociopolitical approaches. 

My intuitions about how solvable the technical problem are influenced by my background in biology. I think that biology gives you a pretty different insight into how theory and real world stuff work together. It’s one of the more Wild West sciences where stuff surprises you all the time.  You can do the same protocol and it fails the same way 60 times and then it suddenly works.  You think you know how something works, but then it suddenly turns out you don't.

So all my instincts are strongly that it's hard to solve technical problems in the wild -- it's possible, but it's really hard.

The aesthetics of advocacy, and showing up not because you're best suited, but because no one else will

Robert Miles

I think there’s an aesthetic clash here somewhere. I have an intuition or like... an aesthetic impulse, telling me basically“advocacy is dumb”. Whenever I see anybody Doing An Activism, they're usually… saying a bunch of... obviously false things? They're holding a sign with a slogan that's too simple to possibly be the truth, and yelling this obviously oversimplified thing as loudly as they possibly can? It feels like the archetype of overconfidence.

It really clashes with my ideal, which is to say things that are as close to the truth as I can get, with the level of confidence that's warranted by the evidence and reasoning that I have. It's a very different vibe.

Then there's the counter-argument that claims that empirically - strategically, or at least tactically - the blatant messaging is what actually works. Probably that activist person actually has much more nuanced views, but that's just not how the game is played. But then part of me feels like “well if that is how the game is played, I don't want to play that game”. I don't know to what extent I endorse this, but I feel it quite strongly. And I would guess that a lot of the LessWrong-y type people feel something similar.


I think people haven't expressed it as clearly as you, but I've gotten sort of that same vibe from a lot of people.

And… it does kind of hurt my feelings. Because I also want to be accurate. I'm also one of those people. 

I think I just have enough pragmatism that I was able to get past the aesthetic and see how activism works? I was also exposed to advocacy more because I have been involved with animal welfare stuff my whole life.

I had this experience of running the last protest, where I had a message I thought was so simple and clear... I made sure there were places where people could delve deeper and get the full statement. However, the amount of confusion was astonishing. People bring their own preconceptions to it; some felt strongly about the issue and didn't like the activist tone, while others saw the tone as too technical. People misunderstood a message I thought was straightforward. The core message was, "Meta, don't share your model weights." It was fairly simple, but then I was even implied to be racist on Twitter because I was allegedly claiming that foreign companies couldn’t build models that were as powerful as Meta’s?? While I don't believe that was a good faith interpretation, it’s just crazy that if you leave any hooks for somebody to misunderstand you in a charged arena like that, they will. 

I think in these cases you’re dealing with the media in an almost adversarial manner.  You're constantly trying to ensure your message can't be misconstrued or misquoted, and this means you avoid nuance because it could meander into something that you weren't prepared to say, and then somebody could misinterpret it as bad.

This experience was eye-opening, especially since I don't consider very myself politically minded. I had to learn these intricacies the hard way. I believe there's room for more understanding here. There's a sentiment I've sensed from many: "I'm smart and what you're doing seems unintelligent, so I won't engage in that." Perhaps if people had more experiences with the different forms of communication, there would be more understanding of the environment the different kinds of messages are optimized for? It feels like there's a significant inferential gap. 

Robert Miles

Yeah, that sounds frustrating. I guess there’s something about… “rationality”. So, the task of trying to get the right answer, on a technical or philosophical question where most people get the wrong answer, feels like... semiconductor manufacturing, or something like that. Like "In this discipline I’ve tried my best to master, I have to make sure to wear a mask, and cover my hair so that no particles get out into the air. This entire lab is positive air pressure, high end filtration systems. I go through an airlock before I think about this topic, I remove all contamination I can, remove myself from the equation where possible, I never touch anything with my hands. Because I'm working with delicate and sensitive instruments, and I'm trying to be accurate and precise…". It sounds pretty wanky but I think there's a real thing there.

And then in this metaphor the other thing that you also have to do in order to succeed in the world is like... mud wrestling, or something. It may not be an easier or less valuable skill, but it really doesn't feel like you can do both of these things at once.


What if it were phrased more like “There's this tech. It's pretty low tech. It's a kludge. But it works”? 

I remember the exact first moment I cried about AI safety: it’s when I saw the polls that came out after the FLI letter, saying -- and I forget the exact phrasing, but something like -- most people were in favor of pause. “Oh my god”, I thought “we're not alone. We don't have to toil in obscurity anymore, trying to come up with a path that can work without everybody else's help.” I was overjoyed. I felt as proud and happy discovering that as I would've been if I found a crucial hack for my really important semiconductor.

Robert Miles

Hmm... Yeah, it clearly is important and can work, but my gut reaction, when I'm inhabiting that clean-room mindset, is I still feel like I want to keep it at arm's length? I don't trust this low tech kludge not to spring a leak and mess up the lab.

When you start arguing with... (by you I mean 'one' not you personally) - When one starts arguing with people using the discussion norms of political debate, and you're trying to win, I think it introduces a ton of distortions and influences that interfere. Maybe you phrase things or frame things some way because it's more persuasive, and then you forget why you did that and just believe your distorted version. You create incentives for yourself to say things more confidently than you believe them, and to believe things as confidently as you said them. Maybe you tie your identity to object level questions more than necessary, and/or you cut off your own line of retreat, so it's extra painful to notice mistakes or change your mind, that kind of thing. I notice this kind of thing in myself sometimes, and I expect I fail to notice it even more, and I'm not even really trying to fight people with my work, just explain things in public.

I feel like there’s kind of a trilemma here. You've got to choose between either 1) truth/'rationality'/honesty, 2) pragmatism, for getting things done in the messy world of politics, or 3) a combination of the two different modes where you try to switch between them.
Choosing 1 or 2 each gives up something important from the other, and choosing 3 isn't really viable because the compartmentalisation it requires goes against 1.

I have this feeling about the truth, where you don't get to make exceptions, to sometimes believe what's pragmatically useful. You don't get to say, “in this situation I’ll just operate on different rules”. I feel like it’s not safe to do that, like human hardware can't be trusted to maintain that separation.


I guess I can't bring myself to be that concerned about contamination... I do think it's possible to switch modes and I find rationalists are often quite paranoid about this, like it's one toe out of line and they can never sincerely value and seek truth again. I have struggled with scrupulosity (OCD) and it reminds me of that. 

Maybe it’s because I wasn't into RationalityTM until later in my life, so I already had my own whole way of relating to the truth and how important the truth was. It wasn't so much through “speech codes” or things advocacy seems to violate? 

Robert Miles

Yeah. Maybe that 'clean room' mindset was just what was needed in order to recognise and address this problem seriously 15+ years ago… a mindset strongly focused on figuring out what's true, regardless of external opinions or how it would be perceived. At that time, societal factors might have prevented people from seeing the true scope of the situation. Now, the reality is starting to smack everyone in the face, so, perhaps that intense focus isn't as essential anymore. But by now, the most prominent individuals in the field have been immersed for a long time in this mindset that may not be compatible with an advocacy mindset? (This is just a hypothesis I randomly came up with just now.)


I'm not sure I possess an advocacy mindset, nor am I convinced it's mandatory. I do, however, see value in advocacy, perhaps more clearly than many on LessWrong. Discussing the virtues you mentioned as necessary15 years ago to perceive this issue, I may not embody those exact virtues, but I identify with many of them.

I feel like they were necessary for me to be able to pursue advocacy because AI Safety/LessWrong were so kneejerk against it. Even though Eliezer had voiced support for international politics on this matter, I have felt like an outsider. I wouldn't have ventured into this realm if I was constantly worried about public opinion. I was driven by the belief that this was really the biggest opportunity I saw that other people weren't taking.  And I felt their reluctance was tied to the factors you highlighted (and some other things).

It's not a big crowd pleaser. Nobody really likes it.

Early on, some conveyed concerns about protesting, suggesting it might be corrupting because it was something like “too fun”. They had this image of protesters delighting in their distinct identity.  And sorry, but like, none of us are really having fun. I'm trying to make it fun. I think it could and should be fun, to get some relief from the doom. I would love it if in this activity you didn't have to be miserable.  We can just be together and try to do something against the doom.

Maybe it could be fun. But it hasn't been fun.  And that’s, mainly because of the resistance of our community, and because of the crowds that just, like… man, the open source ML community was pretty nasty. 

So I was hearing from people what seemed like you were saying: “oh it would be so easy to let your clean room get contaminated; you could just slide down this incentive gradient”: and that’s not what's happening. I'm not just sliding down a gradient to do what's easier, because it's actually quite difficult. 

Finally: everybody thinks they know how to run a protest, and they keep giving me advice. I'm getting a ton of messages telling me what I should have done differently. So much more so than when I did more identity-congruent technical work.  But nobody else will actually do it.


Before Rob replies, I want to express some empathy hearing that. It sure sounds like a slog. I haven't made up my mind on what I think of protesting for AI safety, but I really do have deep sympathy for the motion of "doing what seems like the most important thing, when it seems no one else will". I recognise that trying that was hard and kind of thankless for you. Thank you for nonetheless trying. 

Robert Miles

Yeah, I appreciate it too. I think your activism is pretty different from the stereotype. I also feel the empathy for "nobody else will actually do it", because I too have felt that way for a while about YouTube, and I guess broader outreach and communication stuff. 

Because, it's very weird that this is my job! And for a long time I was the only one doing it. I was like, "Am I really the best qualified person to do this? Really? Nobody else is doing it at all, so I guess I'll do it, I guess I'll be the best person in the world at this thing, simply because there's literally nobody else trying?". For years and years I felt like "Hey, AI Safety is obviously the most interesting thing in the world, and also the most important thing in the world… and there's only one YouTube channel about it? Like… what!?"

And all these researchers write papers, and people read the papers but don't really understand them, and I'm like "Why are you investing so little in learning how to write?", "Why is there so little effort to get this research in front of people?".


No, then you'd get real criticism :)

Robert Miles

It's just confusing! If you’re a researcher, are you banking on being in the room when the AGI happens? At the keyboard? Probably not. So whatever impact your work has, it routes through other people understanding it. So this is not an optional part of the work to do a good job at. It's a straightforward multiplier on the effect of your work. 

It seems you had some similar feelings: "this seems critical, why is it not being done by anyone else?"


Yeah. I looked around thinking I would love to volunteer for an advocacy campaign, if people are going to be doing more advocacy. There were other people early on in other cities, and there were people interested in a more political approach in the Bay Area, but no one else was willing to do public advocacy around here. And I could feel the ick you're describing behind it.

Robert Miles

Man. It sounds like diving hard into advocacy is kind of like coming to school in a clown suit



There was also another thing that made me feel compelled to pursue it. It was similar to when, as a kid vegetarian, I always heard such terrible arguments from adults about eating meat. There was just no pressure to have a good argument. They already knew how it was gonna turn out: society had ruled in their favor, and I was a kid. Yet it was so clearly wrong, and it bothered me that they didn't care. 

Then, after Eliezer's Time article came out, I once again saw a wall of that from people not wanting to move in a political direction. People I really thought had the situation handled were just saying honestly the dumbest things and just mean, bad argumentation; making fun of people or otherwise resorting to whatever covert way they could to pull rank and avoid engaging with the arguments. 

I wouldn't have felt qualified to do any of this except for witnessing that.  And I just thought “okay, it just so happens I’ve accumulated some more knowledge about advocacy than any of you, and I really feel that you're wrong about why this isn't going to work”. I felt somebody really needs to be doing this. How would I feel if, I got a glimpse of “this is it, we're getting killed, it's happening” and I hadn’t even tried? It was a total “inadequate equilibrium” feeling. 

I guess I'm a generalist in skillset, and was able to do adapt that here… but my training is in evolutionary biology. I worked at a think tank before. I really feel like there's so much I don't know.  But I guarantee you there was no one else willing to do it who was better qualified.

Robert Miles


None of this makes me want to actually do it. But it does it does make me want somebody to do it.


I wouldn't want you to change what you're doing. It seems to be working. Neither would it be good for you to stop caring about accuracy in your outreach, lol. (And just to be clear, I don't think our protest signs are inaccurate; it's just a different level of resolution of communication. I’m working on a post about this. [tweet thread on this topic])

Closing thoughts

Robert Miles

To end, I’d be interested to share some deltas / updates.

I've updated toward believing that the reluctance, of people who consider themselves rationalists, to seriously consider advocacy, is itself a failure of rationality on its own terms. I'm not completely sold, but it seems worth looking into in a lot more detail.


If people would just get out of the way of advocacy, that would be really helpful.

Robert Miles

I’m curious about that... what's in the way of advocacy right now?


Probably the biggest thing is people's perception that their job prospects would be affected. I don't know how true that is. Honestly I have no idea. I suspect that people have reason to overestimate that or just conveniently believe that, because they don't want to update for other reasons. But I think they're worried labs won't like the protests.

Robert Miles

Oh, I have no sympathy for that, that sounds like straightforward cowardice... I mean, I guess there's a case to be made that you need to be in the room, you need to maintain access, so you should bide your time or whatever, not make waves... but that feels pretty post hoc to me. You're not going to get fired for that; say what you believe!


Holly, do you want to share more of your updates from the conversation?


Well, Rob definitely gave me a +10 to a general sense I had picked up on, of people around here having an aversion to advocacy or stuff that looks like advocacy, based on rationalist principles, instincts, and aesthetics. 

I also certainly felt I understood Rob’s thinking a lot better. 

Robert Miles

Yeah, it was interesting to work through just what my attitude is towards that stuff, and where it comes from. I feel somewhat clearer on that now.


I think a lot of people have similar inner reactions, but won't say them to me. So I suspected something like that was frequently going on, but it was I wasn't able to have a discussion with it.

Robert Miles

I'm glad it was helpful!

Ok I guess that's it then? Thanks for reading and uh... don't forget to Strong Upvote, Comment and Subscribe?

New Comment
30 comments, sorted by Click to highlight new comments since: Today at 12:07 AM

If you found yourself interested in advocacy, the largest AI Safety protest ever is happening Saturday, October 21st! 


I think there’s an aesthetic clash here somewhere. I have an intuition or like... an aesthetic impulse, telling me basically… “advocacy is dumb”. Whenever I see anybody Doing An Activism, they're usually… saying a bunch of... obviously false things? They're holding a sign with a slogan that's too simple to possibly be the truth, and yelling this obviously oversimplified thing as loudly as they possibly can? It feels like the archetype of overconfidence.

This is exactly the same thing that I have felt in the past. Extremely well said. It is worth pointing out explicitly that this is not a rational thought - it's an Ugh Field around advocacy, and even if the thought is true, that doesn't mean all advocacy has to be this way.

Sometimes such feelings are your system 1 tracking real/important things that your system 2 hasn’t figured out yet.


I just want to say that I thought this was an excellent dialogue. 

It is very rare for two people with different views/perspectives to come together and genuinely just try to understand each other, ask thoughtful questions, and track their updates. This dialogue felt like a visceral, emotional reminder that this kind of thing is actually still possible. Even on a topic as "hot" as AI pause advocacy. 

Thank you to Holly, Rob, and Jacob. 

I'll also note that I've been proud of Holly for her activism. I remember speaking with her a bit when she was just getting involved. I was like: "she sure does have spirit and persistence– but I wonder if she'll really be able to make this work." But so far, I think she has. I'm impressed with how far she's come. 

I think she's been doing an excellent and thoughtful job so far. And this is despite navigating various tradeoffs, dealing with hostile reactions, being a pioneer in this space, and battling the lonely dissent that Rob mentioned.

I don't know what the future of AI pause advocacy will look like, and I'm not sure what the movement will become, but I'm very glad that Holly has emerged as one of its leaders.



I think "whether to engage in advocacy for AI, and how to go about it" is a pretty important topic. I do get the sense of LessWrong-type folk being really selected for finding it aesthetically ughy, and it seems like a lot of this is a bias. 

I think there are separately real arguments about the risks and costs of advocacy, or how to figure out how to have an "advocacy arm" of the AI safety movement without warping the epistemic culture we've built here. I'd be interested in a followup dialogue that somehow approaches that with an "okay, how can we make this work?" attitude.

Update: Will curate this in 2-3 days instead. Looks like curation emails currently look a bit broken for dialogues, and we should fix that before we send out an email to 10k+ people.

I’m down for a followup!

Fwiw, since we decided to delay a couple days in curating, a thing I think would be cool for this one is to have either a "highlights" section at the beginning, or maybe a somewhat gearsier "takeaways" at the end. 

Maybe this is more useful for someone else to do since it may be harder for you guys to know what felt particularly valuable for other people.


Just in case you missed that link at the top:

The global Pause AI protest is TOMORROW (Saturday Oct 21)!

This is a historic event, the first time hundreds of people are coming out in 8 countries to protest AI.

I'm helping with logistics for the San Francisco one which you can join here. Feel free to contact me or Holly on DM/email for any reason.

I think it could be misinterpreted to mean "pause all AI development and deployment", which results in a delayed deployment of "sponge safe" narrow AI systems that would improve or save a large number of people's lives. There's a real cost to slowing things down.

This cost is trivial compared to the cost of AGI Ruin. It's like going on a plane to see your family on a plane where the engineers say they think there's a >10% chance of catastrophic failure. Seeing your family is cool, but ~nobody would think it's reasonable to go on such a plane. There are other ways to visit your family, they just take longer. 

The analogy breaks down when it comes to trying to fix the plane. We understand how airplanes work; we do not understand how AI works. It makes sense to ground the plane until we have such understanding, despite the benefits of transportation.

I would love to have all the cool AI stuff too, but I don't think we're capable of toeing the line between safe and ruinous AI at acceptable risk levels.


I think in this analogy the narrow AI models would be like texting your parents instead of flying to see them. Obviously not as good as visiting them in person, but you avoid the 10% x-risk. I think Rob is saying let's make sure we don't stop the development of texting/calling as collateral.

So yes, don't get on the plane, but let's be very specific about what we're trying to avoid.

There seems to be a trade-off in policy-space between attainability and nuance (part of what this whole dialogue seems to be about). The point I was trying to make here is that the good of narrow AI is such a marginal gain relative to the catastrophic ruin of superintelligent AI that it's not worth being "very specific" at the cost of potentially weaker messaging for such a benefit.

Policy has adversarial pressure on it, so it makes sense to minimize the surface area if the consequence of a breach (e.g. "this is our really cool and big ai that's technically a narrow ai and which just happens to be really smart at lots of things...") is catastrophic.

One thing that me more comfortable with making statements that are less nuanced in some circumstances is Wittgenstein's idea of language games. Rationalists have a tendency of taking words literally, whilst Wittgenstein views statements as moves in a language games where there are a host of different language games for different situations and people can generally figure it out. Specifically, there seems to be some distinct language games associated with protests where people understand that your sign or slogan doesn't cover everything in complete nuance. At the same time, I think we should be trying to raise the bar in terms of the epistemics/openness of our advocacy work and I do see risks in people taking this reasoning too far.

There is a massive tradeoff between nuance/high epistemic integrity and reach. The general population is not going to engage in complex nuanced arguments about this, and prestigious or high-power people who are able to understand the discussion and potentially steer government policy in a meaningful way won't engage in this type of protest for many reasons, so the movement should be ready for dumbing-down or at least simplifying the message in order to increase reach, or risk remaining a niche group (I think "Pause AI" is already a good slogan in that sense). 

I agree that there is a trade-off here, however:

a) Dumbing down the message will cost us support from ML engineers and researchers.
b) If the message is dumbed down too much, then the public is unlikely to create pressure towards the kinds of actions that will actually help as opposed to pressuring politicians to engage in shallow, signaling-driven responses.

I think the idea we're going to be able to precisely steer government policy to achieve nuanced outcomes is dead on arrival - we've been failing at that forever. What's in our favor this time is that there are many more ways to cripple advance than to accelerate it, so it may be enough for the push to be simply directionally right for things to slow down (with a lot of collateral damage).

Our inner game policy efforts are already bearing fruit. We can't precisely define exactly what will happen, but we certainly can push for nuance via this route than we would be able to through the public outreach route.

I can see why you would be a lot more positive on advocacy if you thought that crippling advances is a way out of our current crisis. Unfortunately, I fear that will just result in AI being built by whichever country/actor cares the least about safety. So I think we need more nuance than this.

We learned the hard way that "solving the technical problem is way easier than solving the political one". But group project technical problems are artificially easy.

I think we've mostly learned this not from group projects, but rather from most of history. E.g. just look at "covid as technical problem" vs "covid as political problem".

Covid was a big learning experience for me, but I'd like to think about more than one example. Covid is interesting because, compared to my examples of birth control and animal-free meat, it seems like with covid humanity smashed the technical problem out of the park, but still overall failed by my lights because of the political situation.

How likely does it seem that we could get full marks on solving alignment but still fail due to politics? I tend to think of building a properly aligned AGI as a straightforward win condition, but that's not a very deeply considered view. I guess we could solve it on a whiteboard somewhere but for political reasons it doesn't get implemented in time?

I think this is a potential scenario, and if we remove existential risk from the equation, it is somewhat probable as a scenario, where we basically have solved alignment, and yet AI governance craps out in different ways.

I think this way primarily because I tend to think that value alignment is really easy, much easier than LWers generally think, and I think this because most of the complexity of value learning is offloadable to the general learning process, with only very weak priors being required.

Putting it another way, I basically disagree with the implicit premise on LW that being capable of learning is easier than being aligned to values, at most they're comparably or a little more difficult.

More generally, I think it's way easier to be aligned with say, not killing humans, than to actually have non-trivial capabilities, at least for a given level of compute, especially at the lower end of compute.

In essence, I believe there's simple tricks to aligning AIs, while I see no reason to expect a simple trick to make governments be competent at regulating AI.

it's one toe out of line and they can never sincerely value and seek truth again.

That's almost my position, but not quite: mistakes, born of passion (anger, fear, etc.) aren't fatal. But deliberately deciding to sacrifice truth to achieve something … yeah, there's no coming back from that.

Does that really seem true to you? Do you have no memories of sacrificing truth for something else you wanted when you were a child, say? I'm not saying it's just fine to sacrifice truth but it seems false to me to say that people never return to seeking the truth after deceiving themselves, much less after trying on different communication styles or norms. If that were true I feel like no one could ever be rational at all. 

I think that's a misunderstanding of what I mean by "sacrificing truth." Of course I have lied: I told my mom I didn't steal from the cookie jar. I have clicked checkboxes saying "I am over 18" when I wasn't. I enjoy a game of Mafia as much as the next guy. Contra Kant, I wholeheartedly endorse lying to your enemies to protect your friends.

No, sacrificing truth is fundamentally an act of self-deception. It is making yourself a man who believes a falsehood, or has a disregard for the truth. It is Gandhi taking the murder-pill. That is what I consider irreversible. It's not so easy that I worry I might do it to myself by accident, so I'm not paranoid about it or anything.

(One way to go about doing this would be to manipulate your language, redefining words as convenient: "The sky is 'green.' My definition of the word 'green' includes that color. It has always included that color. Quid est veritas?" Doing such things for a while until it becomes habitual should do it.)

In this sense, no, I don't think I have ever done this. By the time I conceived of the possibility, I was old enough to resolve never to do it.

Of course, the obvious counter is that if you had scifi/magic brain surgery tech, you could erase and rewrite your mind and memories as you wished, and set it to a state where you still sincerely valued truth, so it's not technically irreversible. My response to that is that a man willing to rewrite his own brain to deceive himself is certainly not one who values truth, and the resultant amnesiac is essentially a different person. But okay, fair enough, if this tech existed, I would reconsider my position on the irreversibility of sacrificing truth via self-deception.

No, sacrificing truth is fundamentally an act of self-deception. It is making yourself a man who believes a falsehood, or has a disregard for the truth. It is Gandhi taking the murder-pill. That is what I consider irreversible.

This is what I was talking about, or the general thing I had in mind, and I think it is reversible. Not a good idea, but I think people who have ever self-deceived or wanted to believe something convenient have come back around to wanting to know the truth. I also think people can be truthseeking in some domains while self-deceiving in others. Perhaps if this weren’t the case, it would be easier to draw lines for acceptable behavior, but I think that unfortunately it isn’t.

Very beside my original point about being willing to speak more plainly, but I think you get that.

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year. Will this post make the top fifty?


Why no discussion of the world view? IMHO AI cannot be paused because, if the US & EU pause AI development to pursue AI safety research then other state actors such as Russia & China will just see this as an opportunity to get ahead with the added benefit that the rest of the world will freely give away their safety research. It's a political no-brainer unless the leaders of those countries have extreme AI safety concerns themselves. Does anyone really believe that the US would either go to war or impose serious economic sanctions on countries that did not pause?

This is a pretty standard joke/stereotype, the XKCD protest where the signs are giant and full of tiny text specifying exactly what we mean.

Can somebody link to this comic? My attempts to Google different combinations of "XKCD", "protests", and "giant sign" were unsuccessful.

I'm an xkcd expert and I don't remember such a comic.

Reflecting on what I found while Googling, I think the phrase “XKCD protest” refers not to a protest portrayed in an XKCD comic but rather to an XKCD comic that was used in a protest: https://www.reddit.com/r/xkcd/comments/6uu5wj/xkcd_1357_in_the_boston_protest_xpost_rpics/

I couldn't find any xkcd even remotely similar. What this does remind me of though is the "left wing memes vs. right wing memes" (meta)meme.