(edit 7/24/2023: Certain sections of this post I no longer endorse, but the central dilemma of the Eye remains) 

The decision to reach out to the broad public isn't - or shouldn't - be one that comes lightly. However, once you are actively vying for the Eye of Sauron - writing in TIME, appearing on highly visible/viral podcasts, getting mentioned in white house press briefings, spending time answering questions from twitter randos, and admitting you have no promising research directions by way of partially explaining why all this public-facing work is happening  - you are no longer catering exclusively to a select subset of the population, and your actions should reflect that.  

You are, whether you like it or not, engaged in memetic warfare - and recent events/information make me think this battle isn't being given proper thought. 

Perhaps this wasn't super intentional, and after now having poked the bear MIRI may realize this isn't in their best interest. But surely it's better to either be (1) completely avoiding the Eye of Sauron and not concerned with public facing memetics at all, or (2) committed to thorough, strategic, and effective memetic warfare. Instead, we are wandering around in this weird middle ground where, for example, Eliezer feels like X hours are well spent arguing with randos. 

If we are to engage in memetics, low hanging fruit are abound, and being ignored

  • Refusing to engage billionaires on twitter - especially ones that are sufficiently open to being convinced that they will drop $44 billion for something as pedestrian as a social media company.  
  • Not even attempting to convince other high leverage targets 
  • Relying on old blogposts and 1-1 textual arguments instead of much more viral (and scalable!) mediums like video 
  • Not updating what high visibility (video, aggregated text, etc) instances of our arguments which do exist, to meet the AI skeptics where they are at. (I'm not saying your actual model of the fundamental threat necessarily needs updating)
  • Not attempting to generate or operationalize large bounties that would catch the attention of every smart person on the planet. Every seriously smart high schooler knows about the Millennium Prize Problems, and their reward is just $1 million. A pittance! You also don't have to convince these masses about our entire alignment worldview; split up and operationalize the problems appropriately and people will want to solve them even if they disagree with doom! (non-seriously-thought-through example: either solve alignment problem X or convince committee Y that X isn't a problem)      
  • Relying on a "bottom-up" media strategy whereby the community is responsible for organizing and creating said media 
  • Not attempting aggregation 
  • Not aiming for the very attainable goal of getting just the relatively small idea of ainotkilleveryoneism (need a better name; more memetics) into the general population (not the entire corpus of arguments!) to the same degree that global warming is. You are effectively running a PR campaign, but the vast majority of people do not know that there is this tiny fervent subset of serious people that think literally every single person will die within the next 0-50 years, in an inescapable way, which is distinct from other commonly known apocalypses such as nuclear war or global warming. The AI that kills even a billion people is not the hypothesis under consideration, and that detail is something that can and should fit within the transmissibly general ainotkilleveryonism meme. 
  • During podcasts/interviews, abdicating responsibility for directing the conversation - and covering the foundations of the doom world model - onto the interviewers. (The instances I have in mind being the recent Bankless and Lex Fridman podcasts - I may provide timestamped links later. But to paraphrase, Eliezer basically says at the end of both: "Oh well we didn't cover more than 10% of what we probably should have. ¯\_(ツ)_/¯ " )  
  • Assuming we want more dignity, answers to the question "what can we do" should not be met with (effectively; not a quote) "there is no hope". If that's actually a 4D chess move whereby the intended response is something like "oh shit he sounds serious, let me look into this", surely you can just short-circuit that rhetoric straight into an answer like "take this seriously and go into research, donate, etc" - even if you don't think that is going to work. (We are doomers after all - but come to think of it maybe it's not good memetically for us to self identify like that). Even if you draw analogies to other problems that would require unprecedented mass coordinated efforts to solve - how is giving up dying with dignity?
  • Stepping right into a well known stereotype by wearing a fedora. Yes this has nothing to do with the arguments, but when your goal is effective memetics it does in fact matter. Reality doesn't care that we justifiably feel affronted about this. 

I want to be clear, Eliezer is one person who has already done more than I think could be expected of most people. But I feel like he may need a RIGBY of his own here, since many of the most powerful memetic actions to take would best be performed by him.  

In any case, the entirety of our future should not be left to his actions and what he decides to do. Where are the other adults in the room who could connect these simple dots and pluck these low hanging fruit? You don't need to export ALL of the corpus of reasons or even the core foundations of why we are concerned about AI - the meme of ainotkilleveryoneism is the bare minimum that needs to be in everyone's heads as a serious possibility considered by serious people.

Further, completely dropping the ball on memetics like this makes me concerned that what we non-insiders see being done... is all that is being done. That there aren't truly weird, off-the-wall, secret-by-necessity things being tried. 4 hours ago, I would have bet everything I own that Eliezer was at least attempting extensive conversations with the heads of AI labs, but given that apparently isn't happening, what else might not be? 

(Edit: I meant to hit this point more directly: Eliezer in his podcast with Dwarkesh Patel also said that he has tried "very, very hard" to find a replacement for himself - or just more high quality alignment researchers in general. I'm not questioning the effort/labor involved in writing the sequences, fiction, arbital, or research in general - and everything but maybe the fiction should still have been done in all timelines in which we win - but to think producing the sequences and advanced research is the best way to cast your net seems insane. There are millions of smart kids entering various fields, with perhaps thousands of them potentially smart enough for alignment. How many people - of near-high-enough IQ/capability - do you think have read the sequences? Less than 100? )

(Edit 2: Another benefit of casting the net wider/more-effectively: even if you don't find the other 100-1000 Eliezers out there, think about what is currently happening in alignment/AI discourse: we alignment-pilled semi-lurkers who can argue the core points - if not contribute to research - are outnumbered and not taken seriously. What if we 10-1000x our number? And by cultural suffusion we reach a point where ainotkilleveryoneism no longer sounds like a crazy idea coming out of nowhere? For example, high visibility researchers like Yann LeCun are actively avoiding any conversation with us, comparing the task to debating creationists. But he'll talk to Stuart Russell :\ ) 

New Comment
68 comments, sorted by Click to highlight new comments since: Today at 11:24 AM

Refusing to engage billionaires on twitter - especially ones that are sufficiently open to being convinced that they will drop $44 billion for something as pedestrian as a social media company.

What on earth are you talking about? In the hyperlink you mention he is engaging him; he's just stating the brute fact that it would be really hard to deploy the money, which is something true and necessary for someone to understand if they were to try this. Doing the standard political thing of just grifting and assuring him that all he needs to do is pour money into the field would be unhelpful.

Notice the last time that someone "engaged" Elon Musk we got OpenAI. It is empirically extraordinarily easier to pour gasoline on the problem with barrels of money by accelerating race dynamics than it is to get meaningful work done.

Related: The question "Where is a comprehensive, well-argued explanation of Eliezer's arguments for AI risk, explaining all lingo, spelling out each step, referencing each claim, and open to open peer review" apparently has no answer? Cause a lot of people have been asking. They see the podcast, and are concerned, but unconvinced, and want to read a solid paper or book chapter, or the equivalent in website form. Something you can read over the course of a few hours, and actually have the questions answered. Something you can cite and criticise, and have Eliezer accept that this is a good version to attack. No handwaving, or vaguely referencing online texts without links, or concepts that are mentioned everywhere and never properly explained with hindsight, no pretending a step is trivial or obvious when it simply is not. But all the arguments in it. All the data in it.

I think the closest thing to an explanation of Eliezer's arguments formulated in a way that could plausibly pass standard ML peer review is my paper The alignment problem from a deep learning perspective (Richard Ngo, Lawrence Chan, Sören Mindermann)

Linking the post version which some people may find easier to read:
The Alignment Problem from a Deep Learning Perspective  (major rewrite) 

Thanks for posting, it's well written and concise but I fear it suffers the same flaw that all such explanations share:

Weapons development: AGIs could design novel weapons which are more powerful than those under human control, gain access to facilities for manufacturing these weapons (e.g. via hacking or persuasion techniques), and deploy them to threaten or attack humans. An early example of AI weapons development capabilities comes from an AI used for drug development, which was repurposed to design chemical weapons [Urbina et al., 2022].

The most critical part, the "gain access to facilities for manufacturing these weapons (e.g. via hacking or persuasion techniques), and deploy them to threaten or attack humans.", is simply never explained in detail. I get there are many info-hazards in this line of inquiry, but in this case it's such a contrast to the well elaborated prior 2/3 of the paper that it really stands out how hand-waivy this part of the argument  is.

I'm working on a follow-up exploring threat models specifically, stay tuned.

The most critical part, the "gain access to facilities for manufacturing these weapons (e.g. via hacking or persuasion techniques), and deploy them to threaten or attack humans.", is simply never explained in detail. 

Generally, you can't explain in detail the steps that something smarter than you will take because it's smarter and will be able to think up better steps. 

If we take an intelligence that's much smarter than humans, it could make a lot of money on the stock market and buy shares of the companies that produce the weapons and then let those companies update the software of the factories with AI software that's sold as increasing the efficiency of the factory to all the involved human decision makers. 

Thinking that you can prevent smart AGI from accessing factories is like thinking you could box it a decade ago. The economic pressures make it so that boxing an AI reduces the economic opportunities a lot and thus companies like OpenAI don't box their AI. 

Given very smart AI power is a way to win economic competitions because the AI can outcompete competitors. Just like people aren't boxing their AIs they are also not giving them distance from power.

If we take an intelligence that's much smarter than humans, it could make a lot of money on the stock market and buy shares of the companies that produce the weapons and then let those companies update the software of the factories with AI software that's sold as increasing the efficiency of the factory to all the involved human decision makers. 

Even the first part of this scenario doesn't make sense. It's not possible to earn a lot of money on any major stock market anonymously because of KYC rules, and the very motivated groups that enforce those rules, which every country with a major stock market has.

It might be possible to use intermediary agents but really competent people, who will definitely get involved by the dozens and cross check each other if it's a significant amount of money, can tell if someone is genuine or just a patsy.

Plus beyond a certain threshold, there's only a few thousand people on this planet who actually are part of the final decision making process authorizing moving around that much money, and they all can visit each other in person to verify.

The only real way would be to subvert several dozen core decision makers in this group near simultaneously and have them vouch for each other and 'check' each other, assuming everything else goes smoothly.

But then the actually interesting part would be how this could be accomplished in the first place.

None of these are what you describe, but here are some places people can be pointed to:

People have been trying to write this for years, but it's genuinely hard. Eliezer wrote a lot of it on Arbital, but it is too technical for this purpose. Richard Ngo has been writing distillations for a while, and I think they are pretty decent, but IMO currently fail to really actually get the intuitions across and connect things on a more intuitive and emotional level. Many people have written books, but all of them had a spin on them that didn't work for a lot of people. 

There are really a ton of things you can send people if they ask for something like this. Tons of people have tried to make this. I don't think we have anything perfect, but I really don't think it's for lack of trying.

Curious which intuitions you think most fail to come across?

I don't have all the cognitive context booted up of what exact essays are part of AI Safety Fundamentals, so do please forgive me if something here does end up being covered and I just forgot about an important essay, but as a quick list of things that I vaguely remember missing: 

  • Having good intuitions for how smart a superintelligence could really be. Arguments for the lack of upper limit of intelligence. 
  • Having good intuitions for complexity of value. That even if you get an AI aligned with your urges and local desires, this doesn't clearly get you that far towards an AGI you would feel comfortable optimizing things on their own. 
  • Somehow communicating the counterintuiveness of optimization. Classical examples that have helped me are the cannibal bug examples from the sequences. The genetic algorithm that developed an antenna (the specification gaming Deepmind post never really got this across for me)
  • Security mindset stuff
  • Something about the set of central intuitions I took away from Paul's work. I.e. something in the space of "try to punt as much of the problem to systems smarter than you".
  • Eternity in six hours style stuff. Trying to understand the scale of the future. This has been very influential on my models of what kinds of goals an AI might have.
  • Civilizational inadequacy stuff. A huge component of people's differing views on what to do about AI Risk seems to be sources in disagreements on the degree to which humanity at large does crazy things when presented with challenges. I think that's currently completely not covered in AGISF. 

There are probably more things, and some things on this list are probably wrong since I only skimmed the curriculum again, but hopefully it gives a taste.

I totally agreed that question should have an answer.

On a tangent: During my talks with numerous people, I have noticed that even agreeing on fundamentals like "what is AGI" and "current systems are not AGI" is furiously hard.

The best primer that I have found so far is Basics of AI Wiping Out All Value in the Universe by Zvi.  It's certainly not going to pass peer review, but it's very accessible, compact, covers the breadth of the topics, and links to several other useful references.  It has the downside of being buried in a very long article, though the link above should take you to the correct section.

It’s certainly not going to pass peer review,

What does that mean? I notice that it doesn't actually prove that AI will definitely kill us all. I've never seen anything else that does, either. You can't distill what never existed.

I feel like Robert Miles' series of YouTube videos is the most accessible yet on-point explanation of this that is to be found right now. They're good, accessible, clear, easy to get. That said, they're videos, which for some people might be a barrier (I myself prefer reading my heady stuff).

Honestly, would it be such a challenge to put together something? We could work on it, then put it up on a dedicated domain as a standalone web page. We could even include different levels of explanation (e.g. "basic" to "advanced" depending on how deep you want to delve into the issues). Maybe gathering the references is the most challenging thing, but I'm sure someone must have them already piled up in a folder or Mendeley group somewhere.

Related: The question "Where is a comprehensive, well-argued explanation of Eliezer's arguments for AI risk, explaining all lingo, spelling out each step, referencing each claim, and open to open peer review" apparently has no answer?

There is no such logically consistent argument, even scattered across dozens of hyperlinks. At least none I've seen.

Let's not bury this comment. Here is someone we have failed: there are comprehensive, well-argued explanations for all of this, and this person couldn't find them. Even the responses to the parent comment don't conclusively answer this - let's make sure that everyone can find excellent arguments with little effort.

Is this written for a different comment and accidentally posted here?

I think he was referring to the enormous corpus of writing of Eliezer and others on LessWrong, which together do, as far as I can tell, fulfill all of your requirements, though there is a lot of sifting to do. My guess is you don't think this applies, but laserfiche thought the problem was likely one of ignorance about the existing writing, not your confident belief in its absence.

Why would a user who's only made 8 comments assume my ignorance about the most read LW writer, when I clearly have engaged with several hundred posts, that anyone can see within 10 seconds of clicking my profile?

If they're genuinely confused it's bizarre that they didn't bother checking, so much so that I didn't even consider it a possibility.

I'm curious if there are specific parts to the usual arguments that you find logically inconsistent.

Yup. I commented on how outreach pieces are generally too short on their own and should always be leading to something else here.

I will say that I thought Connor Leahy's talk on ML Street Talk was amazing and that we should if possible make Connor go on Lex Fridman?

The dude looks like a tech wizard and is smart, funny, charming and a short timeline doomer. What else do you want?

Anyway we should create a council of charming doomers or something and send them at the media, it would be very epic. (I am in full agreement with this post btw)

I hard disagree with your point about the fedora

My true rejection is that we shouldn't be obsessing over people's appearances and self-expression, we shouldn't be asking people to be less than themselves. This is not truth-seeking. It gives off the vibe of fakeness and cults and of your mom telling you not to go out dressed like that.

My more principled rejections
- It overly focuses signalling on what some people on the internet care about over what the average american / person in political power / AI researcher cares about.
- I'm suspect yudkowsky is aware that the stereotypes of the fedora, and wears it anyways, perhaps to reclaim it as a celebration of smartness, perhaps to countersignal that he doesn't care what the sneer-clubbers think, perhaps to 'teach in a clown suit'.
- You don't win the meme war by doing the conventional media stuff good enough. You have to do something out of the distribution, and perhaps "low status" (to some). The Kardashians pioneered a new form of tv. Mr Beast studied a lot and pioneered a new form of media business. I should write a longer post on this. In any case, let's keep EA and rationality weird, it's one of the few edges we have.

I think "we should prevent AGI doom" and "we should normalize the notion that a fedora is just a hat, not a sign that you're not some alt-right incel nutjob who wants women in the kitchen and black people in camps" are both worthwhile goals, but also completely orthogonal, and the former is a tad bit more important; so for all that it makes my high-decoupling heart weep, I say if it makes you more likely to achieve your goal of saving the world in the short term, lose your pride and ditch the fedora.

I think you overstate the badness of the fedora stereotype (multiplied how many people have that association, like the integral of vibes over all audience). I would disapprove of a notable Rationalist carelessly going onto a podcast wearing a flag of the soviet union, or a shirt that says "all lives matter".

And I think you understate the memetic benefits of playing into the fedora meme. Culture is a subtle, complicated thing, where "Liquid Death" is a popular fizzy water company valued at $700 million, because it signals something bad and is therefore socially acceptable to drink it at rock concerts and bars. And when it comes to personal clothing, it's also a matter of individual taste - being cool does partly come from optimizing for what everyone else likes, but also from being unique and genuine and signalling that you don't care what everyone else thinks.

But also I think it doesn't matter that much? Should 80,000 hours write an article on being a makeup artist or costume designer? Is personal visual aesthetics the constraint on winning at outreach/policy? That world sounds kind of bizarre and fun, and I think even in that world we should try to seem real. But we aren't there (yet?) so we can simply be real instead of trying to be real. Let people be their full selves and make their own fashion choices.

You're possibly right. Honestly the "fedora" thing strikes me as a Very Online thing, so odds are it doesn't matter that much. However wouldn't really want to draw in people who think "fedora good" over "fedora bad" either. When the wise man points at the looming world-ending superintelligent AI, an idiot looks at his hat. Realistically, odds are most regular people don't much care. But it might be a teeny teensy bit safer to drop possible blatant signals of that sort, to avoid triggering both groups. It risks being a distraction.

Let people be their full selves and make their own fashion choices.

Fair, but also, fashion choices when going for an interview are definitely something most media-savvy people would be very conscious of.

I was wrong. On twitter Eliezer says he wears the fedora because he likes how it looks.
He also says he doesn't "represent his followers and their interests" because that way of thinking fails.

He's open to alternative hat suggestions.


My background is extremely relevant here and if anybody in the alignment community would like help thinking through strategy, I'd love to be helpful.

^ can confirm! I volunteered for Spartz's Nonlinear org a couple years ago, and he has a long history of getting big numbers on social media.

We need some expertise on this topic. I actually just wrote a post on exactly that point, yesterday. I also include some strategy ideas, and I'm curious if you agree with them. I think a post from you would probably be useful if you have relevant expertise; few people here probably do. I'd be happy to talk.

Eliezer, or somebody better at talking to humans than him, needs to go on conservative talk shows - like, Fox News kind of stuff - use conservative styles of language, and explain AI safety there. Conservatives are intrinsically more likely to care about this stuff, and to get the arguments why inviting alien immigrants from other realms of mindspace into our reality - which will also take all of our jobs - is a bad idea. Talk up the fact that the AGI arms race is basically as bad as a second cold war only this time the "bombs" could destroy all of human civilization forever, and that anyone who develops it should be seen internationally as a terrorist who is risking the national security of every country on the planet.

To avoid the topic becoming polarized, someone else should at the same time go on liberal talk shows and explain how unaligned AGI, or even AGI aligned to some particular ideological group, is the greatest risk of fascism in human history and could be used to permanently lock the planet into the worst excesses of capitalism, fundamentalism, oppression, and other evils, besides the fact that (and this is one of the better arguments for abortion) it is immoral to bring a child (including AGI) into the world without being very sure that it will have a good life (and most people will not think that being addicted to paperclips is a good life).

I guess I feel at the moment that winning over the left is likely more important and it could make sense to go on conservative talk shows, but mainly if it seems like the debate might start to get polarised.

Conservatives are already suspicious of AI, based on ChatGPT3's political bias. AI skeptics shd target the left (which has less political reason to be suspicious) and not target the right (because if the succeed, the left will reject AI skepticism as a right-wing delusion).

This, especially because right now the left is on a dangerous route to "AI safety is a ruse by billionaires to make us think that AI is powerful and thus make us buy into the hype by reverse psychology and distract us from the real problems... somehow".

Talk about AGI doom in the language of social justice, also because it's far from inappropriate. Some rich dude in Silicon Valley tries to become God, lots of already plenty exploited and impoverished people in formerly colonised countries fucking die for his deranged dream. If that's not a social justice issue...

is the greatest risk of fascism in human history and could be used to permanently lock the planet into the worst excesses of capitalism, fundamentalism, oppression, and other evils

Seems like a fairly weak argument; you're treating it like a logical reason-exchange, but it's a political game, if that's what you're after. In the political game you're supposed to talk about how the techbros have gone crazy because of Silicon Valley techbro culture and are destroying the world to satisfy their male ego.

It might be almost literally impossible for any issue at all to not get politicized right down the middle when it gets big, but if any issue could avoid that fate one would expect it to be the imminent extinction of life. If it's not possible, I tend to think the left side would be preferable since they pretty much get everything they ever want. I tentatively lean towards just focusing on getting the left and letting the right be reactionary, but this is a question that deserves a ton of discussion.

I think avoiding polarization is a fool's game. Polarization gets half the population in your favor, and might well set up a win upon next year's news. And we've seen how many days are a year, these days.

Having half of the population in our favor would be dangerously bad - it would be enough to make alignment researchers feel important, but not enough to actually accomplish the policy goals we need to accomplish. And it would cause the same sort of dysfunctional social dynamics that occur in environmentalist movements, where people are unwilling to solve their own problem or advocate for less protean political platforms because success would reduce their relevance.

If one wants to avoid polarization, what are examples of a few truly transversal issues to use as a model? I almost can't think of any. Environmentalism would be one that makes sense, either side can appreciate a nice untouched natural landscape, but alas, it's not.

They're hard to think of because if everyone genuinely agrees, then society goes their way and they become non-issues that nobody talks about anymore.  For example, "murder should be illegal" is an issue that pretty much everyone agrees on. 

Something like "the state should have the right to collect at least some taxes" also has strong enough support that there's very little real debate over it, even if there are some people who disagree.

I suppose I meant more issues where there is no established norm yet because they're new (which would be a good analogue to AI) or issues where the consensus has shifted across the spectrum so that change is likely to be imminent and well accepted even though it goes against inertial. Drug legalisation may be a good candidate for that, but there are still big holdouts of resistance on the conservative side.

You mean, it would flood us with sheep instead of tigers?

Environmentalism has people unwilling to solve environmental issues because their livelihood depends on them? Would you expect the same to happen with a movement to prevent a nuclear exchange?

Would you expect the same to happen with a movement to prevent a nuclear exchange?


I think you're right about the problem, but wrong about the solution. Doing all of the additional things you mention, but with the same communication strategy, is going to produce polarization. As it happens, I just wrote a post about this yesterday.

If we get half the experts on board but cause the other half to dig in their heels and talk nonsense out of sheer irritation, the public and public policy will be gridlocked, like they are on climate change and pretty much every other red vs. blue issue. We need faster action, and the case is strong enough to get it.

Eliezer's approach on those podcasts was so bad that it's doing much more harm than good. Every reaction I've heard from outside the rationalist community has been somewhere from bad to actively irritated with the whole topic and the whole community.

Yudkowsky needs to step aside as the public face of AGI alignment, or he needs to get much better at it, quickly. I love Eliezar and value his work tremendously, but this is not his area, and he's using an approach that is basically known to produce polarization. And I can't imagine where he'd find the emotional energy to get good at this, given his current state of despair.

On the other hand, somebody needs to do it, and he's being given opportunities. I don't know if he can hand those interview requests off to someone else.

I would not have come to trust a socially-able-Eliezer. He's pure. Let him be that.

Existing rockstar researchers 

Eliezer explains in the following link (and in AGI ruin) why he thinks this is unlikely to work. It would certainly increase the "prestige" of AI alignment to have well known faces here, but that does not necessarily lead to good things. 

I agree the goal here should be effective memetics. Does Yud currently have a PR team? I am assuming the answer no.

Refusing to engage billionaires on twitter - especially ones that are sufficiently open to being convinced that they will drop $44 billion for something as pedestrian as a social media company. 

This one isn't obvious to me - having billionaires take radical action can be very positive or very negative (and last time this was tried on Elon he founded OpenAI!)

I'm surprised this has this many upvotes. You're taking the person that contributed the most to warning humanity about AI x-risks, and are saying what you think they could have done better in what comes across as blamy to me. If you're blaming zir, you should probably blame everyone. I'd much rather if you wrote what people could have done in general rather than targeting one of the best contributors.

I can't agree more with the post but I would like to note that even the current implementation is working. It definitely grabbed people's attention. 

My friend who never read LW writes in his blog about why we are going to die. My wife who is not a tech person and was never particularly interested in AI gets TikToks where people say that we are going to die.

So far it looks like definitely positive impact overall. But it's early to say, I am expecting some kind of shitshow soon. But even shitshow is probably better than nothing.

"You are, whether you like it or not, engaged in memetic warfare - and recent events/information make me think this battle isn't being given proper thought"

I'd like to chime in on the psychological aspects of such a warfare. I suggest that a heroic mindset will be helpful in mustering courage, hope and tenacity in fighting this cultural battle. In the following I will sketch out both a closer view of the heroic mindset, as well as a methodology for achieving it. 

A) On the heroic mindset
Heroism is arguably "altruistic risk, resilience, and fundamental respect for humanity". One could see it as part of manliness, but one could also have gender-inclusive versions of heroism, for example the rationalist hero, as per Eliezer Yudkowsky's reference to The World of Null-A-science fiction-series. The key aspect for me is that the hero is willing to take initiative, to take a stand, is courageous, fights the good cause, doesn't become cynical, isn't a coward - and many other good qualities.


B) On achieving the heroic mindset
The Oxford Character project focuses on developing virtue, and has summarized some key findings here. One of the strategies for character development mentioned is "engagement with virtuous exemplars". In other words: imitating or taking on other people as role models - perhaps even identifying with the role model. Linda Zagzebski - who has written on virtue epistemology - has also written a book on this type of virtue development called "Exemplarist Moral Theory". Recommended. 

I believe that if we find some good role-models, for example the Null-A heroes, or others like Nelson Mandela, Winston Churchill, or Eleanor Roosevelt, we can identify with them, and thereby access our own resources. One practical way of going about this is through Todd Herman's The Alter Ego Effect: The Power of Secret Identities to Transform Your Life. [2019]. We basically take on a heroic identity, and go through our day from that frame of reference. This can be majorly empowering.

In summary, the key to taking on a heroic mindset lies in shifting one's identity to a heroic identity. And we can do that in practice through Zagzebski's or perhaps even more hands-on: through Todd Herman's work. 

By doing this, we might gain more fighting spirit and willingness to try to be actors and movers, not just "understanders" of this world and this situation.

Agree for the need of some of those things, though I wouldn't say things like e.g. Eliezer not giving any proposals to Musk in that Tweet is a sign of "refusing to engage". Last time Elon Musk tried to do something about AI safety, Open AI was born. I think it makes sense to hold back from giving more ideas to the one man whose sole attempt at solving the problem was also the one thing that most of all made the problem a lot worse in the last year.

The discussion in the comments is extremely useful and we've sorely needed much more of it. I think we need a separate place purely for sharing and debating our thoughts about strategies like this, and ideally also working on actual praxis based on these strategies. The ideal solution for me would be a separate "strategy" section on LessWrong or at least a tag, with much weaker moderation to encourage out-of-the-box ideas. So as not to pass the buck I'm in the process of building my own forum in the absence of anything better.

Some ideas for praxis I had, to add to the ones in this post and the comments: gather a database of experiences people have had of actually convincing different types of people of AI risk, and then try to quantitatively distill the most convincing arguments for each segment; proofreading for content expected to be mass consumed - this could have prevented the Time nukes gaffe; I strongly believe a mass-appeal documentary could go a long way to alignment-pilling a critical mass of the public. It's possible these are terrible ideas, but I lack a useful place to even discuss them.

The easiest point to make here is Yud's horrible performance on Lex's pod. It felt like no prep and brought no notes/outlines/quotes??? Literally why?

Millions of educated viewers and he doesn't prepare..... doesn't seem very rational to me. Doesn't seem like systematically winning to me.

Yud saw the risk of AGI way earlier than almost everyone and has thought a lot about it since then. He has some great takes and some mediocre takes, but all of that doesn't automatically make him a great public spokesperson!!!

He did not come off as convincing, helpful, kind, interesting, well-reasoned, humble, very smart, etc.

To me, he came off as somewhat out of touch, arrogant, weird, anxious, scared, etc. (to the average person that has never heard of Yud before the Lex pod)

The top reactions on Reddit all seem pretty positive to me (Reddit being less filtered for positive comments than Youtube): https://www.reddit.com/r/lexfridman/comments/126q8jj/eliezer_yudkowsky_dangers_of_ai_and_the_end_of/?sort=top 

Indeed, the reaction seems better than the interview with Sam Altman: https://www.reddit.com/r/lexfridman/comments/121u6ml/sam_altman_openai_ceo_on_gpt4_chatgpt_and_the/?sort=top 

Here are the top quotes I can find about the content from the Eliezer Reddit thread: 

This was dark. The part that really got me was the discussion about human time vs AI time. The fact that AI is running 24/7 at gigahertz speeds and the human brain runs about 200 hertz in short bursts is worrisome. If AGI did want to escape it would happen before we knew it.

I also keep thinking about Dune: "Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.” 

Surprised to read so many skeptical comments here about Yudkowsky. I’ve been somewhat occasionally following his writings on rationality and I am absolutely convinced this guy is very brilliant in his niche slice of topics. His AI conversation with Sam Harris a few years ago is my favorite AI podcast where he hits the nail on the head about why we should be worried about AI. I have never hear someone talk so coherently on the topic as there. Really excited for his one.

Personally, I'm glad to hear voices on the extreme and opposing side as a counterweight to all the "AI is totally cool, bro!" AI positivity and optimism. We've been caught with our pants down even as people have tried to sound the alarm on this for years now. If unmitigated disaster is a possibility, we should damn well be hearing from those voices too.

Compare to top comments from the Sam interview with Lex Friedman: 

I was very interested in hearing this interview but goddamn I can't stand 2 hours of that guys vocal fry.

The whole bit about Jordan Peterson and other controversial figures at the beginning was really difficult to listen to. Sam sidestepping the topic by trivializing Lex was hilarious.

Only 1 hour in so far, but is it just me or Sam Altman evading every technical question? It's as if he's too afraid to give out any secrets. I'm pretty sure Lex repeated one of the questions twice too but no bite (I think it was the safety one?).

I guess that's okay but I'm used to the Elon-like "here's every detail I know and I don't care about the competitors". Though maybe the former approach is understandable considering the competitor in this case is Google.

This sampling methodology overall of course isn't great, and I do think Eliezer obviously reads as someone pretty weird, but I think he also reads as quite transparently genuine in a way that other spokespeople do not, and that is quite valuable in-itself. Overall, I feel pretty uncompelled by people's strong takes saying that Eliezer did a bad job on the recent podcasts.

I don't find this argument convincing. I don't think Sam did a great job either but that's also because he has to be super coy about his company/plans/progress/techniques etc.

The Jordan Peterson comment was making fun of Lex and a positive comment for Sam.

Besides, I can think Sam did kinda bad and Elizier did kind of bad but expect Elizier to do much better!


I'm curious to know your rating on how you think Eliezer did compare to what you'd expect is possible with 80 hours of prep time including the help of close friends/co-workers.

I would rate his episode at around a 4/10

Why didn't he have a pre-prepared well thought list of convincing arguments, intuition pumps, stories, analogies, etc. that would be easy to engage with for a semi-informed listener? He was clearly grasping for them on the spot.

Why didn't he have quotes from the top respected AI people saying things like "I don't think we have a solution for super intelligence.",  "AI alignment is a serious problem", etc.

Why did he not have written notes? Seriously... why did he not prepare notes? (he could have paid someone that knows his arguments really well to prepare notes for him)

How many hours would you guess Eliezer prepared for this particular interview? (maybe you know the true answer, I'm curious)

How many friends/co-workers did Eliezer ask for help in designing great conversation topics, responses, quotes, references, etc.?

This was a 3-hour long episode consumed by millions of people. He had the mind share of ~6 million hours of human cognition and this is what he came up with? Do you rate his performance more than a 4/10?

I expect Rob Miles, Connor Leahy, or Michaël Trazzi would have done enough preparation and had a better approach, and could have done an 8+/10 job. What do you think of those 3? Or even Paul Christiano.

Eliezer should spend whatever points he has with Lex to get one of those above 4 on a future episode is my opinion.

@habryka curious what you think of this comment

I'm talking about doing a good enough job to avoid takes like these: https://twitter.com/AI_effect_/status/1641982295841046528

50k views on the Tweet. This one tweet probably matters more than all of the Reddit comments put together


50k views is actually relatively little for a tweet. The view numbers seem super inflated. I feel like I've seen tweets with 1000+ likes and 100k+ "views" on the topic of the Lex Friedman podcast (I think positive, but I really don't remember). 

I didn't mean to bring up the Reddit comments as consensus, I meant them as a relatively random sample of internet responses.

Fair enough regarding Twitter

Curious what your thoughts are on my comment below


Very well put, and I couldn't agree more with this. I've been reading and thinking more and more about the AI situation over the past year or so, starting when that AI researcher at Google became convinced that he had created a conscious being. Things are now accelerating at a shocking pace, and what once seemed like speculation that wasn't immediately relevant is now crucially urgent. Time is of the essence. Moreover, I'm becoming increasingly convinced that AI containment, if it is achieved, will be done through political solutions rather than technological solutions. Things are just moving way too fast, and I don't see how technical alignment will keep up when the pool of alignment researchers is so tiny compared to the enormous number of AI capabilities researchers.


For those of us deeply worried about AI risk, we're going to have to prepare for a rapid change in the discourse. Public persuasion will be crucial, as if we win it will be achieved by a combination of public persuasion and effective wielding of the levers of power. This means a paradigm shift in how capital-R Rationalists talk about this issue. Rationalists have a very distinctive mode of discourse which, despite having undeniable benefits, is fundamentally incongruent with more typical modes of thinking. We need to be willing to meet people where they are at, empathize with their concerns (including people worried about AI taking their jobs or making life meaningless - this seems to be quite common), and adopt non-Rationalist methods of persuasion and effective use of power that are known to be effective. Memetic warfare, one could call it. This will probably feel very dirty to some, and understandably so, but the paradigm has shifted and now is the time.


The methods of Rationality can still be very useful in this - it's an effective way to interrogate one's own assumptions and preexisting biases. But people have to be willing and able to use these methods in service of effective persuasion. Keeping our eyes on the prize will also be crucial - if this new limelight ends up getting used to advance other popular Rationalist causes and viewpoints such as atheism and wild animal suffering, I do not see how this could possibly go well.