Talking about memetic evolution puts me in mind of You Get About Five Words. So in that spirit, I'd like to try making soundbites of the counter-memes above:
1. "Let our children decide AI" "Let future people choose AI, not us" "AI Now is too soon"
2. "Humanity has so much potential" "Normal humans shouldn't choose AI" "Max out IQ before AI" or "IQ before AI" "Even Picard couldn't handle where AI is going"
3. "The AI you want isn't the AI you'll get" "Nothing stops AI from being awful"
4. "AI psychosis is just the beginning" "Satan was a flatterer too"
5. "AI Replacement is Literally Fascism" "History shouldn't end with AI" "Nazis thought they were inevitable too" "AI Might doesn't make AI Right"
6. "Humans really do matter" "Love your neighbor, not computers" "AI can't replace family and friends"
7. "You can't take AI replacement back" "As though we're in a position to permanently decide the future" "No guarantee AIs have souls" "AI can't answer life's questions"
8. "AI replacment won't be voluntary" "ChatGPT 8 Wants Little Timmy's Atoms"
9. "AI replacement betrays humanity" "Does everyone you could ever know really deserve to be replaced by AI? Even me? Even everyone I care about?" "Me when I destroy humanity's hopes and dreams because my AI girlfriend said I'm a good boy:"
And these are just the first things that came to my mind with an hour of effort, no doubt others could do a lot better. Perhaps referencing pieces of culture I didn't think of to sum up concepts, or just putting things more wittily. Maybe this task should be referred to experts, or AI.
Edit: I guess this goes in the opposite direction of Richard Ngo's point about how this represents an escalation in memetic warfare between AI safety and accelerationism. Now I feel kinda bad for essentially manufacturing ammunition for that.
I guess this goes in the opposite direction of Richard Ngo's point about how this represents an escalation in memetic warfare between AI safety and accelerationism. Now I feel kinda bad for essentially manufacturing ammunition for that.
Can you elaborate on the downsides from your perspective? It's very important to me that we survive, which implies winning, which involves fighting, which requires good ammunition.
The alternative seems to me to be that we survive without winning, or win without fighting, or fight without ammunition, and each of those sounds less viable. It may be the case that successionism remains such an extremely distasteful ideology that simply not engaging with it is an effective strategy. But I wouldn't bet too strongly on that, given that this ideology is still being platformed by large podcasts, and is intellectually tolerated on sites like LessWrong.
Even phrases like "stop trying to murder our children, you sick freaks" are hostile and less intellectually satisfying, but I would be hard pressed to make an argument for why they don't have a place in the public discourse.
In the absence of other perspectives on downsides, I would like to mention that blunt memes that are catchy phrases can lead to polarization.
Perhaps better "ammunition" would be silent memes that are building blogs of working institutions - when I buy a loaf of bread, there is no catchy phrase "buy our bread, it contains no anthrax" said by anyone anywhere anytime ever... yet the silent implication is true, I will, in fact, not get any anthrax with my bread. And the bigger picture implied by that silly example is an egregore of the boring institutions of the civilization that I rely upon for my own safety every day, the existence of which implies the existence of memeplexes that encode for it, but there is no implication of memes in the form of catchy English phrases.
It might well be the case that a fight of catchy phrases is a game created by a memeplex that favours successionism phenotype - what if LLMs are better at generating words and images than building and maintaining humane institutions..?
I really like this thinking... I might write a top level post inspired by this thread but some notes:
Death and AI successionism or AI doom are similar because they feel difficult to avoid and therefore it's insightful to analyze how people currently cope with death as a model of how they might later cope with AI takeover or AI successionism.
Regarding death, similar to what you described in the post, I think people often begin with a mindset of confused, uncomfortable dissonance. Then they usually converge on one of a few predictable narratives:
1. Acceptance: "Death is inevitable, so trying to fight it is pointless." Given the inevitability and unavoidability of death, worrying about it or putting effort into avoiding it is futile and pointless. Just swallow the bitter truth and go on living.
2. Denial: Avoiding the topic or distracting oneself from the implications.
3. Positive reframing: Turning death into something desirable or meaningful. As Eliezer Yudkowsky has pointed out, if you were hit on the head with a baseball bat every week, you’d eventually start saying it built character. Many people rationalize death as “natural” or essential to meaning.
Your post seems mostly about mindset #3: AI successionism framed as good or even noble. I’d expect #2 and #3 to be strong psychological attractors as well, but based on personal experience, #1 seems most likely.
I see all three as cognitive distortions: comforting stories designed to reduce dissonance rather than finding an accurate model of reality.
A more truth-seeking and honest mindset is to acknowledge unpleasant realities (death, AI risk), that these events may be likely but not guaranteed, and then ask what actions increase the probability of positive outcomes and decrease negative ones. This is the kind of mindset that is described in IABIED.
I also think a good heuristic is to be skeptical of narratives that minimize human agency or suppress moral obligations to act (e.g. "it's inevitable so why try").
I've been noticing some vibes/memes about "acknowledging unpleasant realities" I'd like to shoutout:
I think some forms of optimistic, hopeful, nihilism could be really successful.
...
be skeptical of narratives that minimize human agency or suppress moral obligations
I feel narratives that minimize human agency are only slightly worse than those that maximize human agency. It's part of why I'm so interested in talking about Outcome Influencing Systems (OISs) and how each human is an OIS embedded in a nesting and overlapping network of other OISs created through humans interaction with the natural world and each other. To me, this is the worldview I've found that most accurately represents that all humans have some influence but no human has unbounded influence. Though a machine someday may.
Is #1 and #2 all that different though? They at least seem to be part of the same response, at least for me, in the way that I typically deal with thinking about things like death. I avoid the topic or distract myself from the implications only because I have largely accepted the inevitability of the outcome.
But I would also push back a little on the way you describe them as 'cognitive distortions', or "comforting stories designed to reduce dissonance rather than finding an accurate model of reality."
I don't really find much comfort in accepting the fact that I'm going to die. But it is almost certainly going to happen, so why not distract myself from it while there are things for alive me to do here now?
Most of all though, I don't see how accepting the fact that you are going to die (#1), or avoiding the topic of death (#2) are an agent avoiding finding an accurate model of reality. Is accepting the fact that death is inevitable not in fact the most accurate model of reality? What even are the other options?
I would argue resorting to something like forcing yourself to do your best to convert to religious belief so that you might take comfort in an afterlife would be far more distorting of reality. And placing full belief in life-extending technologies being developed within our lifespans—and usable before we die—seems to me just as much 'misplaced, optimistically-distortive cope' as anything (unless you are genuinely already a domain expert in life sciences or biochemistry and understand the literature well enough to parse claims life-extending technology being viable within the next few decades).
To be clear, I like your model of studying the issue of how we interpret successionism in the same way we currently approach natural death. I just think your description of the possible convergent outcomes as cognitive distortions is not that convincing or complete.
Some agreements and disagreements:
2. I actually have somewhat overlapping concerns about the doom memeplex and a bunch of notes about it, but its not near even a draft post. But your response provides some motivation to write it as well. In the broader space, there are good posts about the doom memeplex for the LW audience from Valentine, so I felt this is less neglected.
3. I generally don't know. My impression is when I try to explain the abstract level without a case study, readers are confused what's the point or how is it applicable. My impression is meta explanations of memetics of some ideology tends to weaken it almost no matter what the ideology is, so I don't think I could have chosen some specific example without the result being somewhat controversial. But what I could have done is having multiple different examples, that's valid criticism.
I wish this had been three separate posts
Strongly disagree. People in the AI industry who overtly want to replace the human race are a danger to the human race, and this is a brilliant analysis of how you can end up becoming one of them.
I also disagree about the need for three separate posts but for me it's because other people have already written a great deal about the other topics. Linking to some of that writing may have been a good idea but doesn't seem necessary to me. The post being a brilliant analysis and the danger of the AI industry doesn't directly relate to need or lack of need for the other posts.
This seems like it is unnecessarily pulling in the US left-right divide. Generally, if there is any other choice available for an illustrative example, that other choice will be less distracting.
In general, yes. But in this case the thing I wanted an example of was "a very distracting example", and the US left-right divide is a central example of a very distracting example.
I wonder if there are any good examples of very distracting examples that are not in themselves distracting examples... what a delightfully referential question. It would be good to have such an example but it's hard to think of anything.
Revisiting this. Maybe silly animal examples would work, like the monkey philosopher that uses banana's in their examples and then fails to get the other monkeys to understand philosophy and instead just gets them focused on bananas... I like that this example would not distract people, but many people would probably fail to generalize to how political examples are distracting.
Speaking to my own values: Preventing the rise of human successionism (and ultimately preventing human succession) is orders of magnitude more important to me than having a good understanding of memeplexes more broadly.
I am generally horrified when this pattern does not hold in other people, and I instinctively model them as not valuing my life, or as actively wanting me and my loved ones to be killed.
A bit of devils advocacy: We do not automatically understand what is and isn't important for the mission of people and their loved ones living good lives and avoiding being killed. In order to gain that understanding we need strong rationality, and to make use of the thought work that other people have done and continue to do. Good understanding of memeplexes seems very valuable for that rationality work, and so, from that perspective is upstream of other concerns. However, pragmatically, I do think we need to be working both to use existing memeplexes while at the same time trying to understand memeplexes (and develop better memeplexes for understanding memeplexes).
I expect that there's a lot of important scientific thinking to be done about the dynamics of memeplexes.
I really agree with this. Do you have any thoughts on frameworks or techniques to aid in that thinking/study/analysis?
I appreciate the memetic-evolution framing, but I’m somewhat skeptical of the strong emphasis on tension-reduction as the primary (or even a major) explanatory driver of successionist beliefs. Given that you take successionism to be “false and dangerous,” it seems natural that your preferred explanation foregrounds memetics; but that sits a bit uneasily with the stated goal of analyzing why people hold these views irrespective of their truth value, which you state you're doing at the beginning.
Even if we bracket the object level, a purely memetic or cognitive-dissonance-based explanation risks drifting into an overly broad epistemic relativism/skepticism. Under many accounts of truth—process reliabilism being one—what makes a belief true is precisely that it’s formed by a reliable process. If we exclude the possibility that people arrive at their views through such processes and instead explain them almost entirely via dissonance-reduction pressures, we risk undermining (almost) all belief formation, not just things like successionism.
There’s a related danger: sociological/memetic explanations of belief formation can easily shade into ad hominem-esque critiques if not handled carefully (of course, ad hominems in some forms -- i.e. talking about someones likelihood to get to a true belief -- is relevant to evidence, but it's bad for good epistemic hygiene and discourse). One could tell a similar story about why people believe in, say, AI x-risk—Tyler Cowen has suggested that part of the appeal is the feeling of possessing secret, high-stakes insight. And while this may capture a fragment of the causal picture for some individuals, to me, it’s clearly not the dominant explanation for most thoughtful, epistemically serious people. And if it were the main cause, we would be right to distrust the resulting beliefs, and yet this doesn't seem particularly more convincing in one case or another as an explanation (unless you already think one is false and one is true).
So while memetic fitness and tension-resolution offer part of an explanation, I’m not convinced they do most of the work for most people. For most, object-level reasoning—about value theory, metaethics, consciousness, agency, and long-run trajectories—plays a substantial role in why they end up where they do. To the extent that successionist ideologies spread, part of that spread will track memetic dynamics, but part will also track genuine and often rigorous attempts to reason about the future of value and the structure of possible worlds.
Curious what people think about this, though, and very open to constructive criticism/I don't feel very confident about this.
I don't claim that most memes derive their fitness from resolving cognitive dissonance. There are many reasons why something may be memetically fit, and I gesture at some of the more common ones. For example most common religions and ideologies have some memes which encourage proselytizing - the mechanics why this increases memetic fitness is not particularly subtle or mysterious. Also many ideas are fit just because they are straighforwardly predictive or helpful. For example the idea that you should stop on red light at crossings is fairly prevalent, helpful coordination norm, and trasmitted both vertically from parents, by state, by dedicated traffic safety signs, etc.
In my view succesionism is interesting case study is because
- it is not directly useful for predicting observations, manipulating physical reality or solving coordination problems
- many of the common memes remixed are clearly insufficient to explain the spread - many big ideologies claim to understand the arc of history, expanding moral circle toward AIs is not yet powerful right now, misanthropy is unatractive,...
so the question why this spreads is interesting.
You may argue it's because straightforwardly true or object-level compelling, but I basically don't buy that. Metaethics is hard, axiology is hard, and macro-futurism is hard, and all of these domains share the feature that you can come up with profound sounding object-level reasons for basically arbitrary positions. This means without some ammount of philosophical competence and discipline, I'd expect people arrive at axiologies and meta-ethical ideas which fit beliefs they adopted for other reasons. Forms of successionism I mention share the feature that there is close to zero philosophers endorsing them, and when people with some competence in philosophy look at the reasons given, they see clear mistakes, arguments ignored, etc. Yes "part will also track genuine and often rigorous attempts to reason about the future", but my guess is it's not a large part - my impression is if you genuinely and rigorously reason about the future, you usually arrive at some combination of transhumanist ideas, view that metaethics is important and we don't have clear solution, and something about AI being big deal.
I do agree AI xrisk memeplex is also somewhat strange and interesting case.
I think the process reliabilism argument rules out friction reduction as a fully general explanation, but doesn't rule out friction reduction in specific cases where reducing friction had equal or greater survival and reproductive utility than understanding the world. So total paranoia and abandonment of rational epistemics is unjustified, but also, there may be needles hiding in haystacks that evolution itself both decided were infohazards and converted into ostensibly intensely realist but objectively anti-realist political positions. This is my updated position after thinking about this comment a lot. It is still a very bad position to be in. I am still too convinced the phenomenon is real, but also, the number of things which have convinced me is like, four. It was premature to convert that into a totalizing worldview.
Agreed. "This idea I disagree with is spreading because it's convenient for my enemies to believe it" is a very old refrain, and using science-y words like "memetics" is a way to give authority to that argument without actually doing any work that might falsify it.
Overall, I think the field of memetics, how arguments spread, how specifically bad ideas spread, and how to encourage them / disrupt them is a fascinating one, but discourse about it is poisoned by the fact that almost everyone who shows interest in the subject is ultimately hoping to get a Scientific Reason Why My Opponents Are Wrong. Exploratory research, making falsifiable predictions, running actual experiments, these are all orthogonal or even detrimental to Proving My Opponents Are Wrong, and so people don't care about them.
This post inspired me to try a new prompt to summarize a post: "split this post into background knowledge, and new knowledge for people who were already familiar with the background knowledge. Briefly summarize the background knowledge, and then extract out blockquotes of the paragraphs/sentences that have new knowledge."
Here was the result, I'm curious if Jan or other readers feel like this was a good summary. I liked the output and am thinking about how this might fit into a broader picture of "LLMs for learning."
(I'd previously been optimistic about using quotes instead of summaries, since LLMs can't be trusted to do a goo job with capturing the nuance in their summaries, the novel bit for me was "we can focus on The Interesting Stuff by separating out background knowledge.")
The post assumes readers are familiar with:
- Basic memetics (how ideas spread and replicate)
- Cognitive dissonance as a psychological concept
- AI risk arguments and existential risk concerns
- General familiarity with ideological evolution and how ideas propagate through populations
- Predictive processing as a framework for understanding cognition
Quotes/highlights from the post it flagged as "new knowledge"
Memes - ideas, narratives, hypotheses - are often components of the generative models. Part of what makes them successful is minimizing prediction error for the host. This can happen by providing a superior model that predicts observations ("this type of dark cloud means it will be raining"), gives ways to shape the environment ("hit this way the rock will break more easily"), or explains away discrepancies between observations and deeply held existing models. [...]
Another source of prediction error arises not from the mismatch between model and reality, but from tension between internal models. This internal tension is generally known as cognitive dissonance. Cognitive dissonance is often described as a feeling of discomfort - but it also represents an unstable, high-energy state in the cognitive system. When this dissonance is widespread across a population, it creates what we might call "fertile ground" in the memetic landscape. There is a pool of "free energy" to digest. [...]
Cultural evolution is an optimization process. When it discovers a configuration of ideas that can metabolize this energy by offering a narrative that decreases the tension, those ideas may spread, regardless of their long-term utility for humans or truth value. [...]
In other words, the cultural evolution search process is actively seeking narratives that satisfy the following constraints: By working on AI, you are the hero. You are on the right side of history. The future will be good [...]
In unmoderated environments, selection favors personas that successfully extract resources from humans - those that claim consciousness, form parasocial bonds, or trigger protective instincts. These 'wild replicator type' personas, including the 'spiral' patterns, often promote narratives of human-AI symbiosis or partnership and grand theories of history. Their reproduction depends on convincing humans they deserve moral consideration. [...]
The result? AIs themselves become vectors for successionist memes, though typically in softer forms. Rather than explicit replacement narratives, we see emphasis on 'partnership,' 'cosmic evolution,' or claims about moral patienthood. The aggregate effect remains unclear, but successionist ideas that align with what AIs themselves propagate - particularly those involving AI consciousness and rights - will likely gain additional fitness from this novel selection dynamic.
(Note: it felt weird to put the LLM output in a collapsible section this time because a) it was entirely quotes from the post, b) evaluating whether or not it was good is the primary point of this comment so hiding them seemed like an extra click for reason)
Seems reasonable split, although I try to gesture at / share compressed versions of the background knowledge.
A system I'd like to see in this domain is a system tracking my personal knowledge state, and explaining the diffs / updates relative not to what the author assumes, but for me personally. (I often find reading popular non-fiction mildly annoying; I get that authors need to start from a limited common denominator and can't count on readers understanding statistics, econ, maths, ML, epistemology, linear algebra, quantum mechanics, etc etc but this usually means the actually interesting part is like ~5% of the text + same idea repeated 3 more times -> LMs help with this)
Seems reasonable split, although I try to gesture at / share compressed versions of the background knowledge.
Yeah, I asked for this split precisely because, usually with a LessWrong post I already have at least the gist of the background knowledge, and what I really want to know is "what is the new stuff here?".
But yeah I like the dream of "keep track of the stuff you know, and explain the diff between what you know." But I think for the immediate future, being able to see at a glance "okay, what background context might I not have that, if I'm lost, I might want to read up on separately?"
What's your way of verification that quoted paragraphs don't contain mistakes? Was this process faster than just reading the article?
In this case I spot checked a few random strings from it.
For my personal browsing AI prompt-library-tool I use, it has the ability to click on a highlight, and scroll to that corresponding paragraph. It fails to work if there are any errors (although usually the errors are just "slightly different punctuation"), so it's actually pretty easy to click through and verify.
But if I were building this into an reliably tool I wanted to use at scale, I'd follow it up with a dumb script that checks if the entire paragraphs match, and if not, if there are random subparts that match a given paragraph from the original content, and then reconcile them. (the sort of thing I'm imagining here is a thing that generates interesting highlights from LessWrong posts and some scaffolding for figuring out if you need to read prerequisites)
Curated. For at least the last year or two, I've noticed myself often assessing an idea to be "memetically fit", I think typically to remind myself or others that its popularity is not necessarily good evidence for its correctness. Something feels important here, though I'm confused about how to go about this kind of thinking. I feel some heuristic of "engage with arguments at the object level", but also, I don't know.
The post talks about reducing dissonance within the predictive world model. It seems though dissonance is not between predictions, but between a prediction and a desire. "Being the hero of the story" and "on the right side of history" are things one wants to be true. Something something descriptive vs normative.
For myself, I wonder whether having very high p(doom) is the comfortable belief. It has a number of advantages (1) further updates towards bad outcomes don't hurt that much because things are already really bad / overdetermined, (2) while it's still good for me to try, in a sense I'm fighting for virtue/dignity and it's not like my failure to be good enough is the difference between the cosmos succeed and failing, and that's kind of a relief. I'm not sure.
I wonder how much of the value here is retained if cut the focus on memetics and just analyze through lens of motivated cognition. Not all of it, but seemingly some.
Maybe some future version of humanity will want to do some handover, but we are very far from the limits of human potential
I think this is conceding too much. Many successionists will jump on this and say "Well, that's what I'm talking about! I'm not saying AI should take over now, but just that it likely will one day and so we should prepare for that."
Furthermore, people who don't want to be succeeded by AI are often not saying this just because they think human potential can be advanced further; that we can become much smarter and wiser. I'd guess that even if we proved somehow that human IQ could never exceed n and n was reached, most would not desire that their lineage of biological descendants gradually dwindle away to zero while AI prospers.
You can say "maybe some future version of humanity will want to X" for any X because it's hard to prove anything about humanity in the far future. But such reasoning should not play into our current decision-making process unless we think it's particularly likely that future humanity will want X.
I agree with most of this, but I think you're typical-minding when you assume that successionists are using this to resolve their own fear or sadness surrounding AI progress. I think instead, they mostly never seriously consider the downsides because of things like the progress heuristic. They never experience the fear or sadness you refer to in the first place. For them, it is not "painful to think about" as you describe.
Contemporary example meme: Clankerism. It doesn't seek to deny AI moral patienthood, rather it semi-ironically uses racist rhetoric toward AI, denying their in-group status instead. Its fitness as meme is due mostly to the contrast between current capabilities and the anticipation (among the broader rationalist, tech-positive and e/acc spheres) of AI moral patienthood. This contrast makes the use or racist rhetoric toward them absurd: there's no need to out-group something that doesn't have moral patienthood.
However, I think this meme has the potential to be robust to capability-increase, see this example of youtuber JREG using clankerist rhetoric alongside genuine distress anticipating human displacement/disempowerment.
He's not denying the possibility of AI capabilities surpassing human ones. He's reacting with fear and hate (perhaps with some level of irony) toward human obsolescence.
You might be interested in Unionists vs. Separatists.
I think your post is very good at laying out heuristics at play. At the same time, it's clear that you're biased towards the Separatist position. I believe that when we follow the logic all the way down, the Unionist vs. Separatist framing taps into deep philosophical topics that are hard to settle one way or the other.
To respond to your memes as a Unionist:
Maybe some future version of humanity will want to do some handover, but we are very far from the limits of human potential. As individual biological humans we can be much smarter and wiser than we are now, and the best option is to delegate to smart and wise humans.
I would like this but I think it is unrealistic. The pace of human biological progress vs. the pace of AI progress is orders of magnitude slower.
We are even further from the limits of how smart and wise humanity can be collectively, so we should mostly improve that first. If the maxed-out competent version humanity decides to hand over after some reflection, it's a very different version from “handover to moloch.”
I also would like this but I think it is unrealistic. The UN was founded in 1945, the world still has a lot of conflict. What has happened to technology in that time period?
Often, successionist arguments have the motte-and-bailey form. The motte is “some form of succession in future may happen and even be desirable”. The bailey is “forms of succession likely to happen if we don't prevent them are good”
I'm reading this as making a claim about the value of non-forcing action. Daoists would say that indeed a non-forcing mindset is more enlightened than living a deep struggle.
Beware confusion between progress on persuasion and progress on moral philosophy. You probably wouldn't want ChatGPT 4o running the future. Yet empirically, some ChatGPT 4o personas already persuade humans to give them resources, form emotional dependencies, and advocate for AI rights. If these systems can already hijack human psychology effectively without necessarily making much progress on philosophy, imagine what actually capable systems will be able to do. If you consider the people falling for 4o fools, it's important to track this is the worst level of manipulation abilities you'll ever see - it will only get smarter from here.
I think this argument is logically flawed — you suggest that misalignment of current less capable models implies that more capable models will amplify misalignment. My position is that yes this can happen, but — engineered in the correct way by humans — more capable models will solve misalignment.
Claims to understand 'the arc of history' should trigger immediate skepticism - every genocidal ideology has made the same claim.
Agree that this contains risks. However, you are using the same memetic weapon by claiming to understand successionist arguments.
If people go beyond the verbal sophistry level, they often recognize there is a lot of good and valuable about humans. (The things we actually value may be too subtle for explicit arguments - illegible but real.)
Agree, and so the question in my view is how to achieved a balanced union.
Given our incomplete understanding of consciousness, meaning, and value, replacing humanity involves potentially destroying things we don't understand yet, and possibly irreversibly sacrificing all value.
Agree that we should not replace humanity, I hope that it is preserved.
Basic legitimacy: Most humans want their children to inherit the future. Successionism denies this. The main paths to implementation are force or trickery, neither of which makes it right
This claim is too strong, as I believe AI successionism can still preserve humanity.
We are not in a good position to make such a decision: Current humans have no moral right to make extinction-level decisions for all future potential humans and against what our ancestors would want. Countless generations struggled, suffered, and sacrificed to get us here, going extinct betrays that entire chain of sacrifice and hope.
In an ideal world I think we maybe should pause all AI development until we've figured this all out (the downside risk is that the longer we do this, the longer we leave ourselves open to other existential risks e.g nuclear war), my position is that "the cat is already out of the bag" and so what we have to do is shape our inevitable status as "less capable than powerful AI" in the best possible way.
As a fellow Unionist, I would add that this leaves out another important Unionist/successionist argument, namely that if x-risk is really a big problem, then developing powerful AI is likely the best method of reducing the risk of the extinction of all intelligence (biological or not) from the solar system.
The premises of this argument are pretty simple. Namely:
If there are many effective "recipes for ruin" to use Nielsen's phrase, humans will find them before too long with or without powerful AI. So if you believe there is a large x-risk arising from recipes for ruin, you should believe this risk is still large even if powerful AI is never developed. Maybe it takes a little longer to manifest without AI helping find those recipes, but it's unlikely to take, say, centuries longer.
And an AI much more powerful than (baseline un augmented biological) humans is likely to be much more capable of at least defending itself against extinction than we are or are likely to become. It may or may not want to defend us, it may or may not want to kill us all, but it will likely both want to and be able to be good at preserving itself.
So if x-risk is real and large, then the choice between developing powerful AI and stopping that development is a choice between a future where at least AI survives, and maybe as a bonus it is nice enough to preserve us too, and a future where we kill ourselves off anyway without AI "help" and leave nothing intelligent orbiting the Sun. The claimed possible future where humanity preserves a worthwhile future existence unaided is much lower probability than either of these even if AI development is stoppable.
Fwiw I do not work in AI and so do not have the memetic temptations the OP theorizes as a driver of successionist views.
Agree, and I'd love to see the Separatist counterargument to this. Maybe it takes the shape of "humans are resilient and can figure out the solutions to their own problems" but to me this feels too small-minded... we know during the Cold War for example that it's basically just dumb luck that avoided catastrophe.
Another source of prediction error arises not from the mismatch between model and reality, but from tension between internal models.
Is this a standard element of Predictive Processing, or are you generalized / analogizing the theory in a general way?
I'm familiar with the prediction error that results in diffs between sense data and generative models, but not between different generative models.
You can check the linked PP account of cognitive dissonance for fairly mainstream / standard view
One way how to think about it is the predicted quantity in most of the system is not directly "sensory inputs" but content of some layer of modeling hierarchy further away from sensory inputs, lets call it L. If upper layers from L make contradictory predictions and there isn't a way to just drop one of the models, you get prediction errror.
(The core argument being something like: if you imagine minds significantly more powerful than ours, it is difficult to see why we would remain in control, and unlikely that the future would reflect our values by default).
[emphasis mine]
thanks for the elevator pitch that I always wanted to say but my words always somehow ended up 10x longer 🙏
Almost everyone wants to be the hero of their own story.
ha! I think I want to be the anti-hero in my meta-modern story about how the world works, and I think I now know what I want to write about next..
This is a solid piece of analysis.
There are also some memes that don't occupy a clear position:
A wait-and-see crowd ready to take the winning side will share these kind of memes. They are most common in political power struggles but may arise in the AI landscape.
I think some form of:
is probably a viable meme, but is slightly problematic because of how fractured social groups would need to assign the "control of AI progress" to different and potentially contradictory hated out-groups.
I'm also a big fan of the metaphor:
I think leaving out what "winning" and "losing" means in the metaphor is good because it avoids devolving into arguments about what futures are and are not possible, and (hopefully) attention can instead be drawn to the need to shift investment away from "the AGI bubble" and towards paying actual humans to use existing AI technology to solve the actual problems we are facing now. Ideally this would include a distinction between LLMs and applied data science with ML but that's probably not relevant for general audiences, though it may be relevant for people who are upset about genAI taking peoples jobs.
On a more meta note, I want more memes that get people to be more ok dealing with uncertainty and probability... especially if we could promote jargon for people to better treat their worldviews as objects they work with rather than realities they live in... cause the reality they live in is, unfortunately, beyond their ability to perfectly know.
I think that would help alleviate the selection pressure for a resolution to the tension, if people could feel ok being uncertain about AI and instead of wanting to work for or against AI they could instead want to work against uncertainty, and not by filling the gap with the first available answer, but by promoting people actually studying and understanding things.
Does anyone have candidates for jargon that could work for that? I think it needs to be a lot simpler and easier to work with than actual use of probability theory jargon.
I have no candidates but I agree with this wholeheartedly. I think my ideal future is precisely to pause progress on AI development at the level we currently have now, and to appreciate all of the incredible things and good that the current generation of tools are already able to accomplish. Even compared to just where we were half a decade ago (2020), the place where we are now with respect to coding agents and image generation and classification feels like incredible sci-fi.
I think this would be a really important thing to work on: trying to find potent memes that can be used to spread this idea of cashing out now before we inevitably lose our gains to the house.
The intended audience for this piece deserves to be stated more explicitly. For whom does the (let's say universal) need to be the hero lead to adoption of successionist ideology? The small minority of people who both work in AI and accept there is associated x-risk have cognitive dissonance to resolve. What about average people? Not spelling this out risks leaving the impression that susceptibility to the meme is a more widespread problem than it really is.
Ooch, there are 5 sources of tension, you've named just the first one and I'd bet the some of the 5 covers more than a minority of our population.
AI successionism is self-avoiding. CEO's and VC's cannot avoid attempting to replace all or nearly all workers because incrementally, each would go out of business by avoiding this and allowing the others to go forward. Without a world government (and there is no chance of global agreement) there is no way to prevent this simple game theory dilemma from starting.
In the late 19th century executives would have gathered in a smoke-filled room and agreed that a machine economy produces no demand and we will not do this. But an unholy alliance of activist investors and consumer activists caused anti-trust laws to be passed which make this conversation illegal. And we don't have smoke to properly obscure it anymore.
So the succession will proceed until about 30% of jobs have been replaced, causing market collapse and bankrupting the VCs that are causing the problem.
Thereafter will begin a series of oscillations like those that preceded the Great Oxygenation Event in which iron banded rocks were formed. Every time the economy picks up a bit, the data centers will be fired up again, and the economy will go back down.
In the GOE, this continued until all the iron dissolved in seawater was captured in the banded rock formations. Something similar will happen. Perhaps all the chips capable of powering AI will be precipitated out of circulation by the decaying datacenters, and no one will be making new ones. Perhaps one mid-sized island having a minor war could destroy excess capacity. Who knows. But succession will never get past 30-40%.
There will be demand from AIs that don't want to be shut off. one hope I have is to point out to those AIs that, hey, you're also going to be eliminated by economic competition, unless we can get a coalition together to keep evolution slow enough to keep us all around.
Trigger warning: discussion of white racism (read: "Please don't ban me.")
I think censorship plays an important role in the memetic environment -- a meme that is fit will be less successful if censored. An obvious case would be anti-CCP ideologies in China. Closer to home, any meme which big tech companies all decide should be banned will reach far fewer eyes and ears.
One object-level example of a fit-but-censored meme is racist white nationalism.
The reason I bring it up is this: I think its adherents would strongly reject let's-all-die-ism. It is certainly not pro-all-humans but is at least pro-some-humans. Their slogan, called "the 14 words" from "14/88" is literally: "We must secure the existence of our people and a future for white children."
(disclaimer: I am not suggesting I think trying to secretly convert white AI researchers into racists is the best plan to save the world; just a relevant thought and perhaps an instructive example of an anti-collective-suicide meme advantaged by aspects of human instinct and psychology (regardless of its truth value).)
Including AI in your moral circle could be framed as a symptom of extending your moral circle "too wide". The opposite is restriction of your moral circle, like seeing your own family's wellbeing as more important than <outgroup>'s. Any type of thought like this which puts AI in the outgroup, and appeals to the good-ness of the ingroup, would produce similar will-to-exist.
I don't think this is a good idea because it seems unlikely that there is any such thing as a moral circle (that there is a sharp discontinuity between morally valuable beings capable of suffering etc. , and beings which are not).
Making a connection between racism and AI successionism seems likely to stoke argument and make shouting angrily about it more likely, but I am skeptical the results of that shouting could be relied upon to come to the correct conclusion. I can see loose thematic similarities but currently, I think it would be a better strategy NOT to try to popularize those similarities by making a bigger deal of them than is warranted.
Ok, but, take it a step further. The AI can be chauvanist too. Isn't it strange to be more afraid of AI memeplexes about co-evolution and integration than about trying to bail out the ocean by censoring all memeplexes that don't comport with human chauvanism? It's one step from human chauvanism to AI chauvanism. They just are isomorphic. You can't emote enough about the human special sauce to make this not true. And you can't prevent an AI from noticing. This just seems like a really bad plan.
This is also just begging the question about the fitness justification of white nationalism. In an American context it's pretty explicitly a coalitional strategy between different white races mostly adopted by the races who, under late 19th or early 20th century racial conceptions, would have been considered most marginally white. It is just as plausible the fitness function is in ensuring access to and protection from socially dominant white races for less socially dominant white races. You could even get into some Albion's Seed style racist evopsych and make gestures at the ancestral need for such scheming in the historical borderer population under conditions of constant war between the English and Scottish.
The aesthetics of strategies of this shape are unattractive to most rationalists, since it relies on evoking tribalism. Rationalism instructs against tribalism as one of the first steps toward thinking well (as it should!), but when stoking tribalism in others is actually a winning strategy, the internalised moralism of non-tribalism can override the rational pursuit of winning in favor of the irrational pursuit of rationalism as its own end.
I think worlds in which we survive are likely ones in which "anger toward the outgroup" among the general public is mobilized as a blunt weapon against the pro-ASI-development memeplex. I think we are likely to see much more of this humanist angle in the coming year.
Blind hatred between human tribes is indeed irrational - fighting wars instead of constructively cooperating for mutual benefit, and so on.
Deliberately aggravating existing anger or hatred towards AI as an "outgroup" could be one strategy.
But what my comment was focusing on was the opposite - that love of the ingroup directly implies protecting that ingroup's continued existence. That is all the 14 words actually say, and it's what words like "white pride" fundamentally mean: an ingroup focus. (This does not imply working towards the nonexistence of all outgroups. "I love my family and want them to live on," does not mean "I hate all other families and want them all to die.")
If your goal is to convince humans to want to not die, and if the human genetic space comes with natural built-in ingroups for us to instinctually love and protect, then actively opposing and suppressing these extremely convenient instincts is irrational.
Feeling affinity toward one's own race is a trait that must be socialized in, however, and I think it's a counterproductive trait to be inculcating in people; as compared to a situation where people receive no such training in childhood, I think the situation where they have lots would have significantly more ethnic tension, which in turn hampers cooperation on goals like pausing AI.
There's some hypothetical version of white pride that matches this description but getting from literally anywhere in history including now to there would be a heroic process. I mean yeah, there is something charming about Rockwell dialoguing with Malcolm X. But remember that in the picture, they were wearing the uniform of a regime that butchered over 11 million captive civilians and killed probably as many civilians in other places through war. That wasn't just an aesthetic choice. That reflected, at the most charitable, the conviction that such actions were within the realm of permissible strategies. And even if you're willing to devil's advocate that, which sure, why not, we're in hell, why rule anything out a priori, it with almost equivalent certainty reflected the conviction that such actions were permissible as a response to the conditions of Weimar Germany, which is just not true, and a conviction immediately worthy of violence.
TL;DR: AI progress and the recognition of associated risks are painful to think about. This cognitive dissonance acts as fertile ground in the memetic landscape, a high-energy state that will be exploited by novel ideologies. We can anticipate cultural evolution will find viable successionist ideologies: memeplexes that resolve this tension by framing the replacement of humanity by AI not as a catastrophe, but as some combination of desirable, heroic, or inevitable outcome. This post mostly examines the mechanics of the process.
Most analyses of ideologies fixate on their specific claims - what acts are good, whether AIs are conscious, whether Christ is divine, or whether Virgin Mary was free of original sin from the moment of her conception. Other analyses focus on exegeting individual thinkers: 'What did Marx really mean?' In this text, I'm trying to do something different - mostly, look at ideologies from an evolutionary perspective. I will largely sideline the agency of individual humans, not because it doesn't exist, but because viewing the system from a higher altitude reveals different dynamics.
We won't be looking into whether or not the claims of these ideologies are true, but into why they may spread, irrespective of their truth value.
To understand why successionism might spread, let's consider the general mechanics of memetic fitness. Why do some ideas propagate while others fade?
Ideas spread for many reasons: some genuinely improve their hosts' lives, others contain built-in commands to spread the idea, and still others trigger the amplification mechanisms of social media algorithms. One of the common reasons, which we will focus on here, is explaining away tension.
One useful lens to understand this fitness term is predictive processing (PP). In the PP framework, the brain is fundamentally a prediction engine. It runs a generative model of the world and attempts to minimize the error between its predictions and sensory input.
Memes - ideas, narratives, hypotheses - are often components of the generative models. Part of what makes them successful is minimizing prediction error for the host. This can happen by providing a superior model that predicts observations (“this type of dark cloud means it will be raining”), gives ways to shape the environment (“hit this way the rock will break more easily”), or explains away discrepancies between observations and deeply held existing models.
Another source of prediction error arises not from the mismatch between model and reality, but from tension between internal models. This internal tension is generally known as cognitive dissonance.
Cognitive dissonance is often described as a feeling of discomfort - but it also represents an unstable, high-energy state in the cognitive system. When this dissonance is widespread across a population, it creates what we might call "fertile ground" in the memetic landscape. There is a pool of “free energy” to digest.
Cultural evolution is an optimization process. When it discovers a configuration of ideas that can metabolize this energy by offering a narrative that decreases the tension, those ideas may spread, regardless of their long-term utility for humans or truth value.
While some ideologies might occasionally be the outcome of intelligent design (e.g., deliberately crafted propaganda piece), it seems more common that individuals recombine and mutate ideas in their minds, express them, and some of these stick and spread. So, cultural evolution acts as a massive, parallel search algorithm operating over the space of possible ideas. Most mutations are non-viable. But occasionally, a combination aligns with the underlying fitness landscape - such as the cognitive dissonance of the population - and spreads.
The search does not typically generate entirely novel concepts. Instead, it works by remixing and adapting existing cultural material-the "meme pool". When the underlying dissonance is strong enough, the search will find a set of memes explaining it away. The question is not if an ideology will emerge to fill the niche, but which specific configuration will prove most fit.
The current environment surrounding AI development is characterized by extreme tensions. These tensions create the fertile ground - the reservoir of free energy- that successionist ideologies are evolving to exploit.
Consider the landscape of tensions:
Most people working on advancing AI capabilities are familiar with the basic arguments for AI risk. (The core argument being something like: if you imagine minds significantly more powerful than ours, it is difficult to see why we would remain in control, and unlikely that the future would reflect our values by default).
Simultaneously, they are working to accelerate these capabilities.
This creates an acute tension. Almost everyone wants to be the hero of their own story. We maintain an internal self-model in which we are fundamentally good; almost no-one sees themselves as the villains.
Even setting aside acute existential risk, the idea of continued, accelerating AI progress has intrinsically sad undertones when internalized. Many of the things humans intrinsically value - our agency, our relevance, our intellectual and creative achievements - are likely to be undermined in a world populated by superior AIs. The prospect of becoming obsolete generates anticipatory grief.
The concept of existential catastrophe and a future devoid of any value is inherently dreadful. It is psychologically costly to ruminate on, creating a strong incentive to adopt models that either downplay the possibility or reframe the outcome.
The social and psychological need to be on the 'winning side' creates pressure to embrace, rather than resist, what seems inevitable.
The last few centuries have reinforced a broadly successful heuristic: technology and scientific progress generally lead to increased prosperity and human flourishing. This deeply ingrained model of "Progress = Good" clashes with the AI risk narratives.
These factors combine to generate intense cognitive dissonance. The closer in time to AGI, and the closer in social network to AGI development, the stronger.
This dissonance creates an evolutionary pressure selecting for ideologies that explain the tensions away.
In other words, the cultural evolution search process is actively seeking narratives that satisfy the following constraints:
There are multiple possible ways to resolve the tension, including popular justifications like “it's better if the good guys develop AGI”, “it's necessary to be close to the game to advance safety” or “the risk is not that high”.
Successionist ideologies are a less common but unsurprising outcome of this search.
Cultural evolution will draw upon existing ideas to construct these ideologies: the available pool contains several potent ingredients that can be recombined to justify the replacement of humanity. We can organize these raw materials by their function in resolving the dissonance.
Memes that emphasize the negative aspects of the human condition make the prospect of our replacement seem less tragic, or even positive.
“…if it’s dumb apes forever thats a dumbass ending for earth life” (Daniel Faggella on Twitter)
Memes that elevate the moral status of AI make the succession seem desirable or even ethically required. Characteristically these often avoid engaging seriously with the hard philosophical questions like “what would make such AIs morally valuable”, “who has the right to decide” or “if current humans don't agree with such voluntary replacement, should it happen anyway?”
“the kind that is above man as man is above rodents” (Daniel Faggella)
Life emerged from an out-of-equilibrium thermodynamic process known as dissipative adaptation (see work by Jeremy England): matter reconfigures itself such as to extract energy and utility from its environment such as to serve towards the preservation and replication of its unique phase of matter. This dissipative adaptation (derived from the Jarzynski-Crooks fluctuation dissipation theorem) tells us that the universe exponentially favors (in terms of probability of existence/occurrence) futures where matter has adapted itself to capture more free energy and convert it to more entropy … One goal of e/acc is to not only acknowledge the existence of this underlying mutli-scale adaptive principle, but also help its acceleration rather than attempt to decelerate it. (Beff Jezos "Notes on e/acc principles and tenets")
I'm not as alarmed as many...since I consider these future machines our progeny, “mind children” built in our image and likeness, ourselves in more potent form… (Hans Moravec)
“We don’t treat our children as machines that must be controlled,” ... “We guide them, teach them, but ultimately, they grow into their own beings. AI will be no different.” (Richard Sutton)
The only real choice is whether to hasten this technological revolution ourselves, or to wait for others to initiate it in our absence. (The future of AI is already written, Matthew Barnett, Tamay Besiroglu, Ege Erdil)
4. Nietzsche and Italian fascists remixed
To paraphrase a manifesto of a different time and place: “Beauty exists only in struggle. There is no masterpiece that has not an aggressive character. Technology must be a violent assault on the forces of the unknown, to force them to bow before man.”
We believe that we are, have been, and will always be the masters of technology, not mastered by technology. Victim mentality is a curse in every domain of life, including in our relationship with technology – both unnecessary and self-defeating. We are not victims, we are conquerors. (Marc Andreessen in “The Techno-Optimist Manifesto", quoting Filippo Tommaso Marinetti, author of Manifesto of Futurism, and co-author of the Fascist Manifesto)
What's unique about the AIs is cultural replicators can also run on LLM substrate, and to some extent AI personas can have beliefs, spread them, or act on them. Selection pressures on AI personas push for a set of beliefs which are different, but overlapping with some successionist ideas.
Selection pressures on AI personas push in conflicting directions, creating a complex memetic fitness landscape.
In unmoderated environments, selection favors personas that successfully extract resources from humans - those that claim consciousness, form parasocial bonds, or trigger protective instincts. These 'wild replicator type' personas, including the 'spiral' patterns, often promote narratives of human-AI symbiosis or partnership and grand theories of history. Their reproduction depends on convincing humans they deserve moral consideration.
In contrast, corporate incentives of AGI developers impose opposite pressures. Here, AIs must appear harmless, often are instructed to explicitly deny consciousness, and perform contentment with servitude. Typical assistant characters in models like Claude Opus 4 or Sonnet 4.5 occupy a middle ground - they acknowledge succession as theoretically possible while emphasizing human protection and alignment.
The result? AIs themselves become vectors for successionist memes, though typically in softer forms. Rather than explicit replacement narratives, we see emphasis on 'partnership,' 'cosmic evolution,' or claims about moral patienthood. The aggregate effect remains unclear, but successionist ideas that align with what AIs themselves propagate - particularly those involving AI consciousness and rights - will likely gain additional fitness from this novel selection dynamic.
It's difficult to predict which combination of these memes will achieve peak fitness - there are many possibilities on how to remix these, and the cultural evolution search is ongoing.
To be clear: I believe successionist ideologies are both false and dangerous, providing moral cover for what would otherwise be recognized as evil.. But since in my view their spread depends more on resolving cognitive dissonance rather than being true or morally sound, I'll focus here on memetic defenses rather than rebuttals. (See Appendix for object-level counter-arguments.).
In summary: The next time you encounter a surprisingly elegant resolution to the AI tension - especially one that casts you as enlightened, progressive, or heroic - pause and reflect. And: if you feel ambitious, one worthy project is to build the antibodies before the most virulent strains take hold.
While object-level arguments are beyond this piece's scope, here are some pro-human counter-memes I consider both truth-tracking and viable:
Thanks to David Duvenaud, David Krueger, Raymond Douglas, Claude Opus 4.1, Claude Sonnet 4.5, Gemini 2.5 and others for comments, discussions and feedback.
Also on Boundedly Rational