P(doom) is a Dumb Meme

Max Harms

Look, I'm as much of a Rationalist with a special interest in AI x-risk as anyone. But oh my god do I hate talking about "P(doom)". When it gained widespread usage in the wake of ChatGPT, I assumed that it was floating around variously adjacent circles of faux-intellectuals, but surely everyone in my circles could see how braindead it was... right?

(This post was partially inspired by a recent conversation with Liron about Doom Debates.^[1])

I guess it's time for me to focus on a place where I'm shocked that everyone else is dropping the ball.^[2]

P(doom) is Hopelessly Vague

Let's start with the ambiguity. Does "doom" mean... extinction? A lot of people think so! I have personally encountered people who think catastrophic harms from AI are likely, but the risks of all humans dying are low. They're like "Sure, 99.999% of humans might die from AI, but the AI will obviously want to keep thousands of humans alive for science and potential trade with aliens and stuff, so my P(doom) is approximately 0%."

That might sound crazy. Surely you, dear reader, know exactly what "doom" means. You know, for example, which of these count as doom and which don't:

A young ASI tries to use it's first-mover advantage to take over the world and prevent other ASI competitors from emerging. In doing so it sparks a war against humanity where it eventually loses,^[3] but it kills 10% of all humans in the process.
ASI empowers a single person or small group of humans to become tyrants and lock in a permanent authoritarian regime where almost all other humans are subject to the whims of the tyrant(s).
A state actor or terrorist group uses narrow AI to build a bioweapon (eg AlphaFold but for plagues), and that weapon gets out and kills literally everyone.^[4]
A great power conflict starts for reasons mostly unrelated to AI, but advanced AI systems are deployed on the various battlefields, and eventually (in part due to the speed and ruthlessness of AI) it escalates to a thermonuclear war that kills 99% of humanity.
Humans "merge with the machines" in a way that, from the outside, looks an awful lot like all of humanity being subsumed by inhuman machinery.
Humans become incapable of competing with machines in almost all sectors of the economy. While the rule of law persists, the political landscape is also dominated by a variety of ASIs with various goals, and they don't redistribute significant wealth into human hands. Only a small number of humans survive in the long-run, relegated to being a historical curiosity.
ASIs are developed in a broadly corrigible way, leading to extreme abundance, and the offense-defense balance means that the world is basically safe. But everyone (even the Amish) eventually stop having human kids because other things (including AI "kids") are way more fun/satisfying. Even though lifespans are really long, people still gradually die off, and eventually only the machines remain.

Implicit in all of this are questions of timeframe. Many people that I've talked to feel that gradual disempowerment probably doesn't count as AI doom because it's too slow, and they're assuming a short timeframe for "P(doom)" like 5 or 10 years. Others assume that P(doom) means "this century." (But does that mean the next 74 years or 100 years?) Does the heat death of the universe count as doom?

Even if you have a way that you like to think about the subject, are you sure that the person you're talking with is using the same frame? If I've learned one thing from making lots of bets and engaging with prediction markets nearly every day for years, it's that the devil is in the details, and that finding the right way to operationalize a bet is often the hardest part.

(Edit: To be clear, I am not saying that vague statements are always bad. There's only so much communication bandwidth, after all. What I am against is vagueness under the guise of precision, where people don't realize that they have different interpretations, and thus miscommunicate.)

Inside Views, Outside Views, and Likelihood Ratios

"I actually think the risk [from AGI] is more than 50% ... but I don't say that because there's other people think it's less, and I think a sort of plausible thing that takes into account the opinions of everybody I know is sort of 10 to 20%."
— Geoffrey Hinton, 2024

Let's say that you're trying to estimate the number of beans in a jar. A good way to do that is use the wisdom of crowds. Go around to a bunch of people, ask them to guess, and then take the average of their guesses. Some people will be too high, others will be too low, but in theory their guesses will be correlated with the truth and their errors will be independent, so when you average them, the errors cancel out.

But now suppose that you go up to someone and ask them to write their guess on a piece of paper. They write down 1000. Then, you go to the Nobel Prize winning "godfather" of bean-counting. He's about to write 5000, but then spots the "1000" at the top of the page. "I'm probably too high," he says to himself, and writes 3000 instead, thinking himself very clever. And indeed, he is clever! If all that mattered was the accuracy of his guess, then updating on the evidence is smart.

But if you then repeat this process and get a bunch of guesses, then average them, you'll do way worse. You're double-counting evidence! The errors in the first guess get compounded in the second guess, and at that point who wants to go against the expert consensus? People look around them and assume that a fire alarm can't possibly be going off, because not enough people are acting, not realizing that most other people aren't acting for a similar reason.

These two forms of probability — gearsy inside-views and all-things-considered outside-views — can disagree wildly! When I try to tell a concrete story of the development of superintelligence going well, not fudging or assuming some mysterious breakthrough, I simply fail. In that sense, my P(doom) is 100%. That's the number that I claim you should take from me if you're calculating an average. But I am also a Bayesian, and thus entirely aware that 100% isn't a valid probability. When adopting a stance of humility that keeps my ignorance in mind, my sense of being doomed goes down a lot.^[5]

But both of these numbers are usually a distraction!

When I'm talking to someone about the risk from AI, I don't really want to average my worldview with theirs. What I want to do is learn their insights. Insight can come in the form of evidence, experienced out in the world, or it can come from reasoning and studying the ideas in question. Insight gives an update in the form of a likelihood ratio.

Imagine a person named Gloomer who has a naturally sad disposition, perhaps because of his genes, or maybe they had a rotten childhood. Gloomer starts with a prior of 99:1 that humanity will go extinct in his lifetime (inside view). Again, this isn't really based on anything. He's just a pessimist. Now suppose that Gloomer learns of an alignment technique that has promise, according to his understanding of the science, but doesn't solve the whole problem. Being the pessimist that he is, he suspects that the rest of the problem probably won't get solved in time. And even if it did, something else would probably get us.

If you ask Gloomer for his P(doom), even if you're very clear to operationalize it in the right way, defining your terms and asking for their inside view, he might say "9:1." Yikes! That's pretty bleak! But now imagine asking "in what ways have you updated about doom?" Gloomer might tell you about becoming less pessimistic, sharing his 11:1 evidence that things are going to be okay!

In my general experience, conversations about reason and evidence ("Why do you believe what you believe?") rather than bottom-line conclusions ("What do you believe?") get into productive territory much faster, and devolve less often into performative bafflement around the other person's bottom-line.

P(doom) is Fatalistic

What is the probability that you will say "Zimbabwe," out-loud to yourself, in the next 60 seconds?

Probabilistic reasoning is a (mental) tool. And like all tools, it has places where it fits nicely and helps you do work, and has places where, if you insist on applying it, you'll make a mess. In particular, it carries a flavor of looking at things from the outside, like a detached spectator betting on the way a sports match will go. But as the saying goes: There are two ways to be right. You can be right because you're a wizard, and have thus found the secret truths of reality, or you can be right because you're a king, and have decided that things will be the way you say they are.

Some secret observer might do well to speculate about whether you'll say "Zimbabwe" in terms of probability, but you should instead ask yourself "What do I want to say?"

This is especially important in the context of coordination problems and multi-party interactions. For example, if I said to my wife "I think there's an 8% chance we'll get divorced in the next five years," the very act of sharing that prediction might cause that number to go up! She might reasonably interpret it as a sign that I don't believe in the partnership. Then, she might respond with her own prediction of 14%, leading me to update in the same way, in a back-and-forth pattern that ends up in divorce. This cascade of updates is only irrational in that speaking purely in terms of high-level probabilities is the wrong tool for the job of coordination. We could dodge it by getting specific, asking to talk about likelihood ratios and what sorts of things might doom the marriage. Or we could dodge it by asking "Do you also want to be married?", taking the other person's "Yes" as good enough, and moving on with our lives, having jointly made that decision.

When leaders of companies or nations talk about how they don't expect other companies or nations to slow down in the race towards the AGI cliff, consider that this expression of pessimism might be a self-fulfilling prophesy. Those words are a signal to those who are listening that the speaker doesn't expect (and therefore doesn't intend!) to cooperate.

Perversely, the inverse dynamic can also happen! I've met people who say they have a low P(doom) because "Obviously it would be extremely dangerous to charge ahead, even as superintelligence gets closer. Humanity isn't that stupid. We'll slow down and figure things out, and thereby avoid doom."

But, uhh... do you realize that giving a low P(doom) is communicating the exact opposite of your worldview?! The situation resembles Murphy's Law: if you believe that things will go fine, your carelessness may cause things to fail; if you believe that things will go wrong, you will prepare, and by preparing, things will go fine.^[6] The situation isn't paradoxical. You just have to have believe that conditional on being cautious, things will turn out okay. Giving a non-conditional probability hides our power, and ability to act with caution, implying that things will either be fine (in which case why bother taking action) or they won't (in which case why bother taking action) or they are up to chance (in which case why bother taking action).

Some people feel powerless. I get that. They don't feel like they get to decide whether humanity is cautious or reckless. They don't feel like, by vocalizing hope or fear, they are changing the world. And sometimes that's right. It can be useful to quietly ask yourself whether cooperation is likely, or whether humanity will decide to slow down. Not everyone is in a position of power.

But our decisions are usually entangled in more ways than we appreciate. You are part of a culture, a community, and a country. It matters, in aggregate, what people in those groups say and believe and try to accomplish. This aggregate will isn't some mysterious thing, beyond your power. It comes from you, and people like you. Every time you act, you are acting collectively.

At the level of humanity, at the very least, we are in control of Earth. If we decide to work together, we can work together. If we decide not to build machines that make us obsolete, we can (for now) just stop, and choose not to go down that road. Perhaps there are too many fools for things to work out, but assuming that to be the case (or saying things in public that imply that you think it) is itself a kind of foolishness.

Counterarguments

There are people I respect who disagree with me about P(doom) being a bad meme. Zvi Mowshowitz, for example, argued with me at LessOnline that while I'm not wrong about all the ways that it's vague and unsophisticated, it has the virtue of being memetically catchy enough to get people thinking about existential risk at all, whereas the default is that it gets ignored. He followed up by asking whether political prediction markets (ie "P(Trump)") are also bad.

These are strong counterpoints, and I take them seriously, but I'm ultimately convinced by neither.

On the question of memetic fitness, I agree that some people who might otherwise be oblivious, or swept up in some other memetic fad, probably spend more time thinking about the possibility of AI causing a disaster as a result of the meme. This is good. We need more people thinking about and talking about risks.

But it's a false dichotomy to choose between P(doom) and nothing. Even before P(doom), there was the obnoxious "what are your timelines?" meme^[7] and many others before it. After, we had AI 2027, the METR plot, and of course:

Each of these have pros and cons, being variously sophisticated, memetically fit, and useful for having productive conversations. I'm not really saying that any of them are necessarily better than P(doom) — even though they are — but rather that a dumb meme is still a dumb meme even if it's better than nothing. We can, and should, do better!

One example of doing better are prediction markets, which I am broadly in favor of. Talking about probabilities and making bets seems to me to be a vastly superior way to do political forecasting than the raw punditry I grew up with. So what about "P(Trump)" and other such things? Why don't I have the same animosity for prediction markets as I do for P(doom)?

Election markets aren't vague. Most are extremely well operationalized, with clear criteria and timelines. If someone made a market that was similarly vague, and people kept talking about it ad nauseam, I would have similar complaints.
In my view, people who are savvy enough to be thinking about prediction markets (outside of sports betting) are usually savvy enough to understand that something like "P(Trump)" exists in the context of a marketplace, and that knowing how someone bet on that market gives only a very limited handle on their beliefs. If people started implying that people with a high P(Trump) should be labeled as "Trumpers," I would also yell at them.
When people cite prediction markets (or more typically, polling) as a reason not to vote for the best candidate,^[8] I think this is a dumb and bad mistake that is directly analogous to the fatalism surrounding P(doom)! Thankfully, it seems to me that most people understand that their vote matters for deciding elections, even though the market isn't conditional on one course of action or another.

In short, P(Trump) isn't a meme that's rolling around the discourse, causing miscommunication and sloppy thinking. If it was, I'd probably be against it!

A Sense That More Is (Memetically) Possible

P(doom) probably isn't going away, this side of the singularity. It's reached critical mass, and will get parroted by midwits, if nothing else. But you don't have to be a midwit! You can recognize that P(doom) has the flavor of a scissor statement, rather than a nuanced perspective on reality. If someone brings it up in conversation, use it as an opportunity to get into the details of what sorts of futures they're concretely imagining, what they're basing that perspective on, and what sorts of things can be done to steer our fate. Refuse to just give simple numbers until things have been sufficiently operationalized. Take the road of rationality, and choose your memes wisely.

^{^}
I gave a mini-version of this rant when I went on his show late last year. While the P(doom) emphasis rubs me the wrong way, I like Doom Debates overall.
^{^}
To be fair, some others, including Eliezer, have also complained about it publicly.
^{^}
In the hypothetical, we can imagine the ASI is the weakest/stupidest possible AI that has a real shot at takeover. But thanks to it being right on the cusp, its success isn't guaranteed, and we happen to get lucky.
^{^}
Perhaps by causing society to collapse to the point where the few stragglers in bunkers can't recover.
^{^}
How much? It depends on what "doom" means, of course! 😛
^{^}
This idea, and the difference between wizard correctness and kingly correctness, are ideas I picked up from CFAR, with particular thanks to Duncan Sabien. I'm not sure who the first person to develop the ideas was.
^{^}
"What are your timelines?" is bad for similar reasons. It's vague about what we're talking about. It encourages focusing on bottom-lines, rather than reason or evidence. It denies agency and our power to choose. And it, in practice, often resulted in point estimates, rather than distributions. Like with P(doom), there's a sophisticated way to do forecasting, which the meme version turns into a cartoon.
^{^}
To be clear, I think that some amount of strategic voting is wise. In particular, as long as one is tracking their decision-theoretic reference class and the higher-order effects, I think it's good to try to estimate how the other voters are leaning and occasionally make compromises that look like voting for the lesser evil. I am rallying against the braindead "What can one vote do?" and "I live in a state where my vote doesn't matter!" narratives.

Doom Debates host here. I think this post makes the case against "P(Doom)" well, and I'll grant this:

In a hypothetical world where most people agree — as maybe most of the people reading this post do — that the probability of imminent human extinction from artificial superintelligence is something in the ballpark of 10–90%, I would totally share the view that P(Doom) is just a "dumb meme", and that finessing whether the better answer is 30% or 70% would be a waste of a discussion/debate.

But it turns out that a large fraction of people (e.g. >1/3 of Doom Debates guests) don't agree with that 10–90% ballpark range. They either derail on some Bayesian epistemology 101 aspect of the question, or they claim that the probability is some crazy (IMO) value like 0.1%. Furthermore, I find that their policy arguments are often downstream of whether or not their P(Doom) is in the galaxy of saneness.

One striking example is this debate between Max Tegmark and Dean Ball, which IMO became much clearer when they answered that their P(Doom)s are >90% and <0.1%, respectively. AFAICT, the difference in their policy positions is well explained as being downstream of their P(Doom)s.

Asking people their P(Doom) may only yield about 1 bit of useful information — answering in the 10-90% range, vs. expressing a lot of confidence one way or the other — but it's a high-order bit, a starting point for triangulating guests' positions in the high-dimensional AI x-risk belief space, which is the mission of Doom Debates. It's for that reason (and because of the fitness of the meme) that I've made “What's Your P(Doom)” a recurring segment and tagline of the show.

Seconded. On my end I feel like I see two different kinds of people who frustrate me in AI risk conversations:

People who deny the risk exists at all. Since doom is ~zero probability, there's no reason to do anything at all about it, and we can get upset at anyone who talks about it for focusing on sci-fi nonsense and distracting from [whatever minor AI effect happens to annoy them].
People who claim that the risk is so great, and doom so certain absent drastic action, that the question of whether the stuff they are doing is stupid and counterproductive is irrelevant. Since doom is ~certain, there's no way their genius plan could make things worse - the default state is guaranteed extinction, and literally anything is an improvement!

And I feel like P(doom) does a good job of capturing the thing I care about, even if it misses nuances.

They either derail on some Bayesian epistemology 101 aspect of the question, or they claim that the probability is some crazy (IMO) value like 0.1%. Furthermore, I find that their policy arguments are often downstream of whether or not their P(Doom) is in the galaxy of saneness.

The property of "sanity" is not a property of beliefs, but of belief-formation processes.

P(doom) is Hopelessly Vague

I don't think it's more vague than other terms people routinely use. Sometimes it's helpful to be more precise, but not everything has to be academic philosophy. My P(doom) = 50% might not mean the same thing as your P(doom) = 50%, but that's still an alarming number that means something. "I'm worried about AI risk" is not any less vague than "P(doom) = 50%".

P(doom) is Fatalistic

This section seems to be making two claims, both of which seem wrong to me:

If your actions can change the probability of an outcome, then you can't meaningfully have a belief about the probability.
A pragmatic stance on beliefs, along the lines of, "Believing in P(doom) might increase extinction risk, so you shouldn't believe in it."

RE #1: I do actually have a belief about the probability that AI kills everyone. The fact that I can (slightly) change the outcome via my actions is already accounted for in my belief. Like I can meaningfully say that there's a near-0% chance I will say "Zimbabwe" in the next 60 seconds, because I'm typing on the computer right now and nobody else is in the room, so I have no reason to speak aloud, and I don't plan on speaking.

RE #2: This feels a bit like Pascal's Wager, in the sense that you can't just decide to believe in God as a bet. You can't actually control your beliefs; God knows you're only pretending to believe in Them so you'll get into heaven.

Perhaps I should have been more clear about why vagueness is bad. It's fine for statements to be vague. What is bad is when something is ambiguous and people don't realize it's ambiguous. They treat "P(doom) = 40%" as a clear statement, where they would (hopefully!) recognize that they don't really understand the perspective of someone who says "I'm worried about AI risk."

You definitely can have a probability on an outcome you have significant control over. When you predict, you are choosing. 0% chance of "Zimbabwe" is downstream of making a choice, and once the choice is made, it's fine to reflect on the chances. But note that if you aren't paying attention to your power, making a forecast can mask the fact that you have power.

On a related note, while you can't choose your beliefs directly, you can choose your actions, and thereby choose your beliefs about what you will do, and what will follow. Insofar as you choose your actions based on your beliefs (which you shouldn't do unless you first condition on various choices of action!), then your beliefs about the future will have multiple fixed-points, dependent on your choices. See: https://www.lesswrong.com/posts/SwcyMEgLyd4C3Dern/the-parable-of-predict-o-matic

"I'm worried about AI risk" is not any less vague than "P(doom) = 50%".

I think this misses the point. "I am worried about AI risk" doesn't seem to give itself any undue weight: it doesn't feign being more confident or certain of what the risk is.

If your actions can change the probability of an outcome, then you can't meaningfully have a belief about the probability.

I think the problem getting at is that people's idea of P(doom) is often really P(doom|inaction) not a cumulative estimate of probability over the full set of probable actions and all interactions. It is frequently given implicitly assuming no interactions with people's discussions or reactions to risk.

I think the problem getting at is that people's idea of P(doom) is often really P(doom|inaction) not a cumulative estimate of probability over the full set of probable actions and all interactions.

I don't think it matters for almost anyone? Like the difference between my best-case future choices and worse-case future choices probably changes P(doom) by less than 0.01 percentage points.

I think it matters for a lot of people. A lot of people would say the odds are substantially lower given international action.

Yeah I would also say that, but if I put in the maximum possible effort into causing vs. preventing international action, the change in probability of whether international action happens is pretty small. So my P(doom) given my own actions being maximally good is approximately identical to my P(doom) given my own actions being maximally bad.

P(doom) is Hopelessly Vague

So just define it when you ask the question or answer it. I just say something like 'My P(doom) as in the chance that all of us die within a generation after ASI is X%, but there's a bunch of more complex negative scenarios that are also likely'. You are rarely confined to just blindly giving a number without explanation in real conversations.

All these fights over definitions are so trite and easily dealt with. There's a sequence about it and everything.

I have mixed feelings about this post.

I found it frustrating to argue about a well-defined version of p(doom) at the X-Risk Persuasion Tournament, but it was fairly helpful at informing me that the main source of disagreement was about whether AI would be transformative this century.

>When I try to tell a concrete story of the development of superintelligence going well, not fudging or assuming some mysterious breakthrough, I simply fail. In that sense, my P(doom) is 100%. That's the number that I claim you should take from me if you're calculating an average.

This seems to imply that an outside view is just an aggregate of other people's views. That is fairly wrong for how I create my outside views. I make estimates of which reference classes are most relevant. How much should I weigh the example of humans meeting neanderthals? How much should I weigh the example of the first governments to attempt to train a standing army not to stage a coup? How much damage has been caused in the past by people similar to Altman, Musk and Trump?

Those estimates that go into my outside view may be partly double-counting evidence if others update on them, but partly reflect my independent thought, and my reading of books on intelligence that are not yet reflected in other's opinions.

I agree that many things that "outside view" might mean include things that aren't aggregating the perspectives of others, and I'm being a bit sloppy in the main text by implying that this is the main thing that an outside view involves. But I do think that it's pretty important to distinguish views that stem from different sorts of mental stances so that one can avoid double-counting where possible.

Talking to people in person, I've tried to push back on exchanging p(doom)'s on the basis of "it's not sensible to talk about a probability that depends on one's own actions". And then in reply I got hit with "you really think that you personally can influence the chance that the world ends by more than a couple percent?" And I have to admit that's a pretty good reply. I think the best one can do in response is gesture in the general direction of logical correlations between my actions and those of many other people.

Almost all physically instantiated algorithms aren't logically correlated to any other physically instantiated algorithms. Plausibly logical correlation is also very rare among all algorithms period.

If you think of something like a phylogenetic tree of physical instantiations of algorithms, descending with modification, then surely algorithms on the same tree are somewhat logically correlated, and generally, the closer they are on this tree, the more logically correlated they are. It seems clear to me that a vast majority of physically instantiated algorithms^[1] originate in this way, be it Darwinian-evolved or designed by intelligent creators (who are algorithms themselves^[2]).

You also have the possibility of the analogue of horizontal gene transfer via communication between algorithms, which strengthens the effect.

^{^}
At least those, which it makes sense to model as algorithms, i.e., modeling them as such allows you to derive some valuable implications running along the seams of reality or something.
^{^}
Albeit in this case it's less of a descent with modification thing and more of a being spawned by a very similar process thing.

Interesting! Maybe this changes my mind. For designed algorithms this seems more applicable than for evolved/trained ones, since in my vague current model logical correlation drops off quickly with difference in the causal structure on an algorithm. Perhaps for two non-selected algorithms: with every bit of different source code the logical correlation drops off exponentially quickly. (This doesn't apply to algorithms that try to be logically correlated, but that is relatively rare, still).

And for the effect to matter enough the utility gain has to outweigh the exponential drop-off.

I honestly think it's even worse than this, conditional on the AI doom cases being even remotely real, because they're substantially correlated with AI being a lot less legible and visble to the public because of internal deployment mattering way more, and it's likely that conditional on AI progress accelerating in the next few years as much as certain people think, the number of people who get to control the probabilities of AI doom go down a lot, even in less extreme cases like no nationalization/heavy handed government involvement.

This is perhaps not an optimistic view, but it is freeing, as it means basically no-one should care about whether we all die from AI if AI doom has a significant probability (because no control/useful actions + negatives of rumination mean it's -EV to care about AI doom.)

Yeah. I think this is a CDT fallacy. We can influence the chance that the world ends quite a lot. "We" is made up of a bunch of "me". Therefore, yes, I do think I can make a difference (as part of my decision group).

I would suggest asking if they vote, but a lot of people vote for non-FDT reasons (eg high expected impact even if you only have a 1/million chance of changing things).

P(doom) is Fatalistic

I want to push back on this section per this semi-related tweet I wrote recently:

I don’t like that “optimistic” conflates an epistemic / forecasting thing with a decision-making thing. A superforecaster will be simultaneously “optimistic” about things that will probably work out well and “pessimistic” about things that will probably not. And good for them, that’s exactly how they should be.

Of course, that’s assuming they’re in the role of a passive observer. Separately, active participants in events should energetically try to make things go well.

I think good leaders have both wolves inside them: given a plan, they are brutally honest and clear-eyed about how likely it is for the plan to work vs not work. But they also don’t give up searching for better plans prematurely. Hence CEOs are praised for “pivoting” away from unpromising plans etc.

…Of course, I have an axe to grind here. I think we should be “pessimistic” about the AI risk situation, in the epistemic / forecasting sense, i.e. in the sense that I see a lot of problems that seem fatal and (based on historical analogies and other arguments) seem very likely to bite us. Thus, I usually describe myself as a “pessimist” about future AI, and I don’t like when people talk as if that’s my character flaw rather than my considered opinion. (Not you in particular, I see it a lot.) Of course, I am energetically trying to make things better.

By analogy, if I’m captain of a high school football team, playing an exhibition game against top professionals, then I can and should energetically try to win, and brainstorm out-of-the-box strategies, etc. But after doing that, if I “optimistically” expect to win, then that would be moronic not praiseworthy.

Anyway, I concede that normal people have a strong tendency to conflate the epistemic / forecasting thing with the decision-making / actions / coordination thing. And therefore, maybe it’s bad public messaging to talk about the epistemic / forecasting thing at all.

But at the same time, if people are conflating those things, that’s dumb, and they shouldn’t do that, and we should criticize them for doing that.

If you only have five words, then yes we should be mindful of areas where people have preexisting confusions, and dance around those confusions. That’s fine. But we don’t want to take that too far into dumbing down our communications in non-“only five words” situations, much less dumbing down our own thinking.

I'm surprised to hear that there are (apparently?) many people exchanging their subjective probabilities of AI doom without that just being the top-level intro to a much more involved discussion. I feel like it's obvious that p(doom) is a flattening of all of one's thinking on AI risk into a single metric, and all the interesting differences of viewpoint are in the details, so of course I wouldn't try to compare numbers in a vacuum. Most people won't answer without first providing at least a timeline and a couple conditionals anyway.

P(doom) might be net negative as a meme -- I'm not sure, so I generally don't use that specific phrase these days, especially with people who don't already know it.

I'm wrestling through the idea now, though, that it may not be important for me to ever consciously consider the value of my all-things-considered p(doom), so long as it's high enough that I find it unacceptable to just let other people handle it. Over time, I have found myself motivated by the dire situation, and I have found myself motivated by rays of hope, but neither change what I ought to do. So why think about it?

That said, I am merely human. Worse yet, I'm a forecaster. I feel compelled to track that target. Watching my p(doom) go from 70% to 80% made me think extra hard about whether I was doing the most helpful things I could. Watching it go from 80% to 60% made me more confident that my chosen actions can have their intended result. Or is it the other way around? Oh well. Number shiny.

I agree, but from a different angle.

There are several ways in which communicating probabilities offers useful information, and I don't see p(doom) in any of them.

For one, there are true probabilities, such as the probability of a radioactive atom decaying by it's half life. This satisfies the frequentist interpretation, where one could re-run the radioactive decay experiment infinitely and observe the true probabilities over time. This does not apply to p(doom); it is unclear how quantum uncertainty could lead to noticeably different outcomes were we to re-run the AI doom experiment. Instead, we lack the information to correctly predict the future.

This does not mean probabilities outside of quantum mechanics are useless. We can construct probabilistic models, such as how Nate Silver, an election forecaster, might add uncertainty to polls and simulate election outcomes. The issue is on what basis to construct a probabilistic model of AI doom. Probabilistic forecasters use base rates and update on event specific information. What are the base rates for an unprecedented, poorly understood, civilization altering technology, causing doom?

Finally, most people do not specify their p(doom) along with a probabilistic model. This is therefore an implicit appeal to authority. The ability to trust authorities depends on the subject. In AI, predictions on timelines and methods have historically been incorrect. It would be unusual if despite experts being unable to predict the how and when of AI, they could predict the impacts.

My suggestion for AI safety advocates is to embrace the unpredictability of the situation. If we really don't understand AI that well, we also don't know what its impacts will be and in particular, if it will be safe. Given that historically the approach with technology has been to push forward and retroactively assess the negative impacts, we run the serious risk of being insufficiently cautious.

I agree with all these points.

To be honest, the biggest reason I don't like p(doom), is because it makes a meme out of a very serious thing. We are talking about humanity (and likely all other animals) being extinguished. If people were talking about p(ethnic group x will be put in gaschambers in the next 3 years), making a meme out of it, I think it would rub many people the wrong way.

The best argument for it, which makes me not confidently wholly opposed to it is like:

A lot of really smart people, in the AI safety weeds, you talk to them, and they're just very chipper, they never mention how perilous a position we're in. Your inference is they're pretty optimistic. But actually they (sometimes) think we're most likely completely screwed in the you and your children will die sense.

And you can ask them how high they think the risk is, and they say "high". But you ask a climate scientist what the risk of climate change is, and they say "high".

You need to be pretty specific to actually get people to say what they really think, in a way that makes the gravity of the situation clear. It's not information they like volunteering on their own, because, they likely have said it at least a dozen times before, and don't want to sound like a propagandist, or just don't like repeating themselves, or don't think its that interesting, they do interesting more in the weeds work.

And asking their p(doom) does get that information usually. Even if their answer to what their pdoom is, start with "I actually don't like people talking about p(doom).."

I hate talking about "P(doom)". When it first started showing up in the wake of ChatGPT

This got me curious about when the term originated, because I thought I remembered it popping up in the late 2021 MIRI conversations. I found that Rohin Shah indeed used it in one of those, but I'm not sure whether he coined it there and then or whether it was already floating around.

This wiktionary page has an uncited claim that it originated on LessWrong in 2010. I managed to find this -4 karma comment from 2010 by timtyler (on Intellectual Hipsters and Meta-Contrarianism), which says, in the context of arguing that MIRI have an incentive to exaggerate the risks from AI in order to get funding:

Anyway, the basic point is that if you are interested in DOOM, or p(DOOM), consulting a DOOM-mongering organisation, that wants your dollars to help them SAVE THE WORLD may not be your best move.

I couldn't find any uses of the term in between those two, so I'm guessing that timtyler's use was independent of Rohin's, and either Rohin coined it during that dialogue or people had started using it in private discussion shortly before then. I'm in the market for people with better Google-Fu to prove me wrong about this though.

Edit: turns out this is already documented: Tim Tyler claims to have started using it in about 2009. I'm still curious on the path between in 09/10 and Rohin in 2021.

Thanks. I'll update my language to be more clear.

If a person summarizes their degree of concern about ASI some other way, like "I'm somewhat concerned about AI", or "I take the risk of catastrophe due to ASI pretty seriously", wouldn't those also implicitly communicate a probability estimate, only with even more confusing ambiguity? In fact, couldn't these arguments against "p(doom)"- ambiguity, encouraging double-counting of updates, and fatalism- also apply to a lot of other ordinary statements like "I'm worried about climate change" or "I think China might invade Taiwan"?

Given that very short summaries of how much we expect something to happen are important in ordinary communication- where an opportunity to express these beliefs will more often be a one-sentence off-topic aside in an unrelated conversation than a chance to give a long, nuanced explanation- what sort of very short expression of credence in ASI catastrophe would you recommend over "p(doom)"?

Granted, using vague words like "might" or "worried about" rather than giving a number can help avoid accusations of claiming unwarranted precision by people who still think of probabilities in terms of frequentism instead of Bayesian priors and updates. But if you know the person you're talking with isn't likely to be confused in that way, doesn't it make sense to just state our credence plainly rather than retreating to defensive vagueness or refusing to express our beliefs until we have a captive audience for a long explanation?

I think vague statements are fine. There's only so much communication bandwidth, after all. But I don't think "I take the risk of catastrophe due to ASI pretty seriously" is encouraging miscommunication with its vagueness. The danger in "P(doom)" comes from different parties having a different sense of what's being discussed and not realizing that they're talking past each other.

I do think people should be wary about double-counting evidence in general. I try, for example, to distinguish between how things seem from my perspective, from whether I think something is worth taking seriously. For example, building datacenters in space, from my perspective, seems idiotic. But others seem to take it seriously, so I wouldn't bet hard against it actually being smart.

And I think that fatalism is sometimes an issue with how people talk about the future, outside of the meme/ASI space. I do not think "I'm worried about climate change" is at all bad -- seems like a reasonable thing for someone to say! But I do think "there's a 50% chance that the temperature will rise by 2 degrees this century" is problematically fatalist, and it should be amended with a conditional (eg "If we keep going down this path, there's a 50%...").

I don't claim to have a good counter-meme (if I did I would've included it), but questions like "How worried are you that superintelligent AI could wipe out human civilization?" or "Where do you stand on existential risks from AI?" seem pretty good. They don't suggest people give a number, which is part of the point. Numbers are good when there's enough specificity to understand what it is that they're measuring. If you and the other person are clearly on the same page about what a number means, and communicating that number won't be taken as a vote of no-confidence or otherwise damage an effort to coordinate, then by all means, share the number. My issue here is not with numbers, but with numbers poorly used.

I disagree for this reason:

It is very difficult to accurately predict certain types of extremely complex events (particularly those past a singularity point) -- to the point that it's not worth trying to create an extremely accurate prediction using a complex model (factors you underweighted or did not consider could blow out the other factors)
It is still very useful to prepare for and assume that there is some likelihood of these hard to predict events happening

That's the whole argument. I see this in business too: you don't have enough information about a risk -- and it is impossible to gather enough information (traded for time, opportunity cost, etc.)

But you still have to act on that possibility, it's still rational to have a contingency

Doom Debates host here. I think this post makes the case against "P(Doom)" well, and I'll grant this:

Seconded. On my end I feel like I see two different kinds of people who frustrate me in AI risk conversations:

People who deny the risk exists at all. Since doom is ~zero probability, there's no reason to do anything at all about it, and we can get upset at anyone who talks about it for focusing on sci-fi nonsense and distracting from [whatever minor AI effect happens to annoy them].
People who claim that the risk is so great, and doom so certain absent drastic action, that the question of whether the stuff they are doing is stupid and counterproductive is irrelevant. Since doom is ~certain, there's no way their genius plan could make things worse - the default state is guaranteed extinction, and literally anything is an improvement!

And I feel like P(doom) does a good job of capturing the thing I care about, even if it misses nuances.

They either derail on some Bayesian epistemology 101 aspect of the question, or they claim that the probability is some crazy (IMO) value like 0.1%. Furthermore, I find that their policy arguments are often downstream of whether or not their P(Doom) is in the galaxy of saneness.

The property of "sanity" is not a property of beliefs, but of belief-formation processes.

P(doom) is Hopelessly Vague

P(doom) is Fatalistic

This section seems to be making two claims, both of which seem wrong to me:

If your actions can change the probability of an outcome, then you can't meaningfully have a belief about the probability.
A pragmatic stance on beliefs, along the lines of, "Believing in P(doom) might increase extinction risk, so you shouldn't believe in it."

"I'm worried about AI risk" is not any less vague than "P(doom) = 50%".

I think this misses the point. "I am worried about AI risk" doesn't seem to give itself any undue weight: it doesn't feign being more confident or certain of what the risk is.

If your actions can change the probability of an outcome, then you can't meaningfully have a belief about the probability.

I think the problem getting at is that people's idea of P(doom) is often really P(doom|inaction) not a cumulative estimate of probability over the full set of probable actions and all interactions.

I don't think it matters for almost anyone? Like the difference between my best-case future choices and worse-case future choices probably changes P(doom) by less than 0.01 percentage points.

I think it matters for a lot of people. A lot of people would say the odds are substantially lower given international action.

P(doom) is Hopelessly Vague

I have mixed feelings about this post.

Almost all physically instantiated algorithms aren't logically correlated to any other physically instantiated algorithms. Plausibly logical correlation is also very rare among all algorithms period.

You also have the possibility of the analogue of horizontal gene transfer via communication between algorithms, which strengthens the effect.

^{^}
At least those, which it makes sense to model as algorithms, i.e., modeling them as such allows you to derive some valuable implications running along the seams of reality or something.
^{^}
Albeit in this case it's less of a descent with modification thing and more of a being spawned by a very similar process thing.

And for the effect to matter enough the utility gain has to outweigh the exponential drop-off.

I would suggest asking if they vote, but a lot of people vote for non-FDT reasons (eg high expected impact even if you only have a 1/million chance of changing things).

P(doom) is Fatalistic

I want to push back on this section per this semi-related tweet I wrote recently:

I don’t like that “optimistic” conflates an epistemic / forecasting thing with a decision-making thing. A superforecaster will be simultaneously “optimistic” about things that will probably work out well and “pessimistic” about things that will probably not. And good for them, that’s exactly how they should be.

Of course, that’s assuming they’re in the role of a passive observer. Separately, active participants in events should energetically try to make things go well.

I think good leaders have both wolves inside them: given a plan, they are brutally honest and clear-eyed about how likely it is for the plan to work vs not work. But they also don’t give up searching for better plans prematurely. Hence CEOs are praised for “pivoting” away from unpromising plans etc.

…Of course, I have an axe to grind here. I think we should be “pessimistic” about the AI risk situation, in the epistemic / forecasting sense, i.e. in the sense that I see a lot of problems that seem fatal and (based on historical analogies and other arguments) seem very likely to bite us. Thus, I usually describe myself as a “pessimist” about future AI, and I don’t like when people talk as if that’s my character flaw rather than my considered opinion. (Not you in particular, I see it a lot.) Of course, I am energetically trying to make things better.

By analogy, if I’m captain of a high school football team, playing an exhibition game against top professionals, then I can and should energetically try to win, and brainstorm out-of-the-box strategies, etc. But after doing that, if I “optimistically” expect to win, then that would be moronic not praiseworthy.

But at the same time, if people are conflating those things, that’s dumb, and they shouldn’t do that, and we should criticize them for doing that.

P(doom) might be net negative as a meme -- I'm not sure, so I generally don't use that specific phrase these days, especially with people who don't already know it.

I agree with all these points.

The best argument for it, which makes me not confidently wholly opposed to it is like:

And you can ask them how high they think the risk is, and they say "high". But you ask a climate scientist what the risk of climate change is, and they say "high".

And asking their p(doom) does get that information usually. Even if their answer to what their pdoom is, start with "I actually don't like people talking about p(doom).."

I hate talking about "P(doom)". When it first started showing up in the wake of ChatGPT

Anyway, the basic point is that if you are interested in DOOM, or p(DOOM), consulting a DOOM-mongering organisation, that wants your dollars to help them SAVE THE WORLD may not be your best move.

Edit: turns out this is already documented: Tim Tyler claims to have started using it in about 2009. I'm still curious on the path between in 09/10 and Rohin in 2021.

Thanks. I'll update my language to be more clear.

I disagree for this reason:

It is very difficult to accurately predict certain types of extremely complex events (particularly those past a singularity point) -- to the point that it's not worth trying to create an extremely accurate prediction using a complex model (factors you underweighted or did not consider could blow out the other factors)
It is still very useful to prepare for and assume that there is some likelihood of these hard to predict events happening

That's the whole argument. I see this in business too: you don't have enough information about a risk -- and it is impossible to gather enough information (traded for time, opportunity cost, etc.)

But you still have to act on that possibility, it's still rational to have a contingency

106

P(doom) is a Dumb Meme

106

P(doom) is Hopelessly Vague

Inside Views, Outside Views, and Likelihood Ratios

P(doom) is Fatalistic

Counterarguments

A Sense That More Is (Memetically) Possible

106

106