We made a map of the doom debate. Here's how the breakdown works.

Sean Herrington; Khai Tran; David Bravo; keivnc; Paul Hindoian; Josh Tuffy; mikaelacankosyan; Christopher A. Davis; Maryam Hampaei

This is the second post in a series of posts titled "We made a map of the Doom Debate" that was created as part of the AI Safety Camp 2026 "Assumptions of the Doom Debate" project, led by Sean Herrington. The previous post introduced our tree model for mapping existential risk probability. This post will go into detail on each node of the tree. The tree can be found here, and the previous post can be found here.

TL;DR:

The tree consists of independent nodes, with each node being mutually exclusive with all other nodes on its level.
Each node concerns a different branch of scenarios for doom.
Each node consists of the likelihood of its associating scenario happening, and the likelihood of doom given that the associating scenario happens.

This next post will be dedicated to some discussion of the branches of this tree in detail. Before we discuss the different branches and scenarios, it is important to keep a few assumptions in mind:

Branches of the tree will be independent and mutually exclusive - that is, the branches represent differing and non-overlapping scenarios for existential catastrophe (x-risk). An increase in possibility for one branch to happen (represented within our model and our website demonstration as found in the first post) necessarily means a decrease in possibility in other branches at the same level. A 100% possibility for one branch necessarily means a 0% possibility for all other branches at the same level.
Possibilities are complete - that is, the possibility for all branches at the same node add up to 100%, given the assumptions of that node and its parents.
The following discussion will assume that the danger posed by AI is existential, and as such, we will discuss these scenarios with doom or existential danger as the end result. The scenarios should not change too much if a different end scenario is considered.

The following is a consideration of all the nodes and subnodes in a top-to-bottom manner. For the sake of ease of presentation, we will follow a slightly different order to the tree found on our website demonstration.

Each node will be presented in its own section, where we will present the scenarios considered by the node in order to determine the probability of such scenarios happening.

We believe that the potential scenarios for x-risk within the timeframe T of the model follow these pathways:

I. NON-AI PATHWAY

The Non-AI Pathway constitutes the path of the "non-believer" - those who ascribe at least a chance to the scenario where AI's impact is neutral, or even positive, in terms of averting existential danger.

- AI doesn't make existential catastrophe more likely: We first consider the scenario where AI actually decreases the possibility of existential catastrophe – which could be in the form of perfectly aligned AI assisting with decreasing other x-risks or through other means. For example, a scenario where aligned AI systems help solve global warming, or achieve world peace through cooperation with humans would fall under this pathway.

This pathway also considers scenarios where x-risk does not get impacted by the advent of capable AI systems. An intuitive (though admittedly incomplete) way to consider this would be to consider more "traditional" existential crisis scenarios, such as mutually assured destruction borne from nuclear weapons, lethal disease pandemics, or large-scale natural disasters. Any potential x-risk in this case would come entirely from the risk of these other crisis scenarios.

II. AI-DRIVEN PATHWAY

The AI-Driven Pathway, on the other hand, considers the scenario where the existence of AI systems changes the cumulative P(doom) of all potential doom scenarios. By considering this as a total possibility, we encompass scenarios that include entirely novel AI-driven existential catastrophe scenarios (human enfeeblement through vastly superintelligent AI systems being one of them) and scenarios where AI accelerates existing existential catastrophe scenarios (AI-created lethal bioweapons).

- AI makes existential catastrophe more likely: This is the scenario where the existence of AI systems increases the chance of an existential catastrophe happening. This means that within the weighted consequence space of a world where AI systems exist, the proportional weight of all consequences that constitute an existentially catastrophic scenario makes up a larger part of all weighted consequences.

The AI-Driven Pathway is divided into subnodes:

II.1. Multipolar AI pathway

The Multipolar AI pathway considers the possibility that no single AI system is more impactful on the probability of existential catastrophe than all other systems combined. This pathway encompasses scenarios that include competitive multipolar AI doom (AI systems engaging in territorial-type opposition), cooperative multipolar AI doom (multiple sub-critical AI systems conspiring with each other against humanity), or multipolar-based non-AI doom (paranoia similar to the Red Scare around multiple nations possessing capable AI systems).

- Danger comes from multiple AIs: This probability considers the situation that an increase in P(existential catastrophe) stems from the introduction and creation of multiple AI models. It considers AI-to-AI dynamics and/or human reactions to AI-to-AI dynamics. Notably, it does not consist of situations where only one of those multiple systems has the requirements needed to increase x-risk all by itself – that is the purview of the Single dominant AI pathway. As such, the scenarios here can be thought of in two ways: a "critical mass standoff" situation where systems have to work together to be able to affect doom chances, and a "critical masses" situation where systems do not have to work together, but no system have the capacity to overcome all other systems combined.

II.2. Single dominant AI pathway

The Single dominant AI pathway is concerned with what is possibly the most commonly imagined scenario of AI-related catastrophe: the possibility of an increase in doom coming from the creation and introduction of one single AI system to the environment. The ideas represented by these models - the degree of "conscious" decision making that a dominant AI system conducts, and the degree to which it impacts the world - are presented as part of this node's children.

Similar to the Multipolar AI node presented in II.1., this pathway concerns both direct doom from the AI system (AI domination over human decision making) or indirect doom from one AI system (placing dangerous weapons in the hands of one rogue nation).

- Danger comes from a single dominant AI: This probability considers having one AI system as the main source of impact for the increase in x-risk. This could be either through being the only capable system built (as part of a "freeze too late" scenario or a deceptive super AI scenario) or being the only system with capabilities AND the ability to neutralize the impact of all other AI systems (rogue security AI).

We can further delineate the different scenarios within this pathway as follows:

II.2.a. Internal model pathway

This branch concerns the scenario in which the AI system in question "knows" what it is doing.

For our purposes, an AI system is said to have an internal model of doom if it expects to change the probability of doom through its actions. In order to do so, it must satisfy the following conditions:

It must be accurately aware of the impact space of any potential action it takes. This means having a reasonably defined and considered set of consequences that could happen as the result of any given action performed by the system.
It must be accurately aware of the impact spaces that constitute an existential catastrophe.
It must be accurately aware of whether and when the impact space of the actions it could take contains those that constitute an existential catastrophe.

Additionally, there is a fourth condition that defines the difference between the subnodes of this scenario:

It must be accurately aware of if and when the impact of the actions it chooses to take causes a change in the probability of doom.

In other words, an AI system that has an internal model of doom must "know" the consequences of its action, "know" what leads to the world blowing up, and "know" that what it does might blow up the world.

- The AI has an internal model of existential catastrophe: This probability considers having an AI system that satisfies the conditions outlined above, as well as the capabilities required to both affect doom and become the dominant system. A system that follows this pathway does not need to be malicious/misaligned or working against human interests - it can potentially take a decision that benefits humanity, at a 5% chance of existential catastrophe, because it (rightly or wrongly) believes that it is the right decision for humanity to take.

II.1.a.a. AI expects existential catastrophe pathway

As stated in section II.1.a., this pathway consists of scenarios with systems that satisfy the first three conditions. AI models that "expect" existential catastrophe, according to our model, are models that also satisfy the fourth condition. Models in this category should have an "understanding" (or any machinic equivalent to an "understanding") of its potential impact on x-risk and conduct its actions with that in mind.

- The AI expects existential catastrophe: This probability considers having an AI system that satisfies the conditions outlined in II.1.a., as well as an active and reasonably accurate model of how its actions affect existential catastrophe. This probability covers scenarios such as when the AI's utility function actively favors a world where humanity does not exist (e.g., resource acquisition, self-preservation against human shut-off attempts), when the AI played along to secure deployment and executes a catastrophic action once it achieves decisive strategic advantage, or when the AI decides to enact a policy that benefits its goals but comes with a 99% chance of destroying humanity, viewing the existential risk as an acceptable externality.

II.1.a.b. AI does not expect existential catastrophe pathway

This branch considers scenarios with systems that satisfy conditions 1-3 outline in II.1.a., but not condition 4. The AI system in this scenario possesses an internal model of doom (it intellectually understands the concept of existential catastrophe and how to avoid it), but its forward-simulation of its specific chosen action fails to flag the danger.

- The AI does not expect existential catastrophe: This probability considers having an AI system that satisfies the conditions outlined in II.1.a., but without an active or reasonably accurate model of how its actions affect existential catastrophe. Scenarios falling under this umbrella may include the AI taking an action that seems safe in isolation but triggers a butterfly effect that escapes its predictive horizon, or an out-of-distribution scenario where safe actions in a training space turns fatal in a real deployment.

II.2.b. No internal model pathway

In contrast to the internal model pathway, the no internal model pathway concerns the potential of a system that "doesn't know" what it's doing to increase x-risk. One potential scenario in this pathway is similar to an oblivious parent that gives you the wrong medicine - they have the power to impact your safety and survival, without the appropriate knowledge to make the right choices towards increasing the probability of such survival.

- The AI has no internal model of existential catastrophe: This probability considers having an AI system that has the capabilities required to both affect doom and become the dominant system but lacks the optimisation power to internally construct how a course of action (or set of courses of action) can lead to an increase in P(existential catastrophe). This could be due to fundamental limitations in model construction (insufficient computational and inference capabilities) or a safeguard learned through effective but incomplete training.

III. CONCLUSION

The above deconstruction of the tree can be used as a guide to help you figure out which scenario goes where. The next post in our series can be found here (or, if there's no link here, it's not out yet. Come back later!)

13