LESSWRONG
LW

CounterfactualsRationalityWorld Modeling
Frontpage

29

[ Question ]

What are the two contradictory theories of how to evaluate counterfactuals?

by Said Achmiz
25th Jul 2025
1 min read
A
4
16

29

CounterfactualsRationalityWorld Modeling
Frontpage

29

What are the two contradictory theories of how to evaluate counterfactuals?
13mesaoptimizer
4philh
4Said Achmiz
4mesaoptimizer
5Richard_Kennaway
3Heighn
3Said Achmiz
11[anonymous]
4Said Achmiz
2[anonymous]
3Heighn
1Zack_M_Davis
10jessicata
7Said Achmiz
-2Warty
4Vladimir_Nesov
New Answer
New Comment

4 Answers sorted by
top scoring

mesaoptimizer

Jul 25, 2025

130
  1. The first intuition is that the counterfactual involves changes the physical result of your decision making, not the process of your decision making itself. The second intuition is that the counterfactual involves a replacement of the process of your decision making such that you'd take another action than you would normally do.
  2. I imagine it as the following:
    • Physical intervention: I imagine that I'm possessed by a demon that leads me to take the physical actions to choose another option than I would have voluntarily.
    • Logical intervention: I imagine that I was a different person with a different life history, that would have led me to choose a different path than the me in physical reality would choose. This doesn't quite communicate how loopy logical intervention can feel, however: I usually imagine logical alternative futures as ones where you effectively have 2+2=3 or something equally clearly illogical as a part of the bedrock of the universe.
  3. I don't think that different problems lead one to develop different intuitions. I think that physical intervention is the more intuitive way people relate to counterfactuals, including for mundane decision theory problems like Newcomb's problem, and that logical intervention is something people need clarifying thought experiments to get used to. I found Counterlogical Mugging (which is counterfactual mugging but involves a statement you have logical uncertainty over) as a very useful intuition pump to start thinking in terms of logical intervention as a counterfactual.

For a more rigorous explanation, here's the relevant section from MacDermott et al., "Characterising Decision Theories with Mechanised Causal Graphs":

But in the Twin Prisoner’s Dilemma, one might interpret the policy node in two different ways, and the interpretation will affect the causal structure. We could interpret intervening on your policy ˜D as changing the physical result of the compilation of your source code, such that an intervention will only affect your decision D, and not that of your twin T . Under this physical notion of causality, we get fig. 3a, where there is a common cause S explaining the correlation between the agent’s policy and its twin’s.

But on the other hand, if we think of intervening on your policy as changing the way your source code compiles in all cases, then intervening on it will affect your opponent’s policy, which is compiled from the same code. In this case, we get the structure shown in fig. 3b, where an intervention on my policy would affect my twin’s policy. We can view this as an intervention on an abstract "logical" variable rather than an ordinary physical variable. We therefore call the resulting model a logical-causal model.

Pearl’s notion of causality is the physical one, but Pearl-style graphs have also been used in the decision theory literature to represent logical causality. One purpose of this paper is to show that mechanism variables are a useful addition to any graphical model being used in decision theory.

Add Comment
[-]philh2mo40

The first intuition is that the counterfactual involves changes the physical result of your decision making, not the process of your decision making itself. The second intuition is that the counterfactual involves a replacement of the process of your decision making such that you'd take another action than you would normally do.

Hm, this makes me realize I'm not fully sure what's meant by "counterfactual" here.

I normally thinking of it as, like. I'm looking at a world history, e.g. with variables A and B and times t=0,1,2 and s... (read more)

Reply
[-]Said Achmiz2mo40

Thank you! That’s definitely more clear than anything I’ve read about this on LW to date!

Follow-up question that immediately occurs to me:

Why are these two ways of evaluating counterfactuals and not, like… “answers to two different questions”? What I mean is: if we want to know what would happen in a “counterfactual” case, it seems like the first thing to do is to say “now, by that do you mean to ask what would happen under physical intervention, or what would happen under logical intervention?” Right? Those would (could?) have different answers, and reall... (read more)

Reply
4mesaoptimizer2mo
Yes. I think that intervening on causality and logic are the only two ways one could intervene to create an outcome different from the one that actually occurs. I don't work in the decision theory field, so I want someone else to answer this question.

Richard_Kennaway

Jul 26, 2025

51

There's a range of interpretations for any counterfactual. One must open up the "suppose" and ask, "What am I actually being asked to suppose? How might the counterfactual circumstance have come to be?" We can accordingly do surgery on the causal graph in different places, depending on how far back from the event of interest we intervene.

To make X counterfactually have some value x, we might, in terms of causal graph surgery, consider do(X=x). Or we might intervene on some predecessors of X, and use do(Y=y) and do(Z=z), choosing values which cause X to take on the value x, but which may have additional effects. Or we could intervene further back than that, and create even more side-effects. We might discover that we are considering a counterfactual that makes no sense — for example, phosphorus matches that do not burn, yet human life continues.

In Newcomb's Problem, the two-boxing argument intervenes on both the decision of the person faced with the problem, and Omega's decision to fill the other box or not, as if there were hidden mechanisms that were to pre-empt both decisions in each of the four possible ways they might be made. (This obviously contradicts one of the hypotheses of the problem, which is that Omega is always right.) The one-boxing argument intervenes on the choice of policy that produces the subject's decision, and does not intervene on Omega.

I could call these CDT and FDT respectively, except for the tendency of people to modify their preferred decision theory xDT in response to problems that it gets wrong, and claim to be still using xDT, "properly understood". I just described the one-boxer's argument in causal terms. That does not mean that CDT, "properly understood", is FDT.

ETA: While googling something about counterfactuals, I came across Molinism, according to which God knows all counterfactuals, and in particular knows what the creatures that he created would do of their own free will in any hypothetical situation. Omega is probably an angel sent by God to test people's rationality. (Epistemic status: jeu d'esprit.)

Add Comment

Heighn

Aug 21, 2025

30

Regarding (2), I interpret Soares' point as follows: there's a "CDT intuition" and an "FDT intuition" of how to evaluate counterfactuals. Let's just take the Bomb problem as an example.

CDT intuition

The CDT intuition deals with which action has the best causal consequences at one given point in time. In Bomb, you can either Left-box or Right-box. There's a bomb in Left, so Left-boxing causes you to die painfully. Right-boxing costs only $100, so right-boxing wins.

FDT intuition

The FDT intuition deals with which decision is the best outcome of your decision procedure. Your decision procedure is a function, and could be implemented more than once. In Bomb, it's implemented both in your head and in the predictor's head (she executed it to predict what you would do). Your decision to Left-box or to Right-box therefore happens twice - and, since your decision procedure is a function, it's necessarily the same on both events - and you have to look at the causal consequences of both events. Left-boxing causes the predictor to not put a bomb in Left and you to not lose $100, while Right-boxing causes the predictor to put a bomb in Left and you to lose $100. Left-boxing wins.

P.S. A natural thing to respond here is something like: "But you already see a Bomb in Left, so the FDT intuition makes no sense!" But note that, since the predictor simulates you in order to make her prediction, you don't actually know whether you are the "real you" or the "simulated you", since the simulated you observes the exact same (relevant) things as the real you. (If not, then the simulation would not be accurate and there would be no subjunctive dependence.) So in this intuition, observing the bomb does not actually mean there is a bomb, since you could be in a simulation. In fact, you are, at different points in time, both in the simulation and in the "real" situation, and you have to make a decision that happens in and makes the best of both these situations.

Add Comment
[-]Said Achmiz20d31

Thanks! This seems to match up to what @mesaoptimizer wrote in his comment, I think?

One question I do have is: does anyone actually have the “FDT intuition”…? That is, is it really an intuition, or is it a perspective that people need to be reasoned into taking?

(I also have some serious problems with the FDT view, but this is not the place to discuss them, of course.)

Reply
[-][anonymous]20d110

That is, is it really an intuition, or is it a perspective that people need to be reasoned into taking?

There's no natural separation between the two. Reasoning and training chisels and changes intuition (S1) just as much as it chisels and changes deliberate thinking (S2).

Take the example of chess. A grandmaster would destroy me, 10 games out of 10, when playing a classical game. But he would also destroy me 10 games out of 10 when we play a hyperbullet (i.e., 30+0 seconds) game, where the time control is so fast that you simply don't have time to deliberately analyze variations at all and must instead play almost solely on intuition.[1] That's because the grandmaster's intuition is far far better than mine.

But the grandmaster was not born with any chess intuition. He was born not knowing anything about the existence of chess, actually. He had to be trained, and to train himself, into it. And through the process of studying chess (classical chess, where you have hours to think about the game and increment to give you extra time for every move), he improved and changed his intuitive, snap, aesthetic judgement too.

  1. ^

    And that's the case even if the grandmaster very rarely plays hyp

... (read more)
Reply
4Said Achmiz20d
This is true to some extent, but I don’t think it’s relevantly true in the given context. Recall the claim/argument which prompted (the discussion that led to) this post: I understood Nate to be saying something other than merely “it is possible for a human to become convinced that FDT is correct, whereupon they will find it intuitive”.
2[anonymous]20d
Hmm. Yeah, I think you're right. But I suppose I'm a poor advocate for the opposite perspective, since a statement like "Humans come equipped with both intuitions," in this precise context, yields a category error in my ontology as opposed to being a meaningful statement capable of being true or false.
3Heighn19d
Yeah, it matches wat @mesaoptimizer said, I believe. I was reluctant to post my view, but thought it could be helpful anyway :) Great question! I'm wondering the same thing now. I, for one, had to be reasoned into it. It does feel like it "clicked", so to speak, but I doubt whether anyone has this intuition naturally. I would be willing to discuss FDT more, if you'd like (in a separate post, of course).

Zack_M_Davis

Jul 25, 2025

1-16

My understanding is that the two contradictory theories are causal decision theory (CDT), which says to choose the action that will cause the best consequences, and evidential decision theory (EDT), which says to choose the action such that the consequences will be best conditional on the action you chose. Newcomb's problem makes causal decision theory look bad but evidential decision theory look good. (CDT two-boxes because your choice seemingly can't cause the prediction, but EDT one-boxes you have more money conditional on one-boxing.) But the smoking lesion problem makes evidential decision theory look bad and causal decision theory look good. (CDT gets to enjoy smoking because it doesn't cause health problems according to the thought-experiment setup, but EDT doesn't because according to the setup, you have health problems conditional on enjoying smoking.)

Add Comment
[-]jessicata2mo105

EDT doesn't have counterfactuals at all, IMO. It has Bayesian conditionals.

Reply
[-]Said Achmiz2mo75

I am fairly sure that this is not the distinction being made. I think this because the FDT paper first contrasts EDT on the one hand with CDT and FDT on the other hand (saying that CDT and FDT both differ from EDT in the same way), and then goes on to say that CDT and FDT differ in some other way. And the quotes I gave in the OP were also about CDT vs. FDT, with no EDT involved.

Reply
[-]Warty2mo-2-3

I never got that cause is deciding to smoke much of an update after you already detected an urge to smoke? edt looks simpler so it should be correct 

Reply
1 comment, sorted by
top scoring
Click to highlight new comments since: Today at 1:05 PM
[-]Vladimir_Nesov2mo40

A natural way of formulating decision making is to ask how outcomes depend on agent's behavior. If we try to look at such a dependence pointwise, we might end up with a story for how a particular possible behavior leads to a particular possible world, and in that possible world to a particular outcome (for sufficiently general notions of "possible world" and "outcome").

This framing is soon in trouble when we consider a fully specified deterministic agent of any kind (either physical or algorithmic), because most possible behaviors are not the actual behavior, and in that sense all possible behaviors except the actual behavior are counterfactual. (This gives a widespread misnomer where "counterfactual" starts referring to all possible behaviors or all possible worlds, including the actual one, even though it's not literally counterfactual.) It's not clear how to think about counterfactual things.

Simply asking for the intended meaning (intended construction of counterfactuals, or more generally the intended dependence of everything on agent's behavior) doesn't help with the real problem of how to actually make decisions, how the dependence should be constructed. FDT mostly assumes away this issue by giving a way of thinking about the dependence and of formulating it (which in particular translates to its notion of counterfactuals), and asking it to be formulated in this way in the problem statement for decision making.

A possible clue about the issue with counterfactuals is that it's not known from the outset which behaviors/worlds are counterfactual, and there are no relevant conceptual issues with thinking about the behaviors/worlds that are actual. So a priori (before we know which things are actual) any methods applicable to thinking about the actual behaviors/worlds should also be applicable to the counterfactual ones, and the central puzzle is to figure out how to use this to formulate the dependence of various mostly-counterfactual possibilities of interest on the various mostly-counterfactual possible behaviors of the agent.

Reply
Moderation Log
More from Said Achmiz
View more
Curated and popular this week
A
4
1

In this comment thread on the 2021 post “A Defense of Functional Decision Theory”, @So8res wrote:

...Also, just to be clear, you’re aware that these are two different internally-consistent but contradictory theories of how to evaluate counterfactuals? Like, we can be pretty confident that there’s no argument a CDT agent can hear that causes them to wish to adopt FDT counterfactuals, and vice versa. Humans come equipped with both intuitions (I can give you other problems that pump the other intuitions, if you’d like), and we have to find some other way to arbitrate the conflict.

Following up on my reply that I didn’t know what he was talking about, he recommended some reading:

The canonical explanatory text is the FDT paper (PDF warning) (that the OP is responding to a critique of, iirc), and there’s a bunch of literature on LW (maybe start at the wiki page on UDT? Hopefully we have one of those) exploring various intuitions. If you’re not familiar with this style of logic, I recommend starting there (ah look we do have a UDT wiki page). I might write up some fresh intuition pumps later, to try to improve the exposition. (We’ve sure got a lot of exposition if you dig through the archives, but I think there are still a bunch of gaps.)

I followed both of those links, and was not enlightened. The linked FDT paper had this bit:

In short, CDT and FDT both construct counterfactuals by performing a surgery on their world-model that breaks some correlations and preserves others, but where CDT agents preserve only causal structure in their hypotheticals, FDT agents preserve all decision-relevant subjunctive dependencies in theirs.

(This follows a rather technical section, which I have little confidence in having understood correctly.)

Has the dichotomy that @So8res refers to been clearly explained anywhere? If not—can anyone explain it now? The relevant questions are:

  1. What are the “two different internally-consistent but contradictory theories of how to evaluate counterfactuals”?
  2. What are the intuitions for each (which humans come equipped with)?
  3. What problems pump those intuitions?