(This is the third post in a planned sequence.)

My last post left us with the questions:

- Just what are humans, and other common CSAs, calculating when we imagine what “would” happen “if” we took actions we won’t take?
- Is there more than one natural way to calculate these counterfactual “would”s? If so, what are the alternatives, and which alternative works best?

Today, I’ll take an initial swing at these questions. I’ll review Judea Pearl’s causal Bayes nets; show how Bayes nets offer a general methodology for computing counterfactual “would”s; and note *three* plausible alternatives for how to use Pearl’s Bayes nets to set up a CSA. One of these alternatives will be the “timeless” counterfactuals of Eliezer’s Timeless Decision Theory.

**The problem of counterfactuals** is the problem what we do and should mean when we we discuss what “would” have happened, “if” something impossible had happened. In its general form, this problem has proved to be quite gnarly. It has been bothering philosophers of science for at least 57 years, since the publication of Nelson Goodman’s book “Fact, Fiction, and Forecast” in 1952:

Let us confine ourselves for the moment to [counterfactual conditionals] in which the antecedent and consequent are inalterably false--as, for example, when I say of a piece of butter that was eaten yesterday, and that had never been heated,

`If that piece of butter had been heated to 150°F, it would have melted.'Considered as truth-functional compounds, all counterfactuals are of course true, since their antecedents are false. Hence

`If that piece of butter had been heated to 150°F, it would not have melted.'would also hold. Obviously something different is intended, and the problem is to define the circumstances under which a given counterfactual holds while the opposing conditional with the contradictory consequent fails to hold.

Recall that we seem to need counterfactuals in order to build agents that do useful decision theory -- we need to build agents that can think about the consequences of each of their “possible actions”, and can choose the action with best expected-consequences. So we need to know how to compute those counterfactuals. As Goodman puts it, “[t]he analysis of counterfactual conditionals is no fussy little grammatical exercise.”

**Judea Pearl’s Bayes nets offer a method for computing counterfactuals.** As noted, it is hard to reduce human counterfactuals in general: it is hard to build an algorithm that explains what (humans will say) really “would” have happened, “if” an impossible event had occurred. But it is easier to construct specific formalisms within which counterfactuals have well-specified meanings. Judea Pearl’s causal Bayes nets offer perhaps the best such formalism.

Pearl’s idea is to model the world as based on some set of causal variables, which may be observed or unobserved. In Pearl’s model, each variable is determined by a conditional probability distribution on the state of its parents (or by a simple probability distribution, if it has no parents). For example, in the following Bayes net, the beach’s probability of being “Sunny” depends only on the “Season”, and the probability that there is each particular “Number of beach-goers” depends only on the “Day of the week” and on the “Sunniness”. Since the “Season” and the “Day of the week” have no parents, they simply have fixed probability distributions.

Once we have a Bayes net set up to model a given domain, computing counterfactuals is easy*. We just:

- Take the usual conditional and unconditional probability distributions, that come with the Bayes net;
- Do “surgery” on the Bayes net to plug in the variable values that define the counterfactual situation we’re concerned with, while ignoring the parents of surgically set nodes, and leaving other probability distributions unchanged;
- Compute the resulting probability distribution over outcomes.

For example, suppose I want to evaluate the truth of: “If last Wednesday had been sunny, there would have been more beach-goers”. I leave the “Day of the week” node at Wednesday“, set the ”Sunny?“ node to ”Sunny“, ignore the “Season” node, since it is the parent of a surgically set node, and compute the probability distribution on beach-goers.

*Okay, not quite easy: I’m sweeping under the carpet the conversion from the English counterfactual to the list of variables to surgically alter, in step 2. Still, Pearl’s Bayes nets do much of the work.

**But, even if we decide to use Pearl’s method, we are left with the choice of how to represent the agent's "possible choices" using a Bayes net.** More specifically, we are left with the choice of what surgeries to execute, when we represent the alternative actions the agent “could” take.

There are at least three plausible alternatives:*Alternative One: “Actions CSAs”:*

Here, we model the outside world however we like, but have the agent’s own “action” -- its choice of a_1, a_2, or ... , a_n -- be the critical “choice” node in the causal graph. For example, we might show Newcomb’s problem as follows:

The assumption built into this set-up is that the agent’s action is uncorrelated with the other nodes in the network. For example, if we want to program an understanding of Newcomb’s problem into an Actions CSA, we are forced to choose a probability distribution over Omega’s prediction that is independent of the agent’s actual choice.*How Actions CSAs reckon their coulds and woulds:*

- Each “could” is an alternative state of the “My action” node. Actions CSAs search over each state of the “action” node before determining their action.
- Each “would” is then computed in the usual Bayes net fashion: the “action” node is surgically set, the probability distribution for other nodes is left unchanged, and a probability distribution over outcomes is computed.

So, if causal decision theory is what I think it is, an “actions CSA” is simply a causal decision theorist. Also, Actions CSAs will two-box on Newcomb’s problem, since, in their network, the contents of box B is independent of their choice to take box A.

*Alternative Two: “Innards CSAs”:*

Here, we again model the outside world however we like, but we this time have the agent’s own “innards” -- the physical circuitry that interposes between the agent’s sense-inputs and its action-outputs -- be the critical “choice” node in the causal graph. For example, we might show Newcomb’s problem as follows:

Here, the agent’s innards are allowed to cause both the agent’s actions and outside events -- to, for example, we can represent Omega’s prediction as correlated with the agent’s action.*How Innards CSAs reckon their coulds and woulds:*

- Each “could” is an alternative state of the “My innards” node. Innards CSAs search over each state of the “innards” node before determining their optimal innards, from which their action follows.
- Each “would” is then computed in the usual Bayes net fashion: the “innards” node is surgically set, the probability distribution for other nodes is left unchanged, and a probability distribution over outcomes is computed.

Innards CSAs will one-box on Newcomb’s problem, because they reason that if their innards were such as to make them one-box, those same innards would cause Omega, after scanning their brain, to put the $1M in box B. And so they “choose” innards of a sort that one-boxes on Newcomb’s problem, and they one-box accordingly.

*Alternative Three: “Timeless” or “Algorithm-Output” CSAs:*

In this alternative, as Eliezer suggested in Ingredients of Timeless Decision Theory, we have a “Platonic mathematical computation” as one of the nodes in our causal graph, which gives rise at once to our agent’s decision, to the beliefs of accurate predictors about our agent’s decision, and to the decision of similar agents in similar circumstances. It is the output to this *mathematical* function that our CSA uses as the critical “choice” node in its causal graph. For example:

*How Timeless CSAs reckon their coulds and woulds:*

- Each “could” is an alternative state of the “Output of the Platonic math algorithm that I’m an instance of” node. Timeless CSAs search over each state of the “algorithm-output” node before determining the optimal output of this algorithm, from which their action follows.
- Each “would” is then computed in the usual Bayes net fashion: the “algorithm-output” node is surgically set, the probability distribution for other nodes is left unchanged, and a probability distribution over outcomes is computed.

Like innards CSAs, algorithm-output CSAs will one-box on Newcomb’s problem, because they reason that if the output of their algorithm was such as to make them one-box, that same algorithm-output would also cause Omega, simulating them, to believe they will one-box and so to put $1M in box B. They therefore “choose” to have their algorithm output “one-box on Newcomb’s problem!”, and they one-box accordingly.

Unlike innards CSAs, algorithm-output CSAs will also Cooperate in single-shot prisoner’s dilemmas against Clippy -- in cases where they think it sufficiently likely that Clippy’s actions are output by an instantiation of “their same algorithm” -- even in cases where Clippy cannot at all scan their brain, and where their innards play no physically causal role in Clippy’s decision. (An Innards CSA, by contrast, will Cooperate if having Cooperating-type innards will physically cause Clippy to cooperate, and not otherwise.)

Coming up: considerations as to the circumstances under which each of the above types of agents will be useful, under different senses of “useful”.

Thanks again to Z M Davis for the diagrams.

After reading more Girard, "platonic computation" started to sound to me like "phlogiston". Seriously.

Thanks for this sequence. I wasn't really able to follow some of the other decision theory posts, but this is very clear, ties a lot of things together, and finally gives me a good handle on what a lot of people are talking about.

Stupid question time: Why are all the agents only "surgically setting" nodes with no parents? Is that a coincidence, or is it different in a significant way from the sunniness example?

I think a lot of the confusion about these types of decision theory problems has to do with not not everyone thinking about the same problem even when it seems like they are.

For example, consider the problem I'll call 'pseudo newcombs problem'. Omega still gives you the same options, and history has proven a strong correlation between peoples choices and its predictions.

The difference is that instead of simulating the relevant part of your decision algorithm to make the prediction, Omega just looks to see whether you have a blue dot or a red dot on your f... (read more)

The description of Pearl's counterfactuals in this post isn't entirely correct, in my opinion.

Your description of the "sunny day beachgoers" statement is describing an interventional distribution (in other words something of the form P(y|do(x))).

The key difference between interventions and counterfactuals is that the former are concerned with a single possible world (after an intervention), while the latter are concerned with multiple hypothetical worlds simultaneously.

Example of a statement about an intervention (causal effect): "I am about... (read more)

Can anyone explain why Goodman considers this statement to be true:

Hence `If that piece of butter had been heated to 150°F, it would not have melted.' would also hold.

How does one then take into account the fact that one's abstract platonic algorithm is implemented on physical hardware that may end up occasionally causing an accidental bitflip or other corruption, thus not actually computing the answer that the algorithm you think you're computing actually computes?

My

INITIAL(sleep deprived) thought is a hybrid of options 2,3, and a form of EDT in which one would say "If I output X, then that would be evidence that abstract platonic computation outputs X, which would then also cause other implementation/model of t... (read more)Agents do

notneed to calculate what would have happened, if something impossible had happened.They need to calculate the consequences of their possible actions.

These are all possible, by definition, from the point of view of the agent - who is genuinely uncertain about the action she is going to take. Thus, from her point of view at the time, these scenarios are not "counterfactual". They do not contradict any facts known to her at the time. Rather they all lie within her cone of uncertainty.

That's the difference between epistemic and metaphysical possibility. Something can be epistemically possible without being metaphysically possible if one doesn't know if it's metaphysically possible or not.

My problem with these formalisms is that they don't apply to real world problems. In that real world problems you can alter the network as well as altering the nodes.

For example you could have an innards altering program that could make innards that attempt to look at omega's innards and make a decision on whether to one-box or not based on that. This would form a loop in the network, what it would do to the correct strategy, I don't know.

In the real world this would not be able to be stopped. Sure you can say that it doesn't apply to the thought experiment but then you are divorcing the thought experiment from reality.