Throughout this post I will use "related" as an exact synonym for "not independent", as a way of simplifying the language.
For "A is not independent of B" you could say "A is informative about B", inspired by "mutual information".
I think this sounds better to my ears. Like "the first coin toss is related to the second coin toss" is kinda true -- of course they are related! they are both about coins!
but "the first coin toss is informative about the second coin toss" is obviously false
Here's a simple heuristic about causal inference:
For example, consider the claim "rain causes me to carry an umbrella":
The rule has some exceptions which I discuss in the last section.
Derivation of the rule
Suppose A and B are related (non-independent). We want to distinguish the following three possibilities:
Now suppose we want to distinguish these using purely observational data. In particular, we can look at the joint distribution of {A, B, X} for a variety of additional variables X.
Specifically, we are going to look at the conditional dependency relationships between the three variables. This consists of six questions:
If we require that A and B are related and C is related to at least one of the other two, then it will turn out that there are only 6 possible answers to this set of questions. Either all of the dependencies will exist (1 option) or exactly one will be independent (5 options, since we already assumed A and B are related).[3]
Here is the table of which dependency structures are compatible with each of the three causal hypotheses for A and B:
There is no possible value of the conditional dependencies between {A, B, C} that can rule out "common cause only". However, each of the other two hypotheses can be falsified by a single observation.
The only way to gain high confidence in the hypothesis "A causes B" is to look hard for cases where C is related to A but not to B. If A does not cause B, we'd expect to eventually find such a case, so not finding one is usually strong evidence of causation.
The counterexample is always a case where C is either:
In the next section, I'll derive the table by giving a causal diagram for every "allowed" cell. Then I'll talk about two practical exceptions to the central claim.
A visual derivation of the table
Exceptions
Exception 1: "coincidences" and control systems[4]
In general, a causal graph can tell you when two variables must be independent, but it cannot rule out two variables being independent "by coincidence" when the causal graph says that they shouldn't be independent.
If the content of the causal relationships is randomly chosen from a continuous option space, then such coincidences have probability zero. However, in intentionally designed control systems, it is possible for causal paths to cancel out.
For example, consider a thermostat. The output voltage of the thermostat at time surely has a causal effect on the temperature in the room at time . However, if the thermostat functions perfectly, it might statistically appear as if the output voltage is not causally upstream of the future room temperature. Let's work through the example with a causal graph:
Now let's apply our rule. We want to verify that A is causally upstream of B. Since X is related to A, it should also be related to B. However, if the thermostat works well, X and B will in fact be independent! The temperature at time ("B") will depend only on the thermostat set point and not on the temperature at time ("X") -- that's what it means for the thermostat to work well!
The graph says that X has two causal paths to B (the direct blue arrow and the orange path through the thermostat), but these will cancel out by design of the thermostat.
This designed cancellation would be a probability-zero coincidence for a randomly generated system.
Exception 2: No measurable causes of A except for common causes with B
These two diagrams cannot be observationally distinguished:
The culprit is that A has no causes except the one shared with B.
If we instead had this diagram, we could rule out "A causes B" by measuring that D is related to A while being independent of B.
If D is not measurable for some reason, we could instead measure variables downstream of D that are unrelated to B and C. But if there is some systematic reason that no such variables are measurable, our rule will incorrectly conclude that A causes B.
Thanks to Thomas Kwa for pointing out the control systems exception, and for encouraging me to write this post. Thanks to Lukas Finnveden for useful feedback.
Throughout this post I will use "related" as an exact synonym for "not independent", as a way of simplifying the language.
I actually mean "not independent" but I think the sentence is easier to read with "correlated".
I'm assuming here that variables are only independent if the structure of the causal graph requires them to be independent. Exceptions to this are discussed in the final section.
Thanks to Thomas Kwa for pointing out the control systems exception, and for encouraging me to write this post :)