Is there a name for and research about the heuristic / fallacy that there is exactly one cause for things? How come we do not look for the conditions that cause but for a cause?

I see this almost as often as the correlation = causation fallacy. When it comes in the form of "risk factor" it is ok if the factor is selective. But when it comes in the form of a general assumption about the world I find it simplistic. A risk factor is only a vague hint that needs to be looked at more closely to establish causation.

There also is this notion that multi causality is additive as would be the case if the probability for something would depend on this OR that happening but not this AND that.

A correlation of less than one may be random, but there might also be a hidden more selective cause/factor.

In medical news I keep hearing of risk factors for a condition. They find that there is a correlation between A, B and the studied disease. But how do we know that it doesn't take A and B and C to make it almost certain to develop that disease? I would like to know. C might be a common gene that is not even known.

Say it takes A and B. I really enjoy A, but I never do B, then why lower my life quality just because a study including people who also do B found that A is a risk factor? Risk factor is only a positive correlation. Eating and breathing have positive correlation to all diseases and the joke is, they come out with news about bad diets every year.

I keep hearing that A is a risk factor, then a follow-up study finds that there is no conclusive data for A being the problem, so A is cool again. But what if A and B is the problem and each alone is not harmful?

In the end this means that you can only find what you are looking for. (Kind of the big problem with science.) Looking for 1:1 correlation you will only find the low hanging fruit and the singular cause.

Whenever we find that some but not all who do/have A get Y we should look for additional factors, but this is not always done. As soon as A feels restrictive/selective enough the finding gets blown out of proportion. The reality might be that all who have A and B get Y, which would be a lot more informative. Who cares to know that breathing causes respiratory problems? Now that might seem silly and far fetched but how often have you heard that some common behavior is a risk factor?

New Answer
New Comment

1 Answers sorted by



Two relevant things.

First, the Epsilon Fallacy: the idea that effects are the result of many tiny causes adding up. In practice, 80/20 is a thing, and most things most of the time do have a small number of "main" root causes which account for most of the effect. So it's not necessarily wrong to look for "exactly one cause" - as in e.g. optimizing runtime of a program, there's often one cause which accounts for most of the effect. In the "logical-and" case you mention, I'd usually expect to see either

  • most of the things in the and-clause don't actually vary much in the population (i.e. most of them are almost always true or almost always false), and just one or two account for most of the variance, OR
  • a bunch of the things in the and-clause are highly correlated due to some underlying cause.

Of course there are exceptions to this, in particular for traits under heavy selection pressure - if we always hammer down the nail that sticks out, then all the nails end up at around the same height. If we repeatedly address bottlenecks/limiting factors in a system, then all limiting factors will end up roughly equally limiting, and 80/20 doesn't happen.

Second: the right "language" in which to think about this sort of thing is not flat boolean logic (i.e. "effect = (A or B) and C and D") but rather causal diagrams. The sort of medical studies you mention - i.e. "saliva is a risk factor for cancer but only if taken orally in small doses over a long period of time" - are indeed pretty dumb, but the fix is not to look for a giant and-clause of conditions which result in the effect. The fix is to build a gears-level model of the system, figure out the whole internal cause-and-effect graph.

Right, one could expand the clause indefinitely, that is kind of what I meant by "can only find what you are looking for". But that only means it is hard, not that it is bad to think that way.

I do neither think of it as logic nor as causal diagrams nor Bayesian nor Markov diagrams but simply as sets of some member type that may have any number of features/properties/attributes that make them a member of some subset.

When I wrote "A AND B" I wanted you to understand it as a dual logic clause, but only for simplicity.

The way I really think... (read more)

I think of "gears-level model" and "causal DAG" as usually synonymous. There are some arguable exceptions - e.g. some non-DAG markov models are arguably gears-level - but DAGs are the typical use case. The obvious objection to this idea is "what about feedback loops?", and the answer is "it's still a causal DAG when you expand over time" - and that's exactly what gears-level understanding of a feedback loop requires. Same with undirected markov models: they typically arise from DAG models with some of the nodes unobserved; a gears-level model hypothesizes what those hidden factors are. The hospital example includes both of these: a feedback loop, with some nodes unobserved. But if you expand out the actual gears-level model, distinguishing between different people with different diseases at different times, then it all looks DAG-shaped; the observed data just doesn't include most of those nodes. This generalizes: the physical world is always DAG-shaped, on a fundamental level. Everything else is an abstraction on top of that, and it can always be grounded in DAGs if needed. The advantage of using causal DAGs for our model, even when most of the nodes are not observed, is that it tells us which things need to be included in the AND-clauses and which do not. For instance, "gear AND oval-shaped" vs "gear AND needs oil" - the idea that the second can be ignored "because that already follows from gears" is a fact which derives from DAG structure. For a large model, there's an exponential number of logical clauses which we could form; a DAG gives formal rules for which clauses are relevant to our analysis.
1 comment, sorted by Click to highlight new comments since:

I made up this story:
In a company there have been head injuries, so they brought in a medical student to investigate/research.
The researcher gathered all employees blood pressure, gender, age, and eye sight data.
The result was that mostly men were affected, with all other factors being what you would expect given the employees.
The company was forced by the insurance company to make helmets mandatory for all men due to their gender being a risk factor.
Because the engineers were all men they were over proportionally affected and did not like to wear the helmets, so they got together and demanded further research into what caused the injuries and how to remove the cause.
This time the secretary was tasked with the follow up because she knew Excel. She took her mail scale and measuring tape and went around asking everyone if they drank coffee or tea, measured the weight of the content of people's pockets and how high they were with and without shoes. To be thorough she did this for every week day separately.
After importing the previous data, she found many correlations between attributes and other attributes variances but what stood out were the correlations between injuries to Friday, pocket weight, gender, height with shoes in ascending order. A histogram of injuries per "height without shoes"-class showed a sharp increase at 6 feet. Being taller than 6' was clearly the cause.
After having presented her findings, one woman stood up and remarked: "But I am not 6', and it happened to me!" Counting the women taller than 6' the secretary found none.
I could go on but I think you get it and we can save us the time. After more searching they found that their 6 foot door frames were the best thing to change and that some women had been wearing higher shoes on Fridays.
My point is that gender was not the cause and especially "too low doors" AND ("over 6' tall" OR ("tall for a woman" AND "high shoes")) was the problem. Neither being a tall woman nor high shoes alone would have been causal in this scenario.
I would have loved to include wheel chairs in this but found it too complicated.