Related to: Embedded Decisions.

Epistemic Status

This is the fruit of me thinking on counterlogical reasoning for several minutes. I expect there's nothing revolutionary here, and not much of import, and this post is designed more to help me specify how I think counterlogical reasoning is like intuitively, with the hope of reflecting on those intuitions and refining them as I advance in mathematical maturity.

There's also a part of me that was intrigued by the problem of formalising counterlogical reasoning being difficult when humans reason counterlogically all the time, which is what motivated me taking a shot in the dark.


I want to thank Diffractor and Intent for taking the time to help me refine my intuitions on counterlogical reasoning. This post would not have materialised where it not for their feedback.


Consider a deterministic agent which is faced with a decision problem in which it has two available actions . For the particular problem is facing, the output of its decision algorithm is a mathematical fact (as the algorithm is deterministic). However in deciding which action to pick, may engage in reasoning of the form "If I chose what would the consequences be, and how would they compare to the consequences were I to choose ?". Such reasoning is termed counterfactual as we consider scenarios that are contrary to what is actually factual ( will choose only one of and would not choose the other option. Furthermore, as the fact of interest in this case is a logical fact (the output of a deterministic algorithm is a fixed logical fact), it's a special kind of counterfactual problem called a counterlogical problem (borrowing Garrabant and Demski's terminology).

Getting counterlogical reasoning right is very important as humans routinely engage in such reasoning (it seems to be an intuitively desirable way to select actions or policies), yet it seems that it is something that is incredibly difficult to formalise. That humans seem to do sensible counterlogical reasoning makes me believe that sensible counterlogical reasoning is possible, so I reflected on that using myself as the subject of inquiry. In the rest of the post I will describe my findings/conclusions.

Logical Implication is Not Counterlogical Implication

I suppose it would help to define counterfactual implication.

says that if counterfactually were true, then would be true. This is different from logical implication in that a false proposition implies everything. Suppose I write an exam and don't cheat on the exam. It doesn't make sense to state that if I had counterfactually cheated on the exam then the universe would have been destroyed. However, if is a proposition that states "the universe has been destroyed", and if is a proposition that says "I did not cheat on my test", then given that we know to be true, . Yet, this doesn't seem to be an intuitively sensible way to imagine the world evolve had I counterfactually cheated on my test. As far as I'm aware, a satisfactory account of counterfactual reasoning is given by Pearl, and my intuitions for how to do counterlogical reasoning is based on his approach.

Counterlogical implication is just a special kind of counterfactual implication involving logical statements. Consider the following statements:

"17 is a prime number".


Logically, . However, if counterfactually, were false, it doesn't follow that is false (or that is true. More context/conditions would be needed to determine the counterfactual truth value of were false). At least, this is my intuition regarding the scenario.

Now consider another statement:


is false, therefore , however counterfactually if were true it seems to me that would be the consequence. That is . That is what my intuitions tell me regarding the counterfactual world where is true.

The Approach

The reason why I intuitively think that but not is the relation between the propositions and , and the propositions and . The primeness of is derived from the logical fact of it having only two natural number factors (namely and itself). If we didn't know if was prime, and we discovered that it did have a factor other than and itself then we would believe that it is not prime (it should be pointed out that while necessary the above is not a sufficient condition for derivability). In this case, if we found out that is a factor of we would believe to be not prime. If we defined as "" Then is derivable from (note that the value of D is itself disjunctively dependent on the value of . Alas, I shall not explore conjunctive and disjunctive dependence of propositions as they affect counterlogical reasoning until I actually formalise it). This notion of derivability (which I shall not specify here is central to my approach to counterlogical reasoning). When I say derives , what I'm saying is that the reason is true is that is true, Y is logically downstream from , depends on .. If , and were made to be counterfactually false, then would also have its truth value inverted (but not vice versa).

Properties of Derivability

Derivability is meant to capture the property of logical dependence. The following are some properties that a satisfactory notion of derivability should possess:

1. Irreflexive: . Propositions don't derive themselves (one may contend that axioms aren't derived from other propositions, but the leap from that to them deriving themselves would need to be further substantiated. Furthermore, axioms deriving themselves may be undesirable, as it may cripple counterfactual reasoning about the axioms themselves)).

2. Asymmetric: . If derives , then does not derive . This can be thought of as preventing self reference in the dependence of logical facts.

3. Transitive: . If derives , and derives , then derives . This seems to be self explanatory to me.

Suppose you represented all logical facts as a directed acyclic graph (with each fact as a node), in which there was an edge from to . Then if you altered a particular node to counterfactually invert its truth value, all descendants of would have their truth value inverted. However if is not a descendant of , then the effect of inverting the truth value of has an undefined effect on . To define that effect, I specify a principle below.

Minimal Change

For any alteration to the graph, only the descendants of the altered node have their truth value altered. All other nodes retain their truth values prior to the counterlogical surgery. The intuition behind minimal change is that a counterlogical should change only those facts dependent on the counterlogical itself, and not arbitrary unrelated facts (the same reasoning why we don't think me cheating on the test has any effect on whether the universe would be destroyed or not).

One may observe that if but , then counterlogically altering may lead to a contradiction. This apparent contradiction is rectified by noting that is itself derived from and , so counterlogically altering may lead to an alteration in the truth value of thus eliminating the contradiction.


Using the method I specified above, I think agents would be able to sensibly reason about the output of their decision algorithm and what it would mean if their decision algorithm output a certain decision instead of a certain other decision. However, I've not formalised it, so there may be errors/weaknesses in my approach that are not immediately apparent. I do expect that at the very least the above approach (even if flawed) would be helpful towards identifying a more robust and/or more effective theory of counterlogical reasoning.


: In particular, I intend to complete First Order Logic, Logic of Provability and Pearl's material on counterfactual reasoning before reviewing this post, so at earliest I expect to review it sometime next year.

: It has been pointed out to me that the concept of entails (A entails B is synonymous with "there's a proof of B from A") is a strong candidate for what I intuitively mean by derives. I don't want to commit to any particular candidate concepts for now, lest I blind myself from seeing other potential solutions.


New Comment

New to LessWrong?