My Intuitions on Counterlogical Reasoning

DragonGod

Epistemic Status

This is the fruit of me thinking on counterlogical reasoning for several minutes. I expect there's nothing revolutionary here, and not much of import, and this post is designed more to help me specify how I think counterlogical reasoning is like intuitively, with the hope of reflecting on those intuitions and refining them as I advance in mathematical maturity $^{*}$ .

There's also a part of me that was intrigued by the problem of formalising counterlogical reasoning being difficult when humans reason counterlogically all the time, which is what motivated me taking a shot in the dark.

Acknowledgement

I want to thank Diffractor and Intent for taking the time to help me refine my intuitions on counterlogical reasoning. This post would not have materialised where it not for their feedback.

Introduction

Consider a deterministic agent $α$ which is faced with a decision problem in which it has two available actions $({a_{1}, a_{2}})$ . For the particular problem $α$ is facing, the output of its decision algorithm is a mathematical fact (as the algorithm is deterministic). However in deciding which action to pick, $α$ may engage in reasoning of the form "If I chose $a_{1}$ what would the consequences be, and how would they compare to the consequences were I to choose $a_{2}$ ?". Such reasoning is termed counterfactual as we consider scenarios that are contrary to what is actually factual ( $α$ will choose only one of $({a_{1}, a_{2}})$ and would not choose the other option. Furthermore, as the fact of interest in this case is a logical fact (the output of a deterministic algorithm is a fixed logical fact), it's a special kind of counterfactual problem called a counterlogical problem (borrowing Garrabant and Demski's terminology).

Getting counterlogical reasoning right is very important as humans routinely engage in such reasoning (it seems to be an intuitively desirable way to select actions or policies), yet it seems that it is something that is incredibly difficult to formalise. That humans seem to do sensible counterlogical reasoning makes me believe that sensible counterlogical reasoning is possible, so I reflected on that using myself as the subject of inquiry. In the rest of the post I will describe my findings/conclusions.

Logical Implication is Not Counterlogical Implication

I suppose it would help to define counterfactual implication.

$A counterfactually implies B (A □ \to B)$ says that if counterfactually $A$ were true, then $B$ would be true. This is different from logical implication in that a false proposition implies everything. Suppose I write an exam and don't cheat on the exam. It doesn't make sense to state that if I had counterfactually cheated on the exam then the universe would have been destroyed. However, if $U$ is a proposition that states "the universe has been destroyed", and if $T$ is a proposition that says "I did not cheat on my test", then given that we know $T$ to be true, $\neg T ⟹ U$ . Yet, this doesn't seem to be an intuitively sensible way to imagine the world evolve had I counterfactually cheated on my test. As far as I'm aware, a satisfactory account of counterfactual reasoning is given by Pearl, and my intuitions for how to do counterlogical reasoning is based on his approach.

Counterlogical implication is just a special kind of counterfactual implication involving logical statements. Consider the following statements:

$A =$ "17 is a prime number".

$B = ‘ ‘ 2 + 2 = 4 "$ .

Logically, $A ⟹ B$ . However, if counterfactually, $A$ were false, it doesn't follow that $B$ is false (or that $B$ is true. More context/conditions would be needed to determine the counterfactual truth value of $B$ were $A$ false). At least, this is my intuition regarding the scenario.

Now consider another statement:

$C = ‘ ‘ 2 | 17 "$ .

$C$ is false, therefore $C ⟹ A$ , however counterfactually if $C$ were true it seems to me that $\neg A$ would be the consequence. That is $C □ \to \neg A$ . That is what my intuitions tell me regarding the counterfactual world where $C$ is true.

The Approach

The reason why I intuitively think that $(C □ \to \neg A)$ but not $(\neg A □ \to \neg B)$ is the relation between the propositions $C$ and $A$ , and the propositions $A$ and $B$ . The primeness of $17$ is derived from the logical fact of it having only two natural number factors (namely $1$ and itself). If we didn't know if $17$ was prime, and we discovered that it did have a factor other than $1$ and itself then we would believe that it is not prime (it should be pointed out that while necessary the above is not a sufficient condition for derivability). In this case, if we found out that $2$ is a factor of $17$ we would believe $17$ to be not prime. If we defined $D$ as " $(2 | 17 \lor 3 | 17)$ " Then $\neg A$ is derivable from $D$ $(D △ \to \neg A)$ (note that the value of D is itself disjunctively dependent on the value of $C$ . Alas, I shall not explore conjunctive and disjunctive dependence of propositions as they affect counterlogical reasoning until I actually formalise it). This notion of derivability (which I shall not specify here $^{* *}$ is central to my approach to counterlogical reasoning). When I say $X$ derives $Y (X △ \to Y)$ , what I'm saying is that the reason $Y$ is true is that $X$ is true, Y is logically downstream from $X$ , $Y$ depends on $X$ .. If $X △ \to Y$ , and $X$ were made to be counterfactually false, then $Y$ would also have its truth value inverted (but not vice versa).

Properties of Derivability

Derivability is meant to capture the property of logical dependence. The following are some properties that a satisfactory notion of derivability should possess:

1. Irreflexive: $(\forall A, \neg (A △ \to A))$ . Propositions don't derive themselves (one may contend that axioms aren't derived from other propositions, but the leap from that to them deriving themselves would need to be further substantiated. Furthermore, axioms deriving themselves may be undesirable, as it may cripple counterfactual reasoning about the axioms themselves)).

2. Asymmetric: $(\forall A (A △ \to B ⟹ \neg (B △ \to A)))$ . If $A$ derives $B$ , then $B$ does not derive $A$ . This can be thought of as preventing self reference in the dependence of logical facts.

3. Transitive: $(\forall A, B, C (A △ \to B \land B △ \to C ⟹ A △ \to C))$ . If $A$ derives $B$ , and $B$ derives $C$ , then $A$ derives $C$ . This seems to be self explanatory to me.

Suppose you represented all logical facts as a directed acyclic graph (with each fact as a node), in which there was an edge from $A$ to $B$ $iff$ $A △ \to B$ . Then if you altered a particular node $X$ to counterfactually invert its truth value, all descendants of $X$ would have their truth value inverted. However if $Y$ is not a descendant of $X$ , then the effect of inverting the truth value of $X$ has an undefined effect on $Y$ . To define that effect, I specify a principle below.

Minimal Change

For any alteration to the graph, only the descendants of the altered node have their truth value altered. All other nodes retain their truth values prior to the counterlogical surgery. The intuition behind minimal change is that a counterlogical should change only those facts dependent on the counterlogical itself, and not arbitrary unrelated facts (the same reasoning why we don't think me cheating on the test has any effect on whether the universe would be destroyed or not).

One may observe that if $A ⟹ C$ but $\neg (A △ \to C)$ , then counterlogically altering $A$ may lead to a contradiction. This apparent contradiction is rectified by noting that $A ⟹ C$ is itself derived from $A$ and $C$ , so counterlogically altering $A$ may lead to an alteration in the truth value of $‘ ‘ A ⟹ C "$ thus eliminating the contradiction.

Conclusions

Using the method I specified above, I think agents would be able to sensibly reason about the output of their decision algorithm and what it would mean if their decision algorithm output a certain decision instead of a certain other decision. However, I've not formalised it, so there may be errors/weaknesses in my approach that are not immediately apparent. I do expect that at the very least the above approach (even if flawed) would be helpful towards identifying a more robust and/or more effective theory of counterlogical reasoning.

Notes

$*$ : In particular, I intend to complete First Order Logic, Logic of Provability and Pearl's material on counterfactual reasoning before reviewing this post, so at earliest I expect to review it sometime next year.

$* *$ : It has been pointed out to me that the concept of entails (A entails B is synonymous with "there's a proof of B from A") is a strong candidate for what I intuitively mean by derives. I don't want to commit to any particular candidate concepts for now, lest I blind myself from seeing other potential solutions.

LESSWRONG
LW