Epistemic Status

Been sitting in my drafts unattended to for a month. I'm publishing now so that I publish at all. I do not currently endorse this post or its follow up.



Discussions with others helped me refine/sharpen these ideas. I'm grateful for them.


(You may skip to "Closing Remarks" to get a high-level summary of the post.)


Counterfactual/counterlogical reasoning seems intuitively sensible as a decision procedure.

We can build deterministic algorithms that consider what "would" happen if they took one action or another. I will refer to such algorithms as "decision algorithms". 

Loosely speaking, a decision algorithm is one that takes in sensory data (observation history) as input and (if it halts) returns ("chooses") an action as its output.

Given a set of actions .

What does it mean for a decision algorithm to consider an action  if it will deterministically choose an action  and can't actually "choose" .

In what sense "could" it have chosen ?

Why is this possible?

What "Couldness" Feels Like From the Inside

I'm guessing that "couldness" flows from a paradox of self-reference?

A decision algorithm cannot (in general) determine ahead of time what its output (choice) is, until it actually returns its output (makes that choice). This can be proven by contradiction. 

Suppose that there's a subroutine  that gave the algorithm its deterministic choice.


 is the algorithm's percept sequence (sensory data/observation history) at the time of invocation.


Then after running D, the algorithm can pick  (or simply not halt).

("Can" here means that the algorithm may be constructed so that after invoking D it acts in a manner inconsistent with the output of D.)

This leads to a contradiction.

Thus, no such D can exist.

Thus, the algorithm cannot "know" its choice ahead of time.

(This is a limitation of decision algorithms in general. Some specific algorithms call a subroutine that determines their choice and then just return that choice.)


This inability of decision algorithms to know their (deterministic) choice ahead of time, is what "couldness" feels like from the inside.

From the perspective of the algorithm, it really can "choose" any 

What "Couldness" Feels Like From the Outside

What about from outside the algorithm? Isn't the deterministic choice of the algorithm already "known" ahead of time? If one knew the initial state of the universe, couldn't they tell what choice the algorithm was going to make ahead of time? If the choice of the algorithm is already determined, then in what sense "could" the algorithm have chosen otherwise?

Consider Chris Leong's example of "the student and the exam":

Suppose a student has a test on Friday. They are considering whether to study or go to the beach. Since the universe deterministic, they conclude that the outcome is already fixed and has been fixed since the start of time. Therefore, they figure out that they may as well not bother to study. Is there anything wrong with this reasoning?

I think that the reasoning is indeed flawed/problematic, and I'll explain why.


Undecidability of Decision Algorithms

The undecidability of the halting problem means that the deterministic output of arbitrary decision algorithms on arbitrary inputs is also undecidable (for if you knew the output of an algorithm, you could determine that it halted).


So, for decision algorithms, the only way to determine their choice for a given input is to run that algorithm with the particular input (if the algorithm halts, it'll halt with its choice, if it doesn't halt, then you're out of luck).

Even if someone knows the current (and all prior) state(s) of the universe, they cannot "know" the choice of arbitrary decision algorithms on arbitrary inputs without running said algorithms (or extensionally equivalent versions thereof), and even then, they can't truly "know" the choice, as the algorithm may simply not halt.


Extensional Equivalence

I mentioned above that it is possible to determine the output of algorithms by running extensionally equivalent algorithms.

Loosely speaking, an algorithm  and another , are extensionally equivalent if for all input .

Alternatively, the algorithms are extensionally equivalent if when represented as functions, they refer to the same function (i.e. they "compute" the same function).


So, for algorithms that do halt — naturally, you cannot know this — can you not just run an extensionally equivalent algorithm to determine the output of the decision algorithm?

Well, whether an arbitrary algorithm is extensionally equivalent to another arbitrary algorithm is itself also undecidable. Via Rice's theorem, any non-trivial semantic property of an algorithm (such as whether it is extensionally equivalent to some other algorithm) is also undecidable.

So, you can't even determine which extensionally equivalent algorithm to run.

Closing Remarks

We have a slew of impossibility results (for the general case) of arbitrary decision algorithms:

  • The choice of a decision algorithm is unknowable to itself
  • The choice of a decision algorithm is unknowable for other algorithms
    • Knowing the "choice" of a decision algorithm violates halting undecidability
    • Finding an extensionally equivalent algorithm violates Rice's theorem


In a sense then, the "choice" of decision algorithms is in principle "unknowable". Both from inside the algorithm itself, and from outside it.

Other agents cannot determine the choice of arbitrary decision algorithms on arbitrary inputs. If they want to know what "choice" an algorithm makes, they just have to run it and hope it halts.

This fundamental "unknowability" of the choice of a decision algorithm, is I think what "couldness" is.


Next: "Free Will in a Computational Universe"

New to LessWrong?

New Comment
1 comment, sorted by Click to highlight new comments since: Today at 12:19 PM

Via Rice's theorem, any non-trivial semantic property of an algorithm (such as whether it is extensionally equivalent to some other algorithm) is also undecidable.

This is about ability to answer such questions about all algorithms at once. There is no way to algorithmically tell if an arbitrary algorithm behaves the same as a given algorithm. But this is false for many specific algorithms, where it's possible to decide all sorts of properties, for example proving that two particular algorithms always behave the same.

And it might be false-in-practice for the algorithm that's running you. It's inconvenient when you can know how this algorithm behaves, since then you need to take that into account in what kinds of thinking can be enacted with it, but it can happen. The chicken rule is a way of making this unhappen, by acting contrary to any fact about your algorithm's behavior you figure out, so that you won't be in a position to figure out such facts, on pain of contradiction.