Possibility and Could-ness


38


Eliezer_Yudkowsky

This post is part of the Solution to "Free Will".
Followup toDissolving the Question, Causality and Moral Responsibility

Planning out upcoming posts, it seems to me that I do, in fact, need to talk about the word could, as in, "But I could have decided not to rescue that toddler from the burning orphanage."

Otherwise, I will set out to talk about Friendly AI, one of these days, and someone will say:  "But it's a machine; it can't make choices, because it couldn't have done anything other than what it did."

So let's talk about this word, "could".  Can you play Rationalist's Taboo against it?  Can you talk about "could" without using synonyms like "can" and "possible"?

Let's talk about this notion of "possibility".  I can tell, to some degree, whether a world is actual or not actual; what does it mean for a world to be "possible"?

I know what it means for there to be "three" apples on a table.  I can verify that experimentally, I know what state of the world corresponds it.  What does it mean to say that there "could" have been four apples, or "could not" have been four apples?  Can you tell me what state of the world corresponds to that, and how to verify it?  Can you do it without saying "could" or "possible"?

I know what it means for you to rescue a toddler from the orphanage.  What does it mean for you to could-have-not done it?  Can you describe the corresponding state of the world without "could", "possible", "choose", "free", "will", "decide", "can", "able", or "alternative"?

One last chance to take a stab at it, if you want to work out the answer for yourself...

Some of the first Artificial Intelligence systems ever built, were trivially simple planners.  You specify the initial state, and the goal state, and a set of actions that map states onto states; then you search for a series of actions that takes the initial state to the goal state.

Modern AI planners are a hell of a lot more sophisticated than this, but it's amazing how far you can get by understanding the simple math of everything.  There are a number of simple, obvious strategies you can use on a problem like this.  All of the simple strategies will fail on difficult problems; but you can take a course in AI if you want to talk about that part.

There's backward chaining:  Searching back from the goal, to find a tree of states such that you know how to reach the goal from them.  If you happen upon the initial state, you're done.

There's forward chaining:  Searching forward from the start, to grow a tree of states such that you know how to reach them from the initial state.  If you happen upon the goal state, you're done.

Or if you want a slightly less simple algorithm, you can start from both ends and meet in the middle.

Let's talk about the forward chaining algorithm for a moment.

Here, the strategy is to keep an ever-growing collection of states that you know how to reach from the START state, via some sequence of actions and (chains of) consequences.  Call this collection the "reachable from START" states; or equivalently, label all the states in the collection "reachable from START".  If this collection ever swallows the GOAL state - if the GOAL state is ever labeled "reachable from START" - you have a plan.

"Reachability" is a transitive property.  If B is reachable from A, and C is reachable from B, then C is reachable from A.  If you know how to drive from San Jose to San Francisco, and from San Francisco to Berkeley, then you know a way to drive from San Jose to Berkeley.  (It may not be the shortest way, but you know a way.)

If you've ever looked over a game-problem and started collecting states you knew how to achieve - looked over a maze, and started collecting points you knew how to reach from START - then you know what "reachability" feels like.  It feels like, "I can get there."  You might or might not be able to get to the GOAL from San Francisco - but at least you know you can get to San Francisco.

You don't actually run out and drive to San Francisco.  You'll wait, and see if you can figure out how to get from San Francisco to GOAL.  But at least you could go to San Francisco any time you wanted to.

(Why would you want to go to San Francisco?  If you figured out how to get from San Francisco to GOAL, of course!)

Human beings cannot search through millions of possibilities one after the other, like an AI algorithm.  But - at least for now - we are often much more clever about which possibilities we do search.

One of the things we do that current planning algorithms don't do (well), is rule out large classes of states using abstract reasoning.  For example, let's say that your goal (or current subgoal) calls for you to cover at least one of these boards using domino 2-tiles.

Boards_3

The black square is a missing cell; this leaves 24 cells to be covered with 12 dominos.

You might just dive into the problem, and start trying to cover the first board using dominos - discovering new classes of reachable states:

Boarddive

However, you will find after a while that you can't seem to reach a goal state.  Should you move on to the second board, and explore the space of what's reachable there?

But I wouldn't bother with the second board either, if I were you.  If you construct this coloring of the boards:

Boardsparity

Then you can see that every domino has to cover one grey and one yellow square.  And only the third board has equal numbers of grey and yellow squares.  So no matter how clever you are with the first and second board, it can't be done.

With one fell swoop of creative abstract reasoning - we constructed the coloring, it was not given to us - we've cut down our search space by a factor of three.  We've reasoned out that the reachable states involving dominos placed on the first and second board, will never include a goal state.

Naturally, one characteristic that rules out whole classes of states in the search space, is if you can prove that the state itself is physically impossible.  If you're looking for a way to power your car without all that expensive gasoline, it might seem like a brilliant idea to have a collection of gears that would turn each other while also turning the car's wheels - a perpetual motion machine of the first type.  But because it is a theorem that this is impossible in classical mechanics, we know that every clever thing we can do with classical gears will not suffice to build a perpetual motion machine.  It is as impossible as covering the first board with classical dominos.  So it would make more sense to concentrate on new battery technologies instead.

Surely, what is physically impossible cannot be "reachable"... right?  I mean, you would think...

Oh, yeah... about that free will thing.

So your brain has a planning algorithm - not a deliberate algorithm that you learned in school, but an instinctive planning algorithm.  For all the obvious reasons, this algorithm keeps track of which states have known paths from the start point.  I've termed this label "reachable", but the way the algorithm feels from inside, is that it just feels like you can do it.  Like you could go there any time you wanted.

And what about actions?  They're primitively labeled as reachable; all other reachability is transitive from actions by consequences.  You can throw a rock, and if you throw a rock it will break a window, therefore you can break a window.  If you couldn't throw the rock, you wouldn't be able to break the window.

Don't try to understand this in terms of how it feels to "be able to" throw a rock.  Think of it in terms of a simple AI planning algorithm.  Of course the algorithm has to treat the primitive actions as primitively reachable.  Otherwise it will have no planning space in which to search for paths through time.

And similarly, there's an internal algorithmic label for states that have been ruled out:

worldState.possible == 0

So when people hear that the world is deterministic, they translate that into:  "All actions except one are impossible."  This seems to contradict their feeling of being free to choose any action.  The notion of physics following a single line, seems to contradict their perception of a space of possible plans to search through.

The representations in our cognitive algorithms do not feel like representations; they feel like the way the world is.  If your mind constructs a search space of states that would result from the initial state given various actions, it will feel like the search space is out there, like there are certain possibilities.

We've previously discussed how probability is in the mind.  If you are uncertain about whether a classical coin has landed heads or tails, that is a fact about your state of mind, not a property of the coin.  The coin itself is either heads or tails.  But people forget this, and think that coin.probability == 0.5, which is the Mind Projection Fallacy: treating properties of the mind as if they were properties of the external world.

So I doubt it will come as any surprise to my longer-abiding readers, if I say that possibility is also in the mind.

What concrete state of the world - which quarks in which positions - corresponds to "There are three apples on the table, and there could be four apples on the table"?  Having trouble answering that?  Next, say how that world-state is different from "There are three apples on the table, and there couldn't be four apples on the table."  And then it's even more trouble, if you try to describe could-ness in a world in which there are no agents, just apples and tables.  This is a Clue that could-ness and possibility are in your map, not directly in the territory.

What is could-ness, in a state of the world?  What are can-ness and able-ness?  They are what it feels like to have found a chain of actions which, if you output them, would lead from your current state to the could-state.

But do not say, "I could achieve X".  Say rather, "I could reach state X by taking action Y, if I wanted".  The key phrase is "if I wanted".  I could eat that banana, if I wanted.  I could step off that cliff there - if, for some reason, I wanted to.

Where does the wanting come from?  Don't think in terms of what it feels like to want, or decide something; try thinking in terms of algorithms.  For a search algorithm to output some particular action - choose - it must first carry out a process where it assumes many possible actions as having been taken, and extrapolates the consequences of those actions.

Perhaps this algorithm is "deterministic", if you stand outside Time to say it.  But you can't write a decision algorithm that works by just directly outputting the only action it can possibly output.  You can't save on computing power that way.  The algorithm has to assume many different possible actions as having been taken, and extrapolate their consequences, and then choose an action whose consequences match the goal.  (Or choose the action whose probabilistic consequences rank highest in the utility function, etc.  And not all planning processes work by forward chaining, etc.)

You might imagine the decision algorithm as saying:  "Suppose the output of this algorithm were action A, then state X would follow.  Suppose the output of this algorithm were action B, then state Y would follow."  This is the proper cashing-out of could, as in, "I could do either X or Y."  Having computed this, the algorithm can only then conclude:  "Y ranks above X in the Preference Ordering.  The output of this algorithm is therefore B.  Return B."

The algorithm, therefore, cannot produce an output without extrapolating the consequences of itself producing many different outputs.  All but one of the outputs being considered is counterfactual; but which output is the factual one cannot be known to the algorithm until it has finished running.

A bit tangled, eh?  No wonder humans get confused about "free will".

You could eat the banana, if you wanted.  And you could jump off a cliff, if you wanted.  These statements are both true, though you are rather more likely to want one than the other.

You could even flatly say, "I could jump off a cliff" and regard this as true - if you construe could-ness according to reachability, and count actions as primitively reachable.  But this does not challenge deterministic physics; you will either end up wanting to jump, or not wanting to jump.

The statement, "I could jump off the cliff, if I chose to" is entirely compatible with "It is physically impossible that I will jump off that cliff".  It need only be physically impossible for you to choose to jump off a cliff - not physically impossible for any simple reason, perhaps, just a complex fact about what your brain will and will not choose.

Defining things appropriately, you can even endorse both of the statements:

  • "I could jump off the cliff" is true from my point-of-view
  • "It is physically impossible for me to jump off the cliff" is true for all observers, including myself

How can this happen?  If all of an agent's actions are primitive-reachable from that agent's point-of-view, but the agent's decision algorithm is so constituted as to never choose to jump off a cliff.

You could even say that "could" for an action is always defined relative to the agent who takes that action, in which case I can simultaneously make the following two statements:

  • NonSuicidalGuy could jump off the cliff.
  • It is impossible that NonSuicidalGuy will hit the ground.

If that sounds odd, well, no wonder people get confused about free will!

But you would have to be very careful to use a definition like that one consistently.  "Could" has another closely related meaning in which it refers to the provision of at least a small amount of probability.  This feels similar, because when you're evaluating actions that you haven't yet ruled out taking, then you will assign at least a small probability to actually taking those actions - otherwise you wouldn't be investigating them.  Yet "I could have a heart attack at any time" and "I could have a heart attack any time I wanted to" are not the same usage of could, though they are confusingly similar.

You can only decide by going through an intermediate state where you do not yet know what you will decide.  But the map is not the territory.  It is not required that the laws of physics be random about that which you do not know.  Indeed, if you were to decide randomly, then you could scarcely be said to be in "control".  To determine your decision, you need to be in a lawful world.

It is not required that the lawfulness of reality be disrupted at that point, where there are several things you could do if you wanted to do them; but you do not yet know their consequences, or you have not finished evaluating the consequences; and so you do not yet know which thing you will choose to do.

A blank map does not correspond to a blank territory.  Not even an agonizingly uncertain map corresponds to an agonizingly uncertain territory.

(Next in the free will solution sequence is "The Ultimate Source", dealing with the intuition that we have some chooser-faculty beyond any particular desire or reason.  As always, the interested reader is advised to first consider this question on their own - why would it feel like we are more than the sum of our impulses?)