Abstracted Idealized Dynamics


17


Eliezer_Yudkowsky

Followup toMorality as Fixed Computation

I keep trying to describe morality as a "computation", but people don't stand up and say "Aha!"

Pondering the surprising inferential distances that seem to be at work here, it occurs to me that when I say "computation", some of my listeners may not hear the Word of Power that I thought I was emitting; but, rather, may think of some complicated boring unimportant thing like Microsoft Word.

Maybe I should have said that morality is an abstracted idealized dynamic.  This might not have meant anything to start with, but at least it wouldn't sound like I was describing Microsoft Word.

How, oh how, am I to describe the awesome import of this concept, "computation"?

Perhaps I can display the inner nature of computation, in its most general form, by showing how that inner nature manifests in something that seems very unlike Microsoft Word—namely, morality.

Consider certain features we might wish to ascribe to that-which-we-call "morality", or "should" or "right" or "good":

• It seems that we sometimes think about morality in our armchairs, without further peeking at the state of the outside world, and arrive at some previously unknown conclusion.

Someone sees a slave being whipped, and it doesn't occur to them right away that slavery is wrong.  But they go home and think about it, and imagine themselves in the slave's place, and finally think, "No."

Can you think of anywhere else that something like this happens?

Suppose I tell you that I am making a rectangle of pebbles.  You look at the rectangle, and count 19 pebbles on one side and 103 dots pebbles on the other side.  You don't know right away how many pebbles there are.  But you go home to your living room, and draw the blinds, and sit in your armchair and think; and without further looking at the physical array, you come to the conclusion that the rectangle contains 1957 pebbles.

Now, I'm not going to say the word "computation".  But it seems like that-which-is "morality" should have the property of latent development of answers—that you may not know right away, everything that you have sufficient in-principle information to know.  All the ingredients are present, but it takes additional time to bake the pie.

You can specify a Turing machine of 6 states and 2 symbols that unfolds into a string of 4.6 × 101439 1s after 2.5 × 102879 steps.  A machine I could describe aloud in ten seconds, runs longer and produces a larger state than the whole observed universe to date. 

When you distinguish between the program description and the program's executing state, between the process specification and the final outcome, between the question and the answer, you can see why even certainty about a program description does not imply human certainty about the executing program's outcome.  See also Artificial Addition on the difference between a compact specification versus a flat list of outputs.

Morality, likewise, is something that unfolds, through arguments, through discovery, through thinking; from a bounded set of intuitions and beliefs that animate our initial states, to a potentially much larger set of specific moral judgments we may have to make over the course of our lifetimes.

• When two human beings both think about the same moral question, even in a case where they both start out uncertain of the answer, it is not unknown for them to come to the same conclusion.  It seems to happen more often than chance alone would allow—though the biased focus of reporting and memory is on the shouting and the arguments.  And this is so, even if both humans remain in their armchairs and do not peek out the living-room blinds while thinking.

Where else does this happen?  It happens when trying to guess the number of pebbles in a rectangle of sides 19 and 103.  Now this does not prove by Greek analogy that morality is multiplication.  If A has property X and B has property X it does not follow that A is B.  But it seems that morality ought to have the property of expected agreement about unknown latent answers, which, please note, generally implies that similar questions are being asked in different places.

This is part of what is conveyed by the Word of Power, "computation": the notion of similar questions being asked in different places and having similar answers.  Or as we might say in the business, the same computation can have multiple instantiations.

If we know the structure of calculator 1 and calculator 2, we can decide that they are "asking the same question" and that we ought to see the "same result" flashing on the screen of calculator 1 and calculator 2 after pressing the Enter key.  We decide this in advance of seeing the actual results, which is what makes the concept of "computation" predictively useful.

And in fact, we can make this deduction even without knowing the exact circuit diagrams of calculators 1 and 2, so long as we're told that the circuit diagrams are the same.

And then when we see the result "1957" flash on the screen of calculator 1, we know that the same "1957" can be expected to flash on calculator 2, and we even expect to count up 1957 pebbles in the array of 19 by 103.

A hundred calculators, performing the same multiplication in a hundred different ways, can be expected to arrive at the same answer—and this is not a vacuous expectation adduced after seeing similar answers.  We can form the expectation in advance of seeing the actual answer.

Now this does not show that morality is in fact a little electronic calculator.  But it highlights the notion of something that factors out of different physical phenomena in different physical places, even phenomena as physically different as a calculator and an array of pebbles—a common answer to a common question.  (Where is this factored-out thing?  Is there an Ideal Multiplication Table written on a stone tablet somewhere outside the universe? But we are not concerned with that for now.)

Seeing that one calculator outputs "1957", we infer that the answer—the abstracted answer—is 1957; and from there we make our predictions of what to see on all the other calculator screens, and what to see in the array of pebbles.

So that-which-we-name-morality seems to have the further properties of agreement about developed latent answers, which we may as well think of in terms of abstract answers; and note that such agreement is unlikely in the absence of similar questions.

• We sometimes look back on our own past moral judgments, and say "Oops!"  E.g., "Oops!  Maybe in retrospect I shouldn't have killed all those guys when I was a teenager."

So by now it seems easy to extend the analogy, and say:  "Well, maybe a cosmic ray hits one of the transistors in the calculator and it says '1959' instead of 1957—that's an error."

But this notion of "error", like the notion of "computation" itself, is more subtle than it appears.

Calculator Q says '1959' and calculator X says '1957'.  Who says that calculator Q is wrong, and calculator X is right?  Why not say that calculator X is wrong and calculator Q is right?  Why not just say, "the results are different"?

"Well," you say, drawing on your store of common sense, "if it was just those two calculators, I wouldn't know for sure which was right.  But here I've got nine other calculators that all say '1957', so it certainly seems probable that 1957 is the correct answer."

What's this business about "correct"?  Why not just say "different"?

"Because if I have to predict the outcome of any other calculators that compute 19 x 103, or the number of pebbles in a 19 x 103 array, I'll predict 1957—or whatever observable outcome corresponds to the abstract number 1957."

So perhaps 19 x 103 = 1957 only most of the time.  Why call the answer 1957 the correct one, rather than the mere fad among calculators, the majority vote?

If I've got a hundred calculators, all of them rather error-prone—say a 10% probability of error—then there is no one calculator I can point to and say, "This is the standard!"  I might pick a calculator that would happen, on this occasion, to vote with ten other calculators rather than ninety other calculators.  This is why I have to idealize the answer, to talk about this ethereal thing that is not associated with any particular physical process known to me—not even arithmetic done in my own head, which can also be "incorrect".

It is this ethereal process, this idealized question, to which we compare the results of any one particular calculator, and say that the result was "right" or "wrong".

But how can we obtain information about this perfect and un-physical answer, when all that we can ever observe, are merely physical phenomena?  Even doing "mental" arithmetic just tells you about the result in your own, merely physical brain.

"Well," you say, "the pragmatic answer is that we can obtain extremely strong evidence by looking at the results of a hundred calculators, even if they are only 90% likely to be correct on any one occasion."

But wait:  When do electrons or quarks or magnetic fields ever make an "error"?  If no individual particle can be mistaken, how can any collection of particles be mistaken?  The concept of an "error", though humans may take it for granted, is hardly something that would be mentioned in a fully reductionist view of the universe.

Really, what happens is that we have a certain model in mind of the calculator—the model that we looked over and said, "This implements 19 * 103"—and then other physical events caused the calculator to depart from this model, so that the final outcome, while physically lawful, did not correlate with that mysterious abstract thing, and the other physical calculators, in the way we had in mind.  Given our mistaken beliefs about the physical process of the first calculator, we would look at its output '1959', and make mistaken predictions about the other calculators (which do still hew to the model we have in mind).

So "incorrect" cashes out, naturalistically, as "physically departed from the model that I had of it" or "physically departed from the idealized question that I had in mind".  A calculator struck by a cosmic ray, is not 'wrong' in any physical sense, not an unlawful event in the universe; but the outcome is not the answer to the question you had in mind, the question that you believed empirically-falsely the calculator would correspond to.

The calculator's "incorrect" answer, one might say, is an answer to a different question than the one you had in mind—it is an empirical fact about the calculator that it implements a different computation.

• The 'right' act or the 'should' option sometimes seem to depend on the state of the physical world.  For example, should you cut the red wire or the green wire to disarm the bomb?

Suppose I show you a long straight line of pebbles, and ask you, "How many pebbles would I have, if I had a rectangular array of six lines like this one?"  You start to count, but only get up to 8 when I suddenly blindfold you.

Now you are not completely ignorant of the answer to this question.  You know, for example, that the result will be even, and that it will be greater than 48.  But you can't answer the question until you know how many pebbles were in the original line.

But mark this about the question:  It wasn't a question about anything you could directly see in the world, at that instant.  There was not in fact a rectangular array of pebbles, six on a side.  You could perhaps lay out an array of such pebbles and count the results—but then there are more complicated computations that we could run on the unknown length of a line of pebbles.  For example, we could treat the line length as the start of a Goodstein sequence, and ask whether the sequence halts.  To physically play out this sequence would require many more pebbles than exist in the universe.  Does it make sense to ask if the Goodstein sequence which starts with the length of this line of pebbles, "would halt"?  Does it make sense to talk about the answer, in a case like this?

I'd say yes, personally.

But meditate upon the etherealness of the answer—that we talk about idealized abstract processes that never really happen; that we talk about what would happen if the law of the Goodstein sequence came into effect upon this line of pebbles, even though the law of the Goodstein sequence will never physically come into effect.

It is the same sort of etherealness that accompanies the notion of a proposition that 19 * 103 = 1957 which factors out of any particular physical calculator and is not identified with the result of any particular physical calculator.

Only now that etherealness has been mixed with physical things; we talk about the effect of an ethereal operation on a physical thing.  We talk about what would happen if we ran the Goodstein process on the number of pebbles in this line here, which we have not counted—we do not know exactly how many pebbles there are.  There is no tiny little XML tag upon the pebbles that says "Goodstein halts", but we still think—or at least I still think—that it makes sense to say of the pebbles that they have the property of their Goodstein sequence terminating.

So computations can be, as it were, idealized abstract dynamics—idealized abstract applications of idealized abstract laws, iterated over an imaginary causal-time that could go on for quite a number of steps (as Goodstein sequences often do). 

So when we wonder, "Should I cut the red wire or the green wire?", we are not multiplying or simulating the Goodstein process, in particular.  But we are wondering about something that is not physically immanent in the red wires or the green wires themselves; there is no little XML tag on the green wire, saying, "This is the wire that should be cut."

We may not know which wire defuses the bomb, but say, "Whichever wire does in fact defuse the bomb, that is the wire that should be cut."

Still, there are no little XML tags on the wires, and we may not even have any way to look inside the bomb—we may just have to guess, in real life.

So if we try to cash out this notion of a definite wire that should be cut, it's going to come out as...

...some rule that would tell us which wire to cut, if we knew the exact state of the physical world...

...which is to say, some kind of idealized abstract process into which we feed the state of the world as an input, and get back out, "cut the green wire" or "cut the red wire"...

...which is to say, the output of a computation that would take the world as an input.

• And finally I note that from the twin phenomena of moral agreement and moral error, we can construct the notion of moral disagreement.

This adds nothing to our understanding of "computation" as a Word of Power, but it's helpful in putting the pieces together.

Let's say that Bob and Sally are talking about an abstracted idealized dynamic they call "Enamuh".

Bob says "The output of Enamuh is 'Cut the blue wire'," and Sally says "The output of Enamuh is 'Cut the brown wire'."

Now there are several non-exclusive possibilities:

Either Bob or Sally could have committed an error in applying the rules of Enamuh—they could have done the equivalent of mis-multiplying known inputs.

Either Bob or Sally could be mistaken about some empirical state of affairs upon which Enamuh depends—the wiring of the bomb.

Bob and Sally could be talking about different things when they talk about Enamuh, in which case both of them are committing an error when they refer to Enamuh_Bob and Enamuh_Sally by the same name.  (However, if Enamuh_Bob and Enamuh_Sally differ in the sixth decimal place in a fashion that doesn't change the output about which wire gets cut, Bob and Sally can quite legitimately gloss the difference.)

Or if Enamuh itself is defined by some other abstracted idealized dynamic, a Meta-Enamuh whose output is Enamuh, then either Bob or Sally could be mistaken about Meta-Enamuh in any of the same ways they could be mistaken about Enamuh.  (But in the case of morality, we have an abstracted idealized dynamic that includes a specification of how it, itself, changes.  Morality is self-renormalizing—it is not a guess at the product of some different and outside source.)

To sum up:

  • Morality, like computation, involves latent development of answers;
  • Morality, like computation, permits expected agreement of unknown latent answers;
  • Morality, like computation, reasons about abstract results apart from any particular physical implementation;
  • Morality, like computation, unfolds from bounded initial state into something potentially much larger;
  • Morality, like computation, can be viewed as an idealized dynamic that would operate on the true state of the physical world—permitting us to speak about idealized answers of which we are physically uncertain;
  • Morality, like computation, lets us to speak of such un-physical stuff as "error", by comparing a physical outcome to an abstract outcome—presumably in a case where there was previously reason to believe or desire that the physical process was isomorphic to the abstract process, yet this was not actually the case.

And so with all that said, I hope that the word "computation" has come to convey something other than Microsoft Word.

 

Part of The Metaethics Sequence

Next post: "'Arbitrary'"

Previous post: "Moral Error and Moral Disagreement"