Disjunctions, Antipredictions, Etc.


14


Eliezer_Yudkowsky

Followup toUnderconstrained Abstractions

Previously:

So if it's not as simple as just using the one trick of finding abstractions you can easily verify on available data, what are some other tricks to use?

There are several, as you might expect...

Previously I talked about "permitted possibilities".  There's a trick in debiasing that has mixed benefits, which is to try and visualize several specific possibilities instead of just one.

The reason it has "mixed benefits" is that being specific, at all, can have biasing effects relative to just imagining a typical case.  (And believe me, if I'd seen the outcome of a hundred planets in roughly our situation, I'd be talking about that instead of all this Weak Inside View stuff.)

But if you're going to bother visualizing the future, it does seem to help to visualize more than one way it could go, instead of concentrating all your strength into one prediction.

So I try not to ask myself "What will happen?" but rather "Is this possibility allowed to happen, or is it prohibited?"  There are propositions that seem forced to me, but those should be relatively rare - the first thing to understand about the future is that it is hard to predict, and you shouldn't seem to be getting strong information about most aspects of it.

Of course, if you allow more than one possibility, then you have to discuss more than one possibility, and the total length of your post gets longer.  If you just eyeball the length of the post, it looks like an unsimple theory; and then talking about multiple possibilities makes you sound weak and uncertain.

As Robyn Dawes notes,

"In their summations lawyers avoid arguing from disjunctions in favor of conjunctions.  (There are not many closing arguments that end, "Either the defendant was in severe financial straits and murdered the decedent to prevent his embezzlement from being exposed or he was passionately in love with the same coworker and murdered the decedent in a fit of jealous rage or the decedent had blocked the defendant's promotion at work and the murder was an act of revenge.  The State has given you solid evidence to support each of these alternatives, all of which would lead to the same conclusion: first-degree murder.")  Rationally, of course, disjunctions are much more probable than are conjunctions."

Another test I use is simplifiability - after I've analyzed out the idea, can I compress it back into an argument that fits on a T-Shirt, even if it loses something thereby?  Here's an example of some compressions:

  • The whole notion of recursion and feeding object-level improvements back into meta-level improvements:  "If computing power per dollar doubles every eighteen months, what happens if computers are doing the research?"
  • No diminishing returns on complexity in the region of the transition to human intelligence:  "We're so similar to chimps in brain design, and yet so much more powerful; the upward slope must be really steep."
  • Scalability of hardware:  "Humans have only four times the brain volume of chimps - now imagine an AI suddenly acquiring a thousand times as much power."

If the whole argument was that T-Shirt slogan, I wouldn't find it compelling - too simple and surface a metaphor.  So you have to look more closely, and try visualizing some details, and make sure the argument can be consistently realized so far as you know.  But if, after you do that, you can compress the argument back to fit on a T-Shirt again - even if it sounds naive and stupid in that form - then that helps show that the argument doesn't depend on all the details being true simultaneously; the details might be different while fleshing out the same core idea.

Note also that the three statements above are to some extent disjunctive - you can imagine only one of them being true, but a hard takeoff still occurring for just that reason alone.

Another trick I use is the idea of antiprediction.  This is when the narrowness of our human experience distorts our metric on the answer space, and so you can make predictions that actually aren't far from maxentropy priors, but sound very startling.

I shall explain:

A news story about an Australian national lottery that was just starting up, interviewed a man on the street, asking him if he would play.  He said yes.  Then they asked him what he thought his odds were of winning.  "Fifty-fifty," he said, "either I win or I don't."

To predict your odds of winning the lottery, you should invoke the Principle of Indifference with respect to all possible combinations of lottery balls.  But this man was invoking the Principle of Indifference with respect to the partition "win" and "not win".  To him, they sounded like equally simple descriptions; but the former partition contains only one combination, and the latter contains the other N million combinations.  (If you don't agree with this analysis I'd like to sell you some lottery tickets.)

So the antiprediction is just "You won't win the lottery."  And the one may say, "What?  How do you know that?  You have no evidence for that!  You can't prove that I won't win!"  So they are focusing far too much attention on a small volume of the answer space, artificially inflated by the way their attention dwells upon it.

In the same sense, if you look at a television SF show, you see that a remarkable number of aliens seem to have human body plans - two arms, two legs, walking upright, right down to five fingers per hand and the location of eyes in the face.  But this is a very narrow partition in the body-plan space; and if you just said, "They won't look like humans," that would be an antiprediction that just steps outside this artificially inflated tiny volume in the answer space.

Similarly with the true sin of television SF, which is too-human minds, even among aliens not meant to be sympathetic characters.  "If we meet aliens, they won't have a sense of humor," I antipredict; and to a human it sounds like I'm saying something highly specific, because all minds by default have a sense of humor, and I'm predicting the presence of a no-humor attribute tagged on.  But actually, I'm just predicting that a point in mind design volume is outside the narrow hyperplane that contains humor.

An AI might go from infrahuman to transhuman in less than a week?  But a week is 10^49 Planck intervals - if you just look at the exponential scale that stretches from the Planck time to the age of the universe, there's nothing special about the timescale that 200Hz humans happen to live on, any more than there's something special about the numbers on the lottery ticket you bought.

If we're talking about a starting population of 2GHz processor cores, then any given AI that FOOMs at all, is likely to FOOM in less than 10^15 sequential operations or more than 10^19 sequential operations, because the region between 10^15 and 10^19 isn't all that wide a target.  So less than a week or more than a century, and in the latter case that AI will be trumped by one of a shorter timescale.

This is actually a pretty naive version of the timescale story.  But as an example, it shows how a "prediction" that's close to just stating a maximum-entropy prior, can sound amazing, startling, counterintuitive, and futuristic.

When I make an antiprediction supported by disjunctive arguments that are individually simplifiable, I feel slightly less nervous about departing the rails of vetted abstractions.  (In particular, I regard this as sufficient reason not to trust the results of generalizations over only human experiences.)

Finally, there are three tests I apply to figure out how strong my predictions are.

The first test is to just ask myself the Question, "What do you think you know, and why do you think you know it?"  The future is something I haven't yet observed; if my brain claims to know something about it with any degree of confidence, what are the reasons for that?  The first test tries to align the strength of my predictions with things that I have reasons to believe - a basic step, but one which brains are surprisingly wont to skip.

The second test is to ask myself "How worried do I feel that I'll have to write an excuse explaining why this happened anyway?"  If I don't feel worried about having to write an excuse - if I can stick my neck out and not feel too concerned about ending up with egg on my face - then clearly my brain really does believe this thing quite strongly, not as a point to be professed through enthusiastic argument, but as an ordinary sort of fact.  Why?

And the third test is the "So what?" test - to what degree will I feel indignant if Nature comes back and says "So what?" to my clever analysis?  Would I feel as indignant as if I woke up one morning to read in the newspaper that Mars had started orbiting the Sun in squares instead of ellipses?  Or, to make it somewhat less strong, as if I woke up one morning to find that banks were charging negative interest on loans?  If so, clearly I must possess some kind of extremely strong argument - one that even Nature Itself ought to find compelling, not just humans.  What is it?