The Rhythm of Disagreement


16


Eliezer_Yudkowsky

Followup toA Premature Word on AI, The Modesty Argument

Once, during the year I was working with Marcello, I passed by a math book he was reading, left open on the table.  One formula caught my eye (why?); and I thought for a moment and said, "This... doesn't look like it can be right..."

Then we had to prove it couldn't be right.

Why prove it?  It looked wrong; why take the time for proof?

Because it was in a math book.  By presumption, when someone publishes a book, they run it past some editors and double-check their own work; then all the readers get a chance to check it, too.  There might have been something we missed.

But in this case, there wasn't.  It was a misprinted standard formula, off by one.

I once found an error in Judea Pearl's Causality - not just a misprint, but an actual error invalidating a conclusion in the text.  I double and triple-checked, the best I was able, and then sent an email to Pearl describing what I thought the error was, and what I thought was the correct answer.  Pearl confirmed the error, but he said my answer wasn't right either, for reasons I didn't understand and that I'd have to have gone back and done some rereading and analysis to follow.  I had other stuff to do at the time, unfortunately, and couldn't expend the energy.  And by the time Pearl posted an expanded explanation to the website, I'd forgotten the original details of the problem...  Okay, so my improved answer was wrong.

Why take Pearl's word for it?  He'd gotten the original problem wrong, and I'd caught him on it - why trust his second thought over mine?

Because he was frikkin' Judea Pearl.  I mean, come on!  I might dare to write Pearl with an error, when I could understand the error well enough that it would have seemed certain, if not for the disagreement.  But it didn't seem likely that Pearl would concentrate his alerted awareness on the problem, warned of the mistake, and get it wrong twice.  If I didn't understand Pearl's answer, that was my problem, not his.  Unless I chose to expend however much work was required to understand it, I had to assume he was right this time.  Not just as a matter of fairness, but of probability - that, in the real world, Pearl's answer really was right.

In IEEE Spectrum's sad little attempt at Singularity coverage, one bright spot is Paul Wallich's "Who's Who In The Singularity", which (a) actually mentions some of the real analysts like Nick Bostrom and myself and (b) correctly identifies me as an advocate of the "intelligence explosion", whereas e.g. Ray Kurzweil is designated as "technotopia - accelerating change".  I.e., Paul Wallich actually did his homework instead of making everything up as he went along.  Sad that it's just a little PDF chart.

Wallich's chart lists Daniel Dennett's position on the Singularity as:

Human-level AI may be inevitable, but don’t expect it anytime soon. "I don’t deny the possibility a priori; I just think it is vanishingly unlikely in the foreseeable future."

That surprised me.  "Vanishingly unlikely"?  Why would Dennett think that?  He has no obvious reason to share any of the standard prejudices.  I would be interested in knowing Dennett's reason for this opinion, and mildly disappointed if it turns out to be the usual, "We haven't succeeded in the last fifty years, therefore we definitely won't succeed in the next hundred years."

Also in IEEE Spectrum, Steven Pinker, author of The Blank Slate - a popular introduction to evolutionary psychology that includes topics like heuristics and biases - is quoted:

When machine consciousness will occur:  "In one sense—information routing—they already have. In the other sense—first-person experience—we'll never know."

Whoa, said I to myself, Steven Pinker is a mysterian?  "We'll never know"?  How bizarre - I just lost some of the respect I had for him.

I disagree with Dennett about Singularity time horizons, and with Pinker about machine consciousness.  Both of these are prestigious researchers whom I started out respecting about equally.  So why am I curious to hear Dennett's reasons; but outright dismissive of Pinker?

I would probably say something like, "There are many potential reasons to disagree about AI time horizons, and no respectable authority to correct you if you mess up.  But if you think consciousness is everlastingly mysterious, you have completely missed the lesson of history; and respectable minds will give you many good reasons to believe so.  Non-reductionism says something much deeper about your outlook on reality than AI timeframe skepticism; someone like Pinker really ought to have known better."

(But all this presumes that Pinker is the one who is wrong, and not me...)

Robert Aumann, Nobel laureate and original inventor of the no-disagreement-among-Bayesians theorem, is a believing Orthodox Jew.  (I know I keep saying this, but it deserves repeating, for the warning it carries.)  By the time I discovered this strange proclivity of Aumann's, I had long ago analyzed the issues.  Discovering that Aumann was Jewish, did not cause me to revisit the issues even momentarily.  I did not consider for even a fraction of a second that this Nobel laureate and Bayesian might be right, and myself wrong.  I did draw the lesson, "You can teach people Bayesian math, but even if they're genuinely very good with the math, applying it to real life and real beliefs is a whole different story."

Scott Aaronson calls me a bullet-swallower; I disagree.  I am very choosy about which bullets I dodge, and which bullets I swallow.  Any view of disagreement that implies I should not disagree with Robert Aumann must be wrong.

Then there's the whole recent analysis of Many-Worlds.  I felt very guilty, writing about physics when I am not a physicist; but dammit, there are physicists out there talking complete nonsense about Occam's Razor, and they don't seem to feel guilty for using words like "falsifiable" without being able to do the math.

On the other hand, if, hypothetically, Scott Aaronson should say, "Eliezer, your question about why 'energy' in the Hamiltonian and 'energy' in General Relativity are the same quantity, is complete nonsense, it doesn't even have an answer, I can't explain why because you know too little," I would be like "Okay."

Nearly everyone I meet knows how to solve the problem of Friendly AI.  I don't hesitate to dismiss nearly all of these solutions out of hand; standard wrong patterns I dissected long since.

Nick Bostrom, however, once asked whether it would make sense to build an Oracle AI, one that only answered questions, and ask it our questions about Friendly AI.  I explained some of the theoretical reasons why this would be just as difficult as building a Friendly AI:  The Oracle AI still needs an internal goal system to allocate computing resources efficiently, and it has to have a goal of answering questions and updating your mind, so it's not harmless unless it knows what side effects shouldn't happen.  It also needs to implement or interpret a full meta-ethics before it can answer our questions about Friendly AI.  So the Oracle AI is not necessarily any simpler, theoretically, than a Friendly AI.

Nick didn't seem fully convinced of this.  I knew that Nick knew that I'd been thinking about the problem for years, so I knew he wasn't just disregarding me; his continued disagreement meant something.  And I also remembered that Nick had spotted the problem of Friendly AI itself, at least two years before I had (though I did not realize this until later, when I was going back and reading some of Nick's older work).  So I pondered Nick's idea further.  Maybe, whatever the theoretical arguments, an AI that was supposed to only answer questions, and designed to the full standards of Friendly AI without skipping any of the work, could end up a pragmatically safer starting point.  Every now and then I prod Nick's Oracle AI in my mind, to check the current status of the idea relative to any changes in my knowledge.  I remember Nick has been right on previous occasions where I doubted his rightness; and if I am an expert, so is he.

I was present at a gathering with Sebastian Thrun (leader of the team that won the DARPA Grand Challenge '06 for motorized vehicles). Thrun introduced the two-envelopes problem and then asked:  "Can you find an algorithm that, regardless of how the envelope amounts are distributed, always has a higher probability of picking the envelope with more money?"

I thought and said, "No."

"No deterministic algorithm can do it," said Thrun, "but if you use a randomized algorithm, it is possible."

Now I was really skeptical; you cannot extract work from noise.

Thrun gave the solution:  Just pick any function from dollars onto probability that decreases monotonically and continuously from 1 probability at 0 dollars, to a probability of 0 at infinity.  Then if you open the envelope and find that amount of money, roll a die and switch the envelope at that probability.  However much money was in both envelopes originally, and whatever the distribution, you will always have a higher probability of switching the envelope with the lower amount of money.

I said, "That can't possibly work... you can't derive useful work from an arbitrary function and a random number... maybe it involves an improper prior..."

"No it doesn't," said Thrun; and it didn't.

So I went away and thought about it overnight and finally wrote an email in which I argued that the algorithm did make use of prior knowledge about the envelope distribution.  (As the density of the differential of the monotonic function, in the vicinity of the actual envelope contents, goes to zero, the expected benefit of the algorithm over random chance, goes to zero.)  Moreover, once you realized how you were using your prior knowledge, you could see a derandomized version of the algorithm which was superior, even though it didn't make the exact guarantee Thrun had made.

But Thrun's solution did do what he said it did.

(In a remarkable coincidence, not too much later, Steve Omohundro presented me with an even more startling paradox.  "That can't work," I said.  "Yes it can," said Steve, and it could.  Later I perceived, after some thought, that the paradox was a more complex analogue of Thrun's algorithm.  "Why, this is analogous to Thrun's algorithm," I said, and explained Thrun's algorithm.  "That's not analogous," said Steve.  "Yes it is," I said, and it was.)

Why disagree with Thrun in the first place?  He was a prestigious AI researcher who had just won the DARPA Grand Challenge, crediting his Bayesian view of probability - a formidable warrior with modern arms and armor.  It wasn't a transhumanist question; I had no special expertise.

Because I had worked out, as a very general principle, that you ought not to be able to extract cognitive work from randomness; and Thrun's algorithm seemed to be defying that principle.

Okay, but what does that have to do with the disagreement?  Why presume that it was his algorithm that was at fault, and not my foolish belief that you couldn't extract cognitive work from randomness?

Well, in point of fact, neither of these was the problem.  The fault was in my notion that there was a conflict between Thrun's algorithm doing what he said it did, and the no-work-from-randomness principle.  So if I'd just assumed I was wrong, I would have been wrong.

Yet surely I could have done better, if I had simply presumed Thrun to be correct, and managed to break down the possibilities for error on my part into "The 'no work from randomness' principle is incorrect" and "My understanding of what Thrun meant is incorrect" and "My understanding of the algorithm is incomplete; there is no conflict between it and 'no work from randomness'."

Well, yes, on that occasion, this would have given me a better probability distribution, if I had assigned probability 0 to a possibility that turned out, in retrospect, to be wrong.

But probability 0 is a strawman; could I have done better by assigning a smaller probability that Thrun had said anything mathematically wrong?

Yes.  And if I meet Thrun again, or anyone who seems similar to Thrun, that's just what I'll do.

Just as I'll assign a slightly higher probability that I might be right, the next time I find what looks like an error in a famous math book.  In fact, one of the reasons why I lingered on what looked like a problem in Pearl's Causality, was that I'd previously found an acknowledged typo in Probability Theory: The Logic of Science.

My rhythm of disagreement is not a fixed rule, it seems.  A fixed rule would be beyond updating by experience.

I tried to explain why I disagreed with Roger Schank, and Robin said, "All else equal a younger person is more likely to be right in a disagreement?"

But all else wasn't equal.  That was the point.  Roger Schank is a partisan of what one might best describe as "old school" AI, i.e.,  suggestively named LISP tokens.

Is it good for the young to disagree with the old?  Sometimes.  Not all the time.  Just some of the time.  When?  Ah, that's the question!  Even in general, if you are disagreeing about the future course of AI with a famous old AI researcher, and the famous old AI researcher is of the school of suggestively named LISP tokens, and you yourself are 21 years old and have taken one undergraduate course taught with "Artificial Intelligence: A Modern Approach" that you thought was great... then I would tell you to go for it.  Probably both of you are wrong.  But if you forced me to bet money on one or the other, without hearing the specific argument, I'd go with the young upstart.  Then again, the young upstart is not me, so how do they know that rule?

It's hard enough to say what the rhythm of disagreement should be in my own case.  I would hesitate to offer general advice to others, save the obvious:  Be less ready to disagree with a supermajority than a mere majority; be less ready to disagree outside than inside your expertise; always pay close attention to the object-level arguments; never let the debate become about tribal status.