# 151

There are two sealed boxes up for auction, box A and box B. One and only one of these boxes contains a valuable diamond. There are all manner of signs and portents indicating whether a box contains a diamond; but I have no sign which I know to be perfectly reliable. There is a blue stamp on one box, for example, and I know that boxes which contain diamonds are more likely than empty boxes to show a blue stamp. Or one box has a shiny surface, and I have a suspicion—I am not sure—that no diamond-containing box is ever shiny.

Now suppose there is a clever arguer, holding a sheet of paper, and they say to the owners of box A and box B: “Bid for my services, and whoever wins my services, I shall argue that their box contains the diamond, so that the box will receive a higher price.” So the box-owners bid, and box B’s owner bids higher, winning the services of the clever arguer.

The clever arguer begins to organize their thoughts. First, they write, “And therefore, box B contains the diamond!” at the bottom of their sheet of paper. Then, at the top of the paper, the clever arguer writes, “Box B shows a blue stamp,” and beneath it, “Box A is shiny,” and then, “Box B is lighter than box A,” and so on through many signs and portents; yet the clever arguer neglects all those signs which might argue in favor of box A. And then the clever arguer comes to me and recites from their sheet of paper: “Box B shows a blue stamp, and box A is shiny,” and so on, until they reach: “and therefore, box B contains the diamond.”

But consider: At the moment when the clever arguer wrote down their conclusion, at the moment they put ink on their sheet of paper, the evidential entanglement of that physical ink with the physical boxes became fixed.

It may help to visualize a collection of worlds—Everett branches or Tegmark duplicates—within which there is some objective frequency at which box A or box B contains a diamond.1

The ink on paper is formed into odd shapes and curves, which look like this text: “And therefore, box B contains the diamond.” If you happened to be a literate English speaker, you might become confused, and think that this shaped ink somehow meant that box B contained the diamond. Subjects instructed to say the color of printed pictures and shown the word Green in red ink often say “green” instead of “red.” It helps to be illiterate, so that you are not confused by the shape of the ink.

To us, the true import of a thing is its entanglement with other things. Consider again the collection of worlds, Everett branches or Tegmark duplicates. At the moment when all clever arguers in all worlds put ink to the bottom line of their paper—let us suppose this is a single moment—it fixed the correlation of the ink with the boxes. The clever arguer writes in non-erasable pen; the ink will not change. The boxes will not change. Within the subset of worlds where the ink says “And therefore, box B contains the diamond,” there is already some fixed percentage of worlds where box A contains the diamond. This will not change regardless of what is written in on the blank lines above.

So the evidential entanglement of the ink is fixed, and I leave to you to decide what it might be. Perhaps box owners who believe a better case can be made for them are more liable to hire advertisers; perhaps box owners who fear their own deficiencies bid higher. If the box owners do not themselves understand the signs and portents, then the ink will be completely unentangled with the boxes’ contents, though it may tell you something about the owners’ finances and bidding habits.

Now suppose another person present is genuinely curious, and they first write down all the distinguishing signs of both boxes on a sheet of paper, and then apply their knowledge and the laws of probability and write down at the bottom: “Therefore, I estimate an 85% probability that box B contains the diamond.” Of what is this handwriting evidence? Examining the chain of cause and effect leading to this physical ink on physical paper, I find that the chain of causality wends its way through all the signs and portents of the boxes, and is dependent on these signs; for in worlds with different portents, a different probability is written at the bottom.

So the handwriting of the curious inquirer is entangled with the signs and portents and the contents of the boxes, whereas the handwriting of the clever arguer is evidence only of which owner paid the higher bid. There is a great difference in the indications of ink, though one who foolishly read aloud the ink-shapes might think the English words sounded similar.

Your effectiveness as a rationalist is determined by whichever algorithm actually writes the bottom line of your thoughts. If your car makes metallic squealing noises when you brake, and you aren’t willing to face up to the financial cost of getting your brakes replaced, you can decide to look for reasons why your car might not need fixing. But the actual percentage of you that survive in Everett branches or Tegmark worlds—which we will take to describe your effectiveness as a rationalist—is determined by the algorithm that decided which conclusion you would seek arguments for. In this case, the real algorithm is “Never repair anything expensive.” If this is a good algorithm, fine; if this is a bad algorithm, oh well. The arguments you write afterward, above the bottom line, will not change anything either way.

This is intended as a caution for your own thinking, not a Fully General Counterargument against conclusions you don’t like. For it is indeed a clever argument to say “My opponent is a clever arguer,” if you are paying yourself to retain whatever beliefs you had at the start. The world’s cleverest arguer may point out that the Sun is shining, and yet it is still probably daytime.

1Max Tegmark, “Parallel Universes,” in Science and Ultimate Reality: Quantum Theory, Cosmology, and Complexity, ed. John D. Barrow, Paul C. W. Davies, and Charles L. Harper Jr. (New York: Cambridge University Press, 2004), 459–491, http://arxiv.org/abs/astro-ph/0302131.

# 151

New Comment

For the person who reads and evaluates the arguments, the question is: what would count as evidence about whether the author wrote the conclusion down first or at the end of his analysis? It is noteworthy that most media, such as newspapers or academic journals, appear to do little to communicate such evidence. So either this is hard evidence to obtain, or few readers are interested in it.

"What would count as evidence about whether the author wrote the conclusion down first or at the end of his analysis?":

Past history of accuracy/trustworthiness;

Evidence of a lack of incentive for bias;

Spot check results for sampling bias.

The last may be unreliable if a) you're the author, or b) your spot check evidence source may be biased, e.g. by a generally accepted biased paradigm.

In the real world this is complicated by the fact that the bottom line may have only been "pencilled in", biased the argument, then been adjusted as a result of the argument - e.g.

"Pencilled in" bottom line is 65;

Unbiased bottom line would be 45;

Adjusted bottom line is 55; - neither correct, nor as incorrect as the original "pencilled in" value.

This "weak bias" algorithm can be recursive, leading eventually (sometimes over many years) to virtual elimination of the original bias, as often happens in scientific and philosophical discourse.

If you're reading someone else's article, then it's important to know whether you're dealing with a sampling bias when looking at the arguments (more on this later). But my main point was about the evidence we should derive from our own conclusions, not about a Fully General Counterargument you could use to devalue someone else's arguments. If you are paid to cleverly argue, then it is indeed a clever argument to say, "My opponent is only arguing cleverly, so I will discount it."

However, it is important to try to determine whether someone is a clever arguer or a curious inquirer when they are trying to convince you of something. i.e. if you were in the diamond box scenario you should conclude (all other things being roughly equal) the curious inquirer's conclusion to be more likely to be true than the clever arguer's. It doesn't really matter whether the source is internal or external. As long as you're making the right determination. Basically, if you're going to think about whether or not someone is being a clever arguer or a curious inquirer, you have to be a curious inquirer about getting that information, not trying to cleverly make a Fully General Counterargument.

If you happened to be a literate English speaker, you might become confused, and think that this shaped ink somehow meant that box B contained the diamond.

A sign S "means" something T when S is a reliable indicator of T. In this case, the clever arguer has sabotaged that reliability.

ISTM the parable presupposes (and needs to) that what the clever arguer produces is ordinarily a reliable indicator that box B contained the diamond, ie ordinarily means that. It would be pointless otherwise.

Therein lies a question: Is he neccessarily able to sabotage it? Posed in the contrary way, are there formats which he can't effectively sabotage but which suffice to express the interesting arguments?

There are formats that he can't sabotage, such as rigorous machine-verifiable proof, but it is a great deal of work to use them even for their natural subject matter. So yes with difficulty for math-like topics.

For science-like topics in general, I think the answer is probably that it's theoretically possible. It needs more than verifiable logic, though. Onlookers need to be able to verify experiments, and interpretive frameworks need to be managed, which is very hard.

For squishier topics, I make no answer.

The trick is to counterfeit the blue stamps :)

Can anyone give me the link here between Designing Social Inquiry by KKV and this post, because I feel that there is one.

"For the person who reads and evaluates the arguments, the question is: what would count as evidence about whether the author wrote the conclusion down first or at the end of his analysis? It is noteworthy that most media, such as newspapers or academic journals, appear to do little to communicate such evidence. So either this is hard evidence to obtain, or few readers are interested in it."

I don't think it's either. Consider the many blog postings and informal essays - often on academic topics - which begin or otherwise include a narrative along the lines of 'so I was working on X and I ran into an interesting problem/a strange thought popped up, and I began looking into it...' They're interesting (at least to me), and common.

So I think the reason we don't see it is that A) it looks biased if your Op-ed on, say, the latest bailout goes 'So I was watching Fox News and I heard what those tax-and-spend liberals were planning this time...', so that's incentive to avoid many origin stories; and B) it's seen as too personal and informal. Academic papers are supposed to be dry, timeless, and rigorous. It would be seen as in bad taste if Newton's Principia had opened with an anecdote about a summer day out in the orchard.

...And your effectiveness as a person is determined by whichever algorithm actually causes your actions.

Define "effectiveness as a person" - in many cases the bias leading to the pre-written conclusion has some form of survival value (e.g. social survival). Due partly to childhood issues resulting in a period of complete? rejection of the value of emotions, I have an unusually high resistance to intellectual bias, yet on a number of measures of "effectiveness as a person" I do not seem to be measuring up well yet (on some others I seem to be doing okay).

Also, as I mentioned in my reply to the first comment, real world algorithms are often an amalgam of the two approaches, so it is not so much which algorithm as what weighting the approaches get. In most (if not all) people this weighting changes with the subject, not just with the person's general level of rationality/intellectual honesty.

As it is almost impossible to detect and neutralize all of one's biases and assumptions, and dangerous to attempt "counter-bias", arriving at a result known to be truly unbiased is rare. NOTE: Playing "Devil's Advocate" sensibly is not "counter-bias" and in a reasonable entity will help to reveal and neutralize bias.

I think bias is irrelevant here. My point was that, whatever your definition of "effectiveness as a person", your actions are determined by the algorithm that caused them, not by the algorithm that you profess to follow.

Your effectiveness as a rationalist is determined by whichever algorithm actually writes the bottom line of your thoughts.

I guess that this algorithm is called emotions and we are mostly an emotional dog wagging a rational a tail.

You might be tempted to say "Well, this is kinda obvious." but from my experience, LW included, most people are not aware of and don't spend any time considering what emotions are really driving their bottom line and instead get lost discussing superficial arguments ad nauseam.

The idea here has stuck with me as one of the best nuggets of wisdom from the sequences. My current condensation of it is as follows:

If you let reality have the final word, you might not like the bottom line. If instead you keep deliberating until the balance of arguments supports your preferred conclusion, you're almost guaranteed to be satisfied eventually!