I discussed the dilemma of the clever arguer, hired to sell you a box that may or may not contain a diamond. The clever arguer points out to you that the box has a blue stamp, and it is a valid known fact that diamond-containing boxes are more likely than empty boxes to bear a blue stamp. What happens at this point, from a Bayesian perspective? Must you helplessly update your probabilities, as the clever arguer wishes?
If you can look at the box yourself, you can add up all the signs yourself. What if you can’t look? What if the only evidence you have is the word of the clever arguer, who is legally constrained to make only true statements, but does not tell you everything they know? Each statement that the clever arguer makes is valid evidence—how could you not update your probabilities? Has it ceased to be true that, in such-and-such a proportion of Everett branches or Tegmark duplicates in which box B has a blue stamp, box B contains a diamond? According to Jaynes, a Bayesian must always condition on all known evidence, on pain of paradox. But then the clever arguer can make you believe anything they choose, if there is a sufficient variety of signs to selectively report. That doesn’t sound right.
Consider a simpler case, a biased coin, which may be biased to come up 2/3 heads and 1/3 tails, or 1/3 heads and 2/3 tails, both cases being equally likely a priori. Each H observed is 1 bit of evidence for an H-biased coin; each T observed is 1 bit of evidence for a T-biased coin.1 I flip the coin ten times, and then I tell you, “The 4th flip, 6th flip, and 9th flip came up heads.” What is your posterior probability that the coin is H-biased?
And the answer is that it could be almost anything, depending on what chain of cause and effect lay behind my utterance of those words—my selection of which flips to report.
- I might be following the algorithm of reporting the result of the 4th, 6th, and 9th flips, regardless of the result of those and all other flips. If you know that I used this algorithm, the posterior odds are 8:1 in favor of an H-biased coin.
- I could be reporting on all flips, and only flips, that came up heads. In this case, you know that all 7 other flips came up tails, and the posterior odds are 1:16 against the coin being H-biased.
- I could have decided in advance to say the result of the 4th, 6th, and 9th flips only if the probability of the coin being H-biased exceeds 98%. And so on.
Or consider the Monty Hall problem:
On a game show, you are given the choice of three doors leading to three rooms. You know that in one room is $100,000, and the other two are empty. The host asks you to pick a door, and you pick door #1. Then the host opens door #2, revealing an empty room. Do you want to switch to door #3, or stick with door #1?
The answer depends on the host’s algorithm. If the host always opens a door and always picks a door leading to an empty room, then you should switch to door #3. If the host always opens door #2 regardless of what is behind it, #1 and #3 both have 50% probabilities of containing the money. If the host only opens a door, at all, if you initially pick the door with the money, then you should definitely stick with #1.
You shouldn’t just condition on #2 being empty, but this fact plus the fact of the host choosing to open door #2. Many people are confused by the standard Monty Hall problem because they update only on #2 being empty, in which case #1 and #3 have equal probabilities of containing the money. This is why Bayesians are commanded to condition on all of their knowledge, on pain of paradox.
When someone says, “The 4th coinflip came up heads,” we are not conditioning on the 4th coinflip having come up heads—we are not taking the subset of all possible worlds where the 4th coinflip came up heads—but rather are conditioning on the subset of all possible worlds where a speaker following some particular algorithm said, “The 4th coinflip came up heads.” The spoken sentence is not the fact itself; don’t be led astray by the mere meanings of words.
Most legal processes work on the theory that every case has exactly two opposed sides and that it is easier to find two biased humans than one unbiased one. Between the prosecution and the defense, someone has a motive to present any given piece of evidence, so the court will see all the evidence; that is the theory. If there are two clever arguers in the box dilemma, it is not quite as good as one curious inquirer, but it is almost as good. But that is with two boxes. Reality often has many-sided problems, and deep problems, and nonobvious answers, which are not readily found by Blues and Greens shouting at each other.
Beware lest you abuse the notion of evidence-filtering as a Fully General Counterargument to exclude all evidence you don’t like: “That argument was filtered, therefore I can ignore it.” If you’re ticked off by a contrary argument, then you are familiar with the case, and care enough to take sides. You probably already know your own side’s strongest arguments. You have no reason to infer, from a contrary argument, the existence of new favorable signs and portents which you have not yet seen. So you are left with the uncomfortable facts themselves; a blue stamp on box B is still evidence.
But if you are hearing an argument for the first time, and you are only hearing one side of the argument, then indeed you should beware! In a way, no one can really trust the theory of natural selection until after they have listened to creationists for five minutes; and then they know it’s solid.
1“Bits” in this context are a measure of how much evidence something provides—they’re the logarithms of probabilities, base 1/2.
Suppose a question has exactly two possible (mutually exclusive) answers, and you initially assign 50% probability to each answer. If I then tell you that the first answer is correct (and you have complete faith in my claim), then you have acquired one bit of evidence. If there are four equally likely options, and I tell you the first one is correct, then I have given you two bits; if there are eight and I tell you the right one, then I have given you three bits; and so on. This is discussed further in “How Much Evidence Does It Take?” (in Map and Territory).
We really want to know: what are the typical filters applied in particular areas of life, and thus what evidence does testimony there give us? Doctors, lawyers, parents, lovers, teachers and so on - what filters do they collectively produce on the evidence they get?
We had a related discussion my blog a little while ago - your expert input would be most welcome.
What's being overlooked is that your priors before hearing the clever arguer are not the same as your priors if there were no clever arguer.
Consider the case if the clever arguer presents his case and it is obviously inadequate. Perhaps he refers to none of the usual signs of containing a diamond and the signs he does present seem unusual and inconclusive. (Assume all the usual idealizations, ie no question that he knows the facts and presents them in the best light, his motives are known and absolute, he's not attempting reverse psychology, etc) Wouldn't it seem to you that here is evidence that box B does not contain the diamond as he says? But if no clever arguer were involved, it would be a 50/50 chance.
So the prior that you're updating for each point the clever arguer makes starts out low. It crosses 0.5 at the point where his argument is about as strong as you would expect given a 50/50 chance of A or B.
What lowers it when CA begins speaking? You are predictively compensating for the biased updating you expect to do when you hear a biased but correct argument. (Idealizations are assumed here too. If we let CA begin speaking and then immediately stop him, this shouldn't persuade anybody that the diamond is in box A on the grounds that they're left with the low prior they start with.)
The answer is less clear when CA is not assumed to be clever. When he presents a feeble argument, is it because he can have no good argument, or because he couldn't find it? Ref "What evidence bad arguments".
Any attempt to get information from the Clever Arguer relies on the Clever Arguer being less clever than you (or at least, not clever enough to know how clever you are)
A clever arguer might, perfectly happily, argue that Box A contains a diamond because it has turquoise elephants drawn on it, something with no relation to diamonds. He might argue that Box A contains the diamond because UFO sightings over the past ten years have been higher than the ten previous years.
He might do all these things, because he's been paid by the owner of Box B.
This is colloquially known as "trolling"
Are you allowing these Clever Arguers to freely lie as well as be clever in the way they argue? My understanding is that that it is possible to get information from a Clever Arguer when that information helps them with their argument. Consider the case where lying to a court would be punished as obstruction of justice but there is no law against speaking bullshit that messes with the mind of the jury.
They don't need to lie to troll.
You can get information I suppose, but the information is only in the form of facts about the box, not facts about the facts.
Simply because the clever arguer fails to successfully argue his case doesn't mean the diamond is more likely in the other box, because if it did the clever arguer could be a step ahead.
If you assume that a lack of evidence indicates that you should go for the other box, the Clever Arguer can choose to argue for the wrong box, and provide a very bad supply of evidence, thus misleading you while Not Technically Lying.
If the evidence provided by the clever arguer is sufficient, then it may be useful. But in that case the information is coming from the evidence, not the arguer (whose behaviour is too complicated to serve as meta-evidence)
tl;dr: I was wrong, you can get evidence from the clever arguer IFF the arguer has an overwhelming supply of evidence for his side or you are aware of which side he has been paid by. But you should never adjust your expected level DOWN based on his behaviour unless you have external evidence as to which side he is on.
Am I right in assuming that the above is your intended meaning? It seems to fit and if not I would have to reject the 'exclusive if' claim.
I also note that even if you don't know which side a clever arguer is on, if your probability is not at 0.5 then you will still need to update, regardless of what evidence the cleaver arguer has. Just by knowing that he is a clever arguer.
Indeed you are correct as regards your correction (am editing it now)
mental note, think through further before posting
On the second bit: I'm not sure if that's accurate. I've got intuitions arguing in both directions. I will have to think on it
oh, btw, how do you do those quote-lines?
Start a new line with a right angle bracket and a space, then the text you want to quote.
> Quoted text goes here.
Except, um, without the slash. That's a weird bug. (But if I take it out, it shows a quote line instead of the greater-than sign. That slash is markdown's escape character.)
Ahh, welcome welcome to lesswrong! The syntax is based off markdown.
> If you copy and paste this sentence it would appear as a quote. (It doesn't for me because I espcaped the angle bracket like so: \>)
Ahh, welcome welcome to lesswrong! The syntax is based off markdown.
>If you copy and pasted this sentence it would appear as a quote. (It doesn't for me because I put the angle bracket in a code span with back ticks:
Ahh, welcome welcome to lesswrong! The syntax is based off markdown.
>If you copy and paste this sentence it would appear as a quote. (It doesn't for me because I put the angle bracket in a code span with back ticks:
The general answer is that if you hit the Help button at the lower left corner of the reply box, you get some formatting information.
IIRC, the help window is a subset of a system called Markup, but I can't find a link for it.
Lower right corner.
(Well, on my system, at least. "There is at least one user interface element on a Web app that is black on at least one side.")
Sorry-- a slip of the mind on my part-- it's lower right for me, too.
So the prior that you're updating for each point the clever arguer makes starts out low. It crosses 0.5 at the point where his argument is about as strong as you would expect given a 50/50 chance of A or B.
I don't believe this is exactly correct. After all, when you're just about to start listening to the clever arguer, do you really believe that box B is almost certain not to contain the diamond? Why would you listen to him, then? Rather, when you start out, you have a spectrum of expectations for how long the clever arguer might go on - to the extent you believe box A contains the diamond, you expect box B not to have many positive portents, so you expect the clever arguer to shut up soon; to the extent you believe box B contains the diamond, you expect him to go on for a while.
The key event is when the clever arguer stops talking; until then you have a probability distribution over how long he might go on.
The quantity that slowly goes from 0.1 to 0.9 is the estimate you would have if the clever arguer suddenly stopped talking at that moment; it is not your actual probability that box B contains the diamond.
Your actual probability starts out at 0.5, rises steadily as the clever arguer talks (starting with his very first point, because that excludes the possibility he has 0 points), and then suddenly drops precipitously as soon as he says "Therefore..." (because that excludes the possibility he has more points).
It is very possible I don't understand this properly, but assuming you have knowledge of what strength of evidence is possible, could you start at 0.5 and consider strong arguments (relative to possible strength) as increasing the possibility and weak arguments as decreasing the possibility instead? With each piece of evidence you could increase the point at which weak arguments are viewed as having a positive effect, so numerous weak arguments could still add up to a decently high probability of the box containing the diamond.
For example, if arguments are rated in strength from 0 to 1, and most arguments would not be stronger than .5, my approach would be as follows for each piece of evidence:
Piece 1: Probability += (strength-.25)
Piece 2: probability += (strength-.22)
Piece 3: probability += (strength-.20)
I am of course oversimplifying the math, and looking at how you are approaching stoppage, perhaps this isn't actually effectively much different from your approach. But this approach is more intuitive to me than considering stopping a separate event on its own. If he is struck by lightning, as mentioned several times throughout this discussion, it is hard to view this in the same light as if he had stopped on his own as an independent event, but I am not sure the difference is enough that the probability of the diamond being in the box should be substantially different in the two cases.
Can someone clear up what issues there are with my approach? It makes more sense to me and if it is wrong, I would like to know where.
I mostly concur, but I think you can (and commonly do) get some "negative" information before he stops. If CA comes out with a succession of bad arguments, then even before you know "these are all he has" you know "these are the ones he has chosen to present first".
I know that you know this, because you made a very similar point recently about creationists.
(Of course someone might choose to present their worst arguments first and delay the decent ones until much later. But people usually don't, which suffices.)
I was recently reading a manual an Mercurial, and the author started going on about how you could make multiple clones of a project in different directories, so that you could have different project states, and then push and pull between them. And I thought "if a supposed expert is telling me to do something that baroque and ridiculous this early in the manual, I'm sure glad I'm using Git."
However, when you read the Git manual and get to "Rewriting History", you could come to the conclusion that "this guy is nuts and I have to reevaluate everything I read previously based on that assumption". Also, cloning 2 times and moving commits between those 2 can be a lot easier than rebase/cherry-fu in one copy. I usually do that when I'm called in to fix some messed-up repo.
I would still choose Git over Hg anytime, because this happens seldom enough that the other benefits outweigh it.
Where do you get that A is "almost certain" from? I just said the prior probability of B was "low". I don't think that's a reasonable restatement of what I said.
It doesn't seem to me that excluding the possibility that he has more points should have that effect.
Consider the case where CA is artificially restricted to raising a given number of points. By common sense, for a generous allotment this is nearly equivalent to the original situation, yet you never learn anything new about how many points he has remaining.
You can argue that CA might still stop early when his argument is feeble, and thus you learn something. However, since you've stipulated that every point raises your probability estimate, he won't stop early. To make an argument without that assumption, we can ask about a situation where he is required to raise exactly N points and assume he can easily raise "filler" points.
ISTM at every juncture in the unrestricted and the generously restricted arguments, your probability estimate should be nearly the same, excepting only that you need compensate slightly less in the restricted case.
Now, there is a certain sense of two ways of saying the same thing, raising the probability per point (presumably cogent) but lowering it as a whole in compensation.
But once you begin hearing CA's argument, you know tautologically that you are hearing his argument, barring unusual circumstances that might still cause it not to be fully presented. I see no reason to delay accounting that information.
Tom, if CA's allotment of points is generous enough that the limit makes little difference then it's no longer true that "you never learn anything new about how many points he has remaining" because he'll still stop if he runs out.
If he knows that he's addressing Eliezer and that Eliezer will lower his probability estimate when CA stops, then indeed he'll carry on until reaching the limit (if he can), but in that case what happens is that as he approaches the limit without having made any really strong arguments Eliezer will reason "if the diamond really were in box B then he'd probably be doing better than this" and lower his probability.
Suppose you meet CA, and he says "I think you should think the diamond is in box B, and here's why", and at that instant he's struck by lightning and dies. Ignoring for the sake of argument any belief you might have that liars are more likely to be smitten by the gods, it seems to me that your estimate of the probability that the diamond is in box B should be almost exactly 1/2. (Very slightly higher, perhaps, because you've ruled out the case where there's no evidence for that at all and CA is at least minimally honest.)
Therefore, your suggestion that you lower your probability estimate as soon as you know CA is going to argue his case must be wrong.
What actually happens is: after he's presented evidence A1, A2, ..., Ak, you know not only that A1, ..., Ak are true but also that those are the bits of evidence CA chose to present. And you have some idea of what he'd choose to present if the actually available evidence were of any given strength. If A1, ..., Ak are exactly as good as you'd expect given CA's prowess and perfectly balanced evidence for the diamond's location, then your probability estimate should remain at 1/2. If they're better, it should go up; if they're worse, it should go down.
Note that if you expect a profusion of evidence on each side regardless, k will have to be quite large before good evidence A1 ... Ak increases your estimate much. If that's the case, and if the evidence really does strongly favour box B, then a really clever CA will try to find a way to aggregate the evidence rather than presenting it piecemeal; so in such situations the presentation of piecemeal evidence is itself evidence against CA's claim.
G, you're raising points that I already answered.
Only in the sense that you've said things that contradict one another. You said that knowing that you're listening to CA modifies your prior estimate of P(his preferred conclusion) from the outset, and then you said that actually if you stop him speaking immediately then your prior shouldn't be modified. These can't both be right.
I don't see any way to make the "modified prior" approach work that doesn't amount to doing the same calculations you'd do with the "modified estimation of evidence provided by each point made" approach and then hacking the results back into your prior to get the right answer, and I don't see any reason for preferring the latter.
Of course, as a practical matter, and given the limitations of our reasoning abilities, prior-tweaking may be a useful heuristic even though it sometimes misleads. But, er, "useful heuristics that sometimes mislead" is a pretty good characterization of what's typically just called "bias" around here :-).
"Has it ceased to be true that, in such-and-such a proportion of Everett branches or Tegmark duplicates in which box B has a blue stamp, box B contains a diamond?"
I am baffled as to why a person who calls himself a Bayesian continually resorts to such frequentist thinking. And in this case, it's spectacularly inappropriate frequentist reasoning. Unless someone used a Geiger counter to decide whether or not to put a diamond in the box, quantum-level uncertainty is utterly irrelevant to this problem.
You tread on dangerous ground here. Shouldn't the detail & scope of its predictions (the rent) be the criterion by which we evaluate any theory? Though creationists' poor arguments may be suggestive of the indefensibility of their position, this alone does not prove them wrong, and certainly does not confirm evolution.
Bayesian updating requires competing hypotheses. For E to be evidence for H (H=Darwin's theory), P(H|E) must be greater than P(H), but this is possible only if P(H)0, where ~H is all the competing hypotheses including creationism taken together (i.e. H2,H3,..., where H=H1). And we are able to update only if we have the value for P(E), because of Bayes' formula. But to know P(E), where P(H)<1, we must know P(E|~H), which requires examination of ~H. Therefore we must investigate creationism.
Of course, being finite beings, we need to be able to leave some hypotheses unexamined. But in principle we ought to examine all. So the question of whether or not to examine creationism is a practical question concerning how to allocate our finite resources. Different people may come to different conclusions.
Realistically, we often don't have the means to check the theory ourselves.
And in a modern world where any and everything is marketed to death, we distrust the pro-speech.
But pragmatically, I find that quickly checking the con-speech is very effective.
If it has a point, it will make it clear.
If it is flaky, that was probably the best it could do.
(this does require resistance to fallacies and bullshit)
Which means that (when Monty's algorithm isn't given or when there's uncertainty about how accurate the problem statement is) people who don't switch are making a very defensible choice by the laws of decision theory. For what plausible reason would he (open a door and) offer to let you switch unless he stood to gain if you did? (Answer: To trick you on the meta level, of course.)
I'm not sure if I can parse the title correctly:
I understand it to mean: "What evidence are you using to filter the evidence". Is that correct?
That isn't correct at all. It's more like, "How should you treat evidence that you know/suspect is filtered?" while the literal title would be "What evidence is filtered evidence?" which is a less clear way of saying the same thing. It probably stems from Robin Hanson's habit of omitting verbs in his post titles, since Eliezer wrote this in the OB days.
As long as I don't know his motives (or on what level the host is playing, to put it in HPMOR terms) I can't infer anything from what the host does. He might have opened door 2 because the money isn't behind door 1 and I get another chance. Or because it is behind 1 and he wants me to switch so the company can keep the money. Knowing I should integrate his motives into the equation doesn't mean I can.
Or am I missing something essential here?
Since its a canonical (read most people will have seen it multiple times before) problem the article doesn't go into quite as much detail as it arguably should. Its a well known facet of the standard problem (explicitly stated in many formulations and strongly implied by the context in others) that Monty always opens an empty door that you didn't pick.
I'm trying to incorporate this with conservation of expected evidence: http://lesswrong.com/lw/ii/conservation_of_expected_evidence/
For example: "On average, you must expect to be exactly as confident as when you started out. Equivalently, the mere expectation of encountering evidence—before you've actually seen it—should not shift your prior beliefs." -Eliezer_Yudkowsky AND "Your actual probability starts out at 0.5, rises steadily as the clever arguer talks (starting with his very first point, because that excludes the possibility he has 0 points)" -Eliezer_Yudkowsky
Appear to be contradictions, given that each point= a piece of evidence (shininess of box, presence of blue stamp, etc).
The cherry picking problem appears to be similar to the witch trial problem. In the latter any piece of evidence is interpreted to support the conclusion, while in the former evidence is only presented if it supports the conclusion.
You can't expect your probabilities to on average be increased before seeing/hearing the evidence.
I think only if you do have a large background of knowledge, with a high probability that you are already aware of any given piece of evidence. But if you hear a repeat evidence, it simply shouldn't alter your probabilities, rather than lower it. I'm having a hard time coming up with a way to properly balance the equation.
The only thing I can think of is if you count the entire argument as one piece of evidence, and use a strategy like suggested by g for updating your priors based on the entire sum?
But you don't necessarily listen to the entire argument. Knowing about hypothetical cut off points below which they wont spend the time to present and explain evidence means with enough info you could still construct probabilities. If time is limited, can you update with each single piece of evidence based on strength relative to expected?
What if you are unfamiliar with the properties of boxes and how they are related to likelihood of the presence of a diamond? Any guesstimates seem like they'd be well my abilities at least.
Unless I already know a lot, I have a hard time justifying updating my priors at all based on CA's arguments. If I do know a lot, I still can't think of a way to justifiably not expect the probability to increase, which is a problem. Help, ideas?
PS. Thankfully not everyone is a clever arguer. Ideally, scientists/teachers teaching you about evolution (for example) will not be selective in giving evidence. The evidence will simply be lopsided because of nature being lopsided in how it produces evidence (entangled with truth). I don't think one has to actually listen to a creationist, assuming it is known that the scientists/source material the teacher is drawing from are using good practice.
Also, this is my first post here, so if I am ignorant please let me know and direct me to how I can improve!
Someone claiming that they have evidence for a thing is already evidence for a thing, if you trust them at all, so you can update on that, and then revise that update on how good the evidence turns out to be once you actually get it.
For example, say gwern posts to Discussion that he has a new article on his website about some drug, and he says "tl;dr: It's pretty awesome" but doesn't give any details, and when you follow the link to the site you get an error and can't see the page. gwern's put together a few articles now about drugs, and they're usually well-researched and impressive, so it's pretty safe to assume that if he says a drug is awesome, it is, even if that's the only evidence you have. This is a belief about both the drug (it is particularly effective at what it's supposed to do) and what you'll see when you're able to access the page about it (there will be many citations of research indicating that the drug is particularly effective).
Now, say a couple days later you get the page to load, and what it actually says is "ha ha, April Fools!". This is new information, and as such it changes your beliefs - in particular, your belief that the drug is any good goes down substantially, and any future cases of gwern posting about an 'awesome' drug don't make you believe as strongly that the drug is good - the chance that it's good if there is an actual page about it stays about the same, but now you also have to factor in the chance that it's another prank - or in other words that the evidence you'll be given will be much worse than is being claimed.
It's harder to work out an example of evidence turning out to be much stronger than is claimed, but it works on the same principle - knowing that there's evidence at all means you can update about as much as you would for an average piece of evidence from that source, and then when you learn that the evidence is much better, you update again based on how much better it is.
Not particularly difficult, just posit a person who prior experience has taught you is particularly unreliable about assessing evidence. If they post a link arguing a position you already know they're in favor of, you should assign a relatively low weight of evidence to the knowledge that they've linked to a resource arguing the position, but if you check it out and find that it's actually well researched and reasoned, then you update upwards.
Thanks for the response.
However, I think you misunderstood what I was attempting to say. I see I didn't use the term "filtered evidence", and am wondering if my comment showed up somewhere other than the article "what evidence filtered evidence": http://lesswrong.com/lw/jt/what_evidence_filtered_evidence/ Explaining how I got a response so quickly when commenting on a 5 year old article! If so, my mistake as my comment was then completely misleading!
When the information does not come from a filtered source, I agree with you. If I find out that there is evidence that will be in the up (or down) direction of a belief, this will modify my priors based on the degree of entanglement between the source and the matter of the belief. After seeing the evidence then the probability assessment will on average remain the same; if it was weaker/stronger it will be lower/higher (or higher/lower, if evidence was downward) but it will of course not pass over the initial position before I heard of the news, unless of course it turns out to be evidence for the opposite direction.
Using drugs instead of boxes if that is an example you prefer: imagine a clever arguer hired by Merck to argue about what a great drug Rofecoxib is. The words "cardiovascular", "stroke", and "heart attack" wont ever come up. With the help of selectively drawing from trials, a CA can paint a truly wonderful picture of the drug that has limited baring on reality.
Before seeing his evidence he tells you "Rofecoxib is wonderful!" This shouldn't modify your belief, as it only tells you he is on Merck's payroll. Now how do you appropriately modify your belief on the drug's quality and merits with the introduction of each piece of evidence this clever arguer presents to you?
Actually, it's my bad - I found your comment via the new-comments list, and didn't look very closely at its context.
As to your actual question: Being told that someone has evidence of something is, if they're trustworthy, not just evidence of the thing, but also evidence of what other evidence exists. For example, in my scenario with gwern's prank, before I've seen gwern's web page, I expect that if I look the mentioned drug up in other places, I'll also see evidence that it's awesome. If I actually go look the drug up and find out that it's no better than placebo in any situation, that's also surprising new information that changes my beliefs - the same change that seeing gwern's "April Fools" message would cause, in fact, so when I do see that message, it doesn't surprise me or change my opinion of the drug.
In your scenario, I trust Merck's spokesperson much less than I trust gwern, so I don't end up with nearly so strong of a belief that third parties will agree that the drug is a good one - looking it up and finding out that it has dangerous side effects wouldn't be surprising, so I should take the chance of that into account to begin with, even if the Merck spokesperson doesn't mention it. This habit of keeping possible information from third parties (or information that could be discovered in other ways besides talking to third parties, but that the person you're speaking to wouldn't tell you even if they'd discovered it) into account when talking to untrustworthy people is the intended lesson of the original post.
This post and The Bottom Line previously, are extremely useful for cases where someone is trying to convince you of something you know very little about. Advertising seems like the most common example, although media coverage on obscure topics may fall into this category if you don't bother to look at the other side of the issue. This only applies in instances where two elements are present: (1) the media is motivated to bias their account (perhaps because people prefer news sources that confirm what they already believe, or because people prefer sensationalized stories over watching grass grow) AND (2) the subject must be obscure enough that you only encounter information about it from a single source, or from additional sources with identical biases.
If either of these elements is missing, however, you can't discard all that evidence as part of a selection bias. It may still be subject to some media bias, but you'd have to supply a counterargument to justifiably claim media bias, rather than being able to hypothesize that there exists a large body of data opposing the arguments viewpoint.
Another way of reading this might be to always check at least 2 sources with substantially different viewpoints if you want to be reasonably sure of a fact. Anything those sources have in common might also bias your information, although it may just add random noise in some cases, rather than true selection bias.
No, you don't need update you assumption. If clever arguer choose to argue about what box is contained a diamond - and not bet his own money on that..... It is sure sign that he have absolutely no idea about this, so all his speeches also just can't contain a usefull information, only total bullshit. It is like updating your beliefs about future fliping a coin. Coin just don't contain information about future- therefore useless for predicting. Also with clever arguer.
I try put it in other words. Arguer is clever. He doesn't sure what box is containing a diamond- i.e. he believe in 50/50. Else- he just bouth box, that he think contain diamond. He has a more information about box, then you. So, how you can think that you have more certain data, that one box contain a diamond -than arguer, if you have less information than he?
Also, I wonder - if somebody hired two clever arguers, one of them will persuaded one person, that diamond in the left box, and the other will argue to second person that diamond in the right box. And clever arguers is so good, that they victims almost sure in that... Isn't it almost as creating new diamond out of air ?