25 Adversarial epistemology

24th Aug 2022

4 min read

25

This is the first article in my Bah-Humbug Sequence a.k.a. "Everything I Don't Like Is A Defect/Defect Equilibrium". Epistemic status: strong opinion weakly held, somewhat exaggerated for dramatic effect; I'm posting this here so that the ensuing discussion might help me clarify my position. Anyway, the time has now come for me to explain my overbearing attitude of cynicism towards all aspects of life. Why now, of all times? I hope to make that clear by the end.

You are asking me to believe a certain claim. There is a simple and easy thing you can do to prove its trustworthiness, and yet you have not done that. I am therefore entitled to [Weak Adversarial Argument] disregard your claim as of no evidentiary value / [Strong Adversarial Argument] believe the negation of your claim purely out of spite.

What's going on here? Are these valid arguments?

It may help to give some examples:

The Hearsay Objection - In a court of law, if a witness X tries to testify that some other person Y said Z, in trying to establish the truth of Z, the opposing side may object. This objection takes the form: "The opposition has brought in X to prove Z by way of the fact that Y said Z. But X is not the most reliable witness they could have called, because they could have summoned Y instead. If they were genuinely seeking the truth as to Z, they would have done so; and yet we see that they did not. Therefore I insist that X's testimony be stricken from the record."
The Cynical Cryptographer - My company's HR department emails me a link to an employee satisfaction survey. The email is quick to say "Your responses are anonymous", and yet I notice that the survey link contains a bunch of gobbledegook like ?id=2815ec7e931410a5fb358588ee70ad8b. I think to myself: If this actually is anonymous, and not a sham to see which employees have attitude problems and should be laid off first, the HR department could have set up a Chaumian blind signature protocol to provably guarantee that my response cannot be linked to my name. But they didn't, and so I conclude that this survey is a sham, and I won't fill it out.

So, again, are these valid arguments? From a Bayesian perspective, not really:

X saying that Y said Z is not literally zero evidence of Z. If there is any chance >0 that X and Y are honest, then I must update at least somewhat towards the truth of Z upon hearing X's testimony.
I'm pretty sure they don't teach cryptography in business school. An honest HR department and a dishonest one have approximately equal likelihood (i.e. ε) of knowing what a "Chaumian blind signature" is and actually implementing it. Therefore, by Bayes' theorem, etc.

To steelman the Adversarial Argument, we should understand it not as an ordinary passive attempt to "rationally" form an accurate world-model, but rather as a sort of acausal negotiation tactic, akin to one-boxing on Newcomb's Problem. By adopting it, we hope to "influence" the behavior of adversaries (i.e. people who want to convince us of something but don't share our interests) towards providing stronger evidence, and away from trying to deceive us.

Or, to put it another way, the Adversarial Argument may not be valid in general, but by proclaiming it loudly and often, we can make it valid (at least in certain contexts) and thus make distinguishing truth and falsehood easier. Because the Hearsay Objection is enforced in court, lawyers who want to prove Z will either introduce direct witnesses or drop the claim altogether. And perhaps (we can dream!) if the Cynical Cryptographer argument catches on, honest HR departments will find themselves compelled to add Chaumian blind signatures to their surveys in order to get any responses, making the sham surveys easy to spot.

(Aside: Even under this formulation, we might accept the Weak Adversarial Argument but reject the Strong Adversarial Argument - by adopting a rule that I'll believe the opposite of what an untrustworthy-seeming person says, I'm now setting myself up to be deceived into believing P by a clever adversary who asserts ¬P in a deliberately sleazy way - whereupon I'll congratulate myself for seeing through the trick! Is there any way around this?)

Now, returning to the template above, the premise that "there is a simple and easy thing you can do to prove its trustworthiness" is doing a lot of work. Your adversary will always contend that the thing you want them to do (calling witness Y, adding Chaumian signatures, etc.) is too difficult and costly to reasonably expect of them. This may or may not be true, but someone who's trying to deceive you will claim such regardless of its truth, hoping that they can "blend in" among the honest ones.

At that point, the situation reduces to a contest of wills over who gets to grab how much of the surplus value from our interaction. What is my trust worth to you? How much personal cost will you accept in order to gain it?

We on LessWrong - at least, those who wish to communicate the ideas we discuss here with people who don't already agree - should be aware of this dynamic. There may have been a time in history when charismatic authority or essays full of big words were enough to win people over, but that is far from our present reality. In our time, propaganda and misinformation are well-honed arts. People are "accustomed to a haze of plausible-sounding arguments" and are rightly skeptical of all of them. Why should they trust the ideas on LessWrong, of all things? If we think gaining their trust is important and valuable, how much personal cost are we willing to accept to that end?

Or, backing up further: Why should you trust what you read here?

Epistemic HygieneTrust and ReputationRationalityWorld Modeling

Frontpage

25

Charging for the Dharma

18 comments32 karma

Mentioned in

32Charging for the Dharma

Adversarial epistemology

New Comment

15 comments, sorted by

top scoring

Click to highlight new comments since: Today at 2:28 AM

[-]JBlack3y60

I disagree with the hearsay conclusion that you should update toward the truth of Z.

My first problem is that it's a straw objection. The actual objection is that while X can be further questioned to inspect in detail whether their testimony is credible, Y cannot. This immensely weakens the link between X's testimony and the truth of Z, and admitting such evidence would open up bad incentives outside the courtroom as well.

The next problem is that considering the chance that X and Y are truthful is only part of a Bayesian update procedure. If you have a strong prior of Y's testimony being reliable ~~but not that X's is~~^[1], you should update away from the truth of Z. If both X and Y were correct in their statements, then Y would be a much stronger witness and should have been called by the lawyer. Now you have evidence that Y's testimony would have harmed the case for Z. It is straightforward but tedious to work through a Bayesian update for this. For example:

Suppose priors are P(X's testimony is truthful) = 1/2, P(Y made a true statement about Z) = 9/10, independent of each other and Z. Let E be the event "the lawyer only calls X to give the testimony that Y said Z". This event is incompatible with XYZ, since Y should have been called. It is also incompatible with XYZ', XY'Z, and X'YZ since in these cases X would not testify that Y said Z (where primes ' are used to indicate negation). So

P(ZE) = P(X'Y'ZE) = P(E | X'Y'Z) P(X'Y'Z) < P(X'Y'Z) = 1/40,
P(Z'E) = P(XY'Z'E) + P(X'YZ'E) + P(X'Y'Z'E) > P(E|X'YZ') P(X'YZ') = P(E|X'YZ') 9/40.

If P(E|X'YZ') >= 1/9, then P(Z'|E) > P(Z|E) and you should update away from Z being true. That is, suppose Z was actually false, Y correctly said that it was false, and X is not truthful in their testimony. What is the probability that the lawyer only calls X to testify that Y said Z? It seems to me quite a lot greater than 1/9. The lawyer has incentive to call X, who will testify in support of Z for their client. They will probably attempt to contact Y, but Y will very likely not testify in support of Z. There are possibilities for P(E'|X'YZ'), but they seem much less likely.

So in this scenario you should update against Z.

^{^}
It turns out that this part is not necessary. Almost all of the evidential weight comes from credibility of Y.

[-]jchan3y10

This is interesting, because it seems that you've proved the validity of the "Strong Adversarial Argument", at least in a situation where we can say:

This event is incompatible with XYZ, since Y should have been called.

In other words, we can use the Adversarial Argument (in a normal Bayesian way, not as an acausal negotiation tactic) when we're in a setting where the rule against hearsay is enforced. But what reason could we have had for adopting that rule in the first place? It could not have been because of the reasoning you've laid out here, which presupposes that the rule is already in force! The rule is epistemically self-fulfilling, but its initial justification would have seemed epistemically "irrational".

So, why do we apply it in a courtroom setting but not in ordinary conversation? In short, because the stakes are higher and there's a strong positive incentive to deceive.

[-]JBlack3y20

This calculation just used the fact that Y would have been able to give stronger testimony than X, and that lawyers have incentives to present a strong case for their client where possible. In this scenario, the fact that Y was not called is evidence that Y's testimony would have weakened the case for Z.

The actual objection against hearsay has nothing to do with this calculation at all, as I mentioned in my comment.

You can apply it in ordinary conversation too (to the extent that you apply Bayesian updates in ordinary conversation at all). It's just that the update is stronger when the equivalent of E|XYZ is more unlikely, and in ordinary conversation it may not be very unlikely resulting in a weaker update.

[-]Lukas_Gloor3y30

I'm intrigued to see where this sequence is going!

Or, to put it another way, the Adversarial Argument may not be valid in general, but by proclaiming it loudly and often, we can make it valid (at least in certain contexts) and thus make distinguishing truth and falsehood easier.

Noticing people not doing this in contexts where I think it's appropriate is one of the most triggering things for me that I can imagine. So I think the acausal negotiation tactic comes quite naturally to me.

Similarly, I find it equally triggering when I see people punish others for honesty, for disclosing information they could easily have withheld.

[-]Alex Flint3y20

If you view people as machiavelian actors using models to pursue goals then you will eventually find social interactions to be bewildering and terrifying, because there actually is no way to discern honesty or kindness or good intention if you start from the view that each person is ultimately pursuing some kind of goal in an ends-justify-means way.

But neither does it really make sense to say "hey let's give everyone the benefit of the doubt because then such-and-such".

I think in the end you have to find a way to trust something that is not the particular beliefs or goals of a person.

[-]Kenku3y20

Adversarial epistemology is one of the main themes in the visual novel When the Seagulls Cry, which i have reviewed on my twitter previously. (Also see Gwern's review).

It is a corrosive enough idea, that if you take it seriously enough, you can conclude that entire fields of science are fake because you have no way to verify their claims, so the “experimenters” might as well be making results up.

[-]jchan3y11

You mention "Infra-Bayesianism" in that Twitter thread - do you think that's related to what I'm talking about here?

[-]deepthoughtlife3y10

The thing is, no one ever presents the actual strongest version of an argument. Their actions are never the best possible, except briefly, accidentally, and in extremely limited circumstances. I can probably remember how to play an ideal version of the tic-tac-toe strategy that's the reason only children play it, but any game more complicated than that and my play will be subpar. Games are much smaller and simpler things than arguments. Simply noticing that an argument isn't the best it could is a you thing, because it is always true. Basically no one is a specialist in whatever the perfect argument turns out to be, (and people who are will often be wrong). Saying that a correct argument that significantly changes likelihoods isn't real evidence because it could be stronger allows you to always stick with your current beliefs.

Also, if I was a juror, I would like to hear that the accused was overheard telling his buddy that he was out of town the night before, having taken a trip to the city where the murder he is accused of happened. Even though just being one person in that city is an incredibly weak piece of evidence, and it is even weaker for being someone uninvolved in the conversation saying it, it is still valuable to include. (And indeed, such admissions are not considered hearsay in court, even though they clearly are.) There are often cases where the strongest argument alone is not enough, but the weight of all arguments clearly is.

[-]JBlack3y20

That type of reporting of statements is not considered hearsay because it is directly observed evidence about the defendant, made under oath. It is not treated as evidence that the defendant was in that other city, but as evidence that they said they were. It can be used to challenge the trustworthiness of the defendant's later statements saying that they weren't, for example. The witness can be cross-examined to find flaws in their testimony, other witnesses to the conversation can be brought in, and so on.

Hearsay is about things that are reported to the witness. If Alice testifies that Bob said he saw the defendant in the other city, the court could in principle investigate the fact of whether Bob actually said that, but that would be irrelevant. Bob is not on trial, was not under oath, cannot be cross-examined, and so on.

[-]deepthoughtlife3y10

I am aware of the excuses used to define it as not hearsay, even though it is clearly the same as all other cases of such. Society simply believes it is a valuable enough scenario that it should be included, even though it is still weak evidence.

[-]JBlack3y20

Whether something is hearsay is relative to the proposition in question.

When Charlie testifies that Bob said that he saw Alice at the club, that's hearsay when trying to establish whether Alice was at the club, or was alive at all, or any other facts about Alice. Charlie is not conveying any direct knowledge about Alice.

It is not hearsay in establishing many facts about Bob at the time of the conversation. E.g. where Bob was at the time of the conversation, whether he was acquainted with Alice, or many other such propositions. It also conveys facts about Charlie. Charlie's statement conveys direct knowledge of many facts about the conversation that are not dependent upon the veracity of Bob's statements, and are therefore not hearsay in relation to them.

[-]JBlack3y20

It depends upon how strong the argument actually is compared with how strong you would expect it to be if the conclusion were true. It doesn't have to be a perfect argument, but if you have a high prior for the person making the argument to be competent at making arguments (as you would for a trial lawyer, for example) then your expected strength may be quite high and a failure to meet it may be valid evidence toward the conclusion being false.

If the person making the argument is a random layperson and you expected them to present a zero-knowledge cryptographic protocol in support, then your model of the world is poorly calibrated. A bad world model can indeed result in wrong conclusions, and that together with bounded rationality (such as failure to apply an update to your world model as well as the target hypothesis) can mean being stuck in bad equilibria. That's not great, and it would be nice to have a model for bounded rationality that does guarantee converging toward truth, but we don't.

[-]deepthoughtlife3y20

The examples used don't really seem to fit with that though. Blind signatures are things many/most people haven't heard of, and not how things are done; I freely admit I had never heard of them before the example. Your HR department probably shouldn't be expected to be aware of all the various things they could do, as they are ordinary people. Even if they knew what blind signatures were, that doesn't mean it is obvious they should use them, or how to do so even if they thought they should (which you admit). After reading the Wikipedia article, that doesn't seem like an ordinary level of precaution for surveys. (Maybe it should be, but then you need to make that argument, so it isn't a good example for this purpose, in my opinion.)

I also don't blame you for not just trusting the word of the HR department that it is anonymous. But fundamentally speaking, wouldn't you (probably) only have their word that they were using Chaumian blind signatures anyway? You probably wouldn't be implementing the solution personally, so you'd have to trust someone on that score. Even if you did, then the others would probably just have to trust you then. The HR department could be much sneakier about connecting your session to your identity (which they would obviously claim is necessary to prevent multiple voting), but would that be better? It wouldn't make their claim that you will be kept anonymous any more trustworthy.

Technically, you have to trust that the math is as people say it is even if you do it yourself. And the operating system. And the compiler. Even with physical processes, you have to count on things like them not having strategically placed cameras (and that they won't violate the physical integrity of the process.).

Math is not a true replacement for trust. There is no method to avoid having to trust people (that's even vaguely worth considering). You just have to hope to pick well, and hopefully sway things to be a bit more trustworthy.

Interestingly, you admit that in your reply, but it doesn't seem to have the effect it seems like it should.

A better example to match your points could be fans of a sports team. They pay a lot of attention to their team, and should be experts in a sense. When asked how good their team is, they will usually say the best (or the worst). When asked why, they usually have arguments that technically should be considered noticeably significant evidence in that direction, but are vastly weaker than they should be able to come up if it were true. Which is obvious, since there are far more teams said to be the best (or worst) than could actually be the case. In that circumstance, you should be fairly demanding of the evidence.

In other situations though, it seems like a standard that is really easy to have be much stronger against positions you don't like than ones you do, and you likely wouldn't even notice. It is hard to hold arguments you disdain to the same standards as ones you like, even if you are putting in a lot of effort to do so, though in some people it is actually reversed in direction, as they worry too much.

[-]jchan3y10

If the cryptography example is too distracting, we could instead imagine a non-cryptographic means to the same end, e.g. printing the surveys on leaflets which the employees stuff into envelopes and drop into a raffle tumbler.

The point remains, however, because (just as with the blinded signatures) this method of conducting a survey is very much outside-the-norm, and it would be a drastic world-modeling failure to assume that the HR department actually considered the raffle-tumbler method but decided against it because they secretly do want to deanonymize the surveys. Much more likely is that they simply never considered the option.

But if employees did start adopting the rule "don't trust the anonymity of surveys that aren't conducted via raffle tumbler", even though this is epistemically irrational at first, it would eventually compel HR departments to start using the tumbler method, whereupon the odd surveys that still are being conducted by email will stick out, and it would now be rational to mistrust them. In short, the Adversarial Argument is "irrational" but creates the conditions for its own rationality, which is why I describe it as an "acausal negotiation tactic".

[-]deepthoughtlife3y10

That sort of strategy only works if you can get everyone to coordinate around it, and if you can do that, you could probably just get them to coordinate on doing the right things. I don't know if HR would listen to you if you brought your concerns directly to them, but they probably aren't harder to persuade on that sort of thing than convincing the rest of your fellows to defy HR. (Which is just a guess.) In cases where you can't get others to coordinate on it, you are just defecting against the group, to your own personal loss. This doesn't seem like a good strategy.

In more limited settings, you might be able to convince your friends to debate things in your preferred style, though this depends on them in particular. As a boss, you might be able to set up a culture where people are expected to make strong arguments in formal settings. Beyond these, I don't really think it is practical. (They don't generalize -for instance, as a parent, your child will be incapable of making strong arguments for an extremely long time.)

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

25

Adversarial epistemology

25

25