Algorithms of Deception!

11Pongo

7romeostevensit

5Isnasene

2Matt Goldenberg

1Pattern

2Isnasene

1artifex

New Comment

7 comments, sorted by Click to highlight new comments since: Today at 10:50 PM

If true "beliefs" are models that make accurate predictions, thendeceptionwould presumably be communication that systematically results inlessaccurate predictions (by a listener applying the same inference algorithms that would result in more accurate predictions when applied to direct observations or "honest" reports).

This helped me clarify that these *algorithms of deception* are not just adversarially attempting to deceive, but in fact adversarially crafted for one's belief-forming mechanisms.

This is the kind of issue that being really specific can be used to resolve. If the Reporters were specific about what they were trying to do in their reporting, the Audience would not have used their reports. If this is unfeasible, the Audience could still specifically ask the Reporters to clarify whether they think she will be able to figure out the specific things she is trying to figure out using their reports. If they mislead her here, Audience can credibly claim that either she was lied to or that the reporters are not competent enough to be relied on for information. Pragmatically, both these possibilities imply a course-of-action that involved not using the reporters as reporters.

To illustrate in detail:

So, too, imagine each of our possible Reporters as a person: loyal, responsible—and, entirely coincidentally, the supplier of a good that Audience's careful plans call for in proportion to the value of P(X=4).

In this context, the audience has an explicit goal of estimating how frequently X=4 occurs. Keeping this in mind, the problem the audience is having can be understood in the context of interacting with reports with misaligned goals.

Sure, I'm not a perfect program free from all bias, but everything I said was true—every outcome I reported corresponded to one of the Xi. You can't call that misleading!"

Reporter 2's goal seems to be "report values that correspond to one of the Xi." This is misaligned with a goal associated with estimating the distribution of X.

I told you the truth, the whole truth, and nothing but the truth: everything I saw, I reported. When I said an outcome was a oneorfour, it actually was a oneorfour. Perhaps you have a different category system, such that whatIthink of as a 'oneorfour', appears to you to be any of several completely different outcomes, which you think my 'oneorfour' concept is conflating.

Reporter 3's goal seems to be "report values but conflate ones and fours into a 'oneorfour' category" (in the actual programming, this is a misrepresentation of course. Reporting a one *or *a four as a one *and *a four isn't the same thing). This is misaligned with the goal of figuring out how frequently fours are specifically.

If Reporter 2 or Reporter 3 had explicitly specified what they were trying to do, the Audience wouldn't have used them as an information source.

If Reporter 2 had told Audience that he believed Alice could determine probabilities just from his reporting, he would be lying, or too incompetent to trust.

If Reporter 3 had told Audience that he believed Alice could distinguish the probability of fours separate from the probabilities of ones, he would be lying, or too incompetent to trust.

very similar

The same for 4 significant digits.

[a] perfectly sincere.

[b] In a peaceful world where most falsehood was due to random mistakes, there would be little to be gained by studying processes that systematically create erroneous maps.

Systematic error is conflated with conflict (in b), following sections (in the vicinity of a) which claim error is not conscious. Even if I accept b,

[c] . In a world of conflict, where there are forces trying to slash your tires, one would do well do study these—algorithms of deception!

why should c follow? Why not tune out what can't be verified, or isn't worth verifying? (I say this as someone who intends to vote on this post only after running the code.)

why should c follow? Why not tune out what can't be verified, or isn't worth verifying? (I say this as someone who intends to vote on this post only after running the code.)

Because of things like selective reporting, the vast majority of information reported to us tends to be misleading, but not without some level of useful information (and sometimes a lot of useful information). Instead of tuning out information or spending exhaustive amounts of time attempting to verify it, a faster solution is often to figure out ways to adjust the (inaccurate) information for deception to get only the useful information and a better understanding of uncertainty within it.

Of course, if there is a 100% un-deceptive source for a given piece of information, there's not much value in trying to use deceptive sources for that same piece of information (unless the former source is much more expensive than the latter).

I want you to imagine a world consisting of a sequence of independent and identically distributed random variables Xi, and two computer programs.

The first program is called Reporter. As input, it accepts a bunch of the random variables Xi. As output, it returns a list of sets whose elements belong to the domain of the Xi.

The second program is called Audience. As input, it accepts the output of Reporter. As output, it returns a probability distribution.

Suppose the Xi are drawn from the following distribution:

P(X=x)=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩1/2x=11/4x=23/16x=31/16x=4

We can model drawing a sample from this distribution using this function in the Python programming language:

For compatibility, we can imagine that Reporter and Audience are also written in Python. This is just for demonstration in the blog post that I'm writing—the

realReporter and Audience (out there in the world I'm asking you to imagine) might be much more complicated programs written for some kind ofaliencomputer the likes of which we have not yet dreamt! But I like Python, and for the moment, we can pretend.So pretend that Audience looks like this (where the dictionary, or hashmap, that gets returned represents a probability distribution, with the keys being random-variable outcomes and the values being probabilities):

Let's consider multiple possibilities for the form that Reporter could take. A particularly simple implementation of Reporter (call it

`reporter_0`

) might look like this:The pairing of

`audience`

and`reporter_0`

has aVery Interesting Property!When we call our Audience on the output of this Reporter, the probability distribution that Audience returns isvery similarto the distribution that our random variables are from!^{[1]}Weird, right?!

Of course, there are

otherpossible implementations of Reporter. For example, this choice of Reporter (`reporter_1`

) doesnotresult in the Very Interesting Property—It instead induces Audience to output a very different (and rather boring) distribution. It doesn't even matter how the Xi turned up; the result will always be the same:

We could go on imagining other versions of Reporter, like this one (

`reporter_2`

)—While the distribution that

`reporter_2`

makes Audience output isn't as boring as the one we saw for`reporter_1`

, it still doesn't result in the Very Interesting Property of matching the distribution of the Xi. It comescloserthan`reporter_1`

did—notice how theratiosof probabilities assigned to the first three outcomes is similar to that of the original distribution—but it's assigning way too much probability-mass to the outcome "4":So far, all of the Reporters we've imagined are still only putting one element in the inner sets of the list-of-sets that they return. But we could imagine

`reporter_3`

—Unlike

`reporter_2`

(which typically returned a list withfewerelements than it received as input), the list returned by`reporter_3`

has exactly as many elements as the list it took in. Yet this Reporter still prompts Audience to return a distribution with too many "4"s—andunlike`reporter_2`

, it doesn't even get the ratio of the other outcomes right, yielding disproportionately fewer "1"s compared to "2"s and "3"s than the original distribution—Again, I've presented Audience and various possible Reporters as simple Python programs for illustration and simplicity, but the same

input-output relationshipscould be embodied as part of a more complicated system—perhaps an entire conscious mind which could talk.So now imagine our Audience as a

personwith her own hopes and fears and ambitions ... ambitions whose ultimate fulfillment will require dedication, bravery—and meticulously careful planning based on an accurate estimate of P(X), with almost no room for error.So, too, imagine each of our possible Reporters as a person: loyal, responsible—and, entirely coincidentally, the supplier of a good that Audience's careful plans call for in proportion to the value of P(X=4).

When the expected frequency of "4"s fails to appear, Audience's lifework is in ruins. All of her training, all of her carefully calibrated plans, all the interminable hours of hard labor, were for nothing. She confronts Reporter in a furor of rage and grief.

"You

lied," she says through tears of betrayal, "Itrusted youandyou lied to me!"The Reporter whose behavior corresponds to

`reporter_2`

replies, "Howdareyou accuse me of lying?! Sure, I'm not a perfect program free from all bias, but everything I said was true—every outcome I reported corresponded to one of the Xi. You can't call that misleading!"He is perfectly sincere. Nothing in his

consciousnessreflectsintentto deceive Audience, any more than an eight-line Python program could be said to have such "intent." (Does a`for`

loop "intend" anything? Does a conditional "care"? Of course not!)The Reporter whose behavior corresponds to

`reporter_3`

replies, "Lying?!I told you the truth, the whole truth, and nothing but the truth: everything I saw, I reported. When I said an outcome was a oneorfour, it actually was a oneorfour. Perhaps you have a different category system, such that whatIthink of as a 'oneorfour', appears to you to be any of several completely different outcomes, which you think my 'oneorfour' concept is conflating. If those outcomes had wildly different probabilities, if one was much more common than fou—I mean, than the other—then you'd have no way of knowing that from my report. But using language in a wayyoudislike, is not lying. I can define a word any way I want!"He, too, is perfectly sincere.

## Commentary

Much has been written on this website about reducing mental notions of "truth", "evidence",

&c.to the nonmental. One need not grapple with tendentious mysteries of "mind" or "consciousness", when so much more can be accomplished by considering systematic cause-and-effect processes that result in the states of one physical system becoming correlated with the states of another—a "map" that reflects a "territory."The same methodology that was essential for studying truthseeking, is equally essential for studying the propagation of falsehood. If true "beliefs" are models that make accurate predictions, then

deceptionwould presumably be communication that systematically results inlessaccurate predictions (by a listener applying the same inference algorithms that would result in more accurate predictions when applied to direct observations or "honest" reports).In a peaceful world where most falsehood was due to random mistakes, there would be little to be gained by studying processes that systematically create erroneous maps. In a world of conflict, where there are forces trying to slash your tires, one would do well do study these—

algorithms of deception!But

only"very" similar: the code for`audience`

isnotthe mathematically correct thing to do in this situation; it's just an approximation that ought to be good enough for the point I'm trying to make in this blog post, for which I'm trying to keep the code simple. (Specifically, the last two lines of`audience`

are based on the mode of the Dirichlet distribution, but, firstly, that part about increasing the hyperparameters fractionally when you're uncertain about what was observed (`a[possibility] += 1/len(sight)`

) is pretty dodgy, and secondly, if you wereactuallygoing to try to predict an outcome drawn from a categorical distribution like P(X) using the Dirichlet distribution as a conjugate prior, you'd need to integrate over the Dirichlet hyperparameters; you shouldn't just pretend that the mode/peak represents the true parameters of the categorical distribution—but as I said, wearejust pretending.) ↩︎