Centrally, a lie is a statement that contradicts reality, and
my initial reaction to this was: "what? a lie doesn't have to contradict reality, right? eg if i thought that 2+2=5, then if i told you that 2+2=4, i'd be lying to you, right?"
but then i looked at the google definition of a lie and was surprised to see it agreed with this sentence of your post. but i sort of still don't believe this is really the canonical meaning. chatgpt seems to agree with me lol: https://chatgpt.com/share/696eed66-ab40-800f-9157-0e7d04f5362a
(of course we can choose to use the word either way. i'm mostly saying this because i think it's plausible your reaction will just be "oops". if you stand by this meaning, then probably one should discuss which notion better fits the ways in which we already want to use the term, but i'm not actually that interested in having that discussion)
Yes, oops -- whether speech is a 'lie' in this context should definitely depend on the speaker's world model, not the facts of the world itself. I probably accidentally cut this dependence out during a rewrite and then missed it b/c of editorial blindness.
What counts as a lie?
Centrally, a lie is a statement that contradicts reality, and that is formed with the explicit intent of misleading someone. If you ask me if I’m free on Thursday (I am), and I tell you that I’m busy because I don’t want to go to your stupid comedy show, I’m lying. If I tell you that I’m busy because I forgot that a meeting on Thursday had been rescheduled, I’m not lying, just mistaken.
But most purposeful misrepresentations of a situation aren’t outright falsehoods, they’re statements that are technically compatible with reality while appreciably misrepresenting it. I likely wouldn’t tell you that I’m busy if I really weren’t; I might instead bring up some minor thing that I have to do that day and make a big deal out of it, to give you the impression that I’m busy. So I haven’t said false things, but, whether through misdirecting, paltering, lying by omission, or other such deceptive techniques, I haven’t been honest either.
We’d like a principled way to characterize deception, as a property of communications in general. Here, I’ll derive an unusually powerful one: deception is misinformation on expectation. This can be shown at the level of information theory, and used as a practical means to understand everyday rhetoric.
Information-Theoretic Deception
Formally, we might say that Alice deceives Bob about a situation if:
First Definition: She makes a statement to him that, with respect to her own model of Bob, changes his impression of the situation so as to make it diverge from her own model of the situation.
We can phrase this in terms of probability distributions. (If you’re not familiar with probability theory, you can skip to the second definition and just take it for granted). First, some notation:
- For a possible state x of a system X, let
pAX(x),pBX(x)be the probabilities that Alice and Bob, respectively, assign to that state. These probability assignments pAX and pBX are themselves epistemic states of Alice and Bob. If Alice is modeling Bob as a system, too, she may assign probabilities to possible epistemic states qBX that Bob might be in:
qBX↦pAB(qBX)2. Let
pB∣sX(x)=pBX(x∣s)With this notation, a straightforward way to operationalize deception is as information Alice presents to Bob that she expects to increase the difference between Bob’s view of the world and her own.
Taking the Kullback-Leibler divergence as the information-theoretic measure of difference between probability distributions, this first definition of deception is written as:
EpAB[KL(pA∣∣qB∣s)]>EpAB[KL(pA∣∣qB)]We can manipulate this inequality:
0<EpAB[KL(pA∣∣qB∣s)]−EpAB[KL(pA∣∣qB)]=∫pAB(qB)∫pA(ω)lnpA(ω)qB∣s(ω)−pA(ω)lnpA(ω)qB(ω)dωdqB=∬pAB(qB)pA(ω)ln(pA(ω)qB(ω∣s)qB(ω)pA(ω))dωdqBWrite B,Ω for the product system composed of B and Ω, whose states are just pairs of states of B and Ω. The inequality can then be written in terms of an expected value:
0<−EpAB,Ω[lnqB(ω∣s)qB(ω)]⟹EpAB,Ω[lnqB(ω∣s)qB(ω)]<0This term is the proportion to which Alice expects the probability Bob places on the actual world state to be changed by his receiving the information $s$. If we write this in terms of surprisal, or information content,
S(x)=−lnp(x)we have
EpAB,Ω[SB(ω∣s)]>EpAB,Ω[SB(ω)]This can be converted back to natural language: Alice deceives Bob with the statement s if:
Second Definition: She expects that the statement would make him more surprised to learn the truth as she understands it[1].
In other words, deception is misinformation on expectation.
Misinformation alone isn’t sufficient—it’s not deceptive to tell someone a falsehood that you believe. To be deceptive, your message has to make it harder for the receiver to see the truth as you know it. You don’t have to have true knowledge of the state of the system, or of what someone truly thinks the state is. You only have to have a model of the system that generates a distribution over true states, and a model of the person to be deceived that generates distributions over their epistemic states and updates.
This is a criterion for deception that routes around notions of intentionality. It applies to any system that
An AI, for instance, may not have the sort of internal architecture that lets us attribute human-like intents or internal conceptualizations to it; it may select information that misleads us without the explicit intent to mislead[2]. An agent like AlphaGo or Gato, that sees humans as just another game to master, may determine which statements would get us to do what it wants without even analyzing the truth or falsity of those statements. It does not say things in order to deceive us; deception is merely a byproduct of the optimal things to say.
In fact, for sufficiently powerful optimizers, deception ought to be an instrumental strategy. Humans are useful tools that can be easily manipulated by providing information, and it’s not generally the case that information that optimally manipulates humans towards a given end is simultaneously an accurate representation of the world. (See also: Deep Deceptiveness).
Rhetorical Deception
This criterion can be applied anywhere people have incentives to be dishonest or manipulative while not outright lying.
In rhetorical discussions, it’s overwhelmingly common for people to misrepresent situations by finding the most extreme descriptions of them that aren’t literally false[3]. Someone will say that a politician “is letting violent criminals run free in the streets!”, you’ll look it up, and it’ll turn out that they rejected a proposal to increase mandatory minimum sentencing guidelines seven years ago. Or “protein shakes can give you cancer!”, when an analysis finds that some brands of protein powder contain up to two micrograms of a chemical that the state of California claims is not known not to cause cancer at much larger doses. And so on. This sort of casual dishonesty permeates almost all political discourse.
Descriptions like these are meant to evoke particular mental images in the listener: when we send the phrase “a politician who’s letting violent criminals run free in the streets” to the Midjourney in our hearts, the image is of someone who’s just throwing open the prison cells and letting out countless murderers, thieves, and psychos. And the person making this claim is intending to evoke this image with their words, even though they'll generally understand perfectly well that that’s not what’s really happening. So the claim is deceptive: the speaker knows that the words they’re using are creating a picture of reality that they know is inaccurate, even if the literal statement itself is true.
This is a pretty intuitive test for deception, and I find myself using it all the time when reading about or discussing political issues. It doesn’t require us to pin down formal definitions of “violent criminal” and a threshold for “running free”, as we would in order to analyze the literal truth of their words. Instead, we ask: does the mental image conveyed by the statement match the speaker’s understanding of reality? If not, they’re being deceptive[4].
Treating expected misinformation as deception also presents us with a conversational norm: we ought to describe the world in ways that we expect will cause people to form accurate mental models of the world.
(Also posted on Substack)
This isn’t exactly identical to the first definition. Note that I converted the final double integral into an expected value by implicitly identifying
pAB(qB)pA(ω)=pAB,Ω(qB,ω)i.e. by making Bob’s epistemic state independent of the true world state, within Alice’s model. If Alice is explicitly modeling a dependence of Bob’s epistemic state on the true world state for reasons outside her influence, this doesn’t work, so the first and second definitions can differ.
Example: If I start having strange heart problems, I might describe them to a cardiologist, expecting that this will cause them to form a model of the world that’s different from mine. I expect they’ll gain high confidence that my heart has some specific problem X that I don’t presently consider likely due to my not knowing cardiology. So, to me, there’s an expected increase in the divergence between our distributions that isn’t an expected increase in the cardiologist’s surprisal, or distance from the truth. Because the independence assumption above is violated—I take the cardiologist’s epistemic state to be strongly dependent on the true world state, even though I don’t know that state—the two definitions differ. Only the second captures the idea that honestly describing your medical symptoms to a doctor shouldn’t be deception, since you don’t expect that they’ll be mis-informed by what you say.
Even for humans, there’s a gray zone where we do things whose consequences are neither consciously intended nor unintended, but simply foreseen; it’s only after the action and its consequences are registered that our minds decide whether our narrative self-model will read “yes, that was intended” or “no, that was unintended”. Intentionality is more of a convenient fiction than a foundational property of agents like us.
Resumes are a funnier example of this principle: if someone says they placed “top 400” in a nationwide academics competition, you can tell that their actual rank is at least 301, since they’d be saying “top 300” or lower if they could.
Of course everyone forms their own unique mental images; of course it’s subjective what constitutes a match; of course we can’t verify that the speaker has any particular understanding of reality. But you can generally make common-sense inferences about these things.