In classical logic, the operational definition of identity is that whenever 'A=B' is a theorem, you can substitute 'A' for 'B' in any theorem where B appears. For example, if (2 + 2) = 4 is a theorem, and ((2 + 2) + 3) = 7 is a theorem, then (4 + 3) = 7 is a theorem.
This leads to a problem which is usually phrased in the following terms: The morning star and the evening star happen to be the same object, the planet Venus. Suppose John knows that the morning star and evening star are the same object. Mary, however, believes that the morning star is the god Lucifer, but the evening star is the god Venus. John believes Mary believes that the morning star is Lucifer. Must John therefore (by substitution) believe that Mary believes that the evening star is Lucifer?
Or here's an even simpler version of the problem. 2 + 2 = 4 is true; it is a theorem that (((2 + 2) = 4) = TRUE). Fermat's Last Theorem is also true. So: I believe 2 + 2 = 4 => I believe TRUE => I believe Fermat's Last Theorem.
Yes, I know this seems obviously wrong. But imagine someone writing a logical reasoning program using the principle "equal terms can always be substituted", and this happening to them. Now imagine them writing a paper about how to prevent it from happening. Now imagine someone else disagreeing with their solution. The argument is still going on.
P'rsnally, I would say that John is committing a type error, like trying to subtract 5 grams from 20 meters. "The morning star" is not the same type as the morning star, let alone the same thing. Beliefs are not planets.
morning star = evening star
"morning star" ≠ "evening star"
The problem, in my view, stems from the failure to enforce the type distinction between beliefs and things. The original error was writing an AI that stores its beliefs about Mary's beliefs about "the morning star" using the same representation as in its beliefs about the morning star.
If Mary believes the "morning star" is Lucifer, that doesn't mean Mary believes the "evening star" is Lucifer, because "morning star" ≠ "evening star". The whole paradox stems from the failure to use quote marks in appropriate places.
You may recall that this is not the first time I've talked about enforcing type discipline—the last time was when I spoke about the error of confusing expected utilities with utilities. It is immensely helpful, when one is first learning physics, to learn to keep track of one's units—it may seem like a bother to keep writing down 'cm' and 'kg' and so on, until you notice that (a) your answer seems to be the wrong order of magnitude and (b) it is expressed in seconds per square gram.
Similarly, beliefs are different things than planets. If we're talking about human beliefs, at least, then: Beliefs live in brains, planets live in space. Beliefs weigh a few micrograms, planets weigh a lot more. Planets are larger than beliefs... but you get the idea.
Merely putting quote marks around "morning star" seems insufficient to prevent people from confusing it with the morning star, due to the visual similarity of the text. So perhaps a better way to enforce type discipline would be with a visibly different encoding:
morning star = evening star
220.127.116.11.18.104.22.168.22.214.171.124 ≠ 126.96.36.199.188.8.131.52.184.108.40.206
Studying mathematical logic may also help you learn to distinguish the quote and the referent. In mathematical logic, |- P (P is a theorem) and |- 'P' (it is provable that there exists an encoded proof of the encoded sentence P in some encoded proof system) are very distinct propositions. If you drop a level of quotation in mathematical logic, it's like dropping a metric unit in physics—you can derive visibly ridiculous results, like "The speed of light is 299,792,458 meters long."
Alfred Tarski once tried to define the meaning of 'true' using an infinite family of sentences:
("Snow is white" is true) if and only (snow is white)
("Weasels are green" is true) if and only if (weasels are green)
When sentences like these start seeming meaningful, you'll know that you've started to distinguish between encoded sentences and states of the outside world.
Similarly, the notion of truth is quite different from the notion of reality. Saying "true" compares a belief to reality. Reality itself does not need to be compared to any beliefs in order to be real. Remember this the next time someone claims that nothing is true.
Studying programming seems like it should be an even better way; mixing up 'int' and 'int*' will get you problems fast. On the other hand, I gather from your example that a lot of programmers made this mistake.
I like these posts, but let me add a couple of comments. In philosophical circles the "type distinction", as you call it, is known as the use/mention distinction, i.e. the distinction between using a phrase like "evening star" (to talk about the thing itself) and merely talking about the phrase (usually signaled by quotation marks).
But that's not the first problem you mentioned, which is known in philosophical circles as the failure of substitution in intensional (i.e., roughly, mental) contexts. I'm not so sure the use/mention distinction is useful in explaining this failure. For example, the sentence "Lois is looking for Superman" cannot be substituted for "Lois is looking for Clark Kent", because she may not know that that Superman and Clark Kent are identical. Obviously we never make that mistake, but if someone were to make it, the reason is surely to do with failing to realise that Lois may have false beliefs. But that's not a category mistake.
What is Lois actually looking for? When we say she's looking for Superman, we mean she's got a search target in her mind, a conceptual representation of Superman, and she's looking for something that matches that target closely enough to satisfy her. (Or, well, we ought to mean that. What we actually mean, I'm less sure of.)
If I introduce the typographical convention to designate a conceptual representation of an object X and the convention m(x) to designate an object that matches a concept x, then Lois is looking for m().
Superman is Clark Kent, but is decidedly not . To expect that because Superman is Clark Kent that Lois is looking for m() sure sounds like a category mistake to me.
One nice thing about this is that if you know the secret, then in your mind starts to resemble very closely... they aren't identical, but any m() is almost undoubtedly also a m() and vice versa. Which is exactly what we would expect -- the more I believe Clark Kent and Superman are one and the same, the more likely it is that if I'm looking for one I'll terminate the search upon finding the other.
Yes, like John O, I think this post misdiagnoses the problem.
John and Mary have beliefs about the evening star, not 'the evening star'. Their beliefs are about the world, not about the words. Neither of them believes that 'the evening star' is the god Venus. Who ever thought that the god Venus was a string of three words!?
Further -- though more contentiously -- we might even deny that Mary has any beliefs about the evening star (that very thing, i.e. the planet). She takes the world to be a certain way, such that a god Venus appears in the evening sky, etc. But given that our term 'the evening star' actually denotes a planet, perhaps we misdescribe Mary's belief by employing this term. She might attempt to use the term herself in describing her belief, but this is because she doesn't really know what it means, so she doesn't realize that linguistic error is causing her to misdescribe her belief contents.
For further explanation, see: Belief Content and Linguistic Error.
I believe that is exactly what Eliezer is saying.
"The evening star" refers to two entirely different things to Marry and John. John believes 2+2 = 4, while Marry believes 2+2 = red and 4 = blue. Attempting to substitute 4 for red does not work, because John's 4 is not even the same type of thing as Mary's.
John learns that Mary believes 2+2 = red. When John sees Marry write ((2+2) + 4) = purple, John incorrectly thinks Mary believes (red + red) = purple.
The problem, of course, is that Mary does not believe 2+2=4, to her that would be ridiculous, so John makes an incorrect inference about what Mary does believe because his beliefs are entirely different.
I know a really bad one which nearly turned my stomach: Some newspaper wrote "Survey uncovers that X's have the property Y!" (I forget the details). I read the article and it turned out that, according to some survey, most people believe that X's have the property Y. Argh!
The problem is that identity has been treated as if it were absolute, as if when two things are identical in one system, they are identical for all purposes.
The way I see it, identity is relative to a given system. I'd define it thus: A=B in system S just if for every equivalence relation R that can be constructed in S, R(A,B) is true. "Equivalence relation" is defined in the usual way: reflexive, symmetrical, transitive.
My formulation quantifies over equivalence relations, so it's not properly a relation in the system itself. It "lives" in any meta-logic about S that supports the definition's modest components: Ability to distinguish equivalence relations from other types, quantification over equivalence relations in S, ability to apply a variable that's known to be an equivalence relation, and ability to conjoin an arbitrary number of conjuncts. The fact that it's not in the system also avoids the potentially paradoxical situation of including '=' among its own conjuncts.
Given my formulation, it's easily seen that identity needs to be relative to some system. If we were to quantify over all equivalence relations everywhere, we would have to include relations like "Begins with the same letter", "Has the same ASCII representation", or "Is printed at the same location on the page". These relations would fail on A=B and on other equivalences that we certainly should allow at least sometimes. In fact, the
=' test would fail on every two arguments, since the relation "is passed to the NNNNth call to=' as the same argument index" must fail for those arguments. It could only succeed in a purely Platonic sense. So identity needs to be relative to some system.
How can systems differ in what equivalence relations they allow, in ways that are relevant here? For instance, suppose you write a theorem prover in Lisp. In the Lisp code, you definitely want to distinguish symbols that have different names. Their names might even have decomposable meaning, eg in a field accessor like
my-struct-my-field'. So implicitly there is an equivalence relationhas-same-name' about the Lisp. In the theorem prover itself, there is no such relation as has-same-Lisp-name or even has-same-symbol-in-theorem-prover. (You can of course feed the prover axioms which model this situation. That's different, and doesn't give you real access to these distinctions)
Your text editor in which you write the Lisp code has yet another different catalog of equivalence relations. It includes many distinctions that are sensitive to spelling or location. They don't trip us up here, they are just the sort of things that a text editor should distinguish and a theorem prover shouldn't.
The code in which your text editor is written makes yet other distinctions.
So what about the cases at hand? They are both about logic of belief (doxastic logic). Doxastic logic can contain equivalence relations that fail even on de re equivalent objects. For instance, doxastic logic should be able to say "Alice believes A but not B" even when A and B are both true. Given that sort of expressive capability, one can construct the relation "Alice believes either both A and B or neither", which is reflexive, symmetrical, transitive; it's an equivalence relation and it treats A and B differently.
So A and B are not identical here even though de re they are the same.
I think that in physics we would deal with this as a mapping problem. Jonh's and Mary's beliefs about the planet live in different spaces, and we need to pick a basis on which to project them in order to compare them. We use language as the basis. But then when we try to map between concepts, we find that the problem is ill posed: it doesn't have a unique solution because the maps are not all 1:1.
meh. That first section reads like the missing dollar paradox...
M.ms==M.L (mary's morningstar is equal to mary's lucifer); M.es==M.V ; J.ms==J.V==J.es.
classical logic repaired. sortof. And you might not like the joke i'm about to make...
"if and only IF", isnt it?
Broken link to http://yudkowsky.net/bayes/truth.html
This hasn't been fixed