The Law of Identity

Chris_Leong

Summary: When we define the range of possible values for a variable X, we are fixing an ontology, that is, a way of carving up the space of values. The Law of Identity asserts that this ontology respects a given equivalence function.

Wikipedia defines the Law of Identity as follows: "In logic, the law of identity states that each thing is identical with itself". It is often written as X=X.

While this law seems straightforward, it is anything but once we start digging into what it actually means. The challenge is that it's very difficult to say what this law means without stating a tautology.

Take, for example, the definition above. What does it mean for a thing (let's say A, to be concrete) to "be identical with itself"?

Well, in order for this to make sense we need to have a model where A is not identical to itself which we can reject. If we don't have such a model to reject, then this statement will be tautological.

We can represent this using set theory as follows:

Let be a set containing $A_{1}$ and $A_{2}$ (ie. $A = {A_{1}, A_{2}}$ ). Here A is the "thing" and we're assigning it two separate sub-things that "are it" so that we can talk about them being equal to each other or not. We can think of A as corresponding to a congruence relation in set theory.
When checking if $A$ is identical to itself, we'll be checking if $A_{1} = A_{2}$ . If there were more than two variables, then we would do a pairwise comparison of elements. If instead of numbers, they are something like formulas applied to a specific value, then we'll have to evaluate them before making the comparison.

We can now consider some concrete examples (apologies if the examples are repetitive):

Let's suppose $A_{1} = 100$ represents a variable that we have in memory. We copy it such that we have another variable $A_{2}$ in memory and if the copy operation happened successfully it should also be 100. However, we can also imagine an unreliable copy operation which often produces a value of 99 or 101. In this context, $A$ being identical to itself can serve as a shorthand for the copy operation being reliable such that we can treat all copies as a single variable. If operations are regularly unreliable, then we will tend to assign each copy its own variable.
Let $x = 10$ and suppose we have two functions $f_{1}$ and $f_{2}$ . Similar to how we made $A = {A_{1}, A_{2}}$ , let's define $f = {f_{1}, f_{2}}$ . Suppose f represents different attempts to apply the operation $f (x) = x^{2}$ with $f_{1}$ and $f_{2}$ being two separate attempts to apply it to the same value of x. If our calculation is reliable, then $f_{1}$ should equal $f_{2}$ , however if $f_{1}$ correctly calculates 100 and $f_{2}$ incorrectly calculates 1,000, similar to how a human might mess up, then they will differ. So here, $f$ being identical to itself can serve as a shorthand for the operation providing consistent answers. If operations are regularly unreliable, then we will tend to assign each run its own variable.
Let $B$ represent different observations of the number of bananas in front of me and $B_{1}, B_{2}$ represent the number of bananas in two specific cases. If I look once and see two bananas, then look a second time and see an extra banana has appeared, then I'll need two separate variables to represent the number of bananas. But if we've limited the scope of our consideration such that there are only ever three bananas in front of me, I can use any member of B. Another situation where I might need two separate variables would be if there were only ever three bananas, but I messed up the count. Here, B being identical to itself represents that multiple observations should return the same answer. If the number of bananas was changing, or my observations of how many bananas were in front of me weren't consistent, then I'd need two separate variables.
We can imagine a similar situation as the last point, but instead of making observations of the same aspect of the world which we believe to be constant, we could be accessing the same value in memory. This is similar to the first point, but we're considering multiple accesses of a variable stored in the same location, rather than different locations.

I could keep going and listing different scenarios, but as we can see the Law of Identity is actually pretty complicated underneath and can represent quite different things in different scenarios.

In each scenario, we had a variable that could potentially be further sub-divided (ie. by indexing on the copy, computation, observation or retrieval). We discussed the conditions when it would make sense to work with the course-grained variable (represented by the set) and when it would make sense to work with the fine-grained variables (represented by individual variables). In this article, we considered numerical equivalence, but it works with other kinds of equivalence as well.

Consequences

I think understanding the Law of Identity is a pretty good starting point for trying to understand the nature of logic and mathematics. After all, it's pretty much the simplest law out there. And if you don't have a clear explanation of this law, then that might be a hint that you're not ready to tackle the deeper questions yet.

I guess that if we took this understanding of the Law of Identity and try to extrapolate that out to a theory of logic, the natural way to do this would be to produce a non-Cartesian view of logic where logic describes an abstraction for thinking about the agent's interactions with the world and/or the agent's understanding of its own cognition.

Let me know if you think I should write something about some of the other basic axioms of logic, but to be honest, I'm not really planning to do so at the moment, as I think extending this kind of reasoning to those axioms should be relatively straightforward.

Addendum:

Justis suggested adding the following example to further clarify when the Principle of Identity doesn't hold: "Suppose you're playing a game and a certain enemy always drops one of three things, but the precise thing varies randomly. It could drop a gold, silver, or bronze coin, for example. Then enemyDrop() != enemyDrop(), so identity in some sense doesn't hold on invocation."

On the other hand, you may have enemyDrop(seed) = enemyDrop(seed). So again, the law only holds when your variable is sufficiently fine-grained.

[-]Rupert1y52

I made the following observation to Chris on Facebook which he encouraged me to post here.

My point was basically just that, in reply to the statement "If we don't have such a model to reject, the statement will be tautological", it is in fact true relative to the standard semantics for first-order languages with equality that there is indeed no model-combined-with-an-interpretation-of-the-free-variables for which "x=x" comes out false. That is to say, relative to the standard semantics the formula is indeed a "logical truth" in that sense, although we usually only say "tautology" for formulas that are tautologies in propositional logic (that is, true under every Boolean valuation, a truth-valuation of all subformulas starting with a quantifier and all subformulas which are atomic formulas which then gets extended to a truth-valuation of all subformulas using the standard rules for the propositional connectives). So most certainly "x=x" is universally valid, relative to the standard semantics, and in the sense just described, there is no counter-model.

I take it that Chris' project here is in some way to articulate in what sense the Law of Identity could be taken as a statement that "has content" to it. It sounds as though the best approach to this might be to try to take a look at how you would explain the semantics of statements that involve the equality relation. It looks as though it should be in some way possible to defend the idea that the Law of Identity is in some way "true in virtue of its meaning".

[-]Chris_Leong1y0-2

So most certainly "x=x" is universally valid, relative to the standard semantics, and in the sense just described, there is no counter-model.

Indeed. If we want such a counter-model, then we'll need a different formalisation. This is what I provided above.

It looks as though it should be in some way possible to defend the idea that the Law of Identity is in some way "true in virtue of its meaning".

I would be surprised if this were the case. I guess my argument above doesn't aim to argue for the Law of Identity a priori, but rather as a way of representing that our variables don't need to be more fine-grained given a particular context and a particular equivalence function. In other words, we adopt the Law of Identity because it is part of a formalisation (more properly, a class of formalisations) that is useful in an incredibly wide range of circumstances. At least part of why this is useful so widely because we can use it to formalise parts of our cognition and we use our cognition everywhere.