The determinant is a rather bizarre concept when one first encounters it. Given a square matrix , it's defined as

where is the symmetric group on elements - the collection of all permutations of a set of distinct objects - and is the sign of the permutation . represents the entry of the matrix on the th row and th column.

This definition means that the determinant is a sum that has one term for each permutation of the columns of the matrix , and these terms are the product of entries of the matrix up to a funny sign depending on the specific permutation .

To understand this definition, it's essential that we first understand what the sign of a permtuation is.

Signs

The sign corresponds to the parity of the number of pairs of elements whose order is reversed by a permutation: if this number is even then we say the permutation is even and of sign , and if it's odd we say the permutation is odd and of sign .

Let's do some simple examples to see how this works, letting :

  • The identity permutation reverses the order of exactly zero pairs, and zero is even. Therefore the identity is even, or its sign is positive.

  • The permutation , which swaps and but leaves invariant, changes the ordering of only one pair: the pair . So this permutation is odd.

  • The permutation sends . This permutation changes the ordering of and but leaves invariant, so it reorders exactly two pairs. In other words, it's even.

Another way to think about the sign is as follows: any permutation of a finite set can be obtained by repeatedly swapping two elements of the set with each other. For example, we can get the permutation defined above by first swapping and then swapping , i.e. . This expression is of course not unique, since we also have , for example: swapping two times in a row just gets us back to where we started. However, it turns out given a specific permutation, the parity of the number of pair swaps (or transpositions) we need to perform to obtain it is well defined. In our case, for instance, while we can produce out of two or four transpositions, we can't produce it using exactly three.

So that's where the sign means. Knowing this, we can work out the determinant in some explicit cases. For instance, the determinant of a two-by-two matrix is .

This expression still doesn't look like it should mean anything, so we need to further justify it. The key property of the determinant is that it commutes with matrix multiplication: for two matrices , we have . If we want to multiply two square matrices and then take their determinant, it doesn't matter which order we do the operations in: we can take the determinants first and then multiply or the other way around. We'll get the same answer in the end.

This is a desirable property because matrices are big and complicated objects while numbers are comparatively simpler. The determinant gives us a way to reduce some questions about matrices to questions about numbers, which can be much easier to answer. It does this by throwing away a lot of information about the matrix, but that's not necessarily a problem depending on what we want to do.

So we want to find a map which has the property we just mentioned: it commutes with matrix multiplication. Furthermore, we should ask it to not be a trivial map: it shouldn't just send all matrices to or , since a constant function doesn't give us any information about anything. How might we go about finding such a map?

Finding the determinant

Let's first look at a special class of matrices. We know that a vector space , for example, has an obvious basis consisting of the vectors

We can form a subgroup of matrices closed under multiplication by just looking at matrices where the columns are a permutation of these vectors. Now, we see a connection with the sign of a permutation: it's the only nontrivial way we know (and in fact it's the only way to do it at all!) to assign a scalar value to a permutation in a way that commutes with composition, which in this special case we know the determinant must do. Therefore, we can tentatively define

where the notation represents that this is the determinant of a matrix with columns respectively. This tells us the determinant has something to do with signs of permutations. In fact, combined with the fact that the determinant commutes with matrix multiplication, by multiplying any matrix with rows by a permutation matrix , we can deduce

In other words, if seen as a map defined on the rows (or columns) of a matrix, the determinant is alternating: swapping two rows or columns changes the sign of its value. We can infer from this that if a matrix has two identical rows or columns its determinant will be zero.

Here we seem to be stuck again: the problem is the assumptions we've made so far are too weak. We can look to make them stronger while making the determinant have more and more nice properties in the process.

The nicest property we could ask for is that is linear as a function of matrices, that is, . However, this condition is much too strong. For example, if we take the identity matrix and apply the permutation to its columns twice to get a total of three matrices, it's easy to see that all of them are of determinant but their sum is a matrix with all entries equal to and so must have determinant . In other words, if we ask to commute with addition as well as multiplication, we can't continue our construction.

What is weaker than this that we could hope to ask for? Well, we've already seen that the determinant can be seen as a function of the columns of a matrix, so perhaps instead of being linear in the whole matrix it's linear just in the columns and rows. In other words,

and since the determinant is alternating this will generalize to all other columns as well.

It may not seem like it, but we already have enough information now to determine uniquely. Indeed, this is because any matrix can be expressed as

If this is not clear to you, simply imagine we decompose each column into a linear combination of the elementary basis vectors .

We know the determinant is linear, and we know what values the determinant takes on permutation matrices, so we can actually evaluate the determinant of this directly using the properties we've postulated so far. We simply "expand out" the map using linearity to get

where the outside sum runs over all functions from to itself. Now all we have to do is to plug in the values of determinants we've already figured out. If is not a permutation, in other words if it takes two identical values, then the determinant will be zero. So we can assume is actually a permutation, in which case we already know that the determinant must equal the sign of . In other words,

This is obviously alternating and multilinear, and we can show that any such map must actually commute with matrix multiplication, so we've found the map we were looking for.

Takeaways

The way I went about finding the determinant above might not be intuitive. At first glance, it looks counterproductive to add further assumptions about the map when we only want it to commute with matrix multiplication. However, there are two reasons why this is a good strategy:

  1. If it works, the map we end up with is much more regular and well behaved than what we would've obtained if we picked an arbitrary map just satisfiyng our original condition. In our context, we clearly want to be as well behaved as possible, so this is only beneficial to us.

  2. Narrowing the search space down to maps we understand better is a good start in any search process. Either it succeeds, in which case the additional assumptions will have only made things easier for us; or it fails, in which case we learn a useful fact that a map having some collection of properties is actually impossible. Failure can give us some insight into how we might want to weaken our conditions in order to be successful the next time around.

New Comment
28 comments, sorted by Click to highlight new comments since:

I always liked the interpretation of the determinant as measuring the expansion/contraction of n-dimensional volumes induced by a linear map, with the sign being negative if the orientation of space is flipped. This makes various properties intuitively clear such as non-zero determinant being equivalent to invertibility.

Yup, determinant is how much the volume stretches. And trace is how much the vectors stay pointing in the same direction (average dot product of v and Av). This explains why trace of 90 degree rotation in 2D space is zero, why trace of projection onto a subspace is the dimension of that subspace, and so on.

Thank you for that intuition into the trace! That also helps make sense of .

Interesting, can you give a simple geometric explanation?

My intuition for  is that it tells you how an infinitesimal change accumulates over finite time (think compound interest). So the above expression is equivalent to . Thus we should think 'If I perturb the identity matrix, then the amount by which the unit cube grows is proportional to the extent to which each vector is being stretched in the direction it was already pointing'.

Hmm, this seems wrong but fixable. Namely, exp(A) is close to (I+A/n)^n, so raising both sides of det(exp(A))=exp(tr(A)) to the power of 1/n gives something like what we want. Still a bit too algebraic though, I wonder if we can do better.

Another thing to say is if  then

.

I think the determinant is more mathematically fundamental than the concept of volume. It just seems the other way around because we use volumes in every day life.

I think the good abstract way to think about the determinant is in terms of induced maps on the top exterior power. If you have an dimensional vector space and an endomorphism , this induces a map , and since is always one-dimensional this map must be of the form for some scalar in the ground field. It's this that is the determinant of .

This is indeed more fundamental than the concept of volume. We can interpret exterior powers as corresponding to volume if we're working over a local field, for example, but actually the concept of exterior power generalizes far beyond this special case. This is why the determinant still preserves its nice properties even if we work over an arbitrary commtuative ring; since such rings still have exterior powers behaving in the usual way.

I didn't present it like this in this post because it's actually not too easy to introduce the concept of "exterior power" without the post becoming too abstract.

This is close to one thing I've been thinking about myself. The determinant is well defined for endomorphisms on finitely-generated projective modules over any ring. But the 'top exterior power' definition doesn't work there because such things do not have a dimension. There are two ways I've seen for nevertheless defining the determinant.

  • View the module as a sheaf of modules over the spectrum of the ring. Then the dimension is constant on each connected component, so you can take the top exterior power on each and then glue them back together.
  • Use the fact that finitely-generated projective modules are precisely those which are the direct summands of a free module. So given an endomorphism  you can write  and then define .

These both give the same answer. However, I don't like the first definition because it feels very piecemeal and nonuniform, and I don't like the second because it is effectively picking a basis. So I've been working on by own definition where instead of defining  for natural numbers  we instead define  for finitely-generated projective modules . Then the determinant is defined via .

I'm curious about this. I can see a reasonable way to define  in terms of sheaves of modules over : Over each connected component,  has some constant dimension , so we just let  be  over that component. But it sounds like you might not like this definition, and I'd be interested to know if you had a better way of defining  (which will probably end up being equivalent to this). [Edit: Perhaps something in terms of generators and relations, with the generators being linear maps ?]

I'm curious about this. I can see a reasonable way to define