Nov 28, 2012
Edit 11/28: Edited note at bottom to note that the random variables should have finite variance, and that this is essentially just L². Also some formatting changes.
This is something that has been bugging me for a while.
The correlation coefficient between two random variables can be interpreted as the cosine of the angle between them. The higher the correlation, the more "in the same direction" they are. A correlation coefficient of one means they point in exactly the same direction, while -1 means they point in exactly opposite directions. More generally, a positive correlation coefficient means the two random variables make an acute angle, while a negative correlation means they make an obtuse angle. A correlation coefficient of zero means that they are quite literally orthogonal.
Everything I have said above is completely standard. So why aren't correlation coefficients commonly expressed as angles instead of as their cosines? It seems to me that this would make them more intuitive to process.
Certainly it would make various statements about them more intuitive. For instance "Even if A is positive correlated with B and B is positively correlated with C, A might be negatively correlated with C." This sounds counterintuitive, until you rephrase it as "Even if A makes an acute angle with B and B makes an acute angle with C, A might make an obtuse angle with C." Similarly, the geometric viewpoint makes it easier to make observations like "If A and B have correlation exceeding 1/√2 and so do B and C, then A and C are positively correlated" -- because this is just the statement that if A and B make an angle of less than 45° and so do B and C, then A and C make an angle of less than 90°.
Now when further processing is to be done with the correlation coefficients, one wants to leave them as correlation coefficients, rather than take their inverse cosines just to have to take their cosines again later. (I don't know that the angles you get this way are actually useful mathematically, and I suspect they mostly aren't.) My question rather is about when correlation coefficients are expressed to the reader, i.e. when they are considered as an end product. It seems to me that expressing them as angles would give people a better intuitive feel for them.
Or am I just entirely off-base here? Statistics, let alone the communication thereof, is not exactly my specialty, so I'd be interested to hear if there's a good reason people don't do this. (Is it assumed that anyone who knows about correlation has the geometric point of view completely down? But most people can't calculate an inverse cosine in their head...)
Formal mathematical version: If we consider real-valued random variables with finite variance on some fixed probability space Ω -- that is to say, L²(Ω) -- the covariance is a positive-semidefinite symmetric bilinear form, with kernel equal to the set of essentially constant random variables. If we mod out by these we can consider the result as an inner product space and define angles between vectors as usual, which gives us the inverse cosine of the correlation coefficient. Alternatively we could just take L²(Ω) and restrict to those elements with zero mean; this is isomorphic (since it is the image of the "subtract off the mean" map, whose kernel is precisely the essentially constant random variables).