I too spent a few years with a similar desire to understand probability and statistics at a deeper level, but we might have been stuck on different things. Here's an explanation:

Suppose you have 37 numbers. Purchase a massless ruler and 37 identical weights. For each of your numbers, find the number on the ruler and glue a weight there. You now have a massless ruler with 37 weights glued onto it.

Now try to balance the ruler sideways on a spike sticking out of the ground. The mean of your numbers will be the point on the ruler where it balances.

How does that answer the question? It's true that the center of gravity is a mean, but the moment of inertia is not a variance. It's one thing to say something is "proportional to a variance" to mean that the constant is 2 or pi, but when the constant is the number of points, I think it's missing the statistical point.

But the bigger problem is that these are not statistical examples! Means and sums of squares occur many places, but why are they are a good choice for the central tendency and the tendency to be central? Are you suggesting that we think of a random variable as a physical rod? Why? Does trying to spin it have any probabilistic or statistical meaning?

5IlyaShpitser6yMoments of mass in physics is a good intro to moments in stats for people who
like to visualize or "feel out" concepts concretely. Good post!

4solipsist6yA different level explanation, which may or may not be helpful:
Read up on affine space [http://en.wikipedia.org/wiki/Affine_space], convex
combinations [http://en.wikipedia.org/wiki/Convex_combination], and maybe this
article about torsors [http://math.ucr.edu/home/baez/torsors.html].
If you are frustrated with hand waving in calculus, read a Real Analysis
textbook. The magic words which explain how the heck you can have a probability
distributions over real numbers is measure theory
[http://en.wikipedia.org/wiki/Measure_(mathematics\]).

I too spent a few years with a similar desire to understand probability and statistics at a deeper level, but we might have been stuck on different things. Here's an explanation:

Suppose you have 37 numbers. Purchase a massless ruler and 37 identical weights. For each of your numbers, find the number on the ruler and glue a weight there. You now have a massless ruler with 37 weights glued onto it.

Now try to balance the ruler sideways on a spike sticking out of the ground. The

meanof your numbers will be the point on the ruler where it balances.Now s... (read more)

How does that answer the question?

It's true that the center of gravity is a mean, but the moment of inertia is not a variance. It's one thing to say something is "proportional to a variance" to mean that the constant is 2 or pi, but when the constant is the number of points, I think it's missing the statistical point.

But the bigger problem is that these are not statistical examples! Means and sums of squares occur many places, but why are they are a good choice for the central tendency and the tendency to be central? Are you suggesting that we think of a random variable as a physical rod? Why? Does trying to spin it have any probabilistic or statistical meaning?