"impression that more advanced statistics is technical elaboration that doesn't offer major additional insights"

Why did you have this impression?

Sorry for the off-topic, but I see this a lot in LessWrong (as a casual reader). People seem to focus on textual, deep-sounding, wow-inducing expositions, but often dislike the technicalities, getting hands dirty with actually understanding calculations, equations, formulas, details of algorithms etc (calculations that don't tickle those wow-receptors that we all have). As if these were merely some minor additions over the really important big picture view. As I see it this movement seems to try to build up a new backbone of knowledge from scratch. But doing this they repeat the mistakes of the past philosophers. For example going for the "deep", outlook-transforming texts that often give a delusional feeling of "oh now I understand the whole world". It's easy to have wow-moments without actually having understood something new.

So yes, PCA is useful and most statistics and maths and computer science is useful for understanding stuff. But then you swing to the other extreme and say "ideas from advanced statistics are essential for reasoning about the world, even on a day-to-day level". Tell me how exactly you're planning to use PCA day-to-day? I think you may mean you want to use some "insight" that you gained from it. But I'm not sure what that would be. It seems to be a cartoonish distortion that makes it fit into an ideology.

Anyway, mainstream machine learning is very useful. And it's usually much more intricate and complicated than to be able to produce a deep everyday insight out of it. I think the sooner you lose the need for everything to resonate deeply or have a concise insightful summary, the better.

Showing 3 of 4 replies (Click to show all)

Why did you have this impression?

Probably because of the human tendency to overestimate the importance of any knowledge one happens to have and underestimate the importance of any knowledge one doesn't. (Is there a name for this bias?)

1[anonymous]5yDon't say the p-word, please ;-). [http://lesswrong.com/lw/4zs/philosophy_a_diseased_discipline/] I do agree that more real-life understanding is gained from just obtaining a broad scientific education than from going wow-hunting. But of course, I would say that, since I'm a fanatical textbook purchaser.
3RomeoStevens5yI think having the concept of PCAs prevents some mistakes in reasoning on an intuitive day to day level of reasoning. It nudges me towards fox thinking instead of hedgehog thinking. Normal folk intuition grasps at the most cognitively available and obvious variable to explain causes, and then our System 1 acts as if that variable explains most if not all the variance. Looking at PCAs many times (and being surprised by them) makes me less likely to jump to conclusions about the causal structure of clusters of related events. So maybe I could characterize it as giving a System 1 intuition for not making the post hoc ergo propter hoc fallacy. Maybe part of the problem Jonah is running in to explaining it is that having done many many example problems with System 2 loaded it into his System 1, and the System 1 knowledge is what he really wants to communicate?

Beyond Statistics 101

by JonahS 2 min read26th Jun 2015132 comments


Is statistics beyond introductory statistics important for general reasoning?

Ideas such as regression to the mean, that correlation does not imply causation and base rate fallacy are very important for reasoning about the world in general. One gets these from a deep understanding of statistics 101, and the basics of the Bayesian statistical paradigm. Up until one year ago, I was under the impression that more advanced statistics is technical elaboration that doesn't offer major additional insights  into thinking about the world in general.

Nothing could be further from the truth: ideas from advanced statistics are essential for reasoning about the world, even on a day-to-day level. In hindsight my prior belief seems very naive – as far as I can tell, my only reason for holding it is that I hadn't heard anyone say otherwise. But I hadn't actually looked advanced statistics to see whether or not my impression was justified :D.

Since then, I've learned some advanced statistics and machine learning, and the ideas that I've learned have radically altered my worldview. The "official" prerequisites for this material are calculus, differential multivariable calculus, and linear algebra. But one doesn't actually need to have detailed knowledge of these to understand ideas from advanced statistics well enough to benefit from them. The problem is pedagogical: I need to figure out how how to communicate them in an accessible way.

Advanced statistics enables one to reach nonobvious conclusions

To give a bird's eye view of the perspective that I've arrived at, in practice, the ideas from "basic" statistics are generally useful primarily for disproving hypotheses. This pushes in the direction of a state of radical agnosticism: the idea that one can't really know anything for sure about lots of important questions. More advanced statistics enables one to become justifiably confident in nonobvious conclusions, often even in the absence of formal evidence coming from the standard scientific practice.

IQ research and PCA as a case study

In the early 20th century, the psychologist and statistician Charles Spearman discovered the the g-factor, which is what IQ tests are designed to measure. The g-factor is one of the most powerful constructs that's come out of psychology research. There are many factors that played a role in enabling Bill Gates ability to save perhaps millions of lives, but one of the most salient factors is his IQ being in the top ~1% of his class at Harvard. IQ research helped the Gates Foundation to recognize iodine supplementation as a nutritional intervention that would improve socioeconomic prospects for children in the developing world.

The work of Spearman and his successors on IQ constitute one of the pinnacles of achievement in the social sciences. But while Spearman's discovery of IQ was a great discovery, it wasn't his greatest discovery. His greatest discovery was a discovery about how to do social science research. He pioneered the use of factor analysis, a close relative of principal component analysis (PCA).

The philosophy of dimensionality reduction

PCA is a dimensionality reduction method. Real world data often has the surprising property of "dimensionality reduction":  a small number of latent variables explain a large fraction of the variance in data.

This is related to the effectiveness of Occam's razor: it turns out to be possible to describe a surprisingly large amount of what we see around us in terms of a small number of variables. Only, the variables that explain a lot usually aren't the variables that are immediately visibleinstead they're hidden from us, and in order to model reality, we need to discover them, which is the function that PCA serves. The small number of variables that drive a large fraction of variance in data can be thought of as a sort of "backbone" of the data. That enables one to understand the data at a "macro /  big picture / structural" level.

This is a very long story that will take a long time to flesh out, and doing so is one of my main goals.