## Is statistics beyond introductory statistics important for general reasoning?

Ideas such as regression to the mean, that correlation does not imply causation and base rate fallacy are very important for reasoning about the world in general. One gets these from a deep understanding of statistics 101, and the basics of the Bayesian statistical paradigm. Up until one year ago, I was under the impression that more advanced statistics is technical elaboration that doesn't offer major additional insights into thinking about the world *in general*.

Nothing could be further from the truth: **ideas from advanced statistics are essential for reasoning about the world, even on a day-to-day level.** In hindsight my prior belief seems very naive – as far as I can tell, my only reason for holding it is that I hadn't heard anyone say otherwise. But I hadn't actually looked advanced statistics to see whether or not my impression was justified :D.

Since then, I've learned some advanced statistics and machine learning, and the ideas that I've learned have radically altered my worldview. The "official" prerequisites for this material are calculus, differential multivariable calculus, and linear algebra. But one doesn't actually need to have *detailed* knowledge of these to understand ideas from advanced statistics well enough to benefit from them. The problem is pedagogical: I need to figure out how how to communicate them in an accessible way.

## Advanced statistics enables one to reach nonobvious conclusions

To give a bird's eye view of the perspective that I've arrived at,** in practice**, the ideas from "basic" statistics are generally useful primarily for **disproving** hypotheses. This pushes in the direction of a state of **radical agnosticism**: the idea that one can't really know anything for sure about lots of important questions. More advanced statistics enables one to become **justifiably confident in nonobvious conclusions**, often even in the absence of formal evidence coming from the standard scientific practice.

## IQ research and PCA as a case study

The work of Spearman and his successors on IQ constitute one of the pinnacles of achievement in the social sciences. But while Spearman's discovery of IQ was a great discovery, it wasn't his *greatest* discovery. His greatest discovery was a discovery about *how to do social science research*. He pioneered the use of** factor analysis**, a close relative of

**principal component analysis (PCA).**

## The philosophy of dimensionality reduction

PCA is a *dimensionality reduction* method. Real world data often has the surprising property of "dimensionality reduction": *a *small number of latent variables explain a large fraction of the variance in data.

This is related to the effectiveness of Occam's razor: it turns out to be possible to describe a surprisingly large amount of what we see around us in terms of a **small** number of variables. Only, the variables that explain a lot usually **aren't the variables that are immediately visible*** – *instead they're hidden from us, and in order to model reality, we need to discover them, which is the function that PCA serves. The small number of variables that drive a large fraction of variance in data can be thought of as a sort of "backbone" of the data. That enables one to understand the data at a "macro / big picture / structural" level.

This is a very long story that will take a long time to flesh out, and doing so is one of my main goals.

But there are trivially easy answers to questions like that. Basically you have to ask "Cease to exist for whom?" i.e. it obviously ceases to exist for you. You just have to taboo words like "really" here such "does it really cease to exist" as they are meaningless, they don't lead to predictions. What often people consider "really" reality is the perception of a perfect god-like omniscient observer but there is no such thing.

Essentially there are just two extremes to avoid, the po-mo "nothing is real, everything is mere perception" and the traditional, classical "but how things really really REALLY are?" and the middle way here is "reality is the sum of what could be perceived in principle". A perception is right or wrong based on how much it meshes with all the other things that can in principle be perceived. Everything that cannot even be perceived in theory is not part of reality. There is no how things "really" are, the closest we have to that what is the sum of all potential, possible perceivables about a thing.

I picked up this approach from Eric S. Raymond, I think he worked it out decades before Eliezer did, possibly both working from Peirce.

This is basically anti-metaphysics.

Does this imply that only things that exist in my past light cone are real for me at any given moment?