[draft] Generalizing from average: a common fallacy?

The average human has one ovary and one testicle.

If your feet are in a bucket of ice and your head's in a oven, on average you're at a comfortable temperature.

The average family has 2.4 children.

And as for correlations, some years ago I wrote this brief note on how little predictive use you get from the typical magnitude of published correlations.

[-]Morendil14y20

You probably know the story of the three statisticians hunting a tiger? The first statistician's shot goes wild, one meter to the left. The second statistician applies a correction, but overcompensates and misses, one meter to the right.

And that's when the third yells "Got 'im!"

[-]Dmytry14y00

Ya... i need to expand on that in the post - we have that sort of understanding of how the averages fail ('average temperature in a hospital'), but we don't seem to apply it well to correlations (which are still just averages).

edit: your brief note is great. You can expand on something popular - e.g. IQ tests - consider different IQ scores and what they actually tell about probability of individual doing this well or this badly on another IQ test (using correlation between 2 IQ tests). Or assuming that there is some 'IQ' that IQ tests correlate with, what does IQ test actually tell about the IQ.

[-]LuxAurumque14y00

It's on this kind of thought process that I have issues with statistics being used by people who don't really understand them.

I'm not trying to get on a high horse and exclaim that the common people shouldn't cite studies and stats, but if you are going to cite them, cite them fully. More often than not, by adding a standard deviation and median to an average, you get a picture much closer to what is actually occurring. But even after that, there are other tests which can yield a whole bunch of information that could be more useful towards refining the picture.

I guess if you are going to cite a study, you should take the time to read through the math people tend to skip over, or at least, read all of the conclusions drawn from the math, and not simply mine reports for the facts that happen to work for your argument.

[-]Morendil14y00

It's not just averaging, it's the problem of making valid inferences in general; reasoning from observations to generalized conclusions.

In fact, the data in my original post on the cost of fixing defects wasn't even much of an "average" to start with - that is, it wasn't really obtained by sampling a population, measuring some variable of interest, and generalizing from the expected value of that variable in the sample to the expected value of that variable in the population.

The "sample" wasn't really a sample but various samples examined at various times, of varying sizes. The "measure" wasn't really a single measure ("cost to fix") but a mix of several operationalizations, some looking at engineers' reports on timesheets, others looking at stopwatch measurements in experimental settings, others looking at dollar costs from accounting data, and so on. The "variable" isn't really a variable - there isn't widespread agreement on what counts as the cost of fixing a defect, as the thread illustrated in a few places. And so on. So it's no wonder that the conclusions are not credible - "averaging" as an operation has little to do with why.

I have a further post on software engineering mostly written - I've been sitting on it for a few weeks now because I haven't found the time to finalize the diagrams - which shows that a lot of writing on software engineering has suffered from egregious mistakes in reasoning about causality.

[-]othercriteria14y00

I'm trying to understand your apparent distaste for averaging.

In the physics context, you're treating it as some empirical lab technique for dealing with imperfect apparatus. Given its indifference to the particular sorts of noise or error model, averaging can appear to be unprincipled or just a tractable approximation to some better scheme for analyzing all the observations. What if there is temporal or spatial correlation in the errors? What if there is some Simpson's Paradox-style structure between groups of observations? What if the least-significant bits of the measurements spell out in ASCII what the true answer is?

However, it is nearly a meta-theorem of statistics that inference is possible only when averaging is (this follows from looking at the properties of exponential families of distributions, the only really tractable class). If some extra structure is present, the answer is not to give up averaging but instead to average ALL the things (corresponding to sufficient statistics in a richer family).

[-]Dmytry14y120

The problem is not with averaging. The problem is the misunderstanding of what the result means and where the result is actually coming from.

The average weight of a stable atomic nucleus - averaged over all stable nuclei [of all elements], for instance, is not an important fact from nuclear physics. It is almost entirely useless trivia so uninteresting that I wouldn't be surprised if not a single nuclear physicist has ever calculated it. Likewise, the average human behaviour, when there is huge variance in human behaviour, is more of a demographical fact than psychological.

Likewise in the computer science example; there is a great variety of the work that is performed, with different consequences to mistakes; the average mistake's average cost over time is much more of a fact about the average ratio between different types of work, than a fact about software development process and the fate of any particular mistake and correction. I develop software for living, and I am saying that this factoid is of about as much relevance to my work as the average atomic weight of a stable nucleus is important in the nuclear physics (or any physics).

[-]smk14y40

I found this comment clearer and more engaging than the original post.

[-]Dmytry14y50

Original post is a draft... I intend to rewrite it some to make it a good main post. It is much easier for me to respond to comments than to just make arguments from the blue which would address possible comments.

[-]drethelin14y00

I agree with the grandparent and think those examples should be integrated in the main point.

[-]Giles14y20

Offtopic, but:

there is huge variance in human behaviour

Is this true? My map says that most humans exhibit similar behaviour in most circumstances, but that as social animals we are tuned to pick out the differences more than the similarities, so we just feel that everyone is completely different. If I've got this wrong then I've got some serious updating to do.

On a related note, if I type human behavior or human ethology into Wikipedia I don't seem to get a page explaining how humans behave, but instead get a few observations on how human behaviour is studied. Have I gone completely crazy here?

[-]Richard_Kennaway14y60

Any two things look the same if you look from far enough away. Any two things look different if you look from close enough in. Similarity, like probability, is in the observer, not the observed.

[-]atorm14y10

Missing that point drove my ontology wildly off course in a metaphysics course in undergrad. Seeing the obvious similarity between red things, even if they were reflecting slightly different wavelengths, led me to believe that Universals such as Red and Courage exist. It may be that that point should be pushed harder on this site.

[-]Dmytry14y00

Well, as far as attitude towards savings - or other topic being studied is concerned - yes the behaviour is very diverse. As far as human cognition goes - some people using mental imagery, some people not having mental imagery at all - ditto.

But what does 'very varied' mean? Well, too varied for the common methods would do. As varied as my atomic weight example.

[-][anonymous]14y00

Uh, if your priors about the characteristics of new people you meet don't come from some generalization about other people you have met, where does it come from?

[This comment is no longer endorsed by its author]Reply

[-][anonymous]14y00

I figured I must not understand what the author is saying. He couldn't really mean that we shouldn't use statistical discrimination when dealing with humans, could he?

[This comment is no longer endorsed by its author]Reply

LESSWRONG
LW

LESSWRONG
LW

6

[draft] Generalizing from average: a common fallacy?

6

6