It seems to me that there is a great deal of generalization from average (or correlation, as a form of average) when interpreting the scientific findings.
Consider Sapir-Whorf hypothesis as an example; the hypothesis is tested by measuring average behaviours of huge groups of people; at the same time, it may well be that for some people strong version of Sapir-Whorf hypothesis does hold, and for some it is grossly invalid, with some people in between. We had determined that there's considerable diversity in the modes of thought by simply asking the people to describe their thought. I would rather infer from diversity of comments that I can't generalize about the human thought, than generalize from even the most accurate, most scientifically solid, most statistically significant average of some kind, and assume that this average tells of how human thought processes work in general.
In this case the average behaviour is nothing more but some indicator of the ratio between those populations; useless demographical trivia of the form "did you know that among north americans, linguistically-determined people are numerous enough to sway this particular experiment?" (a result that I wouldn't care a lot about). There has been an example posted here.
This goes for much of science, outside the physics.
There was another thread about software engineering. Even if the graph was not inverted and the co-founding variables were accounted for, the result should still have been perceived as useless trivia of the form "did you know that in such and such selection of projects the kind of mistakes that are more costly to fix with time outnumber the mistakes that are less costly to fix with time" (Mistakes in the work that is taken as input for future work, do snowball over time, and the others, not so much; any one who had ever successfully developed non-trivial product that he sold, knows that; but you can't stick 'science' label on this, yet you can stick 'science' label onto some average). Instead, the result is taken as if it literally told whenever mistakes are costlier, or less costly, to fix later. That sort of misrepresentation is in the abstracts of many papers being published.
It seems to me that this fallacy is extremely widespread. A study comes out, which generalizes from average; the elephant in the room is that it is often invalid to generalize from average; yet instead we are arguing whenever the average was measured correctly and whenever there was many enough people that the average was averaged out from. Even if it was, in many cases the result is just demographical trivia, barely relevant to the subject which the study is purposed to be about.
A study of 1 person's thought may provide some information about how thought processes work in 1 real human; it indicates that thought process can work in some particular way; a study of some average behaviour of many people provides the results that are primarily determined by demographics and ratios. Yet people often see the latter as more significant than the former, perhaps mistaking statistical significance for the significance in the everyday sense; perhaps mistaking the generalization from average for actual detailed study of large number of people. Perhaps this obsession with averaging is a form of cargo cult taking after the physics where you average the measurements to e.g. cancel out thermal noise in the sensor.
I want to make a main post about it, with larger number of examples; it'd be very helpful if you can post here your examples of generalization from averages.