I'm curious if you have an opinion on the relatives contributions of different causes, such as:
I'm thinking (as an example) of Newton, who used calculus to get his results but translated the results out of calculus in order to publish. This let other people see the results were right, but not how anyone could have come up with them. Without that known physics payoff communicated through inadequate tools, there wouldn't have been enough impetus (pun intended) to push the relevant community of people to learn calculus.
I'm unsure if that's what you meant, but your comment has made me realize that I didn't neatly separate the emergence of a new mechanism (pseudo or not) from the perpetuation of an existing one. The whole post weaves back and forth between the two.
For the emergence of a new mechanism, this raises a really interesting question: where does it come from. The examples I mentioned, and more that come to mind, clearly point to a focus on some data, some phenomenological compression as a starting point (Galileo, Kepler, and other's observations and laws for Newton, say).
But then it also feels like the metaphor being used is never (at least I can't conjure up an instance) completely created out of nothing. People pull it out of existing technology (maybe clockwork for Newton? definitely some example in the quote from The Idea of the Brain at the beginning of the post), out of existing science (say the use of the concept of field by Bourdieu in sociology from Physics) out of stories (how historical linguistics and Indo-European linguistics were bootstrapped with an analogy to Babel), out of elements of their daily life and culture (as an example, one of my friend has a strong economics background, and so they always tend towards economic explanations; I have a strong theoretical computer science background, and so I always tend towards computational explanations...)
On the other hand, I know of at least one example where the intensity of the pattern gave life to a whole new concept, or at least something that was hardly tied with existing scientific or technological knowledge at the time: Faraday's discovery of lines of forces, which prefigures the concept of field in physics.
To go deeper into this (which I haven't done), I would maybe look at the following books:
I will definitely be checking out those books, thanks, and your response clarified the intent a lot for me.
As for where new metaphors/mechanisms come from, and whether they're ever created out of nothing, I think that that is very very rare, probably even rarer than it seems. I have half-joked with many people that at some level there are only a few fundamental thoughts humans are capable of having, and the rest is composition (yes, this is metaphorically coming from the idea of computers with small instruction sets). But more seriously, I think it's mostly metaphors built on other metaphors, all the way down.
I have no idea how Faraday actually came up with the idea of force lines, but it looks like that happened a couple decades after the first known use of isotherms, and a few more decades after the first known use of contour lines, with some similar examples dating back to the 1500s. The early examples I can quickly find were mostly about isobaths, mapping the depth of water for navigation starting in the Age of Exploration. Plus, there's at least one use of isogons, lines of equal magnetic inclination, also for navigation. AFAICT Faraday added the idea of direction to such lines, long before anyone else formalized the idea of vectors. But I can still convince myself, if I want, that it is a metaphor building on a previous well-known metaphor.
If I had to guess a metaphor for Newton, yes I think clockwork is part of it, but mathematically I'd say it's partly that the laws of nature are written in the language of geometry. Not just the laws of motion, but also ray optics.
Oh, that's a great response!
I definitely agree with you that there is something like a set of primitives or instructions (as you said, another metaphor) that used everywhere by humans. We're not made to do advanced maths, create life-like 2D animation, cure diseases. So we're clearly retargeting processes that were meant for much more prosaic tasks.
The point reminds me of this great quote from Physics Avoidance, a book I'm taking a lot of inspiration for my model of methodology: (p.32)
An unavoidable consequence of our restricted reasoning capacities is that we are forever condemned to wobble between seasons of brash inferential extension and epochs of qualified retrenchment later on. These represent intellectual cycles from which we can never escape: we remain wedded to a comparatively inflexible set of computational tools evolved for the sake of our primitive ancestors, rather than for experts in metallurgy. We can lift ourselves by our bootstraps through clever forms of strategic reassignment within our reasonings, but no absolutist guarantees on referential application can be obtained through these adaptive policies.
This is clearly the part of my model of methodology/epistemology that is the weakest. I feel there is something there, and that somehow the mix of computational constraints thinking from Theoretical CS and language design thinking from Programming Language Theory might make sense of it, but it's the more mechanistic and hidden part of methodology, and I don't feel I have enough phenomenological regularities to go in that direction.
Digging more into the Faraday question, this raises another subtlety: how do you differentiate the sort of "direct" reuse/adaptation of a cognitive primitive to a new task, from the analogy/metaphor to a previous use in the culture.
Your hypotheses focus more on the latter, considering where Faraday could have seen or heard geometric notions in context that would have inspired him for his lines of forces. My intuition is that this might instead be a case of the former, because Faraday was particularly graphic in his note taking and scientific practice, and so it is quite natural for him to convergently rediscover graphic/visual means of explanations.
Exploratory Experiments, my favoured treatment of Faraday's work on Electromagnetism (though focused on electromagnetic induction rather than the lines of forces themselves), emphasizes this point. (p.235,241)
Both the denial of the fundamental character of attraction and repulsion, as well as the displacement of the poles of a bar magnet away from its ends, broke with traditional conceptions. It is important to highlight that these ideas were formed in the context not only of intense experimentation but also of successive attempts to find the most general graphical presentation of the experimental results—attempts that involved a highly versatile use of various visual perspectives on one and the same experimental subject.
[...]
In this development, Faraday’s engagement with graphical representations is again highly remarkable. His laboratory record contains no drawings of the experimental setups themselves, only the occasional sketch of the shape of the wire segment. Of much greater importance are his sketches of the experimental results. As before, these alternate easily between side views and views from above. The side views are less abstract. But even in these drawings Faraday had to add an imaginary post in the center of each described rotation, so as to distinguish front from back and thereby specify the direction of rotation. Again, his sketches served as working media in which he developed stepwise abstractions. They played a constitutive role in the evolution of his view.
(As a side note, Faraday's work in Electromagnetism is probably one of the most intensely studied episode in the history of science. First because of its key importance for the development of electromagnetism, field theory, and most of moder physics. But also because Faraday provides near perfect historical material: he religiously kept a detailed experimental journal, fully published, and had no interest in covering up his trace and reasoning (as opposed to say Ampère).
So in addition to Exploratory Experiments mentioned above, I know of the following few books studying Faraday's work:
Interesting post!
If I had to venture an explanation (the compulsion strikes again!), I would say that we just struggle to keep track of and manipulate patterns of data without an underlying story. So we end up making one up, pulling it out of our memetic climate.
I also feel compelled to expound on this.
I find it noticeably harder to work with a new concept than an old one. To translate a new concept to an old one, I put it into existing terms.
I think what might happen is, during the process of science, we formulate what we're seeing in our existing terms (ie. memetic climate).
The problem is in letting this take over, or thinking that it is generally true, and not just a way for our brains to manipulate the concept/patterns we're observing.
I prefer the Maxwell strategy of "shifting frames" - I find it hard to hold sets of observations in my head & do meaningful things with them
I find it noticeably harder to work with a new concept than an old one. To translate a new concept to an old one, I put it into existing terms.
I think what might happen is, during the process of science, we formulate what we're seeing in our existing terms (ie. memetic climate).
The problem is in letting this take over, or thinking that it is generally true, and not just a way for our brains to manipulate the concept/patterns we're observing.
Yes, and this leads to another essential point: any new idea is at a fundamental infrastructure disadvantage. For the old idea has been not only etched into the psyche and the ontology of its users, but it has probably (especially in the case of a technical idea) grown a significant epistemic infrastructure around it: tools that embed the assumptions, tricks to simplify computations, tacit knowledge of how to tweak it to make it work.
The new idea has nothing of the sort, and so even if it has eventual advantages, it must first survive in a context where it is probably inferior in result. Which generally comes about through some form of propaganda, of separate community, of a new generation wanting to turn around known wisdom...
Interesting. I mostly agree with the gist.
The following are a few thoughts that occur to me. Presented as potentially useful pointers, rather than well-thought-through arguments/conclusions.
It would be best if we could simply not follow the compulsion, and stay as much as possible at the level of data patterns and phenomenological compressions, at least until we have a good handle there.
Personally, when I try to predict the behaviour of people, I start with their past actions. As in, I look at what they have done in the past, and assume they’ll do more of that.
Similarly here, I think it's asking for trouble to imagine that [Gabe's characterization and extrapolation of [that]] doesn't already rely on a bunch of intent-based expectations and assumptions. (these will usually be more reliable than guesses we'd tend to label "psycho-analysis" - but they're present and important)
For this reason, [be aware of the degree to which you're x-ing, and the implications] seems safer advice than [avoid x-ing], for many x.
I think I have a model for this. I also want to include some observations of my own that constitute Bayesian evidence for the model.
Historically, the social epistemology of mathematicians and physicists diverged a lot from the social epistemology of other scientists; I will use paleoanthropologists as a foil.
Mathematicians use their phenomenal sense of elegance to discover truth all the time, but they enjoy the exclusive luxury of mathematical proof. Nevertheless, conjectures can be true, posing elegant conjectures confers a prestige benefit, it can be surprising, in a positive way, when conjectures are proven false, and you don't lose the prestige benefits of posing an elegant conjecture if they’re proven false, or independent of their premises. Einstein also made profound physical discoveries by relying on a phenomenal sense of the mathematical elegance of physical theories, as Eliezer thoroughly described.
On the other hand, conjecture is a dirty word in paleoanthropology. This survey of over 1,200 academics who have published work on human evolution shows that paleoanthropologists are the most 'critical' out of all formal disciplines studying this domain, operationalized as having the most diffusely allocated probability mass on hypotheses (causal histories) about the evolution of humans.
Paleoanthropologists have also had to do several hard Bayesian updates historically, at least once, I think, in the wrong direction, i.e. at the Cold Spring Harbor conference where I think they were deeply mislead about the qualitative implications of quantitative evolutionary genetics in the specific domain of human evolutionary biology, and also professionally embarrassed, by the ornithologist Ernst Mayr, who wielded his academic prestige as a bludgeon.
In all likelihood, popular human evolutionary hypotheses were also disproportionately discredited by the Williams’ Revolution in 1966. Group selectionist explanations are very tempting in this domain because, from our perspective, humans are unprecedentedly Nice products of evolution.
Historically, paleoanthropologists have also updated hard on the timeline of human evolution, and its geographic origin.
There have also been a number of epistemic schisms (waterside hypotheses, sociobiology, evolutionary psychology, human behavioral ecology) that bear the distinct signature of proliferation of ingroups via a narcissism of small differences.
All of this has encouraged paleoanthropologists to resort to a kind of cautious epistemic agnosticism in order to avoid the humiliation of strong Bayesian updates or new schisms, even though this strategy, in a sense, only guarantees humiliation, by ensuring that they update too infrequently and too softly.
Another suggestion is to not vastly underestimate Edgar Allan Poe. Poe was given a telescope by his stepfather when he was 16, read Humboldt's Cosmos, and contributed content and translations to a textbook on conchology.
Poe's Eureka: A Prose Poem contains a narrative description of a Big Bang cosmology that qualitatively obeys the math of Newtonian mechanics, the first correct solution to the dark night sky paradox in history, and an early anthropic argument for why the astronomical scale of the universe must be so great.
Interestingly, it seems like Poe didn't actually even have that sophisticated of a mathematical toolkit. He was operating with a sort of 'simple math of everything’-type inference, reasoning qualitatively in ways that obeyed quantitative natural laws. Whatever Poe's actual mathematical toolkit was, it's an upper bound on how much explicit math you need to know and apply in a particular domain to think much faster than average, if you have a minimal amount of data and think under whatever conditions Poe did. I think of this as Tao's 'post-rigorous' stage of mathematical maturity, but outside of the ontological domain of mathematics.
And if the results of this paper are sound, then we must explain how George R.R. Martin can write an intentionally fictional narrative and approximately truth-track the social dynamics of actual humans, by e.g. distributing in-paracosm deaths with respect to Planetos wall clock time according to a power law. The subjective temporal distribution of deaths as experienced by the reader is, however, made geometric via masterful pacing, which is to say that the truth-tracking and compositional devices of A Song of Ice and Fire are separated into principal components. I think this suggests that we should just drop the ‘pseudo-’ prefix and call it a ‘compulsion for mechanisms.’
Eliezer would glance at these exemplars and conclude that Poe and Martin each either violated the second law of thermodynamics, or contained implicit Bayesian structure.
Congregations of clouds contingently satisfying the Peano axioms and violating them when they combine or separate, or human brains and epistemic institutions occasionally satisfying the axioms of probability theory, but most of the time just violating them, are both consistent with Eliezer’s epistemology (metaphysics?), and this is the reason it should be possible to systematically make valid (and, on a good day, sound) qualitative inferences about the causal effects and logical consequences of quantitative natural laws in the style of The Simple Math of Everything.
Norbert Schwarz's feelings-as information model, particularly the parts of it describing processing fluency and accessible mental content, seems to neatly explain these phenomena, which I described in the context of the availability heuristic and its associated bias in this old essay (sorry if my writing was worse then, or now). Schwarz has also shown that humans use processing fluency to make aesthetic judgments.
On my interpretation of the model, humans can develop a robust relationship between accurate mental models (presumably maintained via very frequent reality testing, at least) that make fluency experiences reliably informative, and a well-calibrated ‘fluency experience generator.’ In a sense, these people develop intuitive ‘elegance priors,’ which should closely correspond to ordinary simplicity priors, and which, along with good existing models and frequent reality testing, seem to create implicit Bayesian structure.
Another angle here is flow theory, also well-explained by processing fluency, specifically in the context of continuous, fluent execution of procedural skills. The 'more that is possible' could look like a continuous flow experience, where the fluently executed procedural skills are all cognitive, and this would correspond to a sustainable variation on the peak insight experiences observed in Poe, Einstein, etc.
My own humble attempt to exploit the wild inadequate equilibrium I believe I have found in paleoanthropology, by assuming the potential implications of your article, and this very comment, to be true, is this qualitative, ostensibly quantifiable and falsifiable, model of early hominin evolution. (Reality testing!)
Lately I’ve been reading more intellectual histories. For those unaware of the genre, it’s a type of (usually scholarly) book that hunts down the origin of the concept we use, and how they were thought about and framed across time.
Reading a few of these, a curious pattern emerged: there is usually a succession of explanation or metaphors for the concepts, not so much tied to the concept itself, but more to the prevalent and salient memes in the air at the time.
For example, Georges Vigarello identifies four successive Western explanations of tiredness (“fatigue” in the original French):
(Georges Vigarello, Histoire de la fatigue, 2020, p.11-12 , translated by me)
And in his intellectual history of how we think about the brain, Matthew Cobb reveals the following stages in the conceptualization of the brain as a machine:
(Matthew Cobb, The Idea of The Brain, 2020, p.3-4)
What surprised me was the seeming arbitrariness of which metaphor was used to explain the phenomenon. It felt as if they got plucked from the idea current at the time, without thought for adequacy and fitness.
My first instinct was to search for the value of these weird pseudo-mechanisms. Maybe there was something deeper there, some hidden methodological virtue?
But after some time, I came to the conclusion that the explanation was much simpler: there is a deep-rooted human compulsion for explanation, even when the explanations are ungrounded, useless, and weak.
We Need Explanations
Once you look for it, you see this tendency everywhere. Pseudo-mechanisms were postulated in cooking, medicine, astronomy. From old anthropological gods to vapors, humors, forces and energies, I’ve yet to find many examples where people didn’t jump to a pseudo-mechanism, however impotent it was at handling the phenomena.[1]
I’m guilty of this myself. So when articulating issues of emotional management for example, I grasped at the available mechanism of potential energy (which I was studying) to frame it.
Even aside from the lackluster explanation power of most of these pseudo-mechanisms (a topic I will explore below), this aching need for an explanation is curious, because it goes against a key methodological tenet I have underscored many times: you want first to find stable phenomenological compressions (patterns in the data) before jumping to mechanistic models, otherwise you have no ways to ground the latter.
Without crisps and manageable handles on the system under study, the quest for a useful and grounded mechanism is quixotic, unmoored to any stable foundation in the data.
Indeed, if we look at some of the most impressive cases of scientific progress throughout history, they usually involve people actively avoiding mechanistic models for a time, or at least cleanly separating them from the phenomenological compressions under study.
For example, Darwin used the black-boxing technique mentioned in a previous post to sidestep the mechanism for inheritance and variability, which was beyond the power level of the science of his day.
(Marco J. Nathan, Black Boxes, 2021, p.55)
Similarly, Mendel captured the most fundamental properties of inheritance without any recourse to mechanisms or explanations, focusing exclusively on phenomenological compressions.
(Marco J. Nathan, Black Boxes, 2021, p.56)
Other examples come to mind, like the revolutionary work of Milman Parry (covered in a previous post). Through sheer autistic textual analysis of Homeric epics, Parry revealed an intensely formulaic and procedural style, which eventually led him and his student Albert Lord to an exquisitely complex model of oral composition for this and many other poetic traditions.
Similarly, the establishment of Phenomenological Thermodynamics (what we usually called Thermodynamics), one of the most wide-ranging and stable theory in physics, required the efforts of Clausius to remove the weird pseudo-mechanisms (notably the calorific fluid) from Carnot’s insane insights.[2]
(Rudolf Clausius, The Mechanical Theory of Heat, 1867, p.267-269)
And yet, we keep searching and giving explanations, whatever our lack of grounds. If I had to venture an explanation (the compulsion strikes again!), I would say that we just struggle to keep track of and manipulate patterns of data without an underlying story. So we end up making one up, pulling it out of our memetic climate.
Note that this claim also make sense of the recurrent overcompensation in various fields, where some practitioners become allergic to any kind of model or explanation. I expect this is a reaction to both this deep-seated compulsion, and a recent history of of repeated failures of these arbitrary explanations.[3]
Progress Despite (And Thanks To) This Compulsion
Now, the methodological weaknesses of this compulsion to explain don’t condemn us to never make any progress in modeling and problem solving.
First, these pseudo-mechanism have one clear benefit: they focus our attention, delineating specific enough questions for us to actually investigate without feeling overwhelmed. This investigation, when done well, then yields new phenomenological compressions, which provide the soil for the growth of future, more grounded explanations.
In that way, pseudo-mechanisms act as randomizers and symmetry-breakers, where the arbitrary actually jumps us out of the paralysis analysis that a completely open field might cause.
Next, the historical record shows an improvement in the quality of explanations through time, of their “mechanisticness” or “gears-levelness”. As more and more phenomenological regularities accumulate, the first actually mechanistic and powerful explanations emerge, coming notably from the most grounded and regular fields (physics, then chemistry). This creates an improved taste for what a good mechanism looks like, leading to better models all around.
This obviously also brings issues. For example, you can see the developments of fads about what a “real” model looks like. Most fields of science have their youth period where they desperately looked for models like the ones from the physicists, irrespective of whether or not these were adapted. Even in physics, especially early on, some shapes of models ended up sacrosaints for a time.
(Friedrich Steinle, Exploratory Experiments, 2016, p.25-26,30)
And this also breeds an adversarial tendency to present whatever you have in the trappings of accepted models, so they pass people’s filters. That’s why so many conspiracy theories and pseudo-sciences adopt pseudo-mechanisms that feel scientific, without actually bringing on the right properties for a mechanism.
Last but not least, in exceptional cases, the very compulsion to look for mechanisms can be fruitfully exploited. I know of one particularly salient example: Maxwell’s work on Electromagnetism. While he accepted the core phenomenological compression of lines of forces from Faraday, Maxwell experimented wildly with different methods and explanations, until homing in on what is still one of the most impressive theories in all of human history.[4]
(Giora Hon and Bernard R. Goldstein, Reflections on the Practice of Physics, 2021, p.209)
Mitigating The Costs Of Pseudo-Mechanisms
Still, realizing that this compulsion exists in each and everyone of us creates new requirements for a good epistemic hygiene.
It would be best if we could simply not follow the compulsion, and stay as much as possible at the level of data patterns and phenomenological compressions, at least until we have a good handle there.
This is one way to interpret my friend Gabe’s point about predicting and interpreting people based on their behavior, not the intents we ascribe to them.[5]
(Gabe, Intention-based reasoning, 2025)
Yet this is clearly not sufficient. We can’t just will this fundamental compulsion away — want it or not, we will find ourselves regularly coming up with explanations, with or without grounding. And at some point, this is necessary: never looking for explanation is cowardice, not wisdom.
I see two clear ways of improving our natural explanation-making tendencies:
Finally Clarifying Phlogiston
On a personal note, understanding this compulsion and how it fits in the messy picture of “technical progress” also clarified a long-standing confusion I had.
Since I first read Hasok Chang’s Is Water H2O?, I’ve been convinced by his argument that there was something to the old chemical concept of phlogiston, which was completely replaced and eradicated by the more compositional chemistry of Lavoisier (in what we now call The Chemical Revolution).
I even wrote a blogpost years ago trying to defend it, since both in scientific circles and rationalist ones, phlogiston is the poster child of a fake explanation.
But it never felt exactly right. Phlogiston obviously smells fake, a curiosity-stopper that just ascribe some behaviours of chemical matter to a substance that is taken out and put back through various processes, notably combustion. It’s quite obvious that the precise weighting and combination of elements advocated by Lavoisier is a better mechanism, and it is indeed the basis of an incredibly successful modern chemistry.
And yet Chang’s point still stand: by completely killing anything related to phlogiston, the Lavoisierians also threw away some of the core questions and observations that the study of phlogiston had revealed, which were not well explained by the new mechanisms.
(William Odling, The Revived Theory of Phlogiston, 1876)
(Gilbert Lewis, The Anatomy of Science, 1926, p.167-168)
My mistake was to try to resolve this tension between fake mechanism and real loss fully on one side, mostly by pushing a Whiggish history that phlogiston was a good mechanism from the beginning.
But now the correct approach jumps at me: the problem with the treatment of phlogiston by the Lavoisierian was not their replacement of a bad mechanism, but their throwing away of the phenomenological compressions revealed by phlogiston: things we now model as potential chemical energy, electron flow, free electrons in metals…
Where I think the Lavoisierian failed was in their totalizing desires, usually rare in the history of chemistry, that made them try to delete every mention of phlogiston from the history books like a modern damnatio memoriae.
In doing so, they fucked up one of the main source of progress, namely the accumulation of phenomenological compressions.
There is a pattern that I can’t clearly see yet, where the attachment to the explanations was inversely proportional to the economic benefits and pragmatism of the field. One gets the impression that chemists working on dyes and blacksmiths forging weapons didn’t care that much about the accuracy of their mechanisms, whereas physicist and fundamental scientists and thinkers of all sorts got obsessed with their specific explanation.
Another amazing example in physics is the invention of the potential. I know of only one great source, in French, but it convincingly argues that the potential was nothing more than a mathematical compression for the manipulation of inverse-square law forces — it’s the incredibly wide applicability and power of the trick that eventual suggested its reification as potential energy, and Lagrangian.
Of the top of my head, I’m thinking of the self-coherence checks of temperature scale by Victor Regnault described in Chang’s Inventing Temperature, the refusal of internal state in Behaviorism, and the fear of ascribing causation anywhere in many modern statistical analyses.
This is a topic I’ve been wanting to write a blogpost about for years. For the curious, I truly recommend the book from which I’m drawing, Reflections on the Practice Of Physics.
Note that the main point of this post is more a decision theoretic one: relying on intents first leads to worse predictions, susceptibility to self and external deception, and generally bad equilibrium of responsibility across society.
Cosmopolitanism in general helps here, because reading and interacting with other cultures reveals your own hidden assumptions.