The Compulsion For (Pseudo-)Mechanisms

I'm curious if you have an opinion on the relatives contributions of different causes, such as:

Inability of individuals to think outside established metaphors, without realizing they're inadequate
Inability of individuals to think outside established metaphors, even while knowing they're inadequate
Inability of individuals to think of better new metaphors
Inability to have public conversations through low-bandwidth channels without relying on established metaphors, whether or not the individuals on either end know they're inadequate

I'm thinking (as an example) of Newton, who used calculus to get his results but translated the results out of calculus in order to publish. This let other people see the results were right, but not how anyone could have come up with them. Without that known physics payoff communicated through inadequate tools, there wouldn't have been enough impetus (pun intended) to push the relevant community of people to learn calculus.

[-]adamShimi5mo20

I'm unsure if that's what you meant, but your comment has made me realize that I didn't neatly separate the emergence of a new mechanism (pseudo or not) from the perpetuation of an existing one. The whole post weaves back and forth between the two.

For the emergence of a new mechanism, this raises a really interesting question: where does it come from. The examples I mentioned, and more that come to mind, clearly point to a focus on some data, some phenomenological compression as a starting point (Galileo, Kepler, and other's observations and laws for Newton, say).

But then it also feels like the metaphor being used is never (at least I can't conjure up an instance) completely created out of nothing. People pull it out of existing technology (maybe clockwork for Newton? definitely some example in the quote from The Idea of the Brain at the beginning of the post), out of existing science (say the use of the concept of field by Bourdieu in sociology from Physics) out of stories (how historical linguistics and Indo-European linguistics were bootstrapped with an analogy to Babel), out of elements of their daily life and culture (as an example, one of my friend has a strong economics background, and so they always tend towards economic explanations; I have a strong theoretical computer science background, and so I always tend towards computational explanations...)

On the other hand, I know of at least one example where the intensity of the pattern gave life to a whole new concept, or at least something that was hardly tied with existing scientific or technological knowledge at the time: Faraday's discovery of lines of forces, which prefigures the concept of field in physics.

To go deeper into this (which I haven't done), I would maybe look at the following books:

The work of Nancy Nersessian in general
Forces and Fields by Mary B. Hesse
A lot of intellectual histories, especially of concepts that have proven successful.

[-]AnthonyC5mo62

I will definitely be checking out those books, thanks, and your response clarified the intent a lot for me.

As for where new metaphors/mechanisms come from, and whether they're ever created out of nothing, I think that that is very very rare, probably even rarer than it seems. I have half-joked with many people that at some level there are only a few fundamental thoughts humans are capable of having, and the rest is composition (yes, this is metaphorically coming from the idea of computers with small instruction sets). But more seriously, I think it's mostly metaphors built on other metaphors, all the way down.

I have no idea how Faraday actually came up with the idea of force lines, but it looks like that happened a couple decades after the first known use of isotherms, and a few more decades after the first known use of contour lines, with some similar examples dating back to the 1500s. The early examples I can quickly find were mostly about isobaths, mapping the depth of water for navigation starting in the Age of Exploration. Plus, there's at least one use of isogons, lines of equal magnetic inclination, also for navigation. AFAICT Faraday added the idea of direction to such lines, long before anyone else formalized the idea of vectors. But I can still convince myself, if I want, that it is a metaphor building on a previous well-known metaphor.

If I had to guess a metaphor for Newton, yes I think clockwork is part of it, but mathematically I'd say it's partly that the laws of nature are written in the language of geometry. Not just the laws of motion, but also ray optics.

[-]adamShimi5mo62

Oh, that's a great response!

I definitely agree with you that there is something like a set of primitives or instructions (as you said, another metaphor) that used everywhere by humans. We're not made to do advanced maths, create life-like 2D animation, cure diseases. So we're clearly retargeting processes that were meant for much more prosaic tasks.

The point reminds me of this great quote from Physics Avoidance, a book I'm taking a lot of inspiration for my model of methodology: (p.32)

An unavoidable consequence of our restricted reasoning capacities is that we are forever condemned to wobble between seasons of brash inferential extension and epochs of qualified retrenchment later on. These represent intellectual cycles from which we can never escape: we remain wedded to a comparatively inflexible set of computational tools evolved for the sake of our primitive ancestors, rather than for experts in metallurgy. We can lift ourselves by our bootstraps through clever forms of strategic reassignment within our reasonings, but no absolutist guarantees on referential application can be obtained through these adaptive policies.

This is clearly the part of my model of methodology/epistemology that is the weakest. I feel there is something there, and that somehow the mix of computational constraints thinking from Theoretical CS and language design thinking from Programming Language Theory might make sense of it, but it's the more mechanistic and hidden part of methodology, and I don't feel I have enough phenomenological regularities to go in that direction.

Digging more into the Faraday question, this raises another subtlety: how do you differentiate the sort of "direct" reuse/adaptation of a cognitive primitive to a new task, from the analogy/metaphor to a previous use in the culture.

Your hypotheses focus more on the latter, considering where Faraday could have seen or heard geometric notions in context that would have inspired him for his lines of forces. My intuition is that this might instead be a case of the former, because Faraday was particularly graphic in his note taking and scientific practice, and so it is quite natural for him to convergently rediscover graphic/visual means of explanations.

Exploratory Experiments, my favoured treatment of Faraday's work on Electromagnetism (though focused on electromagnetic induction rather than the lines of forces themselves), emphasizes this point. (p.235,241)

Both the denial of the fundamental character of attraction and repulsion, as well as the displacement of the poles of a bar magnet away from its ends, broke with traditional conceptions. It is important to highlight that these ideas were formed in the context not only of intense experimentation but also of successive attempts to find the most general graphical presentation of the experimental results—attempts that involved a highly versatile use of various visual perspectives on one and the same experimental subject.
[...]
In this development, Faraday’s engagement with graphical representations is again highly remarkable. His laboratory record contains no drawings of the experimental setups themselves, only the occasional sketch of the shape of the wire segment. Of much greater importance are his sketches of the experimental results. As before, these alternate easily between side views and views from above. The side views are less abstract. But even in these drawings Faraday had to add an imaginary post in the center of each described rotation, so as to distinguish front from back and thereby specify the direction of rotation. Again, his sketches served as working media in which he developed stepwise abstractions. They played a constitutive role in the evolution of his view.

(As a side note, Faraday's work in Electromagnetism is probably one of the most intensely studied episode in the history of science. First because of its key importance for the development of electromagnetism, field theory, and most of moder physics. But also because Faraday provides near perfect historical material: he religiously kept a detailed experimental journal, fully published, and had no interest in covering up his trace and reasoning (as opposed to say Ampère).

So in addition to Exploratory Experiments mentioned above, I know of the following few books studying Faraday's work:

[-]danielms5mo32

Interesting post!

If I had to venture an explanation (the compulsion strikes again!), I would say that we just struggle to keep track of and manipulate patterns of data without an underlying story. So we end up making one up, pulling it out of our memetic climate.

I also feel compelled to expound on this.

I find it noticeably harder to work with a new concept than an old one. To translate a new concept to an old one, I put it into existing terms.

I think what might happen is, during the process of science, we formulate what we're seeing in our existing terms (ie. memetic climate).

The problem is in letting this take over, or thinking that it is generally true, and not just a way for our brains to manipulate the concept/patterns we're observing.

I prefer the Maxwell strategy of "shifting frames" - I find it hard to hold sets of observations in my head & do meaningful things with them

[-]adamShimi5mo20

I find it noticeably harder to work with a new concept than an old one. To translate a new concept to an old one, I put it into existing terms.
I think what might happen is, during the process of science, we formulate what we're seeing in our existing terms (ie. memetic climate).
The problem is in letting this take over, or thinking that it is generally true, and not just a way for our brains to manipulate the concept/patterns we're observing.

Yes, and this leads to another essential point: any new idea is at a fundamental infrastructure disadvantage. For the old idea has been not only etched into the psyche and the ontology of its users, but it has probably (especially in the case of a technical idea) grown a significant epistemic infrastructure around it: tools that embed the assumptions, tricks to simplify computations, tacit knowledge of how to tweak it to make it work.

The new idea has nothing of the sort, and so even if it has eventual advantages, it must first survive in a context where it is probably inferior in result. Which generally comes about through some form of propaganda, of separate community, of a new generation wanting to turn around known wisdom...

[-]Joe Collman4mo20

Interesting. I mostly agree with the gist.

The following are a few thoughts that occur to me. Presented as potentially useful pointers, rather than well-thought-through arguments/conclusions.

I don't think "pseudo-mechanisms" is a useful label. Feels a bit too binary (and/or post-hoc) in a highly grey situation.
I'm not sure what you mean by "mechanistic model" vs "stable phenomenological compressions".
- I'm not saying I have no idea what you're talking about - just that I'm not clear quite how you want to distinguish these things. (note that I haven't read many of your previous posts - yet! :))
- As soon as I'm calling something a "stable" pattern in the data, there's at least an implicit [...and this pattern will continue to hold] hypothesis.
- What makes something a "mechanistic model"? E.g. does it need to involve a temporal pattern, so that I'm likely to think of it in causal terms?
- Is the problem here that humans tend to prematurely place too high a probability on causal hypotheses?
  - If so, is the general thing to be wary of more [hypotheses I'm likely to believe too strongly given the (lack of) evidence]?
  - E.g. a mathematician might be wary of elegant hypotheses on this basis.
    - This seems a plausible explanation for a [practicality is inversely proportional to attachment to mechanisms]. The less practical a field, the more its practitioners will tend to be attracted by some kind of aesthetic sense - and that's a potential source of bias and premature attachment. (and in many cases there's the [less straightforwardly falsifiable] factor)

It would be best if we could simply not follow the compulsion, and stay as much as possible at the level of data patterns and phenomenological compressions, at least until we have a good handle there.

Here I remain unclear, as above. (I don't know what separates [have a good handle on data patterns] from [have explanations])
It seems to me our thinking is always going to be inseparable from a huge number of mechanistic expectations and assumptions (often implicit).
It seems a lesson here is something like:
- Be aware of the mechanistic models we're relying on.
- Be aware of the tendency to get prematurely attached to an explanation.
- Adapt accordingly.
  - I don't think jumping to a mechanistic model is itself an error - stubbornly sticking to one seems to be the problem.
  - Nimbleness seems desirable.
    - In particular since [assuming too much] and [assuming too little] are both sources of inefficiency.

Personally, when I try to predict the behaviour of people, I start with their past actions. As in, I look at what they have done in the past, and assume they’ll do more of that.

Similarly here, I think it's asking for trouble to imagine that [Gabe's characterization and extrapolation of [that]] doesn't already rely on a bunch of intent-based expectations and assumptions. (these will usually be more reliable than guesses we'd tend to label "psycho-analysis" - but they're present and important)

For this reason, [be aware of the degree to which you're x-ing, and the implications] seems safer advice than [avoid x-ing], for many x.

[-]Gram Stone5mo10

I think I have a model for this. I also want to include some observations of my own that constitute Bayesian evidence for the model.

Historically, the social epistemology of mathematicians and physicists diverged a lot from the social epistemology of other scientists; I will use paleoanthropologists as a foil.

Mathematicians use their phenomenal sense of elegance to discover truth all the time, but they enjoy the exclusive luxury of mathematical proof. Nevertheless, conjectures can be true, posing elegant conjectures confers a prestige benefit, it can be surprising, in a positive way, when conjectures are proven false, and you don't lose the prestige benefits of posing an elegant conjecture if they’re proven false, or independent of their premises. Einstein also made profound physical discoveries by relying on a phenomenal sense of the mathematical elegance of physical theories, as Eliezer thoroughly described.

On the other hand, conjecture is a dirty word in paleoanthropology. This survey of over 1,200 academics who have published work on human evolution shows that paleoanthropologists are the most 'critical' out of all formal disciplines studying this domain, operationalized as having the most diffusely allocated probability mass on hypotheses (causal histories) about the evolution of humans.

Paleoanthropologists have also had to do several hard Bayesian updates historically, at least once, I think, in the wrong direction, i.e. at the Cold Spring Harbor conference where I think they were deeply mislead about the qualitative implications of quantitative evolutionary genetics in the specific domain of human evolutionary biology, and also professionally embarrassed, by the ornithologist Ernst Mayr, who wielded his academic prestige as a bludgeon.

In all likelihood, popular human evolutionary hypotheses were also disproportionately discredited by the Williams’ Revolution in 1966. Group selectionist explanations are very tempting in this domain because, from our perspective, humans are unprecedentedly Nice products of evolution.

Historically, paleoanthropologists have also updated hard on the timeline of human evolution, and its geographic origin.

There have also been a number of epistemic schisms (waterside hypotheses, sociobiology, evolutionary psychology, human behavioral ecology) that bear the distinct signature of proliferation of ingroups via a narcissism of small differences.

All of this has encouraged paleoanthropologists to resort to a kind of cautious epistemic agnosticism in order to avoid the humiliation of strong Bayesian updates or new schisms, even though this strategy, in a sense, only guarantees humiliation, by ensuring that they update too infrequently and too softly.

Another suggestion is to not vastly underestimate Edgar Allan Poe. Poe was given a telescope by his stepfather when he was 16, read Humboldt's Cosmos, and contributed content and translations to a textbook on conchology.

Poe's Eureka: A Prose Poem contains a narrative description of a Big Bang cosmology that qualitatively obeys the math of Newtonian mechanics, the first correct solution to the dark night sky paradox in history, and an early anthropic argument for why the astronomical scale of the universe must be so great.

Interestingly, it seems like Poe didn't actually even have that sophisticated of a mathematical toolkit. He was operating with a sort of 'simple math of everything’-type inference, reasoning qualitatively in ways that obeyed quantitative natural laws. Whatever Poe's actual mathematical toolkit was, it's an upper bound on how much explicit math you need to know and apply in a particular domain to think much faster than average, if you have a minimal amount of data and think under whatever conditions Poe did. I think of this as Tao's 'post-rigorous' stage of mathematical maturity, but outside of the ontological domain of mathematics.

And if the results of this paper are sound, then we must explain how George R.R. Martin can write an intentionally fictional narrative and approximately truth-track the social dynamics of actual humans, by e.g. distributing in-paracosm deaths with respect to Planetos wall clock time according to a power law. The subjective temporal distribution of deaths as experienced by the reader is, however, made geometric via masterful pacing, which is to say that the truth-tracking and compositional devices of A Song of Ice and Fire are separated into principal components. I think this suggests that we should just drop the ‘pseudo-’ prefix and call it a ‘compulsion for mechanisms.’

Eliezer would glance at these exemplars and conclude that Poe and Martin each either violated the second law of thermodynamics, or contained implicit Bayesian structure.

Congregations of clouds contingently satisfying the Peano axioms and violating them when they combine or separate, or human brains and epistemic institutions occasionally satisfying the axioms of probability theory, but most of the time just violating them, are both consistent with Eliezer’s epistemology (metaphysics?), and this is the reason it should be possible to systematically make valid (and, on a good day, sound) qualitative inferences about the causal effects and logical consequences of quantitative natural laws in the style of The Simple Math of Everything.

Norbert Schwarz's feelings-as information model, particularly the parts of it describing processing fluency and accessible mental content, seems to neatly explain these phenomena, which I described in the context of the availability heuristic and its associated bias in this old essay (sorry if my writing was worse then, or now). Schwarz has also shown that humans use processing fluency to make aesthetic judgments.

On my interpretation of the model, humans can develop a robust relationship between accurate mental models (presumably maintained via very frequent reality testing, at least) that make fluency experiences reliably informative, and a well-calibrated ‘fluency experience generator.’ In a sense, these people develop intuitive ‘elegance priors,’ which should closely correspond to ordinary simplicity priors, and which, along with good existing models and frequent reality testing, seem to create implicit Bayesian structure.

Another angle here is flow theory, also well-explained by processing fluency, specifically in the context of continuous, fluent execution of procedural skills. The 'more that is possible' could look like a continuous flow experience, where the fluently executed procedural skills are all cognitive, and this would correspond to a sustainable variation on the peak insight experiences observed in Poe, Einstein, etc.

My own humble attempt to exploit the wild inadequate equilibrium I believe I have found in paleoanthropology, by assuming the potential implications of your article, and this very comment, to be true, is this qualitative, ostensibly quantifiable and falsifiable, model of early hominin evolution. (Reality testing!)

^{^}

There is a pattern that I can’t clearly see yet, where the attachment to the explanations was inversely proportional to the economic benefits and pragmatism of the field. One gets the impression that chemists working on dyes and blacksmiths forging weapons didn’t care that much about the accuracy of their mechanisms, whereas physicist and fundamental scientists and thinkers of all sorts got obsessed with their specific explanation.

^{^}

Another amazing example in physics is the invention of the potential. I know of only one great source, in French, but it convincingly argues that the potential was nothing more than a mathematical compression for the manipulation of inverse-square law forces — it’s the incredibly wide applicability and power of the trick that eventual suggested its reification as potential energy, and Lagrangian.

^{^}

Of the top of my head, I’m thinking of the self-coherence checks of temperature scale by Victor Regnault described in Chang’s Inventing Temperature, the refusal of internal state in Behaviorism, and the fear of ascribing causation anywhere in many modern statistical analyses.

^{^}

This is a topic I’ve been wanting to write a blogpost about for years. For the curious, I truly recommend the book from which I’m drawing, Reflections on the Practice Of Physics.

^{^}

Note that the main point of this post is more a decision theoretic one: relying on intents first leads to worse predictions, susceptibility to self and external deception, and generally bad equilibrium of responsibility across society.

^{^}

Cosmopolitanism in general helps here, because reading and interacting with other cultures reveals your own hidden assumptions.

LESSWRONG
LW

LESSWRONG
LW

31

The Compulsion For (Pseudo-)Mechanisms

31

31

We Need Explanations

Progress Despite (And Thanks To) This Compulsion

Mitigating The Costs Of Pseudo-Mechanisms

Finally Clarifying Phlogiston