Crabs

Nature really loves to evolve crabs.

(source)

Some early biologist, equipped with knowledge of evolution but not much else, might see all these crabs and expect a common ancestral lineage. That’s the obvious explanation of the similarity, after all: if the crabs descended from a common ancestor, then of course we’d expect them to be pretty similar.

… but then our hypothetical biologist might start to notice surprisingly deep differences between all these crabs. The smoking gun, of course, would come with genetic sequencing: if the crabs’ physiological similarity is achieved by totally different genetic means, or if functionally-irrelevant mutations differ across crab-species by more than mutational noise would induce over the hypothesized evolutionary timescale, then we’d have to conclude that the crabs had different lineages. (In fact, historically, people apparently figured out that crabs have different lineages long before sequencing came along.)

Now, having accepted that the crabs have very different lineages, the differences are basically explained. If the crabs all descended from very different lineages, then of course we’d expect them to be very different.

… but then our hypothetical biologist returns to the original empirical fact: all these crabs sure are very similar in form. If the crabs all descended from totally different lineages, then the convergent form is a huge empirical surprise! The differences between the crab have ceased to be an interesting puzzle - they’re explained - but now the similarities are the interesting puzzle. What caused the convergence?

To summarize: if we imagine that the crabs are all closely related, then any deep differences are a surprising empirical fact, and are the main remaining thing our model needs to explain. But once we accept that the crabs are not closely related, then any convergence/similarity is a surprising empirical fact, and is the main remaining thing our model needs to explain.

Agents

A common starting point for thinking about “What are agents?” is Dennett’s intentional stance:

Here is how it works: first you decide to treat the object whose behavior is to be predicted as a rational agent; then you figure out what beliefs that agent ought to have, given its place in the world and its purpose. Then you figure out what desires it ought to have, on the same considerations, and finally you predict that this rational agent will act to further its goals in the light of its beliefs. A little practical reasoning from the chosen set of beliefs and desires will in most instances yield a decision about what the agent ought to do; that is what you predict the agent will do.

— Daniel Dennett, The Intentional Stance, p. 17

One of the main interesting features of the intentional stance is that it hypothesizes subjective agency: I model a system as agentic, and you and I might model different systems as agentic.

Compared to a starting point which treats agency as objective, the intentional stance neatly explains many empirical facts - e.g. different people model different things as agents at different times. Sometimes I model other people as planning to achieve goals in the world, sometimes I model them as following set scripts, and you and I might differ in which way we’re modeling any given person at any given time. If agency is subjective, then the differences are basically explained.

… but then we’re faced with a surprising empirical fact: there’s a remarkable degree of convergence among which things people do-or-don’t model as agentic at which times. Humans yes, rocks no. Even among cases where people disagree, there are certain kinds of arguments/evidence which people generally agree update in a certain direction - e.g. Bottlecaps Aren’t Optimizers seems to make a largely-right kind of argument, even if one found its main claim surprising. If agency is fundamentally subjective, then the convergence is a huge empirical surprise! The differences between which-things-people-treat-as-agents have ceased to be an interesting puzzle - they’re explained - but now the similarities are the interesting puzzle. What caused the convergence?

Metaphorically: if agency is subjective, then why all the crabs? What’s causing so much convergence?

And, to be clear, I don’t mean this as an argument against the intentional stance. The intentional stance seems basically correct, at a fundamental level: “agent” is a pretty high-level abstraction, and high-level abstractions “live in the mind, not in the territory” as Jaynes would say. So the intentional stance is as solid a foundation as one could hope for. But the crabs question indicates that the intentional stance, though correct, is incomplete as an answer to the “What are agents?” question: once we accept subjectivity, the main remaining thing our model needs to explain is the degree of convergence. When and why do people agree about agency? What’s the underlying generator of that agreement?

Probability

We’re all Bayesian now, right?

The big difference between (the most common flavor of) Bayesianism and its main historical competitor, frequentism, is that Bayesianism interprets probabilities as subjective. Different people have different models, and different information, and therefore different (usually implicit) probabilities. There’s not fundamentally an objective “probability that the die comes up 2”. Frequentism, on the other hand, would say that probabilities only make sense as frequencies in experiments which can be independently repeated many times - e.g. one can independently roll a die many times, and track the frequency at which it comes up 2, yielding a probability.

Compared to frequentism, Bayesianism does resolve a lot of puzzles. In analogy to the intentional stance, it also plays a lot better with reductionism: we don’t need a kinda-circular notion of “independent experiment”, and we don’t need to worry about what it means that an experiment “could be” repeated many times even if it actually isn’t.

But we’ve resolved those puzzles in large part by introducing subjectivity, so we have to ask: why all the crabs?

Empirically, there is an awful lot of convergence in (usually implicit) probabilities. Six-sided dice sure do come up 2 approximately ⅙ of the time, and people seem to broadly agree about that. Also, there are all sorts of “weird” Bayesian models - e.g. anti-inductive models, which say that the more often something has happened before, the less likely it is to happen again, and vice-versa. (“The sun definitely won’t come up tomorrow, because it’s come up so many times before.” “But you say that every day, and you’re always wrong!” “Exactly. I’ve never been right before, so I’ll near-certainly be right this time.”) Under the rules of subjective Bayesianism alone, those are totally valid models! … Yet somehow, people seem to converge on models which aren’t anti-inductive.

Much like the intentional stance, (subjective) Bayesianism seems roughly correct (modulo some hedging about approximation and domain of applicability). But it’s incomplete - it doesn’t explain the empirically-high degree of convergence. When and why do people agree about probabilities (or, at a meta level, the models in which those probabilities live)? What’s the underlying generator of that agreement?

Ontology

It’s a longstanding empirical observation that different cultures sometimes use different ontologies. Eskimos have like a gajillion words for snow, polynesian navigators had some weird-to-us system for thinking about the stars, etc. So, claim: ontologies are subjective.

(In this case there’s a whole historical/political angle, to a much greater degree than the previous examples. The narrative goes that bad imperial colonizers would see the different ontologies of other cultures, decide that the ontologies of those “primitive” cultures were clearly “irrational”, and then “help” them by imposing the imperial colonizer’s ontology. And therefore today, any claim that ontologies might not be entirely subjective, or especially any claim that one ontology might be better than another, is in certain circles met with cries of alarm and occasionally attempts at modern academia’s equivalent of exorcism.)

Similar to the intentional stance, there’s a sense in which ontologies are obviously subjective. They live in the mind, not the territory. The claim of subjectivity sure does seem correct, in at least that sense.

But why all the crabs?

Even more so than agents or probability, convergence is the central empirical fact about ontologies. We’re in a very high dimensional world, yet humans are able to successfully communicate at all. That requires an extremely high degree of everyday ontological convergence - every time you say something to me, and I’m not just totally confused or misunderstanding you completely, we’ve managed to converge on ontology. (And babies are somehow able to basically figure out what words mean with only like 1-3 examples, so the convergence isn’t by brute force of exchanging examples with other humans; that would require massively more examples. Most of the convergence has to happen via unsupervised learning.)

So modeling ontologies as subjective is correct in a fundamental sense, but it’s incomplete - it doesn’t explain the empirically-high degree of convergence. When and why do people converge on the same ontology? What’s the underlying generator of that convergence?

Generalization

I find this pattern comes up all the time when discussing philosophy-adjacent questions. The pattern goes like this:

  • Question: What is X? (e.g. agency, probability, etc)
  • Response: Well, that’s subjective.
  • Counter-response: Yeah, that seems basically-true in at least a reductive sense; X lives in the mind, not the territory. But then why all the crabs? Why so much convergence in peoples’ notions of X?

and then usually the answer is “well, there’s a certain structure out in the world which people recognize as X, because recognizing it as X is convergently instrumental for a wide variety of goals”, so we want to characterize that structure. But that’s beyond the scope of this post; this post is about the question, not the answer.

Gotchas

To wrap up, two gotchas to look out for.

First, convergence is largely an empirical question, and it might turn out that there’s surprisingly little convergence. I think “consciousness” is a likely example: there’s a sort of vague thematic convergence (it has to do with the mind, and maybe subjective experience, and morality?), but people in fact mean a whole slew of wildly different things when they talk about “consciousness”. (Critch has some informal empirical results illustrating what I mean.) “Human values” is, unfortunately, another plausible example.

Second, when empirically checking for convergence, bear in mind that when people try to give definitions for what they mean, they are almost always wrong. I don’t mean that people give definitions which disagree with the definitions other people use, or that people give definitions which disagree with some platonic standard; I mean people give definitions which are not very good descriptions of their own use of a word or concept. Humans do not have externally-legible introspective access to what we mean by a word or concept. So, don’t ask people to define their terms as a test of convergence. Instead, ask for examples.

New Comment
15 comments, sorted by Click to highlight new comments since: Today at 3:53 PM

The title of this post ("Yes, It's Subjective, But Why All The Crabs?") gave me entirely the wrong idea of what the post was going to be about, and likely would have caused me to skip it if I didn't recognize the author.

(I thought it was going to actually be about crabs, not about crabs-as-a-metaphor.  I was reminded of the article There's No Such Thing As A Tree (Phylogenetically).)

[-]Ann9mo6369

I confess I was hoping for a theory of carcinization.

Even more infuriating, it appears that the picture shows a variety of porcelain crabs, and not in fact different species.

Yeah, I didn't quickly find a good picture with lots of different species. I too was kinda sad about that.

The Hall of Biodiversity at the natural history museum in NYC does have a little section of crustaceans, which is the closest I quickly found, and is very much the sort of thing you'd probably like if you (like me) would really rather have read a blog post explaining carcinization:

The Spectrum of Life: Hall of Biodiversity | AMNH
[-]aysja9mo268

I am not totally sure that I disagree with you, but I would not say that agency is subjective and I’m going to argue against that here. 

Clarifying “subjectivity.” I’m not sure I disagree because of this sentence “there’s a certain structure out in the world which people recognize as X, because recognizing it as X is convergently instrumental for a wide variety of goals.” I’m guessing that where you’re going with this is that the reason it’s so instrumentally convergent is because there is in fact something “out there” that deserves to be labeled as X, irrespective of the minds looking at it? Like, the fact that we all agree that oranges are things is because oranges basically are things, e.g., they contain the molecules we need for energy, have rinds, and so on, and these are facts about the territory; denying that would be bad for a wide variety of goals because you’d be missing out on something instrumentally useful for many goals, where, importantly, “usefulness” is at least in part a territory property, e.g., whether or not the orange contains molecules that we can metabolize. If this is what you mean, then we don’t disagree. But I also wouldn’t call an orange subjective, in the same way I wouldn’t call agency subjective. More on that later. 

People modeling things differently does not necessarily imply subjectivity. It seems like your main point about agents being subjective is that “different people model different things as agents at different times.” This doesn’t seem sufficient to me. Like, people modeled heat as different things before we knew what it was, e.g., there was a time when people were arguing about whether it was a motion, a gas, or a liquid. But heat turned out to be “objective,” i.e., something which seems to exist irrespective of how we model it. Likewise, before Darwin there was some confusion over what different dog breeds were: many people considered them to be different “varieties” which was basically just a word for “not different species, but still kind of different.” Darwin claimed, and I believe him, that people would give different answers about whether these were different species or merely different varieties based on context and their history (e.g., if a naturalist had never seen dogs, then they’d probably call them different species, if they had, they’d call them different varieties). As it turns out, there’s an underlying “objective” thing here, which is how much their genomes differ from each other (I think? Not an evolutionary biologist :p). In any case, it seems to me that it is often the case that before scientific concepts are totally sussed out there is disagreement over how to model the thing they are pointing at, but that this doesn’t on its own imply that it’s inherently subjective.

A potential crux. There is a further thing you might mean here, what Dennett calls “the indeterminacy of interpretation,” which is that there is just no fact of the matter to what is agentic. Like, people might have disagreed about what heat was for a while, but it turned out that heat is more-or-less objective. The concept “hot,” on the other hand, is more subjective: just a property of how the neurons in a particular body/mind are tuned. In other words, the answer to whether something is hot is basically just “mu”—there is no fact of the matter about it. I am guessing that you think agency is of the latter type; I think it is of the former, i.e., I think we just haven’t pinned down the concepts in agency well enough to all agree on them yet, but that there is something “actually there” which we are pointing at. This might be our crux?  

Abstractions are not all subjective? I am generally pretty confused by the stance that all “high-level abstractions” are subjective (although I don’t know what you mean by “high-level.”) I think (based on citing Jaynes) that you are saying something like “abstractions are reflections of our own ignorance.” E.g., we talk about temperature as some abstract thing because we are uncertain about the particular microstate that underlies it. But it seems to me that if you take this stance then you have to call basically everything subjective, e.g., an orange sitting right in front of me is subjective because I am ignorant of its exact atomic makeup. This seems a little weird to me, like oranges wouldn’t go away if we became fully certain about them? Likewise, I don’t think agency goes away if we become less ignorant of it. 

Agents are more like “oranges” than they are like “hot.” To me, agents seem clearly in the “orange” category, rather than the “hot” category. Sure, we might currently call different things agents at different times, but to me it seems clear that there is something “real” there that exists aside from our perceptual machinery/interpretation layer. Like, the fact that agents consume order (negentropy) from their environment to spend on the order they care about is one such example of something “objective-ish” about agents, i.e., a real regularity happening in the territory, not just relative to our models of it. 

Why do we disagree about what’s agentic, then? On my model, part of the reason that people vary on what they call agentic is because (I suspect) “agency” is not going to be a coherent concept in itself, but rather break out into multiple concepts which all contribute to our sense of it, such that many things we currently consider to be edge cases can be explained by one or a few factors being missing (or diminished). Likewise, I do expect that it is not entirely categorical, but that things can have more or less of it, and have more or less at different times (i.e., that a particular human varies in its ‘agent-ness’ over time). Neither of these seem incongruent to me with the idea that it’s objective-ish, just that we haven’t clarified what we mean by agency yet. 

I’m guessing that where you’re going with this is that the reason it’s so instrumentally convergent is because there is in fact something “out there” that deserves to be labeled as X, irrespective of the minds looking at it? Like, the fact that we all agree that oranges are things is because oranges basically are things, e.g., they contain the molecules we need for energy, have rinds, and so on, and these are facts about the territory; denying that would be bad for a wide variety of goals because you’d be missing out on something instrumentally useful for many goals, where, importantly, “usefulness” is at least in part a territory property, e.g., whether or not the orange contains molecules that we can metabolize. If this is what you mean, then we don’t disagree.

Yup, exactly. And good explanations, this is a great comment all around.

In the final paragraph, I'm uncertain if you are thinking about "agency" being broken into components which make up the whole concept, or thinking about the category being split into different classes of things, some of which may have intersecting examples. (or both?) I suspect both would be helpful. Agency can be described in terms of components like measurement/sensory, calculations, modeling, planning, comparisons to setpoints/goals, taking actions. Probably not that exact set, but then examples of agent like things could naturally be compared on each component, and should fall into different classes. Exploring the classes I suspect would inform the set of components and the general notion of "agency".

I guess to get work on that done it would be useful to have a list of prospective agent components, a set of examples of agent shaped things, and then of course to describe each agent in terms of the components. What I'm describing, does it sound useful? Do you know of any projects doing this kind of thing?

On the topic of map-territory correspondence, (is there a more concise name for that?) I quite like your analogies, running with them a bit, it seems like there are maybe 4 categories of map-territory correspondence;

  • Orange-like: It exists as a natural abstraction in the territory and so shows up on many maps.
  • Hot-like: It exists as a natural abstraction of a situation. A fire is hot in contrast to the surrounding cold woods. A sunny day is hot in contrast to the cold rainy days that came before it.
  • Heat-like: A natural abstraction of the natural abstraction of the situation, or alternatively, comparing the temperature of 3, rather than only 2, things. It might be natural to jump straight to the abstraction of a continuum of things being hot or not relative to one another, but it also seems natural to instead not notice homeostasis, and only to categorize the hot and cold in the environment that push you out of homeostasis.
  • Indeterminate: There is no natural abstraction underneath this thing. People either won't consistently converge to it, or if they do, it is because they are interacting with other people (so the location could easily shift, since the convergence is to other maps, not to territory), or because of some other mysterious force like happenstance or unexplained crab shape magic.

It feels like "heat-like" might be the only real category in some kind of similarity clusters kind of way, but also "things which are using a measurement proxy to compare the state of reality against a setpoint and taking different actions based on the difference between the measurement result and the setpoint" seems like a specific enough thing when I think about it that you could divide all parts of the universe into being either definitely in or definitely out of that category, which would make it a strong candidate for being a natural abstraction, and I suspect it's not the only category like that.

I wouldn't be surprised if there were indeterminate things in shared maps, and in individual maps, but I would be very surprised if there were many examples in shared maps that were due to happenstance instead of being due to convergence of individual happenstance indeterminate things converging during map comparison processes. Also, weirdly, the territory containing map making agents which all mark a particular part of their maps may very well be a natural abstraction, that is, the mark existing at a particular point on the maps might be a real thing, but not the corresponding spot in territory. I'm thinking this is related to a Schelling point or Nash Equilibrium, or maybe also related to human biases. Although, those seem to do more with brain hardware than agent interactions. A better example might be the sound of words: arbitrary, except that they must match the words other people are using.

Unrelated epistemological game; I have a suspicion that for any example of a thing that objectively exists, I can generate an ontology in which it would not. For the example of an orange, I can imagine an ontology in which "seeing an orange", "picking a fruit", "carrying food", and "eating an orange" all exist, but an orange itself outside of those does not. Then, an orange doesn't contain energy, since an orange doesn't exist, but "having energy" depends on "eating an orange" which depends on "carrying food" and so on, all without the need to be able to think of an orange as an object. To describe an orange you would need to say [[the thing you are eating when you are][eating an orange]], and it would feel in between concepts in the same way that in our common ontology "eating an orange" feels like the idea between "eating" and "orange".

I'm not sure if this kind of ontology:

  • Doesn't exist because separating verbs from nouns is a natural abstraction that any agent modeling any world would converge to.
  • Does exist in some culture with some language I've never heard of.
  • Does exist in some subset of the population in a similar way to how some people have aphantasia.
  • Could theoretically exist, but doesn't by fluke.
  • Doesn't exist because it is not internally consistent in some other way.

I suspect it's the first, but it doesn't seem inescapably true, and now I'm wondering if this is a worthwhile thought experiment, or the sort of thing I'm thinking because I'm too sleepy. Alas :-p

I have a longer draft on this, but my current take is the high level answer to the question is similar for crabs and ontologies (&more).

Convergent evolution usually happens because of similar selection pressures + some deeper contingencies

Looking at the selection pressures for ontologies and abstractions, there is a bunch of pressures which are fairly universal, an in various ways apply to humans, AIs, animals...

For example: Negentropy is costly => flipping less bits and storing less bits is selected for; consequences include
-part of concepts; clustering is compression
-discretization/quantization/coarse grainings; all is compression
...
 
Intentional stance is to a decent extent ~compression algorithm assuming some systems can be decomposed into "goals" and "executor" (now the cat is chasing a mouse, now some other mouse). Yes this is again not the full explanation because it leads to a question why there are systems in the territory for which this works, but it is a step.

… and then usually the answer is “well, there’s a certain structure out in the world which people recognize as X, because recognizing it as X is convergently instrumental for a wide variety of goals”, so we want to characterize that structure. But that’s beyond the scope of this post; this post is about the question, not the answer.

I fear I'm being uncharitable here, but this all seems self-evident.

I'm thinking about The Cluster Structure of Thingspace. We could define "dog" to mean "vacuum or prime number or bar of soap or cheese", but that wouldn't be a useful definition. It'd be like picking a bunch of random points in thingspace and saying "there: that's a dog". Of course we wouldn't want to do that. Of course we'd want to draw the boundaries in such a way that captures related things.

I guess that only explains why we all pretty much draw the boundaries around similarity clusters though. It doesn't answer the question of why people tend to draw boundaries around the same similarity clusters. On first thought, I'm seeing various reasons:

  • For something like "food", I dunno, I guess it's just useful to draw the boundary around things that provide nutrients, taste reasonably good, are accessible, etc.
  • For something like "beauty", I guess it's a mix of genetic drives and cultural stuff. Like maybe there's something genetic about symmetrical faces, and something cultural about being attracted to fat vs thin people.
  • For something like "awful", I guess it's just convenient to go along with where everybody else is drawing the boundary for the sake of communication. Like, it originally meant "full of awe" and sounds like it means "full of awe", so "something really bad" seems like the wrong way to draw the boundaries, but since everyone else is already drawing them there, I may as well do so as well.

I get a strong sense that there's something deeper here that I'm missing, but I don't know what it is.

Maybe Zack_M_Davis' Where to Draw the Boundaries? points at a potential answer to your question? The reason I think it's relevant is basically this paragraph from that post, which builds upon Eliezer's post you referenced:

A standard technique for understanding why some objects belong in the same "category" is to (pretend that we can) visualize objects as existing in a very-high-dimensional configuration space, but this "Thingspace" isn't particularly well-defined: we want to map every property of an object to a dimension in our abstract space, but it's not clear how one would enumerate all possible "properties." But this isn't a major concern: we can form a space with whatever properties or variables we happen to be interested in. Different choices of properties correspond to different cross sections of the grander Thingspace. Excluding properties from a collection would result in a "thinner", lower-dimensional subspace of the space defined by the original collection of properties, which would in turn be a subspace of grander Thingspace, just as a line is a subspace of a plane, and a plane is a subspace of three-dimensional space.

[-]Ilio9mo40

then we’re faced with a surprising empirical fact: there’s a remarkable degree of convergence among which things people do-or-don’t model as agentic at which times. Humans yes, rocks no. (…) What caused the convergence?

In my view, because there exists what we call the « biological movement area ». Like color is subjective, but that does not prevent biological and environmental constraints to cause convergence of visions (ah ah).

We’re all Bayesian now, right?

That sounds most probably significant. 😉

Yet somehow, people seem to converge on models which aren’t anti-inductive.

That’s of course true for, say, physics. But what about political questions, like the blue and the red or the pro/anti vax/war/warming/etc? It seems to me that there are domains of knowledge that looks as if anti-induction was a thing. And not just in social sciences: interpretation of QM, not even wrong vs strings, fondation of mathematics… and everything about cognition, including AI!

To wrap up, two gotchas to look out for.

Great points, thanks!

[...] but people in fact mean a whole slew of wildly different things when they talk about “consciousness”.

Just because you have a different name for a concept doesn't necessarily mean that the concept isn't instrumentally convergent. It might be that there is a whole set of concepts that different people label with the same word, or the same concept that people have different names for.

In one video Judea Pearl was talking about consciousness as a kind of self-model that an agent could have. This is completely different from defining consciousness as there is some subjective experience. I.e. a system is conscious if it's like something to be a particular system, as Sam Harris would say it.

This means that we can run into a situation where we have the name "Consciousness" is a pointer to "A system has a model of itself" and "It's like something to be a system". But hypothetically Sam Harris might have the same concept of "A system has a model of itself" and Judea Pearl might have the concept of "It is like something to be this system" but label them with different names. Or actually not have any labels at all.

I have noticed within myself very often that I create new concepts that I do not label, which is bad, because later on I then realize that this concept already exists and somebody has thought about it, but because I didn't label it, I had just this very vague, fuzzy object in my mind that I couldn't really put my finger on precisely and therefore it was hard to think about, making me not cash out the insights that I could have cashed out that seem pretty obvious once I read about what other people have thought about that particular concept.

So, with concepts, there might be a lot more conversions that is visible at first glance, because of different labels.

Your proposal seems to be that people's subjectivities are converging because people are converging on some objective structure out in the world.

I agree that this seems important, but to me, this doesn't quite appear to be the whole story.

In particular, I am quite sympathetic to the Kantian notion of categories through which we can't help but perceive our experiences[1].

This seems very important to me for counterfactuals which are circular in the sense that we justify the use of counterfactuals via counterfactuals.

And I guess if you believe this for counterfactuals, then it seems like you probably should believe this for probabilities too because probabilities are built upon either counterfactuals or possibilities.

  1. ^

    That said, I am less dogmatic than Kant, in that I believe that it is possible to bootstrap up from our naive conceptions to less naive conceptions much as Einstein bootstrapped up from space and time to spacetime.

It seems to to me that this post is about the question "whence the categories?"

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year. Will this post make the top fifty?