Followup to: Where to Draw the Boundaries?

Without denying the obvious similarities that motivated the initial categorization {salmon, guppies, sharks, dolphins, trout, ...}, there is more structure in the world: to maximize the probability your world-model assigns to your observations of dolphins, you need to take into consideration the many aspects of reality in which the grouping {monkeys, squirrels, dolphins, horses ...} makes more sense.

The old category might have been "good enough" for the purposes of the sailors of yore, but as humanity has learned more, as our model of Thingspace has expanded with more dimensions and more details, we can see the ways in which the original map failed to carve reality at the joints ...

So the one comes to you—a-gain—and says:

Hold on. In what sense did the original map fail to carve reality at the joints? You don't deny the obvious similarities between dolphins and fish—between dolphins and other fish. That's a cluster in configuration space! The observation that dolphins are evolutionarily related to mammals may be an interesting fact that specialized professional evolutionary biologists care about for some inscrutable specialist reason. But I'm not a professional biologist. Choosing to define categories around evolutionary relatedness rather than macroscopic human-relevant features seems like an arbitrary ésthetic whim. Why should I care about phylogenetics, at all?

This one is going to take a few paragraphs.

Focusing on evolutionary relatedness is not an arbitrary ésthetic whim because evolution actually happened. Evolution isn't just a story that our Society's specialists happen to have chosen because they liked it; they chose it because it predicts what we see in the world. You can't choose a substantively different theory and make the same predictions about the real world. (At most, you'd end up with an isomorphic theory with additional epiphenominal elements, asserting that an allele rose in frequency "because" the angels willed it, without an account of why the angels' will happens to line up with what would have transpired if there were no angels.) Similarly, category definitions represent hidden probabilistic inferences; you can't "redraw" the "boundaries" of the categories your mind actually uses and still make the same predictions about the real world. Accordingly, it shouldn't be surprising that our knowledge of evolution turns out to have implications for how we should categorize organisms—not as an ésthetic choice, but for structural reasons that can be understood mechanistically.

One element of the evolutionary worldview is a "continuity" postulate: all else being equal, creatures that are more closely related are more similar in general. Creationists sometimes try to discredit evolution by ridiculing the absurdity of the idea that a monkey could give birth to a person. But actually, evolutionary biologists agree on the absurdity of that specific scenario. Monkeys don't suddenly give birth to humans in a single generation; if they did, that would utterly falsify our understanding of evolution! Rather, monkeys and humans had a common ancestor forty million years ago, with the separate lines of descent leading to present-day monkeys and present-day humans each accumulating their own differences one mutation at a time.

The fact that evolution persists information in the genome creates a regularity in the world that can be exploited by cognitive algorithms that know about phylogeny. In terms of the formalization of causality with directed acyclic graphs pioneered by Judea Pearl and others, an organism's genome is at the root of the causal graph underlying all other features of an organism:

In the language of causal graphs, conditioning on the "dolphin DNA" node in the diagram d-separates the paths between the "blowhole" and "flippers" nodes that run through the "dolphin DNA" node. That means that—assuming there aren't any other paths between "blowhole" and "flippers" that don't go through "dolphin DNA"—"blowhole" and "flippers" become conditionally independent given "dolphin DNA": when I see a creature with a blowhole, that makes me more likely to think it's a dolphin, which makes me more likely to think it has flippers, but given that I already know something is a dolphin, learning more about its flippers doesn't change my predictions about its blowhole.

But conditional independence assertions of this kind are exactly what makes "categorizing" a useful AI technique in the first place. It's often helpful to visualize this by claiming that entities in the same category belong to a cluster in some configuration space, but this handy visual metaphor is lacking in rigor and well-definedness.

What space? What do the dimensions of this space represent? "Features"? But there are no pre-existing "features" in the world. Assuming the existence of a "space" up front is punting on most of the actual AI challenge. "There's conditional independence structure in the causal graph" is a meaningfully deeper explanation than "There's a cluster in configuration space", because conditional independence is what what makes it possible to construct a "space" such that there are clusters. (Though this isn't a complete explanation: we still need to figure out where the "variables" in the causal graph come from.)

Going beyond the configuration space metaphor is important because it lets us understand how we can learn new things about dolphins that we don't already know. Dolphins are complicated! Dolphins are complicated in a very specific way. Dolphins are fragile: the shortest computer program that simulates a dolphin requires many bits of initial information, and if you changed some of the bits, you wouldn't have a dolphin anymore. Complex functional adaptations are universal within a species because each beneficial allele has to reach fixation before there can be selection pressure for the next incremental improvement. That's why it's possible to claim that there are 206 bones in "the" human skeleton, even if most humans haven't had their bones counted. I haven't been able to find a citation on how many bones dolphins have, but I'm confident that it's the same number for all or nearly-all members of a particular dolphin species.

But "number of bones" wasn't one of the dimensions of the "space" that we originally noticed the dolphin cluster in! That's what the "carving reality at the joints" metaphor means: genetic relatedness is an underlying generator of similarities, that includes the "finned swimmy animals" properties that dolphins and fish have in common, but also includes many more high-dimensional details: how dolphins are warm-blooded, how dolphins have eyelids, the way female dolphins nurse their live-born young, the way male dolphins sometimes gang-rape female dolphins, the way dolphins sleep with only half their brain at a time, the specific bones in the (the!) dolphin skeleton (however many there turn out to be), the way dolphins swim in a circle to trick fish into jumping and being eaten, &c.

In contrast, "finned swimmy animals" is an intrinsically less cohesive subject matter: there are similarities between them due to convergent evolution to the aquatic habitat, and it probably makes sense to want a short word or phrase (perhaps, "sea creatures") to describe those similarities in contexts where only those similarities are relevant.

But that category "falls apart" very quickly as you consider more and more aspects of the creatures: the finned-swimmy-animals-with-gills are systematically different from the finned-swimmy-animals-with-a-blowhole, in more ways than just the "respiratory organ" feature that I'm using in this sentence to point to the two groups.

A "definition" is just a description that helps someone else pick out "the same" natural abstraction in their own world-model: you can't pack everything there is to know about dolphins into the definition of the word "dolphin", in part because we don't know everything there is to know about dolphins as an empirical regularity in the real world. The "finned swimmy animals" category less useful to the extent that it fails to compress more information than is contained in its definition. Blood is thicker than water (that is, the similarities induced by shared blood cluster in a "thicker" (higher-dimensional) configuration space than the similarities induced by living in the water).

The one replies:

But what if I don't need to compress any more information than "finned swimmy animals"? If I'm watching a nature documentary, I don't think I'm being done any favors by having word-structures that group lungfish and lamprey while excluding sea turtles. In general, the concepts I find useful respond to my immediate needs. I care more about "would be at home atop a fruit pizza" rather than "everything anatomically analogous to an apple". When a child points at a whale and says "look, a fish", and you're like "haha no, its tail flaps horizontally and its grandma had hair", who's in the wrong here?

In some sense, sure: ignorance isn't better than knowledge if you don't care about knowing things. If you live in human civilization and don't need to carve up the world of aquatic life in much detail—if your use-case for thinking about aquatic animals is watching a nature documentary (for entertainment??) rather than living and working with them every day, then you might think the deeper causal structure isn't buying you anything. And for you and your extremely limited use-case, maybe it isn't. But you would likely change your mind if you were a veterinarian or a zoologist who actually had skin in the game in robustly describing this part of the world.

When people have skin in the game, they care about the underlying mechanisms and want short codewords for them, because the underlying mechanisms sometimes have decision-relevant implications. If you hurt your ankle while running, you would probably be interested to know whether it was a sprain or a stress fracture because that affects your decisions about how to recover. You wouldn't say, "Well, all I know is that my ankle hurts—that's all a child would know—so I'm going to call it a hurtankle; I don't care about anatomy."

You may not be intrinsically curious about anatomy, but even if the only thing you care about is relief from pain and recovering your mobility, you still benefit from living in a Society whose shared ontology distinguishes sprains and stress fractures being different things in the territory, even if they compress to the same point in your map of how much your ankle hurts right now. And you probably also benefit from living in a Society that can stabilize a shared map of living things based on the facts of evolutionary history, which we can all agree on in the limit of good science, unlike the vagaries of what I personally think tastes good on pizza.

When you think about it, it makes sense that our shared language ends up being optimized for robustly describing reality, rather than catering to the ignorance of people who don't have reasons to care about whether a particular distinction is actually robust. Personally, I confess I don't know the difference between alligators and crocodiles, and I don't particularly need to know: I'm not likely to encounter either outside of a zoo or a nature documentary. But precisely because I don't need to know, you don't see me demanding that the rest of the world redefine one of these words as a hypernym that includes both. The people who write encyclopedias seem to think there's a difference, and since they probably know what they're doing, it makes sense for their opinion to have more weight on English language common usage than mine—at least until I were to start regularly ending up in situations where I need to point to an alligator-or-crocodile in my environment and I still didn't notice any differences.

Some animals that I do see in my local environment sometimes are cats and dogs, because people often keep them as pets. I benefit from having separate words (in my map) for cats and dogs, because I can see that cats and dogs are actually different (in the territory). If my pen pal from a faraway land that had no cats were to visit America and encounter a cat for the first time, he might remark, "What a strange dog!" If I were to reply, "Actually, that's a cat; they're not the same thing as dogs", it would be pretty obnoxious if he were to snap back, "What kind of definitional gymnastics is this? It's a four-legged furry animal with a tail! As far as I'm concerned, it's a dog."

It's true that dogs and cats are both four-legged furry animal with a tail. If you had never seen a cat before, or you didn't spend much time around four-legged furry tailed animals at all, it might not be immediately obvious why someone might want to allocate two words for these subcategories, or why anyone might oppose just using dog to refer to the supercategory. And yet there's some sense in which my countrymen who think cats and dogs are different things know what they're doing. My "Actually, that's a cat" claim represented an attempt to convey information about the statistical structure of creatures in the real world, and my foreign friend's insistence that he can define a word any way he wants—to suit his ignorance, to avoid challenges to his current ontology—functions to shut down that transfer of information.

But if you don't know what a better ontology can buy you—if you don't know that there are mathematical laws governing the use of categories in a rational mind—you may not know what you're missing. As part of a review of a book on post-traumatic stress disorder, psychiatrist Scott Alexander casually mentions the American Psychiatric Association's "philosophical commitment to categorizing by symptoms rather than cause": "[w]hen the APA decides not to [recognize developmental trauma disorder], they're not necessarily rejecting the seriousness of child abuse, only saying it's not the kind of thing they build their categories around."

In a sane world, this would be utterly discrediting to the APA. The cognitive function of categories is to group relevantly similar things together in order to make similar predictions and decisions about them. But for the decisions involved in treating a condition, causes are of supreme relevance! Medical doctors understand this: we consider bacterial and viral infections to be different categories of disease even when they cause similar symptoms, because antibiotics can treat the former but not the latter. No matter what words are used to describe it, at some point your decision algorithm needs to categorize by cause in order to compute the correct treatment: for example, to give antibiotics to the patients with bacterial diseases and antivirals to the patients with viral diseases. If the authoritative body of professional psychiatrists has a "philosophical commitment" against this, that means we don't have a science of psychiatry.

In short, if you care about making high-quality decisions, mechanisms matter and causality matters, and mechanisms and causality aren't necessarily pinned down by whatever particular high-level surface analogy happens to seem most salient to a particular human.

The one replies:

Okay, you've convinced me that phylogenetics is—potentially—of more than just specialist interest. But "fish" are a paraphyletic category: descended from a common ancestor, but not including all the descendant groups—in this case, excluding the tetrapods (amphibians, reptiles, mammals, birds, &c.). If you've decided that you want to use phylogeny as the basis for your definitions, shouldn't you have the courage of your convictions and only admit monophyletic clades that include all descendants of a common ancestor?

But it's not that we've "decided" that we "want" to define animal words based on phylogeny. Definitions are uninteresting; you can't change reality by choosing a different definition! When we find structure in the distribution of animals in the world, and we want to come up with a "definition" of a category in order to efficiently point to the structure to someone who doesn't already know what the words in our language refer to, we're likely to end up talking about phylogenetics as a convenience, because the creatures that are actually all-around similar are actually related to each other for non-accidental reasons. But there is no principle that it would be hypocritical to betray, that definitions need to be monophyletic clades.

It's true that paraphyletic groups like fish are evolutionary non-events: there's no inherited feature that all fish share, that isn't also shared by the tetrapods. That doesn't mean we somehow can't or shouldn't talk about fish! Paraphyletic categories—descendants of a common ancestor, but excluding one or more monophyletic groups—can make sense when the excluded groups have picked up some salient features not shared by the other "branches" of the family. Tetrapods picked up a lot of adaptations specific to living on land; it's not crazy to want to talk about their cousins that didn't do that, even if that means that some fish are more recently related to some tetrapods than they are to some other fish.

Noticing the relevance of evolutionary relatedness to optimal categorization doesn't mean being slavishly committed to taking "years since last common ancestor" as our only criterion for which creatures are relevantly similar. "Years since last common ancestor" correlates with overall similarity, all other things being equal, but sometimes not all other things are equal, and people who aren't committed to the fallacy that words need to have a simple definition can take the other things into account.

If someone handed you a phylogenetic tree diagram of the development of life on some alien planet, and the diagram was only labeled with years and species names, without any other information about these alien creatures, you wouldn't have enough information to "carve it at the joints". You wouldn't spontaneously invent a paraphyletic grouping—but you also also wouldn't know which monophyletic groups are most significant.

In contrast, when classifying life on Earth, we're not in the position of making arbitrary cuts on an unlabeled tree diagram; rather, it's only after thousands of person-years of studying the natural world that people were able to infer things about evolutionary history and discover the the correct diagram.

It shouldn't be that surprising that the distinctions we notice in the natural world are both tied to the evolutionary history, but also don't always correspond to monophyletic clades. The continuity postulate in the evolutionary worldview imposes the desideratum that good categories should at least be a connected set on "phylogenetic space", not that we should never want to talk about "this clade, except for these few sub-clades that picked up a lot of important differences" as a category of interest—especially when talking about present-day creatures. (We talk about "last common ancestors", but no one has seen such creatures that lived millions of years ago; everything but the very leaves of the phylogenetic tree are inferred, not observed.)

The claim that dolphins shouldn't be considered "fish" because the alleged "courage of our convictions" should make us disdain paraphyletic categories only makes sense as an attempted reductio ad absurdum, not as a consistent argument on its own terms: putting dolphins and fish together would be polyphyletic! That's even worse! But as has just been explained, the reductio fails because the alleged principle being allegedly violated was never actually a principle of category formulation.

You know what else are paraphyletic taxa? Monkeys (excludes apes, even though the common ancestor of monkeys and apes was a monkey). Reptiles (excludes birds, even though the common ancestor of birds was a reptile). Protists (excludes animals, plants, and fungi, even though their common ancestor would have been a protist). Prokaryotes (excludes eukaryotes, even though the common ancestor of eukaryotes would have been a prokaryote). These are pretty commonsensical categories that it makes sense to have words for! But because of the continuity of evolution, it's not a coincidence that these commonsensical categories that people want words for ended up being connected sets in phylogenetic space.

The one replies:

Not all of them did, though! "Fish" used to just mean the swimmy animals: in the Bible, Jonah was swallowed by a "great fish", thought to be a whale. It was only after we figured genealogy that some pedants decided that whales didn't count.

But the claim that the distinction between fish and cetaceans (dolphins and whales) was only recognized after their differing evolutionary histories were discovered is just false to historical fact. Aristotle, writing in the fourth century BCE, already distinguished cetaceans from fish ("Very extensive genera of animals, into which other subdivisions fall, are the following: one, of birds; one, of fishes; and another, of cetaceans"). Aristotle was not being a phylogenetics pedant, because Aristotle did not know about evolution! He actually noticed the differences!

The pattern generalizes. Some determined contrarians might be inclined to argue "bats are birds" (flappy flying animals) on the same grounds as "dolphins are fish" (flappy swimmy animals). But did you know the German word for bat is Fledermaus ("flutter mouse"), which dates back to fledarmƫs in Old High German? Apparently, people way back in the tenth century or so (also long before evolution was understood) already thought bats were like a mammal-that-happened-to-fly rather than a bird-that-happened-to-be-furry.

Similarly, we recognize ostriches and penguins as birds on the basis of overall similarity, even though they don't fly (although we may sometimes qualify them as "flightless birds", in recognition of the fact that most birds fly). It would seem that "flappy flying animals" is not the common usage meaning of bird.

To be sure, convergent evolution is a thing, such that sometimes we might want short codewords that point to the cluster-structure-produced-by-convergent-evolution rather than the conditional-independence-structure-produced-by-connectedness-in-phylogenetic-space—trees, and possibly crabs, are a case in point. But it's important to notice the difference—to see through to the inferences your concepts are buying you—and what gets lost when you try to reason in a domain where your concept falls apart.

The power to define concepts is the power to delimit thought, to determine what kinds of inferences are easily representable. Finding the right concepts to explain and control the world we see is a fundamentally empirical challenge, a scientific challenge—to see the difference between things that seem similar and to see the similarities between things which seem different.

But although the quest is an empirical one—something that can only be achieved by studying what's out there, not just by writing blog posts about philosophy—it turns out that a little bit of philosophy is necessary to ground the rules of the investigation. Not much. Just the basics. The map–territory distinction. Probability, clustering. Conditional independence.

Maybe someday it could be possible to have a real science of psychiatry that reflects the actual structure of the mind, instead of doing the equivalent of lumping sprains and stress fractures together as hurtankles. Maybe even greater achievements are possible. Personally, I'm not optimistic about humanity's prospects.

I'm sure of one thing, though. If there is a better world out there, a way to unlock the secrets of the universe and wield them in the service of our values, it's only possible if we stop playing nitwit games and admit that dolphins don't belong on the fish list.

(Thanks to Tailcalled for the "root of the causal graph" observation and John S. Wentworth for explaining the importance of conditional independence.)

New to LessWrong?

New Comment
26 comments, sorted by Click to highlight new comments since: Today at 2:47 AM

Sometimes other people are going to care about different regularities from what you care about.

If I am in a botany laboratory, and a botanist instructs me to 'put the fruit samples in Refrigerator 1 and the vegetable samples in Refrigerator 2', I will...well, I'll ask for clarification first, but if I can't do that I will put tomatoes in with the fruit samples.  This is because botanists care about the regularities for which tomatoes are similar to other fruits (like being made of flesh and full of seeds).

If I am in a kitchen, and a cook instructs me to 'put the vegetables in the salad bowl and the fruits in the fruit salad bowl', I will put the tomatoes in with the vegetables.  This is because cooks care about the regularities for which tomatoes are similar to other vegetables (like being tasty in a salad and not in a dessert).

Both of these definitions of 'fruit' are valid, depending on context.  

If you show up in a kitchen and demand that the cooks stop putting tomatoes in salads, soups and savory sauces (like you do with vegetables) and instead put tomatoes in the fruit salad, make a tomato crumble, and a tomato ice cream sundae (like you do with fruits), because according to botany tomatoes are fruits, then you are a lunatic.

If you show up in a kitchen and demand that the cooks say 'put the vegetables plus the tomatoes in the salad bowl, and the fruits except for the tomatoes in the fruit salad bowl', you're at least not screwing up dinner, but you're being annoying by denying cooks access to a useful regularity that they care about.

(I actually agree with this. Sorry if it's confusing that the rhetorical emphasis of this post is addressing a different error.)

Would it be fair to say that the error this post is addressing is analogous to if the cook told the botanists, "No, tomatoes aren't fruits."?

I know the tomato is the traditional example for this, but in my own thinking I like to use the word "berry." Botanically, "berry" includes bananas, grapes, cucumbers, pumpkins, melons, and citrus fruits, but excludes raspberries, strawberries, and blackberries. I think it's a much more stark difference.

We have a concept of "carnivore" that is distinct from the order Carnivora.  Animals that eat meat are carnivores even if they're not carnivorans, and there are herbivorous carnivorans. Carnivores are not monophyletic, carnivorans are. 

We have a concept of "flying animals" that is not monophyletic (flight has evolved at least 4 times, gliding more). And as you note, there are flightless members of Aves.

Being nocturnal or diurnal is trait that has evolved many times that we have a name for because sometimes we care more about when an animal is active than what it descended from.

Having a polyphyletic concept of "swimming animal" doesn't seem unreasonable to me, and "fish" doesn't seem like a better or worse set of phenomes for it than any other. It will still have confusing edge cases (proto-whales? hippos? moose?) because biology is messy and gross, and the most useful definition will depend on the situation, and sometimes swimming will not be the only characteristic you care about, but it's not unreasonable for the joint of reality you care about to sometimes be "100% aquatic with a body well adapted to it".

EDIT: my comment describes polyphyletic groups where the OP calls fish paraphyletic, which I missed because I found OP very hard to read and mistakenly remembered the common definition of fish to be polyphyletic. I'm not sure how this changes the argument, but they plausibly should be treated quite differently, and if I want to argue they shouldn't I need to make that case. 


This is addressed as a side-note in the paragraph

To be sure, convergent evolution is a thing, such that sometimes we might want short codewords that point to the cluster-structure-produced-by-convergent-evolution rather than the conditional-independence-structure-produced-by-connectedness-in-phylogenetic-space—trees, and possibly crabs, are a case in point. But it's important to notice the difference—to see through to the inferences your concepts are buying you—and what gets lost when you try to reason in a domain where your concept falls apart.

Zack, what are you claiming is lost? The straw-anti-phylogenist who headlines your post says:

Choosing to define categories around evolutionary relatedness rather than macroscopic human-relevant features seems like an arbitrary ĂŠsthetic whim.

This seems genuinely straw, relative to the simple position: "[Swimming-adapted animals] is a useful, joint-carving concept. [The clade containing all swimmers, minus the clade of landlubbers] is a useful, joint-carving concept. Sometimes one or the other is most suitable. If someone tells me I'm supposed to use one or the other, when the alternative is more suitable, based on some simplistic policy of using words that are joint-carving in one way but not the other (rather than arguing suitability), they're just being silly." Each of these concepts compresses cluster-structure in thingspace. You can't explain why dolphins and fish both propel themselves by undulating in a certain way by reference to phylogeny, you have to talk about adaptation to the vorticial behavior of water. You can't explain why dolphins breath air by reference to swimming-adaptedness, you have to talk about phylogenetic relatedness to air-breathers.

A potential crux:

genetic relatedness is an underlying generator of similarities

I would also say:

niche similarity is an underlying generator of similarities

but I'm not sure you'd admit that, since the niche shows up in the causal graph in at best a cryptic way.

Certainly, there are lots and lots of different aspects in which things in the world can be similar, and we want our language to have lots and lots of concepts to describe those similarities in the contexts that they happen to be relevant in; if someone were to claim that carnivore shouldn't even be a word, that would be crazy, and that's not what I'm trying to do.

Rather, I'm trying to nail down a sense in which some concepts are more "robust" than others—more useful across a wide variety of situations and contexts. If someone asked you, "Do you have any pets? If so, what kind?", and your pet was the animal depicted in the Wikimedia Commons file Felis_silvestris_catus_lying_on_rice_straw.jpg, you might reply, "Yes, I have a cat." But you probably wouldn't say, "Yes, I have a carnivore", or "Yes, I have an endotherm." You wouldn't even say "Yes, I have a furry-four-legged-tailed-mammal"—a concept which includes dogs, but which English doesn't even offer a word for! But all of those statements are true. What's going on here? Is it just an arbitrary cultural convention that people talk about owning "cats" instead of endotherms or furry-four-legged-tailed-mammals—a convention that could have just as easily gone the other way?

I argue that it's mostly not arbitrary. (You could say that the convention has a large basin of attraction.) Cat-ness is causally upstream of carnivory and four-legged-ness and furriness and warm-bloodedness and nocternalness and whiskers and meowing and lots and lots of other specific details, including details that I don't personally know, and details that I "know" in the sense that my brain can use the information in some contexts, but of which I don't have fine-grained introspective access into the implementation details: it's a lot easier to say "That's a cat" when you see one (and be correct), than it is to describe in words exactly what your brain is doing when you identify it as a cat and not a dog or a weasel.

(The relevant Sequences post is "Mutual Information, and Density in Thingspace".)

So while it's true that the category of interest depends on context and the most useful definition can depend on the situation, concepts definable in terms of ancestry (but not even necessarily monophyletic groups) are so robustly useful across so many situations, that it makes sense that English has short words for cat and fish and monkey and prokaryote—if these words didn't already exist, people would quickly reinvent them—and we don't suffer much from "sea animals" being a composition of two words. I think common usage is actually doing something right here, and people like Scott Alexander ("if he wants to define behemah as four-legged-land-dwellers that's his right, and no better or worse than your definition of 'creatures in a certain part of the phylogenetic tree'") or Nate Soares ("The definitional gynmastics required to believe that dolphins aren't fish are staggering") who claim it's arbitrary are doing something wrong.

I don't think Scott is claiming it's arbitrary, I think he's claiming it's subjective, which is to say instrumental. As Eliezer kept pointing out in the morality debates, subjective things are objective if you close over the observer - human (ie. specific humans') morality is subjective, but not arbitrary, and certainly not unknowable.

But also I don't think that phylo categorization is stronger per se than niche categorization in predicting animal behavior, especially when it comes to relatively mutable properties like food consumption. Behavior, body shape etc are downstream of genes, but genes are cyclical with niche. And a lot of animals select their food opportunistically.

Phylo reveals information that niche doesn't. But niche also reveals information that is much harder to predict from phylo. I think Scott's objection goes against the absolutizing claim that "phylo is all you need."

"Swimming animals" vs. "fish" seems like a fairly neutral term choice, so it's somewhat boring as an example. It's a much bigger deal to choose between terms where one option has negative connotations and the other positive (or even just very different connotations!). Cf. various political debates over terms that shall not be named here.

No matter what words are used to describe it, at some point your decision algorithm needs to categorize by cause in order to compute the correct treatment: for example, to give antibiotics to the patients with bacterial diseases and antivirals to the patients with viral diseases. If the authoritative body of professional psychiatrists has a "philosophical commitment" against this, that means we don't have a science of psychiatry.

This is overstating your evidence. Categorizing by cause in order to compute the correct treatment is only helpful (in treatment) if treatments differ by cause. To some extent, that's definitely true. If someone experiences sadness, you want to treat a short term sadness caused by an event (eg. a loved one dying) separately from a long term sadness caused by a hormone imbalance (eg. generic long term depression). The APA does distinguish these - according to the DMV, you don't diagnose sadness as depression if it is short term and caused by a significant event. However, I'm not convinced that it applies for all issues.

Imagine Sisyphus whose job is to keep a boulder balanced at the top of a flat-topped hill. If a wind came and pushed the ball off the peak, he must push the boulder back up. If a god comes and knocks the boulder off, he must push the boulder back up. The treatment is the same no matter the cause. Even in medicine, a broken bone is treated the same whether it is caused by falling out of a tree or getting hit by a rock. If I'm a doctor and I know that you broke your bone, knowing the cause isn't helpful for me to resolve it. Categorizing by cause is only useful to the degree that cause informs treatment more than symptoms do.

The key is to be able to distinguish illnesses by the relevant factors. When the APA decided to not recognize developmental trauma disorder, it was because they thought that knowing that a disorder was caused specifically by childhood trauma is not the primary piece of knowledge needed to help a patient. I'm admittedly not a psychiatrist, but that sounds very plausible to me.

One factor it also depends on is what kind of treatment one is attempting to do - curing vs palliative care. If someone has a serious mental health problem, then presumably there is some persistent difference(s) between them and other people, generating that problem.

(Even if the generation only happens in a liability sense, e.g. someone might have a temprary psychotic episode. The episode itself is not a persistent difference, but most people never have psychotic episodes at all. So the fact that someone does sometimes and that those who do sometimes can get then repeatedly suggests that they have some underlying thought disorder that makes them vulnerable to it.)

If you want to cure the problems, then your best bet is to treat the underlying difference, or at least have a treatment specialized to it so you're sure you get rid of all of the effects. But if you just want to perform palliative care, reducing the harm of symptoms while being unable to fully get rid of them, then focusing on what the symptoms are is a good bet.

It seems to me that a lot of psychiatry is essentially palliative.

Not sure why I got downvoted? :(

(FWIW your comment seems helpful to me, and in general LW voting seems noisy in general as a signal of value, like often helpful things are ignored or downvoted and low-value things are heavily upvoted.)

Oh it seems to be back up, when I posted it I was at -5, I thought I had done something very wrong.

I've tried to address your point about psychiatry in particular at

For the whale point, am I fairly interpreting your argument as saying that mammals are more similar, and more fundamentally similar, to each other, than swimmy-things? If so, consider a thought experiment. Swimmy-things are like each other because of convergent evolution. Presumably millions of years ago, the day after the separation of the whale and land-mammal lineages, proto-whales and proto-landmammals were extremely similar, and proto-whales and proto-fish were extremely dissimilar. Let's say in 99% of ways, whales were more like landmammals, and in 1% of ways, they were more like fish. Some convergent evolution takes place, we get to the present, and you're claiming that modern whales are still more like landmammals than fish - I have no interest in disputing that claim, let's say they're more like landmammals in 85% of ways, and fish in 15% of ways. Now fast-forward into the future, after a billion more years of convergent evolution, and imagine that whales have evolved to their new niche so well that they are more like fish in 99% of ways, and more like mammals in only 1% of ways. Are you still going to insist that blood is thicker than water and we need to judge them by their phylogenetic group, even though this gives almost no useful information and it's almost always better to judge them by their environmental affinities?

(I don't think this is an absurd hypothetical - I think "crabs" are in this situation right now)

And if not, at some point in the future, do they go from being obviously-mammals-you-are-not-allowed-to-argue-this to obviously-fish-you-are-not-allowed-to-argue-this in the space of a single day? Or might there be a very long period when they are more like mammals in some way, more like fish in others, and you're allowed to categorize them however you want based on which is more useful for you? If the latter, what makes you think we're not in that period right now?

Are you still going to insist that blood is thicker than water and we need to judge them by their phylogenetic group, even though this gives almost no useful information and it's almost always better to judge them by their environmental affinities?

No, of course not: we want categories that give useful information.

Did I fail as a writer by reaching for the cutesy title? (I guess I can't say I wasn't warned.) The actual text of the post—if you actually read all of the sentences in the post instead of just glancing at the title and skimming—is pretty explicit that I'm not proposing that phylogenetics is of fundamental philosophical importance ("it's not that we've 'decided' that we 'want' to define animal words based on phylogeny" [...] "we're likely to end up talking about phylogenetics as a [mere] convenience"), and I'm not saying no one should ever want to talk about convergent-evolution categories ("trees, and possibly crabs, are a case in point").

Rather, the reason I wrote this post is because I keep running into idiocies of the form "But who cares about evolutionary relatedness; I only care if it swims" (on the "Are dolphins fish?" question) or "But who cares about sex chromosomes; I only care about presentation and preferred pronouns" (on the "Are trans women women?" question). And I'm pointing out that even if genetics isn't immediately visible or salient if you glance at an organism from a distance, genetics being at the root of the causal graph buys you lots and lots of conditional independence assertions. (And if you don't know what conditional independence is and why it matters, then you have no business having positions about the philosophy of language. Read the Sequences.)

In the process of writing this up, I succumbed to the temptation of running with the proverbial title as a catchy concept-handle, but I would have hoped the audience of this website was sophisticated enough to understand that it was just a catchy concept-handle for the "root of the causal graph, means more conditional independence assertions, means the clustering happens in a much higher-dimensional space" thing, and not some kind of deranged assertion that "Blood Is Infinitely Thicker Than Water" or "Blood Is Thicker Than Water In All Possible Worlds, Including Thought Experiments Where Evolution Works Differently."

And if not, at some point in the future, do they go from being obviously-mammals-you-are-not-allowed-to-argue-this to obviously-fish-you-are-not-allowed-to-argue-this in the space of a single day?

No, of course not. Like many Less Wrong readers, I, too, am familiar with the Sorites paradox.

Or might there be a very long period when they are more like mammals in some way, more like fish in others, and you're allowed to categorize them however you want based on which is more useful for you?

You should categorize them whichever way is more useful for making predictions and decisions in whatever context you're operating in.

I don't want to summarize this as "however you want", because the question of which categories are most useful for making decisions and predictions is something people can be wrong about, and moreover, something people can be motivatedly wrong about, and—as we've seen—the general laws governing which categories are useful is something people can be motivatedly wrong about.

Suppose we have three vectors in ten-dimensional space, u = [1, 1, 1, 0, 0, 0, 0, 0, 0, 0], v = [1, 1, 1, 2, 2, 2, 2, 2, 2, 2], ad w = [3, 3, 3, 0, 0, 0, 0, 0, 0, 0]. A rich language with lots of vocabulary for describing the world will probably want both a word for a category that groups u and v together (because they match on through ), and a word for a category that groups u and w together (because they match on through ).

You'll probably end up talking about {u, w} more often than {u, v} because the former cluster is in a "thicker", higher-dimensional subspace, and it's not always obvious exactly which variables are "relevant." If I own the animal appearing in the photo located at the URL, and you ask if I have a pet, I'm probably going to say "Yes, I have a cat", rather than "Yes, I have a white-animal" or "Yes, I have an endotherm", even though the latter two statments are both true and it's good to have language for them. I think this kind of thing is why English ended up allocating a short codeword bird to a category that includes penguins and excludes bats, even though sparrows and bats have something very salient in common (flying) that penguins don't.

Unfortunately, sometimes the same word/symbol ends up getting attached to two different categories—that's why dictionaries have a list of numbered definitions for a word. Dolphins aren't fish(1) (cold blooded water-dwelling vertebrate with fins and gills), but they are fish(2) (water-dwelling animals more generally). Bananas aren't berries in the culinary sense, but they are berries in the botantical sense. Obviously, it would be retarded to get in a fight over bananas "are berries": it depends on what you mean by the word, and which meaning is relevant can usually be deduced from whether we're baking a pie or studying biology, and if it's not clear which meaning is intended, you can say something like "Oh, sorry, I meant the botanical sense" to clarify.

All this is utterly trivial for people who are actually trying to communicate—who want to be clear about what which probabilistic model they're trying to point to. But I think it's important to articulate the underlying principles rather than saying "you're allowed to categorize them however you want", because sometimes people don't want to be clear: suppose someone were to say "Dolphins are fish because if they weren't, then I would be very sad and might kill myself. You ought to accept an unexpected acquatic mammal or two deep inside the conceptual boundaries of what would normally be considered fish if it'll save someone's life; there's no rule of rationality saying that you shouldn't, and there are plenty of rules of human decency saying that you should."

This emotional-blackmail maneuver is obviously a very different kind of thing than straightforwardly saying "Dolphins are fish in the second sense, but not the first sense; it's important to disambiguate which sense you mean when it's not clear from context." If some intellectually dishonest coward were to try to pass off the emotional-blackmail attempt as a serious philosophy argument, they would be laughed out of the room. Right?

I've tried to address your point about psychiatry in particular at

This is a good post, but it doesn't seem very consistent with the part of the Body Keeps the Score review that I was picking on? In response to people who say that "depression" probably isn't one thing, you reply that those research programs haven't panned out. That makes sense: if you don't know how to find a more precise model, a simple model that probably conflates some things might be the best you can do.

But that's an empirical point about what we (don't) know about "depression", not a philosophical one. I'm saying the APA's verdict on van der Kolk's "developmental trauma disorder" proposal should follow the same methodology as your reasoning about depression. If they're not persuaded by van der Kolk's evidence as a matter of science, fine; if van der Kolk's evidence doesn't matter because "that's not the kind of thing we build our categories around" as a matter of philosophy, that's insane.

I think "crabs" are in this situation right now

What reading or experience is this based off of? This sounds more like something someone would say if they've only heard the term "carcinisation" as a fun fact on the internet, rather than actually reading the Wikipedia page in detail.

For the "I would be sad for those boundaries" I think there is reason not to naively just obey it but its not as hopeless as "being laughed out of the room".

I want to differentiate two different layers of this problem: A difference in what is a natural category to a person and how to manage and choose a shared meaning system.

I previously used a analogy of berries which are chemically different but morhpologically very similar, ie they have nearly the same color but they are metabolically different. Then say there is a small fraction of humans that have a lot of trouble digesting one variant and very easy time with the other variant while most other humans have equally easy peasy time with all variants.

Now I want to evoke an analogy of nut allergy. If you have a nut allergy you really really want for packaging to contain information if there are nuts in the product or whether there is possibility of traces. One could also imagine a human that did not infact have a nut allergy but just disliked the taste of nuts. Both groups would be for lessening the amounts of nuts used in cooking and both groups would be for labeling products that do use nuts so they could not consume them.

Someone could be "allergy sceptical" and think people that advocate for nutlabels are all or mostly people who just really dislike nut flavour. Then the "proof of the existence of allergies" would be to specify how nuts entering cause harm beyond mild annoyance and symptoms like death. And in a "typical" human this allergy and harm is not present.

A nut allergic person forced to navigate a world that doesn't support nut detection would be genuinely dangerous. If they have anxiety or fear about it that would be justified by the structure of their existence.

A claim of sadness from a particular concept could be likened to a report of how an allergic persons immune responds has different properties. Or of a claim how racistic institutions or behaviours cause elevated cortisol levels. To say that high cortisols levels are a "you problem" points to the direction of that the proper reaction against hostility would be calmness rather than stress. So with those words it could also read to not be blackmail but just informing of the (likely) consequences.

Where it would go full blackmail is if the primary effect is to have the desired societal policy in place and we fabricate the required story to get it passed. A nut-disliker might want to upsell their unpleasantness that they get. There is a the danger that if they lie "it would kill me" than people could disbelieve genuine allergics reports.

So I think the "but me sad" argument is at its weakest on whether or not there are laws of rationality that are in fact violated. Feelings don't have any specific "free pass" to be freely chosen and like fear or anxiety can be improperly paranoic or proper threat detection the feeling of sadness should be based. For dolphins I would need to genuinely ask "what do you mean makes you sad?", I would guess that the nut case could be established (and a nut-disliker would need to lie) althought the detail level as a non-allergic person is a bit fuzzy to me. For the controversial area it is likey that it is hard to estalish what happens and via what mechanism and there is a lot of trusting the word and understanding of different parties (chemists and biologists don't have much disagreement about the properties of nuts or human guts).

As a non-allergic person I should not be annoyed that my food is filled with annoying and senseless-to-me food information. And arguing that senseless-to-me would be a very good basis for senseless-for-food-packaging. It is a different issues whether we for example consider dogs to be food consumers and include them in our circle of care when making labeling decisions ("but it wasn't labeled toxic for dogs!" is not a valid excuse atleast in this point in time). What happens in the "general case" can be very relative.

>"But who cares about sex chromosomes; I only care about presentation and preferred pronouns" (on the "Are trans women women?" question)

So, like, depending on context, this might be that, say, you brought up some question for which chromosome-women is the obviously relevant concept, and then someone else tried to stop you from using "women" to mean chromosome-women. Which would be a fault of theirs. Or it might be that someone was trying to, say, work out how to treat people in ways that are good for them, and then you tried to stop them from using "women" to mean, the cluster of social practices around treating women as opposed to men. Which would perhaps be a fault of yours, or at least, I'd want to know why you were doing that, e.g., because you thought the linguistic territory shouldn't be somehow ceded.

"Dolphins are fish because if they weren't, then I would be very sad and might kill myself. You ought to accept an unexpected acquatic mammal or two deep inside the conceptual boundaries of what would normally be considered fish if it'll save someone's life; there's no rule of rationality saying that you shouldn't, and there are plenty of rules of human decency saying that you should."

So, insofar as the category "fish" was supposed to be about animal biology and not social practice, this is of course antirational. But, say it's the case that if dolphins are fish, then I would be very sad and might kill myself. This isn't purely about animal biology, it also involves my values, and so almost certainly involves other facts, e.g. how I and others will relate to tr... dolphins. Just like convergent evolution somehow "overlays" or "overlaps" phylogenetic descent, as a usefully-separately-conceived causal/explanatory factor, so too might these (additional, non-biological, partly social) concepts about dolphins overlay the (also useful, biological) concepts about dolphins.

Even taking everything else you write here for granted (which I wouldn’t normally, but let’s go with it for now)
 the question in your last sentence seems easy to answer: we’re not in that period right now, because right now, by construction, whales are more landmammals in 85% of ways, so if you classify them as mammals, and then use that to make predictions about heretofore-unobserved traits, you will be right 85 / 15 = ~5.67 times more often than if you had classified them as fish.

Structuring reality wil have effctson how you can understand and manipualte it. This post has a tone where we have this very sophisticated and cool understanding and everything that fails to be it must be dumb and ignorant. Sure it could be important for a life form to understand and manipulate realtedness based lines but does that lead that other proficiencies are forbidden?

Medical science uses the word "infection" and doesn't make it go away as being ambigious. If distinguishing between baterial and viral infection is not overwhelming why not adopt a similar mode for "fish"? Or why does not the fish argument apply for the abolishment of "infection"?

Taking loan and coinage from toki pona call "generic land animal" someli and "sea animal" kala. If the foreing quest thinks "thats a weird someli" and you insist "thats a cat not a someli" its not clear to me why it would be okay to impose english standard over toki ona standards. Further more if some memer comes along and says "thats not a dog but a doge" shouldn't you accept their technoloogy of hidden additional knowledge and use the more sophisticated term? Why would detail level of dogs dominate over other detail levels?

For conspiracy and esoteric thinking one might be tempted to entertain the idea if Jona was captured into a unidentified submerging object. None of the biology "cluster hitting" would apply particularly well then but "big fish" in the "aquatic animal" sense would be much closer.

There are also a lot of idioms such as "there are plenty of fish in the sea", "easy like shooting fish in a barrel" and "like fish out of water" which make way more sense in the kala sense than as biological clusters and operate in the adaptational sense and limited sense as biological clusters. A movie like "free willy" makes sense because a water container is enough to work as a prison for a whale.

For treating a disease overdiagnosing can mean later than neccesary treament and access. For some differential diagnoses it is proper to stop asking questions, the differential doesn't need to be of infinite resolution. For a bacterial infection you don't neccearily need to know the bacteria strain and describing a generic antibacterial might be highly proper.

Also to a similar kind of mode that bats are flyers penguin are very lose to being kala. I would imagine that for the purposes of tracing poison accumulation and transfer that dolpins and penguins would be very similarly situated. A study on how radioactive materials impact fish would probably include those for the reality carving reasons. Such a delineations would survive "sailor of old" kind of antiquement critisms.

That the relatedness cravins are powerful doesn't mean one gets to be blind to the other connections or that one must always use relatedness compatible concepts. Thusd demanding overwheming reverence can get harmful.

I find that understanding the ways in which dolphins are mammalian, is very much an informational challenge; I need to know a lot more biology to be able to use the category of “genetically mammal” than just plain old “fish.” Obviously, knowing more is better, but it is not obvious to me that forcing everybody to learn the more informed categories is societally optimal. I doubt anyone but specialists will ever find the information instrumentally useful, so we are just wasting some limited bandwidth to teach people stuff they don’t need to know. (On the other hand, people mostly waste their time with, e.g., learning about the differences between ten different Robin characters, so perhaps the endeavor is justified after all.)

The idea of "general education" is that it's good for ordinary people to learn lots of things that were discovered by specialists: partially because we value knowledge for its own sake, but also because it's hard to tell in advance what knowledge will end up being useful. In principle, you could reject the idea that general education is generally good, but if you're going to be consistent about what that entails, I don't think anyone who reads this website actually wants to go there. Do I really need to know that matter is made of "atoms", that have a "nucleus" composed of "uncharged" "neutrons" and "positively-charged" "protons", surrounded by "negatively-charged" "electrons"? When have I ever used any of this stuff to make money? Do I really need to know that the world is round? &c.

(Incidentally, Dick Grayson is the best Robin.)

My point is more about prioritization. English, math, programming and computer literacy, economics, basic home skills (cooking, trivial repairs, etc.), and possibly rationality (though the existence of “The Dark Valley of Rationality” makes me a bit hesitant on this one) are much better subjects for a “general info” curriculum.

PS: Knowing about elementary particles (without a mathematical model of them) is trivial. You can fit all such facts into a single year’s science curriculum. The things that take time to learn are calculations, e.g., finding the mass of some reagent after some chemical reaction.