# 91

Followup to: Where to Draw the Boundary?

Figuring where to cut reality in order to carve along the joints—figuring which things are similar to each other, which things are clustered together: this is the problem worthy of a rationalist. It is what people should be trying to do, when they set out in search of the floating essence of a word.

Once upon a time it was thought that the word "fish" included dolphins ...

The one comes to you and says:

The list: {salmon, guppies, sharks, dolphins, trout} is just a list—you can't say that a list is wrong. You draw category boundaries in specific ways to capture tradeoffs you care about: sailors in the ancient world wanted a word to describe the swimming finned creatures that they saw in the sea, which included salmon, guppies, sharks—and dolphins. That grouping may not be the one favored by modern evolutionary biologists, but an alternative categorization system is not an error, and borders are not objectively true or false. You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning. So my definition of fish cannot possibly be 'wrong,' as you claim. I can define a word any way I want—in accordance with my values!

So, there is a legitimate complaint here. It's true that sailors in the ancient world had a legitimate reason to want a word in their language whose extension was {salmon, guppies, sharks, dolphins, ...}. (And modern scholars writing a translation for present-day English speakers might even translate that word as fish, because most members of that category are what we would call fish.) It indeed would not necessarily be helping the sailors to tell them that they need to exclude dolphins from the extension of that word, and instead include dolphins in the extension of their word for {monkeys, squirrels, horses ...}. Likewise, most modern biologists have little use for a word that groups dolphins and guppies together.

When rationalists say that definitions can be wrong, we don't mean that there's a unique category boundary that is the True floating essence of a word, and that all other possible boundaries are wrong. We mean that in order for a proposed category boundary to not be wrong, it needs to capture some statistical structure in reality, even if reality is surprisingly detailed and there can be more than one such structure.

The reason that the sailor's concept of water-dwelling animals isn't necessarily wrong (at least within a particular domain of application) is because dolphins and fish actually do have things in common due to convergent evolution, despite their differing ancestries. If we've been told that "dolphins" are water-dwellers, we can correctly predict that they're likely to have fins and a hydrodynamic shape, even if we've never seen a dolphin ourselves. On the other hand, if we predict that dolphins probably lay eggs because 97% of known fish species are oviparous, we'd get the wrong answer.

A standard technique for understanding why some objects belong in the same "category" is to (pretend that we can) visualize objects as existing in a very-high-dimensional configuration space, but this "Thingspace" isn't particularly well-defined: we want to map every property of an object to a dimension in our abstract space, but it's not clear how one would enumerate all possible "properties." But this isn't a major concern: we can form a space with whatever properties or variables we happen to be interested in. Different choices of properties correspond to different cross sections of the grander Thingspace. Excluding properties from a collection would result in a "thinner", lower-dimensional subspace of the space defined by the original collection of properties, which would in turn be a subspace of grander Thingspace, just as a line is a subspace of a plane, and a plane is a subspace of three-dimensional space.

Concerning dolphins: there would be a cluster of water-dwelling animals in the subspace of dimensions that water-dwelling animals are similar on, and a cluster of mammals in the subspace of dimensions that mammals are similar on, and dolphins would belong to both of them, just as the vector [1.1, 2.1, 9.1, 10.2] in the four-dimensional vector space ℝ⁴ is simultaneously close to [1, 2, 2, 1] in the subspace spanned by x₁ and x₂, and close to [8, 9, 9, 10] in the subspace spanned by x₃ and x₄.

Humans are already functioning intelligences (well, sort of), so the categories that humans propose of their own accord won't be maximally wrong: no one would try to propose a word for "configurations of matter that match any of these 29,122 five-megabyte descriptions but have no other particular properties in common." (Indeed, because we are not-superexponentially-vast minds that evolved to function in a simple, ordered universe, it actually takes some ingenuity to construct a category that wrong.)

This leaves aspiring instructors of rationality in something of a predicament: in order to teach people how categories can be more or (ahem) less wrong, you need some sort of illustrative example, but since the most natural illustrative examples won't be maximally wrong, some people might fail to appreciate the lesson, leaving one of your students to fill in the gap in your lecture series eleven years later.

The pedagogical function of telling people to "stop playing nitwit games and admit that dolphins don't belong on the fish list" is to point out that, without denying the obvious similarities that motivated the initial categorization {salmon, guppies, sharks, dolphins, trout, ...}, there is more structure in the world: to maximize the (logarithm of the) probability your world-model assigns to your observations of dolphins, you need to take into consideration the many aspects of reality in which the grouping {monkeys, squirrels, dolphins, horses ...} makes more sense. To the extent that relying on the initial category guess would result in a worse Bayes-score, we might say that that category is "wrong." It might have been "good enough" for the purposes of the sailors of yore, but as humanity has learned more, as our model of Thingspace has expanded with more dimensions and more details, we can see the ways in which the original map failed to carve reality at the joints.

The one replies:

But reality doesn't come with its joints pre-labeled. Questions about how to draw category boundaries are best understood as questions about values or priorities rather than about the actual content of the actual world. I can call dolphins "fish" and go on to make just as accurate predictions about dolphins as you can. Everything we identify as a joint is only a joint because we care about it.

No. Everything we identify as a joint is a joint not "because we care about it", but because it helps us think about the things we care about.

Which dimensions of Thingspace you bother paying attention to might depend on your values, and the clusters returned by your brain's similarity-detection algorithms might "split" or "collapse" according to which subspace you're looking at. But in order for your map to be useful in the service of your values, it needs to reflect the statistical structure of things in the territory—which depends on the territory, not your values.

There is an important difference between "not including mountains on a map because it's a political map that doesn't show any mountains" and "not including Mt. Everest on a geographic map, because my sister died trying to climb Everest and seeing it on the map would make me feel sad."

There is an important difference between "identifying this pill as not being 'poison' allows me to focus my uncertainty about what I'll observe after administering the pill to a human (even if most possible minds have never seen a 'human' and would never waste cycles imagining administering the pill to one)" and "identifying this pill as not being 'poison', because if I publicly called it 'poison', then the manufacturer of the pill might sue me."

There is an important difference between having a utility function defined over a statistical model's performance against specific real-world data (even if another mind with different values would be interested in different data), and having a utility function defined over features of the model itself.

Remember how appealing to the dictionary is irrational when the actual motivation for an argument is about whether to infer a property on the basis of category-membership? But at least the dictionary has the virtue of documenting typical usage of our shared communication signals: you can at least see how "You're defecting from common usage" might feel like a sensible thing to say, even if one's true rejection lies elsewhere. In contrast, this motion of appealing to personal values (!?!) is so deranged that Yudkowsky apparently didn't even realize in 2008 that he might need to warn us against it!

You can't change the categories your mind actually uses and still perform as well on prediction tasks—although you can change your verbally reported categories, much as how one can verbally report "believing" in an invisible, inaudible, flour-permeable dragon in one's garage without having any false anticipations-of-experience about the garage.

This may be easier to see with a simple numerical example.

Suppose we have some entities that exist in the three-dimensional vector space ℝ³. There's one cluster of entities centered at [1, 2, 3], and we call those entities Foos, and there's another cluster of entities centered at [2, 4, 6], which we call Quuxes.

The one comes and says, "Well, I'm going redefine the meaning of 'Foo' such that it also includes the things near [2, 4, 6] as well as the Foos-with-respect-to-the-old-definition, and you can't say my new definition is wrong, because if I observe [2, _, _] (where the underscores represent yet-unobserved variables), I'm going to categorize that entity as a Foo but still predict that the unobserved variables are 4 and 6, so there."

But if the one were actually using the new concept of Foo internally and not just saying the words "categorize it as a Foo", they wouldn't predict 4 and 6! They'd predict 3 and 4.5, because those are the average values of a generic Foo-with-respect-to-the-new-definition in the 2nd and 3rd coordinates (because (2+4)/2 = 6/2 = 3 and (3+6)/2 = 9/2 = 4.5). (The already-observed 2 in the first coordinate isn't average, but by conditional independence, that only affects our prediction of the other two variables by means of its effect on our "prediction" of category-membership.) The cluster-structure knowledge that "entities for which x₁≈2, also tend to have x₂≈4 and x₃≈6" needs to be represented somewhere in the one's mind in order to get the right answer. And given that that knowledge needs to be represented, it might also be useful to have a word for "the things near [2, 4, 6]" in order to efficiently share that knowledge with others.

Of course, there isn't going to be a unique way to encode the knowledge into natural language: there's no reason the word/symbol "Foo" needs to represent "the stuff near [1, 2, 3]" rather than "both the stuff near [1, 2, 3] and also the stuff near [2, 4, 6]". And you might very well indeed want a short word like "Foo" that encompasses both clusters, for example, if you want to contrast them to another cluster much farther away, or if you're mostly interested in x₁ and the difference between x₁≈1 and x₁≈2 doesn't seem large enough to notice.

But if speakers of particular language were already using "Foo" to specifically talk about the stuff near [1, 2, 3], then you can't swap in a new definition of "Foo" without changing the truth values of sentences involving the word "Foo." Or rather: sentences involving Foo-with-respect-to-the-old-definition are different propositions from sentences involving Foo-with-respect-to-the-new-definition, even if they get written down using the same symbols in the same order.

Naturally, all this becomes much more complicated as we move away from the simplest idealized examples.

For example, if the points are more evenly distributed in configuration space rather than belonging to cleanly-distinguishable clusters, then essentialist "X is a Y" cognitive algorithms perform less well, and we get Sorites paradox-like situations, where we know roughly what we mean by a word, but are confronted with real-world (not merely hypothetical) edge cases that we're not sure how to classify.

Or it might not be obvious which dimensions of Thingspace are most relevant.

Or there might be social or psychological forces anchoring word usages on identifiable Schelling points that are easy for different people to agree upon, even at the cost of some statistical "fit."

We could go on listing more such complications, where we seem to be faced with somewhat arbitrary choices about how to describe the world in language. But the fundamental thing is this: the map is not the territory. Arbitrariness in the map (what color should Texas be?) doesn't correspond to arbitrariness in the territory. Where the structure of human natural language doesn't fit the structure in reality—where we're not sure whether to say that a sufficiently small collection of sand "is a heap", because we don't know how to specify the positions of the individual grains of sand, or compute that the collection has a Standard Heap-ness Coefficient of 0.64—that's just a bug in our human power of vibratory telepathy. You can exploit the bug to confuse humans, but that doesn't change reality.

Sometimes we might wish that something to belonged to a category that it doesn't (with respect to the category boundaries that we would ordinarily use), so it's tempting to avert our attention from this painful reality with appeal-to-arbitrariness language-lawyering, selectively applying our philosophy-of-language skills to pretend that we can define a word any way we want with no consequences. ("I'm not late!—well, okay, we agree that I arrived half an hour after the scheduled start time, but whether I was late depends on how you choose to draw the category boundaries of 'late', which is subjective.")

For this reason it is said that knowing about philosophy of language can hurt people. Those who know that words don't have intrinsic definitions, but don't know (or have seemingly forgotten) about the three or six dozen optimality criteria governing the use of words, can easily fashion themselves a Fully General Counterargument against any claim of the form "X is a Y"—

Y doesn't unambiguously refer to the thing you're trying to point at. There's no Platonic essence of Y-ness: once we know any particular fact about X we want to know, there's no question left to ask. Clearly, you don't understand how words work, therefore I don't need to consider whether there are any non-ontologically-confused reasons for someone to say "X is a Y."

Isolated demands for rigor are great for winning arguments against humans who aren't as philosophically sophisticated as you, but the evolved systems of perception and language by which humans process and communicate information about reality, predate the Sequences. Every claim that X is a Y is an expression of cognitive work that cannot simply be dismissed just because most claimants doesn't know how they work. Platonic essences are just the limiting case as the overlap between clusters in Thingspace goes to zero.

You should never say, "The choice of word is arbitrary; therefore I can say whatever I want"—which amounts to, "The choice of category is arbitrary, therefore I can believe whatever I want." If the choice were really arbitrary, you would be satisfied with the choice being made arbitrarily: by flipping a coin, or calling a random number generator. (It doesn't matter which.) Whatever criterion your brain is using to decide which word or belief you want, is your non-arbitrary reason.

If what you want isn't currently true in reality, maybe there's some action you could take to make it become true. To search for that action, you're going to need accurate beliefs about what reality is currently like. To enlist the help of others in your planning, you're going to need precise terminology to communicate accurate beliefs about what reality is currently like. Even when—especially when—the current reality is inconvenient.

Even when it hurts.

(Oh, and if you're actually trying to optimize other people's models of the world, rather than the world itself—you could just lie, rather than playing clever category-gerrymandering mind games. It would be a lot simpler!)

Imagine that you've had a peculiar job in a peculiar factory for a long time. After many mind-numbing years of sorting bleggs and rubes all day and enduring being trolled by Susan the Senior Sorter and her evil sense of humor, you finally work up the courage to ask Bob the Big Boss for a promotion.

"Sure," Bob says. "Starting tomorrow, you're our new Vice President of Sorting!"

"Wow, this is amazing," you say. "I don't know what to ask first! What will my new responsibilities be?"

"Oh, your responsibilities will be the same: sort bleggs and rubes every Monday through Friday from 9 a.m. to 5 p.m."

You frown. "Okay. But Vice Presidents get paid a lot, right? What will my salary be?"

"Still \$9.50 hourly wages, just like now."

You grimace. "O–kay. But Vice Presidents get more authority, right? Will I be someone's boss?"

"No, you'll still report to Susan, just like now."

You snort. "A Vice President, reporting to a mere Senior Sorter?"

"Oh, no," says Bob. "Susan is also getting promoted—to Senior Vice President of Sorting!"

You lose it. "Bob, this is bullshit. When you said I was getting promoted to Vice President, that created a bunch of probabilistic expectations in my mind: you made me anticipate getting new challenges, more money, and more authority, and then you reveal that you're just slapping an inflated title on the same old dead-end job. It's like handing me a blegg, and then saying that it's a rube that just happens to be blue, furry, and egg-shaped ... or telling me you have a dragon in your garage, except that it's an invisible, silent dragon that doesn't breathe. You may think you're being kind to me asking me to believe in an unfalsifiable promotion, but when you replace the symbol with the substance, it's actually just cruel. Stop fucking with my head! ... sir."

Bob looks offended. "This promotion isn't unfalsifiable," he says. "It says, 'Vice President of Sorting' right here on the employee roster. That's an sensory experience that you can make falsifiable predictions about. I'll even get you business cards that say, 'Vice President of Sorting.' That's another falsifiable prediction. Using language in a way you dislike is not lying. The propositions you claim false—about new job tasks, increased pay and authority—is not what the title is meant to convey, and this is known to everyone involved; it is not a secret."

Bob kind of has a point. It's tempting to argue that things like titles and names are part of the map, not the territory. Unless the name is written down. Or spoken aloud (instantiated in sound waves). Or thought about (instantiated in neurons). The map is part of the territory: insisting that the title isn't part of the "job" and therefore violates the maxim that meaningful beliefs must have testable consequences, doesn't quite work. Observing the title on the employee roster indeed tightly constrains your anticipated experience of the title on the business card. So, that's a non-gerrymandered, predictively useful category ... right? What is there for a rationalist to complain about?

To see the problem, we must turn to information theory.

Let's imagine that an abstract Job has four binary properties that can either be high or low—task complexity, pay, authority, and prestige of title—forming a four-dimensional Jobspace. Suppose that two-thirds of Jobs have {complexity: low, pay: low, authority: low, title: low} (which we'll write more briefly as [low, low, low, low]) and the remaining one-third have {complexity: high, pay: high, authority: high, title: high} (which we'll write as [high, high, high, high]).

Task complexity and authority are hard to perceive outside of the company, and pay is only negotiated after an offer is made, so people deciding to seek a Job can only make decisions based the Job's title: but that's fine, because in the scenario described, you can infer any of the other properties from the title with certainty. Because the properties are either all low or all high, the joint entropy of title and any other property is going to have the same value as either of the individual property entropies, namely ⅔ log₂ 3/2 + ⅓ log₂ 3 ≈ 0.918 bits.

But since H(pay) = H(title) = H(pay, title), then the mutual information I(pay; title) has the same value, because I(pay; title) = H(pay) + H(title) − H(pay, title) by definition.

Then suppose a lot of companies get Bob's bright idea: half of the Jobs that used to occupy the point [low, low, low, low] in Jobspace, get their title coordinate changed to high. So now one-third of the Jobs are at [low, low, low, low], another third are at [low, low, low, high], and the remaining third are at [high, high, high, high]. What happens to the mutual information I(pay; title)?

I(pay; title) = H(pay) + H(title) − H(pay, title)
= (⅔ log 3/2 + ⅓ log 3) + (⅔ log 3/2 + ⅓ log 3) − 3(⅓ log 3)
= 4/3 log 3/2 + 2/3 log 3 − log 3 ≈ 0.2516 bits.

It went down! Bob and his analogues, having observed that employees and Job-seekers prefer Jobs with high-prestige titles, thought they were being benevolent by making more Jobs have the desired titles. And perhaps they have helped savvy employees who can arbitrage the gap between the new and old worlds by being able to put "Vice President" on their resumés when searching for a new Job.

But from the perspective of people who wanted to use titles as an easily-communicable correlate of the other features of a Job, all that's actually been accomplished is making language less useful.

In view of the preceding discussion, to "37 Ways That Words Can Be Wrong", we might wish to append, "38. Your definition draws a boundary around a cluster in an inappropriately 'thin' subspace of Thingspace that excludes relevant variables, resulting in fallacies of compression."

Miyamoto Musashi is quoted:

The primary thing when you take a sword in your hands is your intention to cut the enemy, whatever the means. Whenever you parry, hit, spring, strike or touch the enemy's cutting sword, you must cut the enemy in the same movement. It is essential to attain this. If you think only of hitting, springing, striking or touching the enemy, you will not be able actually to cut him.

Similarly, the primary thing when you take a word in your lips is your intention to reflect the territory, whatever the means. Whenever you categorize, label, name, define, or draw boundaries, you must cut through to the correct answer in the same movement. If you think only of categorizing, labeling, naming, defining, or drawing boundaries, you will not be able actually to reflect the territory.

Do not ask whether there's a rule of rationality saying that you shouldn't call dolphins fish. Ask whether dolphins are fish.

And if you speak overmuch of the Way you will not attain it.

(Thanks to Alicorn, Sarah Constantin, Ben Hoffman, Zvi Mowshowitz, Jessica Taylor, and Michael Vassar for feedback.)

# 91

New Comment

This is an excellent post. It has that rare quality, like much of the Sequences, of the ideas it describes being utterly obvious—in retrospect. (I also appreciate the similarly Sequence-like density of hyperlinks, exploiting the not-nearly-exploited-enough-these-days power of hypertext to increase density of ideas without a concomitant increase in abstruseness.)

… which is why I find it so puzzling to see all these disagreeing comments, which seem to me to contain an unusual, and puzzling, level of reflexive contrarianness and pedanticism.

I think my sense of miscommunication with you is that you don't seem to have a sense of the law of equal and opposite advice + meta-contrarianism. Different things seem useful at different stages, and principle of charity means at least trying to see why what people are saying might be useful from their perspective.

Er, sorry, did you mean to post this as a reply here? I’m not quite seeing the relevance…

Considering how much time is spent here on this subject, I'm surprised at how little reference to distributional semantics is made. It's already a half-century long tradition of analyzing word meanings via statistics and vector spaces. It may be worthwhile to reach into that field to bolster and clarify some of these things that come up over and over.

Thanks for the pointer! I've played with word2vec and similar packages before, but had never thought to explore how those algorithms connect with the content of "A Human's Guide to Words".

This is a nice crisp summary of something kind of like pragmatism but capable of more robust intersubjective mapmaking:

Everything we identify as a joint is a joint not "because we care about it", but because it helps us think about the things we care about.

To expand this a bit, when deciding on category boundaries, one should assess the effect on the cost-adjusted expressive power of all statements and compound concepts that depend on it, not just the direct expressive power of the category in question. Otherwise you can't get things like Newtonian physics and are stuck with the Ptolemaic or Copernican systems. (We REALLY don't care about Newton's laws of motion for their own sake.)

As someone who seems to care more about terminology than most (and as a result probably gets into more terminological debates on LW than anyone else (see 1 2 3 4)), I don't really understand what you're suggesting here. Do you think this advice is applicable to any of the above examples of naming / drawing boundaries? If so, what are its implications in those cases? If not, can you give a concrete example that might come up on LW or otherwise have some relevance to us?

Hi, Wei—thanks for commenting! (And sorry for the arguably somewhat delayed reply; it's been a really tough week for me.)

can you give a concrete example that might come up on LW or otherwise have some relevance to us?

Is Slate Star Codex close enough? In his "Anti-Reactionary FAQ", Scott Alexander writes—

Why use this made-up word ["demotism"] so often?

Suppose I wanted to argue that mice were larger than grizzly bears. I note that both mice and elephants are "eargreyish", meaning grey animals with large ears. We note that eargreyish animals such as elephants are known to be extremely large. Therefore, eargreyish animals are larger than noneargreyish animals and mice are larger than grizzly bears.

As long as we can group two unlike things together using a made-up word that traps non-essential characteristics of each, we can prove any old thing.

This post is mostly just a longer, more detailed version (with some trivial math) of the point Scott is making in these three paragraphs: mice and elephants form a cluster if you project into the subspace spanned by "color" and "relative ear size", but using a word to point to a cluster in such a "thin", impoverished subspace is a dishonest rhetorical move when your interlocutors are trying to use language to mostly talk about the many other features of animals which don't covary much with color and relative-ear-size. This is obvious in the case of mice and elephants, but Scott is arguing that a similar mistake is being made by reactionaries who classify Nazi Germany and the Soviet Union as "demotist", and then argue that liberal democracies suffer from the same flaws on account of being "demotist." Scott had previously dubbed this kind of argument the "noncentral fallacy" and analyzed how it motivates people to argue over category boundaries like "murder" or "theft."

My interest in terminological debates is usually not to discover new ideas but to try to prevent confusion (when readers are likely to infer something wrong from a name, e.g., because of different previous usage or because a compound term is defined to mean something that's different from what one would reasonably infer from the combination of individual terms).

I agree that preventing confusion is the main reason to care about terminology; it only takes a moderate amount of good faith and philosophical sophistication for interlocutors to negotiate their way past terminology clashes ("I wouldn't use that word because I think it conflates these-and-such things, but for the purposes of this conversation ..." &c.) and make progress discussing actual ideas. But I wanted to have this post explaining in detail a particular thing that can go wrong when philosophical sophistication is lacking or applied selectively, which was mostly covered by Eliezer's "A Human's Guide to Words", but of which I hadn't seen the "which subspace to pay attention to / do clustering on" problem treated anywhere in such terms.

Thanks, I think I have a better idea of what you're proposing now, but I'm still not sure I understand it correctly, or if it makes sense.

mice and elephants form a cluster if you project into the subspace spanned by “color” and “relative ear size”, but using a word to point to a cluster in such a “thin”, impoverished subspace is a dishonest rhetorical move when your interlocutors are trying to use language to mostly talk about the many other features of animals which don’t covary much with color and relative-ear-size.

But there are times when it's not a dishonest rhetorical move to do this, right? For example suppose an invasive predator species has moved into some new area, and I have an hypothesis that animals with grey skin and big ears might be the only ones in that area who can escape being hunted to extinction (because I think the predator has trouble seeing grey and big ears are useful for hearing the predator and only this combination of traits offers enough advantage for a prey species to survive). While I'm formulating this hypothesis, discussing how plausible it is, applying for funding, doing field research, etc., it seems useful to create a new term like "eargreyish" so I don't have to keep repeating "grey animals with relatively large ears".

Since it doesn't seem to make sense to never use a word to point to a cluster in a "thin" subspace, what is your advice for when it's ok to do this or accept others doing this?

(I continue to regret my slow reply turnaround time.)

But there are times when it's not a dishonest rhetorical move to do this, right?

Right. In Scott's example, the problem was using the "eargrayish" concept to imply (bad) inferences about size, but your example isn't guilty of this.

However, it's also worth emphasizing that the inferential work done by words and categories is often spread across many variables, including things that aren't as easy to observe as the features that were used to perform the categorization. You can infer that "mice" have very similar genomes, even if you never actually sequence their DNA. Or if you lived before DNA had been discovered, you might guess that there exists some sort of molecular mechanism of heredity determining the similarities between members of a "species", and you'd be right (whereas similar such guesses based on concepts like "eargrayishness" would probably be wrong).

Since it doesn't seem to make sense to never use a word to point to a cluster in a "thin" subspace, what is your advice for when it's ok to do this or accept others doing this?

Um, watch out for cases where the data clusters in the "thin" subspace, but doesn't cluster in other dimensions that are actually relevant in the context that you're using the word? (I wish I had a rigorous reduction of what "relevant in the context" means, but I don't.)

As long as we're talking about animal taxonomy (dolphins, mice, elephants, &c.), a concrete example of a mechanism that systematically produces this kind of distribution might be Batesian or Müllerian mimicry (or convergent evolution more generally, as with dolphins' likeness to fish). If you're working as a wildlife photographer and just want some cool snake photos, then a concept of "red-'n'-yellow stripey snake" that you formed from observation (abstractly: you noticed a cluster in the subspace spanned by "snake colors" and "snake stripedness") might be completely adequate for your purposes: as a photographer, you just don't care whether or not there's more structure to the distribution of snakes than what looks good in your pictures. On the other hand, if you actually have to handle the snakes, suddenly the difference between the harmless scarlet kingsnake and the poisonous coral snake ("red on yellow, kill a fellow; red on black, venom lack") is very relevant and you want to be modeling them as separate species!

Sometimes people redraw boundaries for reasons of local expediency. For instance, the category of AGI seems to have been expanded implicitly in some contexts to include what might previously have just been called a really good machine learning library that can do many things humans can do. This allows AGI alignment to be a bigger-tent cause, and raise more money, than it would in the counterfactual where the old definitions were preserved.

This article seems to me to be outlining a principled case that such category redefinitions can be systematically distinguished from purely epistemic category redefinitions, with the implication that there's a legitimate interest in tracking which is which, and sometimes in resisting politicized recategorizations in order to defend the enterprise of shared mapmaking.

I don't see how this article argues against a wider AGI definition. The wider definition is still a correlational cluster.

The article doesn't say that it's worthwhile to keep historical meaning of a term like AGI. It also doesn't say that it's good to draw the boundaries in a way that a person can guess where the boundary is based on understanding the words artificial, general and intelligence.

It's not a thinner boundary so that "38. Your definition draws a boundary around a cluster in an inappropriately 'thin' subspace of Thingspace that excludes relevant variables, resulting in fallacies of compression." might be violated.

The article didn't "argue against" a wider AGI definition. It implied a more specific claim than "for" or "against."

The article starts by speaking about " It is what people should be trying to do ", say in it's middle "This leaves aspiring instructors of rationality in something of a predicament: in order to teach people how categories can be more or (ahem) less wrong," and ends with speaking about what people must do.

That does appear to me like an article that intends to make a case that people should prefer certain definition over other definitions.

If your case is rather that the value of the article is about classification of how boundaries are drawn to distinct ways those boundaries are drawn, it seems to me surprising that you read out of the article that certain claims should be classified as redrawing boundaries for reasons of local expediency that seems odd to me given that the article neither speaks about redrawing boundaries nor redefining boundaries nor about classifying anything under the suggested category of "local expediency".

Rationality discourse is necessarily about specific contexts and purposes. I don't think the Sequences imply that a spy should always reveal themselves, or that actors in a play should refuse to perform the same errors with the same predictable bad consequences two nights in a row. Discourse about how to speak the truth efficiently, on a site literally called "Less Wrong," shouldn't have to explicitly disclaim that it's meant as advice within that context every time, even if it's often helpful to examine what that means and when and how it is useful to prioritize over other desiderata.

I'm not sure what your position happens to be. Is it "This post isn't advice. It's wrong for you (ChristianKl) to expect that the author explicitely disclaims giving advice when he doesn't intent to give advice."?

If that's the case, it seems strange to me. This post contains explicit statemensts about what people should/must do. It contains those in the beginning and in the end, which are usually the places where an essay states it's purpose.

It's bad to be too vague to be wrong.

Postmodern writing about how to speak truth efficiently that's to vague to be wrong is problematic and I don't think having a bunch of LW signaling and cheers for rationalists make it better.

The article seems indirectly relevant to example 4, in which an epistemic dispute about how to divide up categories is getting mixed with a prudential dispute on which things to prioritize. Once a category is clearly designated as "that which is to be prioritized," it becomes more expensive to improve the expressive power of your vocabulary by redrawing the conceptual boundaries, since this might cause your prioritization to deteriorate.

Possibly the right way to proceed in that case would be to work out a definition of the original category which more explicitly refers to the reasons you think it's the right category to prioritize, perhaps assigning this a new name, so that these discussions can be separated.

This makes me curious - have you found that terminological debates often lead to interesting ideas? Can you give an example?

My interest in terminological debates is usually not to discover new ideas but to try to prevent confusion (when readers are likely to infer something wrong from a name, e.g., because of different previous usage or because a compound term is defined to mean something that's different from what one would reasonably infer from the combination of individual terms). But sometimes terminological debates can uncover hidden assumptions and lead to substantive debates about them. See here for an example.

Whether to call something dephlogisticated air or oxygen was a very important terminological debate in chemisty even when the correlational cluster was the same. It matters if you conceptualize it as absense of something or as positive existence.

In medicine the recent debate about renaming chronic fatigue syndrome (CFS) into systemic exertion intolerance disease (SEID) is a quite interesting one.

With CFS it's a quite unclear where to draw the boundary. With SEID you can let someone exercise and then observe how long their body needs to recover and when they take much longer to recover from the exertion you can put the SEID diagnosis on them.

CFS and SEID are both cases where certain states correlate with each other Zacks post doesn't help us at all to reason about whether we should prefer CFS or SEID as a term.

CFS and SEID are both cases where certain states correlate with each other Zacks post doesn't help us at all to reason about whether we should prefer CFS or SEID as a term.

I'm definitely not claiming to have the "correct" answer to all terminological disputes. (As the post says, "Of course, there isn't going to be a unique way to encode the knowledge into natural language.")

Suppose, hypothetically, that it were discovered that there are actually two or more distinct etiologies causing cases that had historically been classified as "chronic fatigue syndrome", and cases with different etiologies responded better to different treatments. In this hypothetical scenario, medical professionals would want to split what they had previously called "chronic fatigue syndrome" into two or more categories to reflect their new knowledge. I think someone who insisted that "chronic fatigue syndrome" was still a good category given the new discovery of separate etiologies would be making a mistake (with respect to the goals doctors have when they talk about diseases), even if the separate etiologies had similar symptoms (which is what motivated the CFS label in the first place).

In terms of the configuration space visual metaphor, we would say that while "chronic fatigue syndrome" is a single cluster in the "symptoms" subspace of Diseasespace, more variables than just symptoms are decision-relevant to doctors, and the CFS cluster doesn't help them reason about those other variables.

When rationalists say that definitions can be wrong, we don't mean that there's a unique category boundary that is the True floating essence of a word, and that all other possible boundaries are wrong. We mean that in order for a proposed category boundary to not be wrong, it needs to capture some statistical structure in reality, even if reality is surprisingly detailed and there can be more than one such structure.

So, I got this part. And it seemed straightforwardly true to me, and seemed like a reasonably short inferential step away from other stuff LW has talked about. Categories are useful as mental compressions. Mental compressions should map to something. There are multiple ways you might want to cluster and map things. So far so straightforward.

And then the rest of the article left me more confused, and the disagreements in the comments got me even more confused.

Is the above claim the core claim of the article? If so, I'm confused what other people are objecting to. If not, I'm apparently still confused about the point of the article.

[edit: fwiw, I am aware of the subtext/discussion that the post is an abstraction of, and even taking that into account still feel fairly confused about some of the responses]

Interesting article. I dare not say I understand it fully. But to argue for some categories as more or less wrong than others is it fair to say you are arguing against the ugly duckling theorem?

Well, I usually try not to argue against theorems (as contrasted to arguing that a theorem's premises don't apply in a particular situation)—but in spirit, I guess so! Let me try to work out what's going on here—

The boxed example on the Wikipedia page you link, following Watanabe, posits a universe of three ducks—a White duck that comes First, a White duck that is not First, and a nonWhite duck that is not First—and observes that every pair of ducks agrees on half of the possible logical predicates that you can define in terms of Whiteness and Firstness. Generally, there are sixteen possible truth functions on two binary variables (like Whiteness or Firstness), but here only eight of them are distinct. (Although really, only eight of them could be distinct, because that's the number of possible subsets of three ducks (2³ = 8).) In general, we can't measure the "similarity" between objects by counting the number of sets that group them together, because that's the same for any pair of objects. We also get a theorem on binary vectors: if you have some k-dimensional vectors of bits, you can use Hamming distance to find the "most dissimilar" one, but if you extend the vectors into 2^k-dimensional vectors of all k-ary boolean functions on the original k bits, then you can't.

Watanabe concludes, "any objects, in so far as they are distinguishable, are equally similar" (!!).

So, I think the reply to this is going to have to do with inductive bias and the "coincidence" that we in fact live in a low-entropy universe where some cognitive algorithms actually do have an advantage, even if they wouldn't have an advantage averaged over all possible universes? Unfortunately, I don't think I understand this in enough detail to explain it well (mumble mumble, new riddle of induction, blah blah, no canonical universal Turing machine for Solomonoff induction), but the main point I'm trying to make in my post is actually much narrower and doesn't require us to somehow find non-arbitrary canonical categories or reason about all possible categories.

I'm saying that which "subspace" of properties a rational agent is interested in will depend on the agent's values, but given such a choice, the categories the agent ends up with is going to be the result of running some clustering algorithm on the actual distribution of things in the world, which depends on the world, not the agent's values. In terms of Watanabe's ducks: you might not care about a duck's color or its order, but redefining Whiteness to include the black duck is cheating; it's wireheading yourself; it can't help you optimize the ducks.

Similarly, the primary thing when you take a word in your lips is your intention to reflect the territory, whatever the means

This sentence sounds to me like you want to use Korzybski's metaphor while ignoring the point of his argument. After him language is supposed to be used to create semantic reactions in the audience and the is a of identity is to be avoided.

The essay feels like you struggle with is a but are neither willing to go Korzybski's way nor are you willing to provide a good argument for why we should use the is a of identity.

Do not ask whether there's a rule of rationality saying that you shouldn't call dolphins fish. Ask whether dolphins are fish.

That feels to me very wrong. Beliefs are supposed to pay rent in anticipated experiences and discussing whether dolphins are fish in the abstract is detached from anticipated experiences.

Context matters a great deal for what words mean. Thomas Kuhn asked both physicists and chemists whether helium is a molecule:

Both answered without hesitation, but their answers were not the same. For the chemist the atom of helium was a molecule because it behaved like one with respect to the kinetic theory of gases. For the physicist, on the other hand, the helium atom was not a molecule because it displayed no molecular spectrum.

If you use either notion of a molecule in the wrong community you are going to run into problems. Asking is 'Is helium a molecule?' in the abstract is not helpful.

In standard English the statement "X is a Y" often means that within the relevant classification system X is a member of category Y. Which classification system is relevant often differs by context, but the OP deals with that explicitly:

in order for a proposed category boundary to not be wrong, it needs to capture some statistical structure in reality, even if reality is surprisingly detailed and there can be more than one such structure.

When quoting the map is not the territory which is a slogan that was created to criticize this usage of is a within a dense 750 page book where on of the main messages is that is a shouldn't be used, I think that paragraph fails to adequately make a case that this common language usage is desirable and if so when it's desirable.

Saying that the primary intention which which language is used isn't to create some effect in the recipient of the language act is a big claim and Zack simply states it without any reflection.

My first reaction to the text was like WaiDai's I don't really understand what you're suggesting here where I'm unsure about the implication that are supposed to be made for practical language use. The second is noting that the text gets basics* like the primary intention of why words are used wrong.

*: I mean basic in the sense of fundamental and not as in easy to understand

Saying that the primary intention which which language is used isn't to create some effect in the recipient of the language act

It's notable to me that both of the passages from this post that you quoted in the great-grandparent comment were from the final section. Would your assessment of the post change if you pretend it had ended just before the Musashi quote, with the words "resulting in fallacies of compression"?

I was trying to create an effect in the recipients of the language act by riffing off Yudkowsky's riff off Musashi in "Twelve Virtues of Rationality", which I expected many readers to be familiar with (and which is the target of the hyperlink with the text "is quoted"). My prereaders seemed to get it, but it might have been the wrong choice if too many readers' reactions were like yours.

I don't think it's hard to "get" the text in a certain way for a person who doesn't have strong opinions about terminology. It's internally consistent and doesn't conflict with other LW writing. I see how most people at my dojo would likely say "yeah, right".

The problem is that if you want to make inferences based on the text, it doesn't seem to be that the concepts pay rent. I don't think your prereaders read it while asking themselves "Does this pay rent?" That's also likely why WeiDei's request to get practical examples went unanswered.

The objection I voiced isn't to the Musashi quote. It's a stylistic choice which is defensible. My objection the text afterwards that reads to me like a summary of the point you want to make.

The values that Yudkowsky writes in the linked article are about empiricism but your post is detached from any empiricsm but about the search of essenses of words.

The search for transcendend essenses should be generally done with caution and you should get clear about why you search transcence from context.

That's also likely why WeiDei's request to get practical examples went unanswered.

your post is detached from any empiricsm but about the search of essenses of words.

I don't think this is a fair characterization of the post.

I need to go get dressed and catch a train now. I'll ping you when my reply to Wei is up.

treating reality as fixed and self as fixed and the discovery of the proper mapping between self concepts and reality concepts is doomed to failure because both your own intentions are fluid depending on what you are trying to do and your own sense of reality is fluid (including self model). Ontologies are built to be thrown away. They break in the tails. Fully embracing and extending the Wittgensteinian revolution prevents you from wasting effort resisting this.

This seems technically true but not relevant. Important classes of intersubjective coordination require locally stable category boundaries, and some ontologies have more variation we care about concealed in the tails than others.

There are processes that tend towards the creation of ontologies with stable expressive power, and others that make maps worse for navigation. It's not always expedient to cooperate with the making of a map that lets others find you, but it's important to be able to track which way you're pushing if you want there to sometimes be good maps.

I'm saying that this post itself is falling prey to the thing it advises against. Better to point at a cluster that helps navigate, like Hanson's babblers than to talk about the information theoretic content of aggregate clusters.

It seems to me like the OP is motivated by a desire to improve decisionmaking processes by making a decisive legal argument against corruption in front of a corrupt court, and that this is an inefficient way of coordinating to move people who are reachable to a better equilibrium.

Does that seem like substantively the same objection to you?

I found parts of the post object-level helpful, like the bit I directly commented on, but overall agree it's giving LW too much credit for coordinating towards "Rationality." But people like Zack will correctly believe that LW's corruption is not common knowledge if people like us aren't willing to state the obvious explicitly.

Yeah, pointing at the same stuff. That clarification helped.

There is an important difference between "identifying this pill as not being 'poison' allows me to focus my uncertainty about what I'll observe after administering the pill to a human (even if most possible minds have never seen a 'human' and would never waste cycles imagining administering the pill to one)" and "identifying this pill as not being 'poison', because if I publicly called it 'poison', then the manufacturer of the pill might sue me."

What is that sentence supposed to tell me? It's not clear whether or not that important difference is supposed to imply to the reader that one is better then the other. Given that there seems to be a clear value judgement in the others, maybe it does here?

Reading it leaves me as a reader with constructed an example where you might be pointing.

You might run standard tox tests and your mice are dead. Mice differ from humans, so you might want to not use the term poison in contrast to the general way people think about tox testing, because you don't care about mice? Is a general critique of the way we do tox testing intended or not?

The part about most possible minds never having seen a human feels like a disgression to me, made with words that are unnecessarily obscure (most people in society won't understand what wasting cycles is about) when it would be quite easy to say that you care about human more then mice.

Is the claim that it's bad to use words in a way to conform to the standards of a powerful institution that enforces certain expectations of what people can expect when they hear a certain word? Boo Brussels? Boo journals who refuse to publish papers that use words when community standards of when certain words should be used aren't meet?

To those people who proofread and appeartly didn't find an issue in that sentence, is it really necessary to mix all those different issues into a 6-line sentence?

It's not clear whether or not that important difference is supposed to imply to the reader that one is better then the other. Given that there seems to be a clear value judgement in the others, maybe it does here?

All three paragraphs starting with "There's an important difference [...]" are trying to illustrate the distinction between choosing a model because it reflects value-relevant parts of reality (which I think is good), and choosing a model because of some non-reality-mapping consequences of the choice of model (which I think is generally bad).

words that are unnecessarily obscure (most people in society won't understand what wasting cycles is about)

The primary audience of this post is longtime Less Wrong readers; as an author, I'm not concerned with trying to reach "most people in society" with this post. I expect Less Wrong readers to have trained up generalization instincts motivating the leap to thinking about AIs or minds-in-general even though this would seem weird or incomprehensible to the general public.

To those people who proofread and appeartly didn't find an issue in that sentence, is it really necessary to mix all those different issues into a 6-line sentence?

It's true that I tend to have a "dense" writing style (with lots of nested parentheticals and subordinate clauses), and that I should probably work on writing more simply in order to be easier to read. Sorry.

I do find myself somewhat confused about the hostility in this comment. It's hard to write good things, and there will always be misunderstandings. Many posts on LessWrong are unnecessarily confusing, including many posts by Eliezer, usually just because it takes a lot of effort, time and skill to polish a post to the point where it's completely clear to everyone on the site (and in many technical subjects achieving that bar is often impossible).

Recommendations for how to phrase things in a clear way seem good to me, and I appreciate them on my writing, but doing so in a way that implies some kind of major moral failing seems like it makes people overall less likely to post, and also overall less likely to react positively to feedback.

You seem to pose a model where a post is either saying good things or saying things uncleanly in a way that's easily misunderstood. A model whereby it's not important to analyses which claims happen to be made which are wrong.

My first answer was pointing out statements in the post that I consider to be clearly wrong and important (it's something many people believed that holds back intellectual progress in the topic). The response seemed to be along the lines of:

"I didn't mean to imply that what I claimed to be true (" Similarly, the primary thing when you take a word in your lips is your intention to reflect the territory, whatever the means"), I said that because it seems to send the right tribal signals because it looks similar to what EY wrote.

Besides the people in my tribe that I showed my draft liked it."

Defending the post as being tribally right instead of either allowing claims to be falsified or defending the claims on their merits feels to me like a violation of debate norms that raises emotional hostility.

I feel that it's bad to by default assume that any disagreement is due to misunderstandings and not substance.

I do think that emotion is justified in the sense that if we get a lot of articles that are full of tribal signaling and attempts to look like EY posts but endorse misconceptions, that would be problematic to LW in a way that posts that are simply low quality because writing good is hard wouldn't be (and that wouldn't trigger emotions).

After rereading the post a few times, I think you are just misunderstanding it?

Like, I can't make sense of your top-level comment in my current interpretation of the post, and as such I interpreted your comment as asking for clarification in a weirdly hostile tone (which was supported by your first sentence being "What is that sentence supposed to tell me?"). I generally think it's a bad idea to start substantive criticisms of a post with a rhetorical question that's hard to distinguish from a genuine question (and probably would advise against rhetorical questions in general, but am less confident of that).

To me the section you quoted seems relatively clear, and makes a pretty straightforwardly true point, and from my current vantage point I fail to understand your criticism of it. I would be happy to try to explain my current interpretation, but would need a bit more help understanding what your current perspective is.

I have written multiple post in this thread and I wouldn't expect you to make sense of the tone by treating this post in isolation.

In a way it's true straightforwardly true point to say that apples are significantly different from tomatoes. It's defensibly true in a certain sense.

At the same time if a reader wants to learn something from the statement and transfer the knowledge to another case, they need to model of what kind of significant difference is implied.

You might read the statement as being about how tomatoes are vegetables purposes for tariff or for cooking purposes and how scientific taxonomy isn't the only taxonomy that matters but it's very bailey-and-motte about that issue. The bailey-and-motteness then makes it hard to falsify the claims.

Are you saying people should never casually make such claims about apples and tomatoes? I haven’t tried to parse your comments in detail, apologies if I'm misunderstanding. But they seem to be implying a huge amount of friction on conversation that does not seem practical to me. (i.e. only discuss things if you're going to take the time to clarify details of your model. The reasons we have clusters and words and shorthand is because that's a lot of effort that most of the time isn't worth it)

A model should generally be clear enough to be falsifiable. It might be okay for a paragraph to not expand an idea in enough detail for that but when there's a >3800 word essay about a model that avoids being falsifiable and instead is full with applause lights I do consider that bad.

I worry that we're spending a LOT of energy on trying to "carve at the joints" of something that has no joints, or is so deep that the joints don't exist in the dimensions we perceive. Categories, like all models, can be better or worse for a given purpose, but they're never actually right.

The key to this is "for a purpose". Models are useful for predictions of something, and sometimes for shorthand communication of some kinds of similarity.

Don't ask whether dolphins are fish. Don't believe or imply that category is identity. Ask whether this creature needs air. Ask how fast it swims. etc. When talking with people of similar background and shared context, call it a fish or an aquatic mammal, depending on what you want to communicate.

We agree that models are only better or worse for a purpose, but ...

If there are systematic correlations between many particular creature-features like whether it needs air, how fast it swims, what it's shaped like, what its genome is, &c., then it's adaptive to have a short code for the conjunction of those many features that such creatures have in common.

Category isn't identity, but the cognitive algorithm that makes people think category is identity actually performs pretty well when things are tightly-clustered in configuration space rather than evenly distributed, which actually seems to be the case for a lot of things! (E.g., while there are (or were) transitional forms between species related by evolutionary descent, it makes sense that we have separate words for cats and dogs rather than talking about individual creature properties of ear-shape, &c., because there aren't any half-cats in our real world.)

Sure, casual use of categories is convenient and pretty good for a lot of purposes. For unimportant cases (including cases where the exceptions don't come into play, like sailors calling dolphin "fish"), go for it. Use whatever words minimize the cognitive load on your conversational partners and allow them to best navigate the world they're in.

Where precision matters, though, you're better off using more words. Don't try to cram so much inferential power into a categorization that's not a good fit for the domain of predictions you're making.

And because these are different needs, be aware that different weights and rigor will be applied. If someone is casually using a category "wrong", you have to decide if the exceptions matter enough to point them out (that is, use more words to get more precision), or if they're just optimizing for brevity on a different set of dimensions than you prefer. Worse, they (and you!) may not fully know what dimensions are important, so your compression may be more wrong than the one you're trying to improve.

Sure, casual use of categories is convenient and pretty good for a lot of purposes. [...] Where precision matters, though, you're better off using more words. Don't try to cram so much inferential power into a categorization that's not a good fit for the domain of predictions you're making.

So, I actually don't think "casual" vs. "precise" is a good characterization of the distinction I was trying to make in the grandparent! I'm saying that for "sparse", tightly-clustered distributions in high-dimensional spaces, something like "essentialism" is actually doing really useful cognitive work, and using more words to describe more basic, lower-level ("precise"?) features doesn't actually get you better performance—it's not just about minimizing cognitive load.

A good example might be the recognition of accents. Which description is more useful, both for your own thinking, and for communicating your observations to others—

At the level of consciousness, it's much easier to correctly recognize accents than to characterize and articulate all the individual phoneme-level features that your brain is picking up on to make the categorization. Categories let you make inferences about hidden variables that you haven't yet observed in a particular case, but which are known to correlate with features that you have observed. Once you hear the non-rhoticity in someone's speech, your brain also knows how to anticipate how they'll pronounce vowels that they haven't yet said—and where the person grew up! I think this is a pretty impressive AI capability that shouldn't be dismissed as "casual"!

Accents are a good example. It's easy to offend someone or to make incorrect predictions based on "has a British accent", when you really only know some patterns of pronunciation. In some contexts, that's a fine compression; way easier to process, communicate and remember. In other contexts, you're better off highlighting and acknowledging that your data supports many interpretations, and you should be preserve that uncertainty in your communication and predictions.

"casual" vs "precise" are themselves lossy compression of fuzzy concepts, and what I really mean is that the use of compression is valid and helpful sometimes, and harmful and misleading at other times. My point is that the distinction is _NOT_ primarily about how tight the cluster or how close the match to some dimensions of reality in the abstract. The acceptability of the compression is about context and uses for the compressed or less-compressed information, and whether the lost details are important for the purpose of the communication or prediction. It's whether it meets the needs of the model, not how close it is to "reality".

Note also that I recognize that no model and no communication is actually full-fidelity. Everything any agent knows is compressed and simplified from reality. The question is how much further compression is valuable for what purposes.

Essentialism is wrong. Conceptual compression and simplified modeling is always necessary, and sometimes even an extreme compaction is good enough for a purpose.