“MAN EATING PIRANHA MISTAKENLY SOLD AS PET FISH” — example news headline from Steven Pinker’s The Sense of Style

The rule is that you use hyphens for compound modifiers like the ones in natural-language processing, high-impact opportunities, cost-effectiveness measures, high-status employers, and so on. Don’t break up compound proper nouns (“New York-based company”) and don’t use them after adverbs ending in -ly but do use them after other adverbs (“stern-looking boss”). You can use suspended hyphens when talking about “latex- and phthalate-free gloves.”

But hyphens are under attack. The Chicago Manual of Style “prefers a spare hyphenation style.” The AP Stylebook says that “the fewer hyphens the better.” In older texts you see a lot more hyphenation than you do today.

Part of this is because of a good trend of combining compound nouns, turning e-mail and fire-fly into email and firefly. But part of it involves replacing hyphens with spaces, turning high-school seniors and ice-cream cones into high school seniors and ice cream cones. Some people think hyphens just look bad.

But hyphens are excellent because they improve the readability of text—the speed at which it can be understood, even at a less-than-perceptible level. In fact, it would probably be an improvement to language if it became acceptable and normal to hyphenate compound nouns simply to make the noun phrase faster to read. But first I hope we can return to making references to chocolate-chip cookies.

Skimming the curated posts that are on LessWrong right now, as a random sample:

  • A Shutdown Problem Proposal → A Shutdown-Problem Proposal
  • hopefully-corrigible agent → hopefully corrigible agent
  • large scale X → large-scale X

A good example of hyphen use: “to make any child-agents it creates responsive-but-not-manipulative to the shutdown button, recursively.”

New to LessWrong?

New Comment
19 comments, sorted by Click to highlight new comments since: Today at 5:44 PM
[-]Kaarel3mo1815

I find [the use of square brackets to show the merge structure of [a linguistic entity that might otherwise be confusing to parse]] delightful :)

[-]leogao3mo146

hot take: if you find that your sentences can't be parsed reliably without brackets, that's a sign you should probably refactor your writing to be clearer

Refactoring your writing for clarity is taxing and will reduce overall word count on LW. That would be an improvement for some users but not others.

I know some major offenders when it comes to unnecessary-hyphenation-trains, but usually I still find all their posts and comments net positive.

Of course, I would be happy if those users could increase clarity without sacrificing other things.

See also https://en.m.wiktionary.org/wiki/crash_blossom , which can be mitigated with better use of hyphens.

Hyphens are like parentheses, but only with one level. Which is okay in most situations because more levels would be more difficult to parse or pronounce.

But it is easier to understand if you change the sentence so that the parentheses become unnecessary.

"natural-language processing" = (natural language) processing = processing of natural languages

So it seems like we have a tradeoff between "speed of speech" and "simplicity of parsing". If you talk about the same topic all the time, speed becomes more important than parsing, because everyone already knows what you mean. So you compress the words (get rid of a preposition or two) by using hyphens. Then a noob comes and complains.

What is the purpose of the -ly exception? What's wrong with "hopefully-corrigible agent" other than that it breaks the rule?

I think it’s because with the words in -ly you know they’re supposed to refer to the noun? With a "stern looking man", you might have doubts whether the guy is stern-looking or both "stern" and "looking" at something, because stern is an adjective. A sternly looking man can’t both be sternly and be looking.

In chemical names it's so hard. At least when one is not a chemist. They say "follow the IUPAC recommendations" which in practice means"find someone who knows how to follow them".

I agree.

Relevant problem: how should one handle higher-order hyphenation? E. g., imagine if one is talking about cost-effective measures, but has the measures' effectiveness specifically relative to marginal costs in mind. Building it up, we have "marginal-cost effectiveness", and then we want to turn that whole phrase into a compound modifier. But "marginal-cost-effective measures" looks very awkward! We've effectively hyphenated "marginal cost effectiveness", no hyphen: within the hyphenated expression, we have no way to avoid the ambiguities between a hyphen and a space!

It becomes especially relevant in the case of longer composite modifiers, like your "responsive-but-not-manipulative" example.

Can we fix that somehow?

One solution I've seen in the wild is to increase the length of the hyphen depending on its "degree", i. e. use an en dash in place of a hyphen. Example: "marginal-cost–effective measures". (On Windows, can be inserted by typing 0150 on the keypad while holding ALT. See methods for other platforms here.)

In practice you basically never go beyond the second-degree expressions, but there's space to expand to third-degree expressions by the use of an even-longer em dash (—, 0151 while holding ALT).

Though I expect it's not "official" rules at all.

Seems like brackets would remove this problem, at the cost of being highly nonstandard and perhaps jarring to some people.

I was jarred and grossed out the first time I encountered brackets used this way. But at the end of the day, I think 20th century writing conventions just aren't quite good enough for what we want to do on LW. (Relatedly, I have higher tolerance for jargon than a lot of other people.)

Caveat: brackets can be great for increasing the specificity of what you are able to say, but I sometimes see the specificity of people's thoughts fail to keep up with the specificity of their jargon and spoken concepts, which can be grating.

Could refer to them in writing as "MC-effectiveness measures"

I basically can't read stuff without noticing typos and grammar issues, so I make a ton of typo and edit suggestions. For some authors and works (currently the web serial Super Supportive in particular), a significant fraction of my suggestions consists of using more hyphens.

[-]Roko3mo20

Yes. I have noticed that I prefer hyphens, and now that I think about why it's because they make writing less ambiguous.

Interestingly, Serbian language does not have this problem because adjectives are merged even in speech using -o- infix. For example, 'svetli žuti pas' = "light yellow dog", 'svetložuti pas' = "light-yellow dog".

It's interesting that this problem does not exist in fusional languages. For instance, in Polish its even impossible (or really hard) to say:

MAN EATING PIRANHA MISTAKENLY SOLD AS PET FISH

in such ambiguous way. For instance:

"(MAN (EATING (PIRANHA))) MISTAKENLY SOLD AS PET FISH" = "człowiek jedzący piranię omyłkowo sprzedany jako ryba domowa"

((MAN-EATING) PIRANHA) MISTAKENLY SOLD AS PET FISH = "jedząca ludzi pirania omyłkowo sprzedana jako ryba domowa"

MAN EATING (PIRANHA (MISTAKENLY SOLD AS PET FISH)) = "człowiek jedzący piranię omyłkowo sprzedaną jako ryba domowa"

I agree. Hooray for hyphens! We want more hyphens to-day!