What is a sentence anyway... is there something special about a period, as opposed to other punctuation marks? Many are available: the colon is a possibility; also its half-brother; and the comma,of course...also the ellipsis -- even the mighty m-dash!
Orwell noted that the semicolon is almost redundant. I wonder if sentences that once would have had a semicolon half way through are now split into two sentences.
...This mermaid of the punctuation world—period above, comma below—is viewed with suspicion by many people, including well-known writers. George Orwell deliberately avoided semicolons in his novel Coming Up for Air (London: V. Gollancz, 1939). As he explained to his editor (Roger Senhouse) at the time, “I had decided … that the semicolon is an unnecessary stop and that I would write my next book without one” (quoted in George Orwell: The Collected Essays, Journalism & Letters, ed. Sonia Orwell and Ian Angus, in Vol. 4: In Front of Your Nose, Jaffrey, NH: David R. Godine, 2000). Kurt Vonnegut had this advice for writers: “First rule: Do not use semicolons. They are transvestite hermaphrodites representing absolutely nothing. All they do is show you’ve been to college” (A Man Without a Country, New York: Seven Stories Press, 2005).
[...] British journalist Lynne Truss affirmed that “a full stop ought always to be an alternative” to the semicolon (Eats, Shoots & Leaves, New York: Gotham Books, 2004). The American writer Noah Lukeman views the semicolon as a mark more suitable for creative writing. Otherwise, he argues, “The first thing to realize is that
The colon seems optional to me, but quotation marks absolutely aren't, as evidenced by how comparatively unreadable this author's dialogue looks. From his book "The Road":
He screwed down the plastic cap and wiped the bottle off with a rag and hefted it in his hand. Oil for their little slutlamp to light the long gray dusks, the long gray dawns. You can read me a story, the boy said. Cant you, Papa? Yes, he said. I can.
That already looks unnecessarily hard to read even though the dialogue is so short. I guess the author made it work somehow, but this seems like artificially challenging oneself to write a novel without the letter 'E': intriguing, but not beneficial to either reader or prose.
I respectfully disagree. As with the minor edit on the Boccaccio quote in another of my comments here, eliminating quotes fundamentally changes the way we interpret the scene.
With quotes (and especially with the way dialog is typically paragraphed), human speech is implicitly shown to be so drastically separate from the sensory component of the scene that it requires completely different formatting from the rest of the text.
By eliminating quotes and dialog paragraphing, human speech becomes just another element in the scene being depicted, not separate or any more or less important than the action of screwing down the plastic cap or the functional importance of the oil in the lamp.
The absence of quotes only makes it harder to read if you, the reader, resist this aesthetic and try to force the dialog to be of greater importance than McCarthy is allowing it to be in his novel.
He screwed down the plastic cap and wiped the bottle off with a rag and hefted it in his hand. Oil for their little slutlamp to light the long gray dusks, the long gray dawns.
"You can read me a story," the boy said. "Cant you, Papa?"
"Yes," he said. "I can."
See how the social interaction between Papa and th...
Should we extend the scope of the data to include pre-Carolingian texts, it would of course approach infinite sentence length as punctuation had rarely been implemented. Even worse, should we go back into ancient Roman or Greek texts, a naive appraisal might also lead us to believe that syllables per word also approach staggering levels of complexity, since the convention of placing spaces or interpuncts between words was uncommon.
Indeed, spacing between words, capitalization, and punctuation were expressly introduced for readability incidentally, a consequence of the practical ease in the mindless and error-prone process of manual copying of documents, which came before the invention of the printing press.
It's not controversial to say that writing with sentences and spacing between words is easier to understand. But what do we gain by counting the punctuation in Chaucer's romance, a drama by Dickens, and a novel by Rowling, and comparing them? The fact of the matter is that modern editions of Chaucer's works have recognizable punctuation only in translation. You're going to see strange things like interpuncts which are not clearly correlating to modern punctuation, and are requiri...
Interestingly, breaking up long sentences into shorter ones by replacing a transitional word with a period does not quite capture the same nuance as the original. Here's a translation of Boccaccio, and a version where I add a period in the middle.
Wherefore, as it falls to me to lead the way in this your enterprise of storytelling, I intend to begin with one of His wondrous works, that, by hearing thereof, our hopes in Him, in whom is no change, may be established, and His name be by us forever lauded.
Wherefore, as it falls to me to lead the way in this your enterprise of storytelling, I intend to begin with one of His wondrous works. By hearing thereof, our hopes in Him, in whom is no change, may be established, and His name be by us forever lauded.
By replacing ", that," with a period, my revision completely changes our relationship with the narrator. In the original translation, the narrator is both announcing his goal and describing what he plans to do to achieve it.
In the revised version, he's describing his plan of action and a potential effect of that plan. We might assume that he's choosing that plan in order to bring about that effect, but it's no longer explicit in the text...
I quite like the article The Rise and Fall of the English Sentence, which partially attributes reduced structural complexity to increase in noun compounds (like "state hate crime victim numbers" rather than "the numbers of victims who have experienced crimes that were motivated by hatred directed at their ethnic or racial identity, and who have reported these crimes to the state")
There is a relatively new, practical reason to write short sentences: they are less likely to be mangled by automated translation software. Sentences often become long via multiple clauses. Automated translators can mangle such sentences by (for example) mistakenly applying words to the incorrect clause. If you split such sentences, you make such translations more reliable. Most of our writing now potentially has global reach. So you can be understood by more people if you meet translation software half-way.
Reading this post, my immediate hunch is that the decline in sentence lengths has a lot to do with the historical role of Latin grammar and how deeply it influenced educated English writers. Latin inherently facilitates longer, complex sentences due to its use of grammatical inflections, declensions, and verb conjugations, significantly reducing reliance on prepositions and conjunctions. This syntactic flexibility allowed authors to naturally craft extensive yet smooth-flowing sentences. Latin's liberating lack of fixed word order and its fun little rhetorical devices combine to support nuanced, flexible thinking. From my own experience studying Latin 7th-12th grade, I find this sort of stuff contributes significantly to freer, more expansive expression when writing or speaking in English, and I often can immediately tell when speaking with or reading something written by someone else who studied Latin. An easy "tell" is when they say "having done x."
Educated English writers historically learned Latin as a foundational part of their education, internalizing this syntactic complexity. As a result, English prose from authors like Chaucer, Samuel Johnson, and Henry James shows a clear...
In China, there was a parallel, but more abrupt change from Classical Chinese writing (very terse and literary), to vernacular writing (similar to speaking language and easier to understand). I attribute this to Classical Chinese being better for signaling intelligence, vernacular Chinese being better for practical communications, higher usefulness/demand for practical communications, and new alternative avenues for intelligence signaling (e.g., math, science). These shifts also seem to be an additional explanation for decreasing sentence lengths in English.
Promoted to curated: I don't think this post is earth-shattering, but it's good, short, and answers an interesting question, and does so with a reasonable methodology and curiosity. And it's not about AI, for once, which is a nice change of pace from our curation schedule these days.
At first, I thought this post would be about prison sentences.
I got curious and checked if DeepResearch would have anything to add. It agreed with your post and largely outlined the same categories (plus a few that you didn't cover because you were focused on an earlier time than the screen era): "Cognitive Load & Comprehension, Mass Literacy & Broad Audiences, Journalism & Telegraphic Brevity, Attention Span & Media Competition, Digital Communication & Screen Reading, Educational & Stylistic Norms".
The last one I thought was interesting and not obvious from your post:
Related: https://www.lesswrong.com/posts/Pweg9xpKknkNwN8Fx/have-attention-spans-been-declining
Another related thing is that the grammar of languages appears to be getting simpler with time. Compare the grammar of Latin to that of modern French or Spanish. Or maybe not quite simpler but more structured/regular/principled, as something like the latter has been reproduced experimentally https://royalsocietypublishing.org/doi/10.1098/rspb.2019.1262 (to the extent that this paper's findings generalize to natural language evolution).
FWIW there is a theory that there is a cycle of language change, though it seems maybe there is not a lot of evidence for the isolating -> agglutinating step. IIRC the idea is something like that if you have a "simple" (isolating) language that uses helper words instead of morphology eventually those words can lose their independent meaning and get smushed together with the word they are modifying.
Humans didn't always speak in 50-word sentences. If you want to figure out how we came to be trending away from that, you should try to figure out how, when, and why that became normal in the first place.
This may be because editing has become easier and faster to iterate.
It's comparatively easy to identify sentences that are too long. Is it easy to identify sentences that are too short? You can always add an additional sentence, but finding examples where sentences themselves should be longer is much harder. With more editing cycles, this leads to shorter and shorter sentences.
Many short sentences can add up to a very long text. The cost of paper, ink, typesetting and distribution would incentivize using fewer letters, but not shorter sentences.
I tend to follow the linguist, McWhorter, on historical trends in languages over time, in believing (controversially!) that undisrupted languages become weirder over time, and only gains learnability through pragmatic pressures, as in trading, slavery, conquest, etc which can increase the number of a language's second language learners (who edit for ease of learning as they learn).
A huge number of phonemes? Probably its some language in the mountains with little tourism, trade, or conquest for the last 8,000 years. Every verb conjugates irregularly? Likely...
shorter sentences are better because they communicate more clearly. i used to speak in much longer and more abstract sentences, which made it harder to understand me. i think using shorter and clearer sentences has been obviously net positive for me. it even makes my thinking clearer, because you need to really deeply understand something to explain it simply.
Shorter sentences are better. Why? Because they communicate clearly. I used to speak in long sentences. And they were abstract. Thus I was hard to understand. Now I use short sentences. Clear sentences.
It's been net-positive. It even makes my thinking clearer. Why? Because you need to deeply understand something to explain it simply.
Yes, and this also applies to your version! For difficult or subtle thoughts, short sentences have to come strictly after the long sentences. If you're having enough such thoughts, it doesn't make sense to restrict long sentences out of communication channels; how else are you supposed to have the thoughts?
I write shorter sentences thanks to the editing work of LW editor @JustisMills and the book Several Short Sentences About Writing.
Literacy seems to make sense to me but I might be missing something in the post. Writing is language and language is communication so at least two sides.
As more people learned to read, they also learned to write, and written communications increases. However, even with modest literacy one can read a long sentence. Or can do that when it is written by a good/skilled writer. But being able to read does not really lead to writing skills in most cases I suspect.
As more people started communicating via writing (think things like schools and education expansion) the skill level of the average writer likely declined. That probably lead to training next generation writes to write in a more simple sentence structure.
I wonder what the trend is across different languages, and whether that'd help shed some light on various hypotheses. For example, I rarely read Chinese, and even more rarely classical texts, but my vague impression is that older Chinese texts are understood to be much more semantically dense than modern Chinese books, which are slightly more semantically dense compared to English.
I can't find one with older Chinese and modern Chinese side-by-side (because most modern Chinese readers are expected to understand older vernacular and even classical Chin...
Thanks for this post. I would argue that part of an explanation here could also be economic: modernity brings specialization and a move from the artisan economy of objects as uncommon, expensive, multipurpose, and with a narrow user base (illuminated manuscripts, decorative furniture) to a more utilitarian and targeted economy. Early artisans need to compete for a small number of rich clients by being the most impressive, artistic, etc., whereas more modern suppliers follow more traditional laws of supply and demand and track more costs (cost-effectiveness...
One small, anecdotal piece of support for your 'improved-readability' hypothesis: ime, contemporary French tends to use longer sentences than English, where I think (native Francophones feel free to correct me) there's much less cultural emphasis on writing 'accessibly'.
E.g., I'd say the (state-backed) style guidelines of Académie Française seem motivated by an ideal that's much closer to "beautiful writing" than "accessible writing". And a couple minutes Googling led me to footnote 5 of this paper, which implies that the concept of "reader-centred l...
The average reader has gotten dumber and prefers shorter, simpler sentences.
I suspect that the average reader is now getting smarter, because there are increasingly ways to get the same information that require less literacy: videos, text-to-speech, Alexa and Siri, ten thousand news channels on youtube. You still need some literacy to find those resources, but it's fine if you find reading difficult and unpleasant, because you only need to exercise it briefly. And less is needed every year.
I also expect that the average reader of books is getting much smar...
I suggest additional explanation.
The bigger the audience is, the more people there are who won't know a specific idea/concept/word (xkcd's comic #1053 "Ten Thousand" captures this quite succinctly), so you'll simply have to shorten.
I took logarithm of sentence length and linearly fitted it against logarithm of world population (that shouldn't really be precise since authors presumably mostly cared about their society, but that would be more time-expensive to check).
Relevant lines of Python REPL
>>> import math
>>> wps = [49, 50, 42, 20, 21,
How does this correlate with vocabulary changes? If the common vocabulary expanded, is the information contents of sentences still decreasing?
With regards to the State of the Union address, one contributing factor might be the method of delivery. The State of the Union is now intended for a television or radio audience, whose spoken format favors shorter, more simple sentence structures when compared with print.
In school and out of it, I’d been told repeatedly my sentences were run-on, which, probably fair enough. I do think varying sentence length is nice, and trying to give your reader easier to consume media is nice. But sometimes you just wanna go on a big long ramble about an idea with all sorts of corollaries which seem like they should be a part of the main sentence, and it’s hard to know if they really should be a part of their own sentence. Probably, but maybe I defer too much to everyone who ever told me, this is a run-on.
It's worth noting that we observe other forms of simplication of language as well. English reduced the amount of inflections of verbs. The distinction between singular and plural pronouns disappeared.
There's the argument that increasing access to information creates competition for attention, which drives language to be more concise and readable, e.g. https://www.nature.com/articles/s44271-024-00117-1
Both examples are harder to understand than necessary. Either "The firefighters who'd been sleeping jumped into action when the alarm sounded." or "When the alarm sounded, the firefighters who'd been sleeping jumped into action." seem much more understandable. The actual short version that flows would be "Then sleeping firefighters jumped into action when the alarm sounded."
Long version: The problem I see with the examples of Hypotaxis and Parataxis might be that it is artificially chunking up the ideas involved into separate bits when it is unnecessary, a...
I've always been a sucker for long sentences. This doesn't mean that I'm overly obsessed by them. Even short ones do have their place. I think I prefer them. You might, too.
But a long sentence — one able to rise to the complicated challenge of a new journey — merits clear regard if without sacrificing speed it happily sweeps us along over the last bumpy road toward home, like riding with John Wayne as he pulls into Dodge City, gets down, casually ambles over to get a bourbon, and says, "Howdy, boys. What do y'all do here for fun?" I mean old Texas tumbleweeds really do roll.
>Sentence lengths have declined.
Data: I looked for similar data on sentence lengths in german, and the first result I found covering a similar timeframe was wikipedia referencing Kurt Möslein: Einige Entwicklungstendenzen in der Syntax der wissenschaftlich-technischen Literatur seit dem Ende des 18. Jahrhunderts. (1974), which does not find the same trend:
Year | wps |
1770 | 24,50 |
1800 | 25,54 |
1850 | 32,00 |
1900 | 23,58 |
1920 | 22,72 |
1940 | 19,60 |
1960 | 19,90 |
This data on scientific writing starts lower than any of your english examples from that time, and increases initially, but arrives...
Reading this article made me immediately think about Russian literature, particularly from the 19th century. Mostly because of my background and how I am still working on adopting to English in my speaking and writing. Russian authors of that time are famous for their elaborate, intricate, and syntactically rich sentences. This isn't unique just to Tolstoy, Dostoevsky, Turgenev, and even Bulgakov later on. The Russian literary tradition is about constructing entire emotional and philosophical worlds within a single sentence where thoughts cascade into one ...
A useful direction for further research would be to analyze fiction and non-fiction as separate corpora. Longer sentences may reflect a tendency towards hypotaxis, and they may also be a deliberate stylistic choice for fiction writers, perhaps setting a more languid and relaxed tone not appropriate for much non-fiction.
I think the point about the Internet enabling a "wordier style" due to lower printing costs actually gets things backwards.
What actually matters is the competition for attention. Consider that as the barrier to entry to publishing has dropped, the number of suppliers has exploded, while the number of consumers has grown much more slowly. That means there's far more supply competing for limited attention. This creates enormous selection pressure for ideas to be consumable, instantly engaging, and spreadable.
I think eloquence and beautiful language rightfull...
Having had my writing criticized for decades on account of an allegedly overly hypotactic style, with parentheticals and qualifiers inserted freely, sometimes to multiple levels, I think it's safe to say I am attempting to reverse this trend single-handedly.
Pros: less to hold in your head at once, letting you focus on the content rather than keeping the words straight. (The longer the worse, and using different languages also makes this harder)
Cons: writers have less stylistic space in less words
Sentences should be shorter rather than longer, expcept where there is good reason (keeping out the less intelligent or for stylistic reasons)
This is an interesting question and you have made many pertinent points, but it remains unclear to me why a move from listening to silent reading creates selective pressure for styles that can be received and understood quickly. If that is an advantage in silent reading, why less so for the same words spoken? After all, listening seems to be burdened with a few additional barriers to comprehension, such as in disambiguating homophones and the inability to skip backwards and re-hear what was just said.
The preference for brevity in telegraphy and newspapers ...
Sentence lengths have declined. The average sentence length was 49 for Chaucer (died 1400), 50 for Spenser (died 1599), 42 for Austen (died 1817), 20 for Dickens (died 1870), 21 for Emerson (died 1882), 14 for D.H. Lawrence (died 1930), and 18 for Steinbeck (died 1968). J.K Rowling averaged 12 words per sentence (wps) writing the Harry Potter books 25 years ago.
So the decline predates television, the radio, and the telegraph—it’s been going on for centuries. The average sentence length in newspapers fell from 35wps to 20wps between 1700 and 2000. The presidential State of the Union address has gone from 40wps down to under 20wps, and the inaugural addresses had a similar decline. (From Jefferson through T. Roosevelt, the SOTU address was delivered to Congress without any speech, and print was the main way that inaugural addresses were consumed for most of their history.) Warren Buffett’s annual letter to shareholders dropped from 17.4wps to 13.4wps between 1974 and 2013.
SlateStarCodex’s ten recommended blog posts have 22wps. My own top 10 posts have 20wps. Even top medical journals have under 25wps. The FAA, the European Commission, and various legal institutions have style guides recommending to stay under 20wps. Skimming r/writing, it looks like people recommend 10-15wps for fiction (HPMOR has 15wps). It’s possible that sentence lengths will stop declining only when we hit a physical limit on how short sentences can reasonably become. The best-selling hardboiled novella The Postman Always Rings Twice (1934) has 11wps, while I saw one source claiming that Jurassic Park (1990) has only 9wps.
Several explanations present themselves for why sentence lengths have decreased. They aren’t mutually exclusive; it could be that all of them contributed.
The reason the average reader could have been smarter in the past is because literacy used to be more limited.
Full literacy didn’t appear until the turn of the 20th century in England. America had an earlier rise in literacy and the vast majority of free men could read by the 1800s, though like England it took until the 1900s to reach full literacy. It does seem broadly true that sentence lengths are higher in areas with more advanced readers; Stuart Little, the 1945 children’s book quoted at the top, has 13wps, while scientific journals often have 25wps. On the other hand, sentence lengths continued to decline throughout the 1900s, well after we reached full literacy.
Another theory is that journalists inspired a terser style. The newspaper industry grew throughout the 19th century and they saved money when they used fewer words. Many great American writers like Twain, Whitman, Hemingway, and Steinbeck were journalists and influenced by newspaper style. There are whole grammatical structures like the appositive noun phrase (the part set off by commas in “Mr. Smith, a Manhattan accountant, said…”) that are associated with newspapers and clearly have brevity in mind.
Another theory has to do with a transition from reading aloud to reading silently. Reading texts aloud to a group continued as a social practice into the Victorian era, and illiterates would even pay to listen to readings of Dickens. Works written up to this period would have often been written with listeners in mind. An interesting 2008 paper discusses how Dickens in particular uses punctuation and other markers to help orators read his novels. But eventually it became most common to read silently and one consequence was that punctuation became standardized on syntactic (i.e. grammatical) rather than prosodic grounds. I’m not sure if it follows that sentence lengths would also go down. Spoken language is surprisingly complex and actually contains more subordinate clauses than professional/academic writing. For example, I found some transcripts of interviews from Brandon Sanderson—a popular fantasy author whose Stormlight Archive series averages only 9 words per sentence—and measured his extemporaneous speech at ~20 words per sentence (and that includes a bunch of short sentences like “Yeah” or “I don’t know”).
The simplest theory is just that shorter sentences reflect better writing. When you see those ratings of a text’s reading difficulty in terms of a 4th-grade reading level or 10th-grade reading level and so on, those ratings are based on the Flesh-Kincaid readability score, which is just a weighted sum of the text’s words-per-sentence and syllables-per-word measures. A decrease of one grade level in readability thus comes from ~10 additional words per sentence or ~0.11 additional syllables per word. Studies invariably show that sentences with fewer words are easier for readers to understand quickly.
Others have suggested this for a long time; in one of the earliest analyses of sentence length, Lucius Sherman in Analytics of Literature (1893) wrote that the “heaviness” of sentences also decreased over time as sentence lengths decreased, and that “Elizabeth writers “are prevailingly either crabbed or heavy … ordinary modern prose, on the other hand, is clear, and almost as effective to the understanding as oral speech.”
Part of this was because older writers affected a Latinate style. The “periodic sentence,” which saves the main clause for the end after multiple dependent clauses are presented first, was common and exemplified in the extreme by writers like Samuel Johnson and Henry James. Consider the Stuart Little quote at the top: the main clause “Stuart stopped to get a drink of sarsaparilla” is preceded by a prepositional phrase “in the loveliest town of all” and four lengthy dependent clauses starting with “where.” This Latinate style included a preference for hypotaxis (connecting clauses with conjunctions or relative pronouns) over parataxis (presenting clauses sequentially without subordination):
It seems like the improved-readability effect provides most of the explanation. As more readers appeared and read more often (and read silently), selective pressure increased for styles that could be read and understood quickly. The telegraph and newspapers encouraged brevity as well. In principle, you could imagine that the Internet would have enabled a wordier style because it removed the financial costs of physically printing more words, but any effect like that hasn’t overcome the other trends.