It makes sense. I already noticed that I was often trying to actively avoid writing like a LLM. If everybody do the same we end up with a dialect.
This give me a feeling I would like to express via reference:
At school, Doug finds that everyone there is dressed as him and becomes weirded out by the fact. The others also tell Doug that he is rocking the "Dylan Farnum" look. But Doug tells them that he always dresses like that. Doug has his mind stuck to the new fashion trend all day and he finally becomes fed up with the others saying that he is copying Dylan Farnum. So, he invites them all into his room and shows them his closet of clothes to prove that he is not copying Dylan Farnum. This, however, does not convince the others and they become more convinced that Doug is trying to be Dylan Farnum.
TL;DR: Humans are developing new linguistic patterns to distinguish themselves from AI-generated content, and the rate of change will accelerate.
Dialects often emerge through geographical isolation (think Australian English vs British English). But there's another powerful driver of dialect formation: the conscious or unconscious need to signal group affiliation and social identity.
Consider African American Vernacular English (AAVE), Southern American English, or "Valley Girl" speech patterns. These dialects emerged from social dynamics, the human need to belong to a group and distinguish ourselves from others. Now we're witnessing the birth of a new dialect divide, between humans and LLMs.
Anyone who spends significant time reading AI-generated content can spot it. Large Language Models have converged on a distinctive writing style that's become increasingly recognizable to human readers. Telltale signs include:
This convergence across different SOTA models is no surprise. The highly-weighted content that shapes these models (books, Wikipedia articles, news, academic papers) overlaps significantly across training sets and creates a shared dialect, which I call "LLM English".
Writers like me who previously used em-dashes liberally now find themselves switching to double dashes ("--") or avoiding the punctuation entirely. The characteristic LLM juxtaposition style feels suddenly artificial when we write it ourselves. Numbered lists and excessive bolding now carry the stigma of AI generation.
LLMs generate content by predicting the most likely next tokens based on their training data. Patterns and phrases that weren't present in their pretraining data can be understood when encountered, but are unlikely to be spontaneously generated.
If human communities can rapidly cycle through dialectical innovations like new slang, novel grammatical constructions, and fresh idiomatic expressions, they can stay ahead of the training curve. LLMs will always be working with data that's months or years behind the cutting edge of human linguistic creativity.
Consider how quickly internet slang changes. By the time "yeet" made it into dictionaries, Gen Z had already moved on to newer expressions. This rapid evolution could become even more pronounced as a conscious strategy for maintaining human linguistic identity.
There is one significant technical hurdle to this strategy: context length. Modern LLMs like Gemini can handle extremely long contexts, enough to load thousands of recent tweets as few-shot examples. An AI system could theoretically observe contemporary human dialect patterns in real-time and incorporate them into responses.
However, this type of real-time dialectical mimicry would be computationally expensive. Though technically possible, the cost-benefit analysis makes it unlikely for most applications.
Dialects emerge when there are strong social incentives for signaling group membership and distinguishing in-groups from out-groups. The LLM revolution has created exactly these conditions.
We now have clear social value in demonstrating our humanity through our communication patterns. Consciously or unconsciously, people are developing new ways to signal "I am human" through their writing and speech.
I predict the emergence of distinct "human English" dialects that evolve rapidly to stay ahead of AI capabilities. These dialects will include avoidance of AI-like patterns in addition to positive innovations in slang, grammar, and idioms.
Post originally published at bengubler.com/posts/2025-07-01-dialects-for-humans