My theory is that the LLM knows what is persuasive and brute forces it. So we see a ton of repetition (which ironically, is a costly transformer signal, as if writing with ink and pen), and we see antithesis. Regarding other persuasive methods (alliteration etc.), more research is needed. Seems like it is more and more fine tuned to what people 'like', possibly by a feedback loop.
What does it mean? It means that we writers are under-using persuasive brute-force methods. We write to sound natural and not specifically persuasive, though in general we do want to be persuasive; this comes from a very human instinct of seeming 'as if' "not trying"; LLM don't have that self-conscious game. It finds the methods, and then uses them ad-nauseum because it works. Why do we have this self-conscious such that we become hyper aware of AI persuasive use?
It works for getting the typical LMArena user to click the like button, but it's not clear that it works for persuasion or anything else. Personally I find the style very offputting and usually stop reading when I notice it.
My guess is that they do so in imitation of humans who do the same thing when asked the sorts of questions that people ask LLMs. It's not an LLM thing; it's a thing one does to make distinctions clear, when the other person might otherwise conflate two distinct entities, clusters, or topics. It just so happens that people ask LLMs a lot of that sort of question, and thus elicit a lot of that particular response.
(I also use em dashes, yes.)
More specifically, they may be emulating the Kenyans who were hired to create much of the training data. "I'm Kenyan. I Don't Write Like ChatGPT. ChatGPT Writes Like Me."
Note: I can't verify that the post I linked is legitimate. For all I know it could be generated by ChatGPT instructed to emulate a Kenyan writing about ChatGPT. HN discussion here.
There seem to be common patterns of how LLMs write text that's shared by the LLMs of different companies and where the language patterns the LLM uses are different from usual human writing.
How much do we know about why LLMs pick certain patterns? Do we know why they use "It's not an X, it's a Y? If not, maybe understand why they pick patterns like it can make us better understand how LLMs are reasoning?