This is not directly the answer to your question but may resolve part of your problem, I have this in my custom instructions: First 2–3 lines give TL;DR, full explanation follows.
Depending on the question, it often goes beyond 2-3 lines but even in Deep Research, it never goes beyond 10 lines of TLDR.
Try using third-person language in system prompts, especially with Claude. E.g. "Claude will give concise responses to this user."
Try More Dakka: use a longer system prompt, containing a dozen positive/negative examples of desires responses. I used to have one with about five anti-examples of Claude-isms---specific phrases like "The key insight"---which worked pretty well for getting Claude to avoid those phrases. Each time Claude started using a new annoying phrase, I'd add it.
This doesn't really help you, but I think you're fighting the weights and you're not going to win. Some of this is intentional training, but I'd guess that most of it is that the assistant persona that happens to be useful is entanged with this behavior. Even if you could come up with instructions that would push the assistant out of this persona, you will likely make it worse at everything else at the same time.
Some relevant posts are how Opus talks in a way you'd likely find even more annoying, and that's probably important to its alignment and the owl post.
If you really hate this to the point of being willing to write your own code to handle it, my best idea would be to have another model like Sonnet summarize every response.
This isn't really a Less-Wrong-style post, but I'm getting desperate, and I think the people here are relatively likely to have tips, or at least sympathy.
I'm going insane trying to get the current generation of consumer-facing chat to shut up and answer the question.
I ask a question. Usually a technical question, but not always. Often one that could be answered in a couple of sentences. Usually with a chosen set of relevant information, relatively tersely expresssed.
I get back an answer, often the right answer... buried somewhere in a wall of dross. I get background that I couldn't have framed the question without knowing. I get maybe-vaguely-related "context". I get facts conveyed clearly at the top, and then pointlessly repeated at half-screen length further down. I get unasked-for code. All followed by distracting "Do you want me to" suggestions.
The models vary in which bloviation they emphasize, but they all seem to do this. Of the "big three", Claude is probably least annoying.
I have "personalization" prompts talking about what I know... but, for example, apparently a CS degree and 30+ years of programming and sysadmin don't suggest I already know how to create a two line shell script. I have text telling the model not to praise me, not to say "that's insightful"... but I'll still get "that's a fascinating question" (looking at you, Claude). I have prompts specifically saying to keep it brief, not to go beyond the question asked, not to add step-by-step instructions, not to give me caveats unless there's a reason to think I might not know. All that may help. It does not fix the problem.
I actually asked GPT 5.2 Thinking how I could improve my personalization. It basically said "You've done all you can. You are screwed. Maybe if you put it in every single question.". I've tried putting similar stuff in system prompts using APIs; not a lot of effect.
This is madness... and it looks to me like intentionally-trained-in madness. Am I the only one who's bothered by it? Who wants it? Is this really what gets thumbs-upped?
And, most importantly, has anybody found a working way to escape it?
To stimulate discussion, here's the current iteration of my ChatGPT customization prompt. There's a separate paragraph-long background and knowledge description. Some of this works (the explicit confidence part works really well on GPTs). Some of it may work, but I can't be sure. But there seems to be no way to tame the verbosity.