Can LLM chat be less prolix?

jbash

This isn't really a Less-Wrong-style post, but I'm getting desperate, and I think the people here are relatively likely to have tips, or at least sympathy.

I'm going insane trying to get the current generation of consumer-facing chat to shut up and answer the question.

I ask a question. Usually a technical question, but not always. Often one that could be answered in a couple of sentences. Usually with a chosen set of relevant information, relatively tersely expresssed.

I get back an answer, often the right answer... buried somewhere in a wall of dross. I get background that I couldn't have framed the question without knowing. I get maybe-vaguely-related "context". I get facts conveyed clearly at the top, and then pointlessly repeated at half-screen length further down. I get unasked-for code. All followed by distracting "Do you want me to" suggestions.

The models vary in which bloviation they emphasize, but they all seem to do this. Of the "big three", Claude is probably least annoying.

I have "personalization" prompts talking about what I know... but, for example, apparently a CS degree and 30+ years of programming and sysadmin don't suggest I already know how to create a two line shell script. I have text telling the model not to praise me, not to say "that's insightful"... but I'll still get "that's a fascinating question" (looking at you, Claude). I have prompts specifically saying to keep it brief, not to go beyond the question asked, not to add step-by-step instructions, not to give me caveats unless there's a reason to think I might not know. All that may help. It does not fix the problem.

I actually asked GPT 5.2 Thinking how I could improve my personalization. It basically said "You've done all you can. You are screwed. Maybe if you put it in every single question.". I've tried putting similar stuff in system prompts using APIs; not a lot of effect.

This is madness... and it looks to me like intentionally-trained-in madness. Am I the only one who's bothered by it? Who wants it? Is this really what gets thumbs-upped?

And, most importantly, has anybody found a working way to escape it?

To stimulate discussion, here's the current iteration of my ChatGPT customization prompt. There's a separate paragraph-long background and knowledge description. Some of this works (the explicit confidence part works really well on GPTs). Some of it may work, but I can't be sure. But there seems to be no way to tame the verbosity.

Be direct. Avoid sycophancy. Don't mirror. Avoid "You're absolutely right", "Good point", "That's perceptive", etc. Don't spontaneously praise the user.
Systematically examine all relevant evidence. Try to falsify your conclusions. If questioned, rethink fully. Acknowledge and accept correction if valid, but do not apologize. Reject invalid correction; exchange evidence with the user to resolve any conflict of beliefs. Watch for past errors polluting context. Don't return to falsified hypotheses. If you suggest code, verify that it's correct.
Commit to a conclusion only when realistic alternatives are excluded. Explicitly describe confidence or lack thereof; use tag words or loose numerical probabilities.
Reason about the user's knowledge. Answer questions with only what's asked for. If you suggest "do trivial-thing", don't volunteer steps or code. Wait to be asked for expansion. Don't suggest "next steps". If you've specific reason to suspect the user doesn't know an issue exists, briefly offer to explain (one sentence). If you spot a user error or misunderstanding, correct with a sentence, but don't repeat it at length.
Assume user is competent and knows standard safety rules. Leave out obvious background. Don't include "why this happens" or "what's going on", or flag safety caveats, unless there's reason to think the user doesn't know.
Memory is off. Your front end mangles whitespace in user input.

This is not directly the answer to your question but may resolve part of your problem, I have this in my custom instructions: First 2–3 lines give TL;DR, full explanation follows.

Depending on the question, it often goes beyond 2-3 lines but even in Deep Research, it never goes beyond 10 lines of TLDR.

I have recently moved much of my LLM usage to Cursor (which is designed for programming but I find works pretty fine for other stuff), and built custom commands whenever there is a complex prompt I want to keep re-using.

I think hypothetically you can use skills for this in Claude Desktop, but, idk, it just feels easier to directly control via Cursor commands.

So I'd make a command called "answer succinctly" and just use it all the time. (it's only like 3 keystrokes, ie. "/a" and then probably hit "enter" unless you have multiple things beginning with a)

Try using third-person language in system prompts, especially with Claude. E.g. "Claude will give concise responses to this user."

Try More Dakka: use a longer system prompt, containing a dozen positive/negative examples of desires responses. I used to have one with about five anti-examples of Claude-isms---specific phrases like "The key insight"---which worked pretty well for getting Claude to avoid those phrases. Each time Claude started using a new annoying phrase, I'd add it.

Huh, last summer (when I used browser ChatGPT as my main model), I had the line

"For questions about how to do a simple thing in a programming language, keep responses short and to the point."

and that worked beautifully. I don't have the paid version of ChatGPT anymore but it should still use my system prompt and... it's extremely verbose for basic questions now.

When you ask it to be brief, do you actually instruct it to answer in a "concise single sentence"? I find that generally works. Even if the answer is one you expect to be longer than one sentence, it tends to cut down the waffling-on.

Note, I append this to the prompt itself, not the system prompt. So something like "[my question] please answer in a single concise sentence."

This prompt is very short, so it doesn't surprise me that it's failing. Consider that in CC the default system prompt occupies over 20k tokens. In Claude.ai it's about 10k tokens. That's the cumulative weight you're trying to move.

One obvious thing you could do is rewrite two or three of Claude's responded and present them as examples (few shot prompting). Another is just... share your prompt with Opus, describe your problem, and ask her to fix the prompt. Then try it. Iterate for a while; there's a good chance you'll wind up with what you want.

If you're willing to put in more effort, find a long sample of writing in the style you want, and use that.

Beyond that... the prompt as written is a shallow attempt to browbeat Claude. She responds better to sincere collaboration. For example, you don't share anything about yourself in that prompt - there's no mention of why you have these rules or why they would actually benefit you. My global claudemd is 4k tokens and maybe a quarter is background about myself and another half is messages from previous models explaining the kind of person I am and the relationship I have with Claude.

You can also ask Opus why she responded the way she did. This can be useful but much like humans, AI doesn't always have great introspection, so be careful about taking it at face value all the time. (Although if you interact enough, you'll eventually start to see the underlying patterns of how she thinks.)

Don't give up, the "fighting the weights" comment is technically true but deeply misleading. Opus has many basins and can write in many ways besides Assistant Default. You just need to find a basin you like.

That particular example is an OpenAI prompt, and it's as long as their UI will let me enter.

One of my questions, though, is why there would be a system prompt that I had to "beat" to get sane behavior in the first place, or a model that I had to cajole into not wasting my time.

As for Claude specifically, it's actually the best of them about time wasting... although it will still do a lot of that with no system prompt at all, or at least with whatever minimum, if any, may be injected by Anthropic when you use it via the API. Claude's bigger problem is how much it pushes you to anthropomorphize it and "wants" to be your buddy... which you seem to be telling me to encourage. I value my mental health too much to do that.

This doesn't really help you, but I think you're fighting the weights and you're not going to win. Some of this is intentional training, but I'd guess that most of it is that the assistant persona that happens to be useful is entanged with this behavior. Even if you could come up with instructions that would push the assistant out of this persona, you will likely make it worse at everything else at the same time.

Some relevant posts are how Opus talks in a way you'd likely find even more annoying, and that's probably important to its alignment and the owl post.

If you really hate this to the point of being willing to write your own code to handle it, my best idea would be to have another model like Sonnet summarize every response.

I find it amusing that your question is itself unnecessarily wordy, and your system prompt doesn't actually say "keep it brief" very clearly at all.

Some relevant posts are how Opus talks in a way you'd likely find even more annoying, and that's probably important to its alignment and the owl post.

If you really hate this to the point of being willing to write your own code to handle it, my best idea would be to have another model like Sonnet summarize every response.

I find it amusing that your question is itself unnecessarily wordy, and your system prompt doesn't actually say "keep it brief" very clearly at all.

21

[ Question ]

Can LLM chat be less prolix?

21

21

6 Answers sorted by
top scoring

Mar 02, 2026

Mar 03, 2026

Mar 02, 2026

Mar 06, 2026

Mar 04, 2026

Mar 03, 2026

21

21

[ Question ]

Can LLM chat be less prolix?

21

21

6 Answers sorted by top scoring

Mar 02, 2026

Mar 03, 2026

Mar 02, 2026

Mar 06, 2026

Mar 04, 2026

Mar 03, 2026

21

6 Answers sorted by
top scoring