I read your title and thought "exactly!". I then read your post and it was pretty much exactly what I expected after reading the title. So, ironically, it seems like you perfectly compressed the state of your mind into a few words. :) But to be fair, that's probably mostly because we've made very similar experiences and doesn't translate to human<->LLM communication.
When vibe-coding, many things work really fast, but I often end up in these cases where the thing I want changed is very nuanced and I can see that just blurting it out would cause the LLM to do something different from what I have in mind. So I sometimes have to write like 5 paragraphs to describe one relatively small change. Then the LLM comes up with a plan which I have to read, which again takes time, and sometimes there are 1-2 more details to clear up, so it's a whole process, and all of this would kind of work naturally without me even noticing if I were to write the code.
A year ago I wrote a post in a somewhat similar direction, but the recent months of vibe coding with Opus 4.5 really gave me a new appreciation for all the different bottlenecks that remain. Once "writing code" is automated - which is basically now - it's not like programmers are instantly replaced (evidently), we just hop on to the next bottleneck below. So, the average programmer will maybe be sped up by some percentage, with only extreme outliers getting a multiple-fold increase in output, and the rest merely shifts to focus on different things in their work. It's still kind of mindblowing to me that that's how it is. Perhaps it gets "solved" once the entire stack, from CEO to PM to testers to programmers, is AIs - but then I guess they would also have to communicate via not-flawlessly-efficient means with each other (and sometimes themselves, until continual learning is solved), and would still run into these coordination overhead issues? But I guess all that overhead is less notable when the systems themselves run at 100x our speed and work 24h/day.
Just a silly idea: If many people start using LLMs, and as a result of that learn to better translate their intuitions into explicit descriptions... perhaps this could help us solve alignment.
I mean, a problem with alignment is that we have some ideas of good, but can't make them explicit. But maybe the reason is that in the past, we had no incentive to become good at expressing our ideas explicitly... but instead we had an incentive to bullshit. However, when everyone will use LLMs to do things, that will create an incentive to be good at expressing your ideas, so that the LLM can implement them more properly.
That's interesting. I had the same problem a while ago and what I did was taking a lot of pictures that show my personal room where I live in, and tell the LLM:
"Here are many pictures of the room where I live in. You have to infer my intentions just from what you can identify in the pictures."
And this worked pretty well because my room tells a lot about my goals and my way of living.
This works much better than language prompts because a single picture contains much more information than a single sentence. I'm not sure how many tokens a single picture contains on average, but I just tested it with Gemini 3 Pro on AI Studio and it was about 1,000 tokens. Writing 1,000 tokens in language will take a lot of time. Taking and uploading a photo only takes like 10 seconds or so.
So if you were to take 10 pictures of your room this would deliver 10,000 tokens, yet take only 100 seconds to give to the LLM. And photos of your room are not useless information. Every object in your room and its spatial relationship to other objects can tell a lot about your intentions.
Of course it's important to note that this is a rather large privacy risk if you upload these pictures to servers of various AI companies. And also, once the AI gets very intelligent and does not really care about your own well-being, then it can use this information to pursue its own goals that may diverge from your goals.
So only try this with local models.
I write specialized data structure software for bioinformatics. I use AI to help with this on a daily basis, and find that it speeds up my coding by quite a bit. But it's not a 10x efficiency boost like some people are experiencing. I've been wondering why that is. Of course, it could be just a skill issue on my part, but I think there is a deeper explanation, which I want to try to articulate here.
In heavily AI-assisted programming, most time is spent trying to make the AI understand what you want to do, so it can write an approximation of what you want. For some people, most of programming work has shifted from writing code into writing requirement documents for AI, and watching over the AI as it executes. In this mode of work, we don't write solutions, but we describe problems, and the limiting factor is how fast we can specify.
I want to extend this idea one step deeper. I think that the bottleneck is actually in synchronizing the internal state of my mind with the internal state of the LLM. Let me explain.
The problem is that there is a very large context in my brain that dictates how the code should be written. Communicating this context to the AI through language is a lot of work. People are creating elaborate setups for Claude Code to get it to understand their preferences. But the thing is, my desires and preferences are mostly not stored in natural language form in my brain. They are stored in some kind of a native neuralese for my own mind. I cannot articulate my preferences completely and clearly. Sometimes I'm not even aware of a preference until I see it violated.
The hard part is transferring the high-dimensional and nuanced context in my head into the high-dimensional state of the LLM. But these two computers (my brain and the LLM) run on entirely different operating systems, and the internal representations are not compatible.
When I write a prompt for the AI, the AI tries to approximate what my internal state is, what I want, and how I want it done. If I could encode the entirety of the state of my mind in the LLM, I'm sure it could do my coding work. It is vastly more knowledgeable, and faster at reasoning and typing. For any reasonable program I want to write, there exists a context and a short series of prompts that achieves that.
But synchronizing two minds is a lot of work. This is why I find that for most important and precise programming tasks, adding another mind to the process usually slows me down.