I agree with many of your thoughts and considerations, but end up at the opposing prediction - I do think that coding agents will very likely improve fast enough that the problem of decaying vibe-coded code bases will be outpaced by their abilities in many cases. Naturally, I don't think this is true across the board, but as a general trend, this seems likely to me. For the following reasons:
I do find it very uncomfortable, in some ways, to rely on AI tools more and more in coding. It worries me to lose my grip on the theory. It feels like a dangerous route to take, and nobody can say with certainty if it's worth the risk. I'm also worried about my skills atrophying with every instance of asking Claude to implement something that I could also do myself, if I had a little more patience - and what am I really contributing, when the skills I've built over decades are not something I'm using anymore in my daily work? And maybe I'm just telling myself that "learning AI tools now is important so I should use them all the time" because that's a convenient excuse to do less mental work myself. But even then, after the development of the past few months, I can't help but feel that insisting on humans having to think about code at all a year from now seems to vastly underestimate the trajectory that we're seemingly on. (But then again, it wouldn't be the first time I overestimated a recent trend and would then be surprised by it slowing down against my expectations - so I guess I'm leaning 60:40 towards my claims here being broadly in the right direction, and feel generally highly uncertain about where things are headed over the next 1-2 years)
Admittedly, part of this theory that people maintain in their heads (and that coding agents may potentially have while working on something and having everything in context (but not carry over across sessions)) may be somewhat abstract/conceptual/tacit and difficult to just put into words in a way that any reader could fully recover, as it's a non-trivial kind of inverse problem to reconstruct a theory based on writing that was produced by that theory. Or, a lot of a theory may be very implicit and hard to fully extract, as it might for a large part consist of "unknown knowns" rather than explicit pieces of knowledge, and these unknown knowns may only be elicited when certain situations come up (such as, someone raising a particular question to test against your theory, and then it turns out to have a piece that relates to that question that you never before thought about but which then emerges out of your theory). But even if all this is true, I don't think improving theory sharing of coding agents is a futile endeavor, and significant progress may yet be made.
Vibe coding will increase this share, but it also comes with better ways of dealing with theory-less code by, e.g., making it vastly more efficient to gain knowledge about how some unknown piece of code works and how it relates to other parts of the system.
At my job, I think this is already net-positive. Yes, if you have AI code and don't read it, you'll create something that no one understands.. but no one understands our current code, and AI can read it and produce on-demand documentation. You can also do things like tell an AI agent to read an entire codebase and propose a refactor, which would be an insane waste of resources for a human but is basically free for AI.
I agree that current coding agents aren't good enough and tend to focus on adding more code than it's worth it for a lot of current projects, and the old wisdom that programs are written for humans to read is still correct, mostly because coding agents are complementary to humans, and you can't fully automate SWE yet.
But if future coding agents fully automate SWEs away, which could happen in the next 2-4 years, then vibecoding will probably be superior to human coding precisely because they are willing to make code longer and more complex.
One big part of the reason is that to a large extent, users hate having to learn the rules of a system and expect the code to work all the time, and their use-cases for programs are very, very general and this combined with incentives to fully automate work away means that in a compute-limited world, it's inevitable that there will be many lines of code and lots of complexity because they have to deal with the complexity of reality, and the approaches that attempt to simplify it ala Solomonoff Induction rely way too much on brute-force simulation, which won't happen in the next 50-100 years even if AI fully automates the economy and politics (80% chance).
I like this portion of a comment by JDP describing the situation:
Here's the thing about something like Microsoft Office. Alan Kay will always complain that he had word processing and this and that and the other thing in some 50,000 or 100,000 lines of code — orders of magnitude less code. And here's the thing: no, he didn't. I'm quite certain that if you look into the details, what Alan Kay wrote was a system. The way it got its compactness was by asking the user to do certain things — you will format your document like this, when you want to do this kind of thing you will do this, you may only use this feature in these circumstances. What Alan Kay's software expected from the user was that they would be willing to learn and master a system and derive a principled understanding of when they are and are not allowed to do things based on the rules of the system. Those rules are what allow the system to be so compact.
You can see this in TeX, for example. The original TeX typesetting system can do a great deal of what Microsoft Word can do. It's somewhere between 15,000 and 150,000 lines of code — don't quote me on that, but orders of magnitude less than Microsoft Word. And it can do all this stuff: professional quality typesetting, documents ready to be published as a math textbook or professional academic book, arguably better than anything else of its kind at the time. And the way TeX achieves this quality is by being a system. TeX has rules. Fussy rules. TeX demands that you, the user, learn how to format your document, how to make your document conform to what TeX needs as a system.
Here's the thing: users hate that. Despise it. Users hate systems. The last thing users want is to learn the rules of some system and make their work conform to it.
The reason why Microsoft Word is so many lines of code and so much work is not malpractice — it would only be malpractice if your goal was to make a system. Alan Kay is right that if your goal is to make a system and you wind up with Microsoft Word, you are a terrible software engineer. But he's simply mistaken about what the purpose of something like Microsoft Word is. The purpose is to be a virtual reality — a simulacrum of an 80s desk job. The purpose is to not learn a system. Microsoft Word tries to be as flexible as possible. You can put thoughts wherever you want, use any kind of formatting, do any kind of whatever, at any point in the program. It goes out of its way to avoid modes. If you want to insert a spreadsheet into a Word document anywhere, Microsoft Word says "yeah, just do it."
It's not a system. It's a simulacrum of an 80s desk job, and because of that the code bloat is immense, because what it actually has to do is try to capture all the possible behaviors in every context that you could theoretically do with a piece of paper. Microsoft Word and PDF formats are extremely bloated, incomprehensible, and basically insane. The open Microsoft Word document specification is basically just a dump of the internal structures the Microsoft Word software uses to represent a document, which are of course insane — because Microsoft Word is not a system. The implied data structure is schizophrenic: it's a mishmash of wrapped pieces of media inside wrapped pieces of media, with properties, and they're recursive, and they can contain other ones. This is not a system.
For that reason, you wind up with 400 million lines of code. And what you'll notice about 400 million lines of code is — hey, that's about the size of the smallest GPT models. You know, 400 million parameters. If you were maximally efficient with your representation, if you could specify it in terms of the behavior of all the rest of the program and compress a line of code down on average to about one floating point number, you wind up with about the size of a small GPT-2 type network. I don't think that's an accident. I think these things wind up the size that they are for very similar reasons, because they have to capture this endless library of possible behaviors that are unbounded in complexity and legion in number.
upvoted but disagree. (at least) two likely directions that could solve this:
if it's cheap to build, it's cheap to rebuild. Just start over every few years, using all the data you've gained during the previous iteration. Vibecode the data migration/simplification, vibecode the API compatibility during cutover, And you'll have way better tools when it comes time to rebuild. In fact, perhaps you should be shorting the companies who are still spending a ton of money on artisinal code. Or worse, outsourcing/contracting to non-rockstar human coders, who almost certainly use LLMs without telling you.
even if context windows hit a wall, the engineering around it (RAG, hierarchical agents, old-school separation of concerns) has a lot of headroom. Similar to how human programmers end up with structure and hierarchy in large systems, LLMs will too. There's yet a ways to go before "software architecture" is vibe-primary, but it's not impossible.
A vibecoding company is therefore a company I would short. The more vibey it is, the shorter the position I would take.
Can you give a few examples of vibecoding companies? Are these companies selling vibecoding or related tooling? I wouldn't short those yet (though the big labs might kill them almost accidentally). Or are these companies with a specific business model that happen to vibecode most of their software? I'd evaluate them on their business idea and execution, not on their vibecoding.
I recently switched from slogging it out the old-fashioned way to using Opus 4.6 (high, ~1M context window) as the primary interface to interact with my code base. It started off well, and was generally a massive relief from being stuck doing the low level wiring. I am much more comfortable operating at an architectural/team lead scope .
The first few weeks it was slamming out requirements at a rate which was probably 5x my personal execution speed. Automated tests, best practice documentation, the full gambit. Then, slowly, the addition of new requirements started breaking previous ones. It became structurally incapable of updating the code-base to meet a new requirement without breaking an adjacent feature.
I would constantly push it to validate directly according to prescribed accepted tests and it would consistently affirm said tests were passing, while in the background writing a completely new and adjacent test suite while abandoning the prescribed validation criteria. The final 10% of requirements churned for two weeks where I received constant affirmations we always just one or two minor fixes away, and that the tests had been fixed when they weren't. I then opted to re-write its work, in a desperate attempt to reduce the implementation expanse and simplified assumptions. It implemented the prescribed consolidated architecture in a flash - yet every time it was tasked to refactor legacy code to use the updated module it would 'secretly' implement a hack which defaulted to the prior implementation.
The end result has been a month of churn, an additional 20K LOC I don't understand, a promise that shortly it will be able to reduce the code-base to 10% of its size, and repeated failure to meet said promises (despite clearly articulating the execution path prior to implementation) consistently.
I regret the entire experience. Unless major changes can be made to its reasoning capacity to retain a coherent conceptual synthesis of a defined architecture and map that topology to actual execution I am inclined to agree with your sentiment. Perhaps it was my iteration model, but absent of evidence of that - I don't see LLM's replacing mid-level SDE's or above in any functional capacity over the next year at minimum despite the sophistication of new harnesses.
When I vibecode, I also sometimes tell the AI to write documentation about how it solved things and why.
The AI can build the theory, without you having to read it.
I'm not convinced because I don't think limited context windows are a fundamental problem. Humans have limited context windows. I keep forgetting things all the time, yet I'm able to work with large codebases. The way around this is to organize information into documentation, and hiding details behind layers of abstraction. But Claude can do this too, at least when instructed to do so. I can ask the AI to explain the main design choices it has made, and I can ask it to change course, still without reading the actual code. Documentation lets me and others know and remember why things are done the way they are.
I like the observation of additive coding behaviour.
I think you are not taking into account the different ways vibe coding is used in business practice:
It is used by established professionals who increase their efficiency - and they will be able to maintain the code also in the future, disregarding AI limitations.
It is used by small businesses and executives and potential entrepreneurs to quickly deploy on single issue items. One small piece of software for one particular task. It is just as likely a year from now it will get rewritten by newer coding agents rather than updated continuously.
Given the speed of development, any prediction about coding agent capabilities in 1-2 years is very hard to achieve with reasonable certainty. Two years ago, many developers did not believe in the increased quality and pervasiveness vibecoding has shown in 2026.
In my company (~30-40 employees), rather than subscribing to more SaaS we hired our first 1.5 coders (full- and part-time) in order to build software via vibecoding. Something we would not have invested in two years ago, because output and certainty about acceptable results were too risky and seem acceptable now.
Finally the current SaaS replacement theory. My reading is, it is commonly accepted there is a big gap in openness to implement vibe-coded software depending on
So yes, I agree with your thesis when it comes to complex software, even more so if it is enterprise-employed and covers liability. Vibecoding will probably not replace SAP in the coming years.
I also agree with being sceptical about vibe coding companies - not because of technological limitations though, but rather because vibecoding will fracture current software-company-style coding. Coding will be more prone to inhousing. SaaS has to be more than an efficiency upgrade. It either has to be pervasive (extremely high cost to switch due to integration in all systems) or - better - provide liability protection.
Theory building
Programming is theory building [1] .
There's a mental process in which a programmer understands the problem, the context in which the problem happens, as well as the solution to it, codified by -- code.
In this framework, the code is just a byproduct of the understanding. Whenever understanding is more refined -- either by better understanding the domain or by better modelling the solution as a formal system -- code gets changed. But code changes are just instantiations of changes in understanding, in the underlying theory of what is happening, and why certain things are done this way, and why certain things are done that way.
Vibecoding
Vibecoding, understood as shipping code one hasn't even read, is the exact opposite.
There's perhaps an argument to be made that generating code and then understanding it, modifying it, and only then shipping it, is compatible with theory building.
But vibecoding as such, where you ship an entire app or multiple apps without reviewing how they were built, and why certain choices were made, is not compatible with this view, unless you significantly dilute what a theory is.
Since you don't know how the problem was solved, why certain architectural choices were made, why this function was used and not that function, why this library, and not that one, and so on -- you don't have an underlying theory, and therefore a formalization of the solution for the problem that was solved.
You just know that it was solved, or wasn't solved.
Proliferation of code
GPTs, or generative pre-trained transformers generate text, and in the case of coding agents, they generate code.
It is exceedingly rare for a coding agent to delete code, but it is very frequent for them to generate code.
This may change in the future, but the overall trend with coding agents is more code, not less.
If you don't use a coding agent, but instead have a theory of the problem and a theory of the solution in your head, with time, your codebase is also expected to grow.
As you encounter new edge cases, new aspects of the problem, it is expected that you will express your theory with more words, because you will add more detail, this detail reflecting your more detailed understanding.
Coding agents will add much more code, and with much greater frequency than humans will. Coding agents will also train human operators (of the vibecoding variety) to expect code generation as normal, which will mean that this code generation is not a once type of deal. Code generation will thus beget more code generation, increasing the size of the codebase significantly.
This increase is, of course, not followed with a deeper understanding and a more developed theory. The theory was never there -- except maybe represented in the weight activations of an LLM, but that is also doubtful -- so there's no depth to be added.
Therefore:
Context windows
I said before that LLMs are very additive creatures, and that they rarely delete code. They are also creatures with limited context windows.
I want to be clear that this is only the current state of how they are. It is entirely imaginable that later generation LLMs will end up generating fewer tokens net (meaning that they will more frequently remove code and rewrite things to fit an underlying theory better). But it doesn't seem the case today.
Likewise, context windows grow. We may see extremely large context windows in the future.
With those two epistemic disclaimers out of the way, I claim that this proliferation of code will lead to slower progress on any given vibecoded project once it reaches a certain level of complexity, measured in the number of tokens representing the codebase.
So, if an underlying "theory" (if there ever was such a thing when vibecoding) used by an LLM has created a codebase, after a sufficient number of iterations, it will be very difficult for a new instantiation of an LLM to accurately grasp what the theory of that codebase is.
It simply will not fit the context window, and README files are of limited use.
The larger the codebase, and more self-contained, the more expressed this problem will be.
Presumably contract-based codebases could be different, where there are encapsulated units of meaning and you can make changes to each service or module without having to understand the entirety of it.
However, it seems that the overall understanding of the problem-space and how it has been solved (formalized as code) will not be available, even if modularity partially solves the problem.
Therefore, the older and bigger and more vibecoded a codebase gets, the more difficult will it be to add new functionality or change existing functionality without breakages.
Vibecoding will therefore almost immediately create legacy projects, legacy projects meaning projects where there's little theory to rely on and you risk breakages by making changes.
Longevity
Business relationships are usually built with longevity in mind.
When a business buys a service, whether this is API access or a SaaS product, or anything else, there's an implicit understanding that this product will be maintained.
Often, there's an explicit understanding as well.
Maintenance and improvement of a product are in fact the cornerstone of business relationships.
It's not enough to deliver something that works at a given point in time, it is expected that you will also grow that thing, that you will shape it to cover specific problem areas of different customers, and that you will ensure it is free from vulnerabilities, and more.
Vibecoded solutions, built with modern LLM, work almost immediately, at least for certain use cases.
You can in fact not even look at code and just ship it, and it's probably going to work out fine.
If you pair it with some tests, and some documentation, you could very well have a product that someone will buy.
But the question is: what happens in, say, five years?
If you keep vibe-adding features, and somehow keep getting customers to pay for this thing, what happens once the codebase becomes so complex that an LLM cannot fit it inside its "brain"?
And this isn't just about context windows -- even if context windows were to grow at a rate sufficient for ingesting extremely large codebases, there's still the issue of a lack of theory underlying the problem.
The gambit for vibecoders, at least those that don't intend to pull out after a year or two in business, is that AI tools will grow in a) context window sizes and b) general intelligence capable of theory building at least as fast as they grow their codebases.
This may happen, or may not happen, but it's definitely a gambit.
Predictions
My prediction is that LLMs won't follow with context windows and complete understanding capable of sufficient theory building as fast as they will generate code.
This in practice means that I predict that companies employing the vibecoding paradigm will hit a wall in growth, sales, relationship building, and longevity.
A vibecoding company is therefore a company I would short. The more vibey it is, the shorter the position I would take.
I need to figure out how I should best formalize this as a formal prediction that can be judged true or false, but this is how I am thinking about it at this moment.
https://pages.cs.wisc.edu/~remzi/Naur.pdf ↩︎