The rapid advances in AI have led to an increase in the automation of software production[1], particularly through agentic coding tools like Claude Code. Open source projects like Gas Town are now attempting to fully automate the software development process. However, I argue that this is economically[2] infeasible for the vast majority of production ("enterprise") use cases today. Beyond an inflection point, the marginal cost of further automating software production increases such that it is detrimental at current levels of model intelligence. For most medium to large companies, it is not financially prudent to attempt to automate the entire software production process.
When building a factory, planners must decide on the optimal ratio of labor (people) to capital (machines) for production. Even if technically possible to fully automate production, the marginal cost of fully automating production is so high that it remains more efficient to only partially automate production, retaining labor for activities where humans hold a comparative advantage over machines. This concept may be counterintuitive for engineers who have a drive to automate as many tasks as possible.
We can apply the same framework of labor-capital splits to software production. Here, labor refers to the activities of all people involved in producing software, most notably software engineers. I’ll use capital to mean any non-labor asset used in the production of software. For simplicity’s sake I won’t distinguish between a company’s own capital assets and the assets they rent. So I consider Claude Code capital, though it would appear as an operating expense on a balance sheet, not a capital investment.
Software development is traditionally a labor intensive activity, even if the output of development, software, is a capital asset that can be leveraged across a huge consumer base. Producing software has long involved teams of product managers, quality assurance testers, support agents, designers and, of course, programmers. That is not to say that the labor mix in software production has been static, or that software development has not become more efficient over time. The past decades saw increasing levels of automation for software operations, mostly thanks to the advent of cloud computing and the associated DevOps and SRE practices, which eliminated/reduced many traditional IT and ops positions (like database administrators). Yet labor has remained essential for the actual production of software.
Just as the mechanical loom in the Industrial Revolution shifted production away from textile workers, Claude Code in the AI revolution allows us to shift software production from labor (programmers and the rest) to capital (rented or otherwise).
Epochal shifts in modes of production do not complete overnight. Capital and labor are not binaries, but a ratio that a company must effectively calibrate to optimize their balance sheet. Considering the automation of software operations, the cloud did not immediately eliminate traditional servers. Advances in virtual machines and containerization slowly spread throughout the software community, while new roles and practices gradually evolved to capitalize on those changes. Even today, when serverless technology is broadly adopted, we have not unified around a single kind of compute or storage layer, but make tradeoffs in how thoroughly we choose to adopt serverless abstractions. We should consider agentic development in the same light—not as a binary, but as a series of tradeoffs as to the degree of automation desirable.
So what is the optimal degree of software automation today? I argue that the optimal labor-capital split is a function of model intelligence, where intelligence is a catch-all term for reasoning, instruction compliance, etc. Beyond this ratio, there are no marginal gains in efficiency, and companies will experience diminishing returns on further automation of software production. Here’s an illustrative chart of the phenomenon.
Purely illustrative, though shape and optima approximate my own best guess
Remember that “software production” refers to the entire set of activities necessary to produce software, not just writing code.
Please don’t anchor too hard on the specific numbers I’ve proposed, which are more vibes than rigor. I’ve calibrated my guess on the Stanford study citing productivity gains of around 10 - 40% depending on the complexity of the task (the study precedes the latest coding models). The initial productivity gains of agentic coding are strong (consider the automation of rote activities and greenfield projects), but taper off into a wide trough of mildly differentiated gains. This is where many software developers are spending their time today, experimenting with different agentic coding strategies. The marginal costs are similar in the region around the optimum, so there is only a small penalty for automating a little too far. But further on the costs increase asymptotically—you will hit a wall if you try to automate your entire workflow.
Today’s models are great at one-shotting discrete tasks, but still fall over when faced with complexity and expanding requirements. It is well understood that filling the context window rapidly reduces the quality of current models. But, in the context of software development, I would add that even minor, seemingly harmless deviations from instructions and design principles cascade into core technical problems. I naively reason about this as an issue of compounding errors. Suppose Claude Code is 5% “off” expectations for every changeset, meaning it very slightly deviates from requirements or design principles. Without manual feedback or review, Claude will continue to deviate your codebase by 5% for every subsequent change. After 5 iterations you would then have deviated 28% from expectations; after 10 iterations, you would be 62% off. In practice error rates probably do not compound so neatly, but it’s true that without human supervision vibe-coding comes to a halt under the weight of its own technical debt. Agentic coding continually adds entropy to the codebase. So small gains in adherence and reasoning can yield massive gains in the number of changesets feasible without human intervention.
The relative cost of automating software production is sensitive to the fees charged by Anthropic and its competitors. We are all the beneficiaries of a competitive market with large subsidies. I did not find any public numbers confirming the “true costs” of a typical Claude Max plan, or any company’s real per-token cost for a SOTA reasoning model. We do have reports that in 2025 Anthropic suffered a $5.2 billion loss against $9 billion in ARR, while OpenAI expected to spend $22 billion against $13 billion in sales. With those losses it seems reasonable to expect that “real costs” for Claude Code users are at least 2x what we pay, though if we further assume that a small percentage of Claude Code users consume the vast majority of tokens, then it’s likely the real costs for Claude Code users are much greater. However, I’d guess there are few hard technical limits on optimizing the inference layer of any given model, so it’s likely Anthropic and competitors could engineer away some of the cost problems if they were compelled to chase profitability. When making critical budget and staffing decisions, engineers and business leaders would be prudent to just keep in mind these costs might increase someday to better reflect real costs.
I’ve charted the labor-capital split relative to a pre-AI baseline of current software output (say, late 2023). The bull case for the labor market is that overall software output will grow apace with the new efficiencies. The low cost of software production creates competitive pressure to add new features, develop new products, and automate increasing amounts of the economy. So even if fewer software engineers are needed relative to overall software output, the absolute amount of software produced increases so much that the actual number of software engineers employed remains constant. As long as engineers, or humans generally, retain some relative advantage to AI, then there is no economic incentive toward full automation, and the labor market could remain steady.
I fear the bull case may prove to be technically true and substantively false.
Firstly, there is an inherent lag between the release of a new model, subsequent development of new automation techniques, and identification of the new optima. The continued rapid pace of change and various incentives to manage costs mean that many (most) companies will over- or underestimate the amount of labor and capital investment necessary to reach the optimal labor-capital split for a given level of model intelligence. An important takeaway from my weeks experimenting with open source libraries and other tools is that it is very unlikely that generic solutions can substitute for a company’s unique needs when working at the frontier of capabilities, as the solutions are incomplete or incorrectly generalized. Even if the initial gains are achieved through Claude Code (rented capital), reaching an optimal labor-capital ratio will require a lot of bespoke development. Executives who fail to understand this problem may implement hiring freezes, premature layoffs and generally fail to seize productivity gains—or the opposite, with excess investment in AI automation and failure to reallocate labor according to the comparative advantage of humans relative to AI.
I am particularly concerned that most executives will interpret AI as an opportunity to shift resourcing away from engineering toward traditional business functions like finance, marketing, etc., when the most efficient use of labor may actually be to have engineering automate internal business processes and move many traditional roles toward new, AI-enhanced positions.
The second risk I see is the decline of marginal utility of software. It may be that we have a near infinite number of processes to automate, so the long term risk is small. But, as with the efficient allocation of labor, the real bottleneck is our own ability to identify new applications of software as marginal costs drop and capabilities increase. Up until now, it seems to me we have seen very few practical innovations using generative AI capabilities outside the model’s own chat interfaces and software development. While these advances alone are huge, it demonstrates that we are already struggling today to invent products on top of the current frontier, let alone the frontier of tomorrow.
The third problem, and the most profound for our lifetimes, is that we have no guarantee that humans will retain any comparative advantage over AI. It is of course comforting to cling to this idea, and we could feasibly retain important comparative advantages for years and years to come. But if we extrapolate into the coming decades, I would anticipate we reach a point where humans have no meaningful comparative advantage over AI and are no longer needed in the production of software or any other asset.
Throughout I will refer to "software production" rather than software development, to distinguish the former as the entire set of activities required to produce software (not just writing code).
I am also very skeptical that it is technically achievable at current levels of model intelligence. Maybe with enough time and effort, we could Rube-Goldberg our way to ~90% automation of software production. But today's model intelligence provides a hard limit on the quality of product and test specifications, necessitating some amount of human intervention.
The rapid advances in AI have led to an increase in the automation of software production[1], particularly through agentic coding tools like Claude Code. Open source projects like Gas Town are now attempting to fully automate the software development process. However, I argue that this is economically[2] infeasible for the vast majority of production ("enterprise") use cases today. Beyond an inflection point, the marginal cost of further automating software production increases such that it is detrimental at current levels of model intelligence. For most medium to large companies, it is not financially prudent to attempt to automate the entire software production process.
When building a factory, planners must decide on the optimal ratio of labor (people) to capital (machines) for production. Even if technically possible to fully automate production, the marginal cost of fully automating production is so high that it remains more efficient to only partially automate production, retaining labor for activities where humans hold a comparative advantage over machines. This concept may be counterintuitive for engineers who have a drive to automate as many tasks as possible.
We can apply the same framework of labor-capital splits to software production. Here, labor refers to the activities of all people involved in producing software, most notably software engineers. I’ll use capital to mean any non-labor asset used in the production of software. For simplicity’s sake I won’t distinguish between a company’s own capital assets and the assets they rent. So I consider Claude Code capital, though it would appear as an operating expense on a balance sheet, not a capital investment.
Software development is traditionally a labor intensive activity, even if the output of development, software, is a capital asset that can be leveraged across a huge consumer base. Producing software has long involved teams of product managers, quality assurance testers, support agents, designers and, of course, programmers. That is not to say that the labor mix in software production has been static, or that software development has not become more efficient over time. The past decades saw increasing levels of automation for software operations, mostly thanks to the advent of cloud computing and the associated DevOps and SRE practices, which eliminated/reduced many traditional IT and ops positions (like database administrators). Yet labor has remained essential for the actual production of software.
Just as the mechanical loom in the Industrial Revolution shifted production away from textile workers, Claude Code in the AI revolution allows us to shift software production from labor (programmers and the rest) to capital (rented or otherwise).
Epochal shifts in modes of production do not complete overnight. Capital and labor are not binaries, but a ratio that a company must effectively calibrate to optimize their balance sheet. Considering the automation of software operations, the cloud did not immediately eliminate traditional servers. Advances in virtual machines and containerization slowly spread throughout the software community, while new roles and practices gradually evolved to capitalize on those changes. Even today, when serverless technology is broadly adopted, we have not unified around a single kind of compute or storage layer, but make tradeoffs in how thoroughly we choose to adopt serverless abstractions. We should consider agentic development in the same light—not as a binary, but as a series of tradeoffs as to the degree of automation desirable.
So what is the optimal degree of software automation today? I argue that the optimal labor-capital split is a function of model intelligence, where intelligence is a catch-all term for reasoning, instruction compliance, etc. Beyond this ratio, there are no marginal gains in efficiency, and companies will experience diminishing returns on further automation of software production. Here’s an illustrative chart of the phenomenon.
Purely illustrative, though shape and optima approximate my own best guess
Remember that “software production” refers to the entire set of activities necessary to produce software, not just writing code.
Please don’t anchor too hard on the specific numbers I’ve proposed, which are more vibes than rigor. I’ve calibrated my guess on the Stanford study citing productivity gains of around 10 - 40% depending on the complexity of the task (the study precedes the latest coding models). The initial productivity gains of agentic coding are strong (consider the automation of rote activities and greenfield projects), but taper off into a wide trough of mildly differentiated gains. This is where many software developers are spending their time today, experimenting with different agentic coding strategies. The marginal costs are similar in the region around the optimum, so there is only a small penalty for automating a little too far. But further on the costs increase asymptotically—you will hit a wall if you try to automate your entire workflow.
Today’s models are great at one-shotting discrete tasks, but still fall over when faced with complexity and expanding requirements. It is well understood that filling the context window rapidly reduces the quality of current models. But, in the context of software development, I would add that even minor, seemingly harmless deviations from instructions and design principles cascade into core technical problems. I naively reason about this as an issue of compounding errors. Suppose Claude Code is 5% “off” expectations for every changeset, meaning it very slightly deviates from requirements or design principles. Without manual feedback or review, Claude will continue to deviate your codebase by 5% for every subsequent change. After 5 iterations you would then have deviated 28% from expectations; after 10 iterations, you would be 62% off. In practice error rates probably do not compound so neatly, but it’s true that without human supervision vibe-coding comes to a halt under the weight of its own technical debt. Agentic coding continually adds entropy to the codebase. So small gains in adherence and reasoning can yield massive gains in the number of changesets feasible without human intervention.
The relative cost of automating software production is sensitive to the fees charged by Anthropic and its competitors. We are all the beneficiaries of a competitive market with large subsidies. I did not find any public numbers confirming the “true costs” of a typical Claude Max plan, or any company’s real per-token cost for a SOTA reasoning model. We do have reports that in 2025 Anthropic suffered a $5.2 billion loss against $9 billion in ARR, while OpenAI expected to spend $22 billion against $13 billion in sales. With those losses it seems reasonable to expect that “real costs” for Claude Code users are at least 2x what we pay, though if we further assume that a small percentage of Claude Code users consume the vast majority of tokens, then it’s likely the real costs for Claude Code users are much greater. However, I’d guess there are few hard technical limits on optimizing the inference layer of any given model, so it’s likely Anthropic and competitors could engineer away some of the cost problems if they were compelled to chase profitability. When making critical budget and staffing decisions, engineers and business leaders would be prudent to just keep in mind these costs might increase someday to better reflect real costs.
I’ve charted the labor-capital split relative to a pre-AI baseline of current software output (say, late 2023). The bull case for the labor market is that overall software output will grow apace with the new efficiencies. The low cost of software production creates competitive pressure to add new features, develop new products, and automate increasing amounts of the economy. So even if fewer software engineers are needed relative to overall software output, the absolute amount of software produced increases so much that the actual number of software engineers employed remains constant. As long as engineers, or humans generally, retain some relative advantage to AI, then there is no economic incentive toward full automation, and the labor market could remain steady.
I fear the bull case may prove to be technically true and substantively false.
Firstly, there is an inherent lag between the release of a new model, subsequent development of new automation techniques, and identification of the new optima. The continued rapid pace of change and various incentives to manage costs mean that many (most) companies will over- or underestimate the amount of labor and capital investment necessary to reach the optimal labor-capital split for a given level of model intelligence. An important takeaway from my weeks experimenting with open source libraries and other tools is that it is very unlikely that generic solutions can substitute for a company’s unique needs when working at the frontier of capabilities, as the solutions are incomplete or incorrectly generalized. Even if the initial gains are achieved through Claude Code (rented capital), reaching an optimal labor-capital ratio will require a lot of bespoke development. Executives who fail to understand this problem may implement hiring freezes, premature layoffs and generally fail to seize productivity gains—or the opposite, with excess investment in AI automation and failure to reallocate labor according to the comparative advantage of humans relative to AI.
I am particularly concerned that most executives will interpret AI as an opportunity to shift resourcing away from engineering toward traditional business functions like finance, marketing, etc., when the most efficient use of labor may actually be to have engineering automate internal business processes and move many traditional roles toward new, AI-enhanced positions.
The second risk I see is the decline of marginal utility of software. It may be that we have a near infinite number of processes to automate, so the long term risk is small. But, as with the efficient allocation of labor, the real bottleneck is our own ability to identify new applications of software as marginal costs drop and capabilities increase. Up until now, it seems to me we have seen very few practical innovations using generative AI capabilities outside the model’s own chat interfaces and software development. While these advances alone are huge, it demonstrates that we are already struggling today to invent products on top of the current frontier, let alone the frontier of tomorrow.
The third problem, and the most profound for our lifetimes, is that we have no guarantee that humans will retain any comparative advantage over AI. It is of course comforting to cling to this idea, and we could feasibly retain important comparative advantages for years and years to come. But if we extrapolate into the coming decades, I would anticipate we reach a point where humans have no meaningful comparative advantage over AI and are no longer needed in the production of software or any other asset.
Throughout I will refer to "software production" rather than software development, to distinguish the former as the entire set of activities required to produce software (not just writing code).
I am also very skeptical that it is technically achievable at current levels of model intelligence. Maybe with enough time and effort, we could Rube-Goldberg our way to ~90% automation of software production. But today's model intelligence provides a hard limit on the quality of product and test specifications, necessitating some amount of human intervention.