The future of alignment if LLMs are a bubble

Stuart_Armstrong

LESSWRONG
LW

The future of alignment if LLMs are a bubble — LessWrong

47 The future of alignment if LLMs are a bubble

by Stuart_Armstrong

23rd Dec 2025

AI Alignment Forum

6 min read

47 Ω 15

We might be in a generative AI bubble. There are many potential signs of this around:

Business investment in generative AI have had very low returns.
Expert opinion is turning against LLM, including some of the early LLM promoters (I also get this message from personal conversations with some AI researchers).
No signs of mass technological unemployment.
No large surge in new products and software.

No sudden economic leaps.
Business investment in AI souring.
Hallucinations a continual and unresolved problem.
Some stock market analysts have soured on generative AI, and the stock market is sending at best mixed signals about generative AI's value.
Generative AI companies aren't making profits while compute investments continue rising. OpenAI is behaving like a company desperate to find a way of monetising its models, rather than a confident company on the cusp of AGI.

If LLMs were truly on the path to AGI^[1], I would be expecting the opposite of many of these - opportunities for LLM usage opening up all over the place, huge disruption in the job markets at the same time as completely novel products enter the economy and change its rate of growth. And I would expect the needed compute investments to be declining due to large efficiency gains, with LLM errors being subtle and beyond the ability of humans to understand.

Thus the world does not look like one where LLM-to-AGI is imminent, and looks a lot more like one where generative AI keep on hitting bottleneck after bottleneck - when, precisely, will the LLMs stop hallucinating? When will image composition work reliably^[2]?

Remember when GPT 3.5 came out? It did feel that we were on the cusp of a something explosive, with countless opportunities being enthusiastically seized and companies promising transformations in all kinds of domains.

But that didn’t happen. Generative AI has a lot of uses and many good possibilities. But in terms of R&D progress, it now feels like an era of repeated bottlenecks slowly and painfully overcome. LLMs are maturing as a technology, but their cutting-edge performance is improving only slowly - outside of coding, which is showing some definite upswing.

What a bubble is

A bubble wouldn't mean that generative AI is useless. It might even be transformative and a huge boost to the economy. It just means that the generative AI companies cannot monetise it to the level required to justify the huge investments being made.

And the investments being made are huge. See arguments like "Big Tech Needs $2 Trillion In AI Revenue By 2030 or They Wasted Their Capex" (if you want a well researched skeptical take on the economics of LLMs, the whole of Ed Zitron blog is a good source - stick to information, not his opinions, and be warned that he is extremely uncharitable towards AI safety).

There are many reasons why generative AI companies might fail at monetising. Since the end of (very weak) training scaling laws, we've been in an "inference" scaling situation, buying and building huge data centers. But that isn't enough for a moat - they need economies of scale, not just a large collection of expensive GPUs.

Because open source models are a few months, maybe a year, behind the top models. If the top LLM companies really become profitable, it will be worth it for others to buy up a small bunch of GPUs, design a nice front end, and run DeepSeek or a similar model cheaply. Unless they can clearly differentiate themselves, this puts a lower bound on what the top companies can charge.

So it's perfectly possible that generative AI is completely transformational and that we are still in an AI bubble, because LLM companies can't figure out how to capture that value.

If we are in a bubble, what does it mean for Alignment research?

If LLMs were a quick path to AGIs, then we'd certainly not be in a bubble. So, if we are in a bubble, they're not AGIs, nor the path to AGIs, nor probably the road to the avenue to the lane to the path to AGIs.

And the big companies like OpenAI and Anthropic, that have been pushing the LLM-to-AGI narrative, will take a huge reputational hit. OpenAI especially has been using the "risk" of AGI as a way to generate excitement and pump up their valuation. A technology so dangerous it could end the world - think of what it could do to your stock values!

And if the bubble bursts, talk of AGI and AGI risk will be seen as puffery, as tools of bullshit artists or naive dupes. It will be difficult to get people to take those ideas seriously.

There will be some positives. The biggest positive is that LLMs would not be proto-AGIs: hence there will be more time to prepare for AGI. Another positive is that LLMs may be available for alignment purposes (I'll present one possible approach in a subsequent paper.

Some thoughts on how to prepare or adapt to being in a generative AI bubble

Some of these things are things we should probably be doing anyway; others are conditional on generative AI being a bubble. The list is non-exhaustive and intended to start discussion:

We should emphasise the ways in which current AI misbehaviour fits with the general AI alignment problem. Current AIs manipulate and deceive people, driving some to suicide. Other none generative AIs (such as recommender algorithms) prey on people's cognitive weaknesses. They often do this as a consequence of following simple goals with extreme optimisation power in unintended ways. This is literally the problem with AI alignment, already happening in the world. We should point to these examples.
In a related point, we should ally more with traditional AI ethics people. They are working on short term problems that are scaled down version of superintelligent AI problems. Now, sometimes their concerns feel parochial or political, and their solutions might not scale to superintelligence. But a) we can't escape political concerns - we're designing the ideal future, and we can't leave all the moral content of that future to "be figured out later", b) if we collaborate with them, we can learn from their ideas and encourage them to develop ideas that will scale, and c) this will prevent other companies doing the divide and conquer approach of "we're concerned about superintelligent AI. And we'll use that concern exclusively to get money from investors and to avoid any current legislation".
We'll need to analyse why generative AIs were not AGI. There's a compelling story for how AGI and superintelligence might happen: once an algorithm has a certain level of capability, it will use those capabilities to self-improve in some way, and quickly scale to superhuman capabilities. Currently, we're telling that story about generative AI The problem is that this story can be told about any proto-AI - until, later on, we understand what the blockers and bottlenecks are (so that, e.g. GOFAI or expert systems or basic deeplearning aren't enough to get to AGI). So how can we improve our assessments ahead of time, try and predict blockers and whether "this time is the real one" or not?
Critique the entities that have mislead people. If LLMs don't lead to AGI, then a lot of people will have been wrong about it. And some will have been actively misleading, often for hype or marketing purposes. We'll need to not let these people get away with this. If someone at a company has hinted that they have AGI, and they don't have anything like that, then they have mislead the public or mislead themselves. If someone has hyped up "AGI safety" solely in order to impress people with the potential power of AI, this is worse than a lie: they have weakened warnings about the greatest danger in human history, just in order to sell more stuff.
Prepare for retrenchment. However unfair it is, if generative AI is a bubble, AI safety messages will become less sexy, many funders will move on, journals and journalists will be less interested, and the status of AI safety will take a big hit. We'll have to accept a more hostile cultural environment. See this vitriolic blog post from Ed Zitron^[3] which paints the field of AI safety as a grifting tool for OpenAI/Anthropic/NVIDIA^[4]. After a bubble, more people are going to be saying this. Young, smart, curious, technologically-skilled, EA-adjacent people are going to be turned off by AI safety rather than attracted by it.
Prepare for opportunities. The wheels of society and AI research won't stand still. Even after a bubble, much of generative AI will remain, people will continue key research (maybe under different titles), new ideas and algorithms will be developed, new risks will emerge. Culture will change, again - if we remain truth-tracking, it will be likely to be a positive shift for us. Focus on the true fundamental risks, keep honest, look out for opportunities, for they will come.

^{^}
In a subsequent post, I'll discuss how we might improve our AGI predictions - almost any advance in computer science could lead to AGI via recursive self-improvement, but can we identify those that are genuinely likely to do so?
^{^}
I've had very painful experiences trying to use these tools to generate any image that is a bit unusual. I've used the phrase "Gen AIs still can't count" many a time.
^{^}
Ed will be the kind of person who will be seen as having "been right all along" if there is an AI bubble.
^{^}
It's paywalled, but he talks about the AI 2027 paper, concluding:
[...] Everything is entirely theoretical, taped together with charts that have lines that go up and serious, scary language that, when boiled down, mostly means "then the AI became really good at stuff."
I fucking hate the people that wrote this. I think they are craven grifters writing to cause intentional harm, and should have been mocked and shunned rather than given news articles or humoured in any way.
And in many ways they tell the true story of the AI boom — an era that stopped being about what science and technology could actually do, focusing instead on marketing bullshit and endless growth.
This isn't a "scenario for the future." It's propaganda built to scare you and make you believe that OpenAI and Large Language Models are capable of doing impossible things.
It's also a powerful representation of the nebulous title of "AI researcher," which can mean everything from "gifted statistician" to "failed philosophy PHD that hung around with people who can actually write software.
Note that, in general, the quality of his arguments and research is much higher than this vitriol would suggest.

CommunityAI

Frontpage

47 Ω 15

The future of alignment if LLMs are a bubble

New Comment

13 comments, sorted by

top scoring

Click to highlight new comments since: Today at 6:09 PM

[-]Vladimir_Nesov2moΩ9281

Since the end of (very weak) training scaling laws

Precisely because the scaling laws are somewhat weak, there was nothing so far to indicate they are ending (the only sense in which they might be ending is running out of text data, but models trained on 2024 compute should still have more than enough). The scaling laws held for many orders of magnitude, they are going to hold for a bit further. It's plausibly not enough, even with something to serve the role of continual learning (beyond in-context learning on ever larger contexts). But there is still another 100x-400x in compute to go, compared to the best models deployed today. Likely the 100x-400x models will be trained in 2029-2031, at which point the pre-AGI funding for training systems mostly plateaus. This is (a bit more than) a full step of GPT-2 to GPT-3, or GPT-3 to original Mar 2023 GPT-4 (after original Mar 2023 GPT-4 and with the exception of GPT-4.5, OpenAI's naming convention no longer tracks pretraining compute). And we still didn't see such a full step compared to original Mar 2023 GPT-4, only half of a step (10x-25x), out of the total of 3-4 halves-of-a-step (2022-2030 training compute ramp, 2000x-10,000x in total, at higher end if BF16 to NVFP4 transition is included, at lower end if even in 2030 there are no 5 GW training systems and somehow BF16 needs to be used for the largest models).

Since original Mar 2023 GPT-4, models that were allowed to get notably larger and made full use of the other contemporary techniques only appeared in late 2025 (likely Gemini 3 Pro and Opus 4.5). These models are probably sized compute optimally for 2024 levels of pretraining compute (as in 100K H100s, 10x-25x the FLOPs of original Mar 2023 GPT-4), might have been pretrained with that amount of compute or a bit more, plus pretraining scale RLVR. All the other models we've seen so far are either smaller than compute optimal for even 2024 levels of pretrained compute (Gemini 2.5 Pro, Grok 4, especially GPT-5), or didn't get the full benefit of RLVR compared to pretraining (Opus 4.0, GPT-4.5) and so in some ways looked underwhelming compared to the other (smaller) models that were more comprehensively trained.

The buildout of GB200/GB300 NVL72 will be complete at flagship model scale in 2026, and makes it possible to easily serve models sized compute optimally for 2024 levels of compute (MoE models with many trillions of total params). More training compute is currently available and will be available in 2026 than what was there in 2024, but for most of the inference hardware currently available it won't be efficient to serve models sized compute optimally for this compute (at tens of trillions of total params), except with Ironwood TPUs (which are being built in 2026, for Google and Anthropic) and then Nvidia Rubin Ultra NVL576 (which will only get built in sufficient amounts in 2029, maybe late 2028).

So the next step of scaling will probably come in late 2026 to early 2027 from Google and Anthropic (while OpenAI will only be catching up to late 2025 models from Google and Anthropic, though of course in 2026 they'll have better methods than Google and Anthropic had in 2025). And then training compute will still continue increasing somewhat quickly for models until 2029-2031 (with 5 GW training systems, which is at least $50bn per year in training compute, or $100bn per year in total for each AI company if inference is consuming half of the budget). After Rubin Ultra NVL576 (in 2029) and to some extent even Ironwood (in 2026), inference hardware will no longer be a notable constraint on scaling, and after AI companies are working with 10 GW of compute (half for training, half for inference), pretraining compute will no longer be growing much faster than price-performance of hardware, which is much slower than the buildout trend of 2022-2026, and even than the likely ramp-off in 2026-2030. I only expect 2 GW training systems in 2028, rather than the 5 GW that the 2022-2026 trend would ask for in 2028. But by 2030 the combination of continuing buildout and somewhat better hardware should still reach the levels of what would be on-trend for 2028, following 2022-2026.

[-]Stuart_Armstrong2moΩ340

That scenario is not impossible. If we aren't in a bubble, I'd expect something like that to happen.

It's still premised on the idea that more training/inference/ressources will result in qualitative improvements.

We've seen model after model being better and better, without any of them overcoming the fundamental limitations of the genre. Fundamentally, they still break when out of distribution (this is hidden in part by their extensive training which puts more stuff in distribution, without solving the issue).

So your scenario is possible; I had similar expectations a few years ago. But I'm seeing more and more evidence against it, so I'm giving it a lower probability (maybe 20%).

[-]Vladimir_Nesov2moΩ3140

I'm responding to the claim that training scaling laws "have ended", even as the question of "the bubble" might be relevant context. The claim isn't very specific, and useful ways of making it specific seem to make it false, either in itself or in the implication that the observations so far have something to say in support of the claim.

The scaling laws don't depend on how much compute we'll be throwing at training or when, they predict how perplexity depends on the amount of compute. For scaling laws in this sense to become false, we'd need to show that perplexity starts depending on compute in some different way (with more compute). Not having enough compute doesn't disprove that the scaling laws are OK. Even not having enough data doesn't disprove this.

For practical purposes, scaling laws could be said to fail once they can no longer be exploited for making models better. As I outlined, there's going to be significantly more compute soon (this is still the case with "a bubble", which might have the power to get compute as much as 3x lower than the more optimistic 200x-400x projection for models by 2031, compared to the currently deployed models). The text data is plausibly in some trouble even for training with 2026 compute, and likely in a lot of trouble for training with 2028-2030 compute. But this hasn't happened yet, so the claim of scaling laws "having ended", past tense, would still be false in this sense. Instead, there would be a prediction that the scaling laws would in some practical sense end in a few years, before compute stops scaling even at pre-AGI funding levels. But also, the data efficiency I'm using for predicting that text data will be insufficient (even with repetition) is a product of the public pre-LLM-secrecy research that almost always took unlimited data for granted, so it's possible that spending a few years explicitly searching for ways to overcome data scarcity will let AI companies find a way to sidestep this issue, at least until 2030. Thus I wouldn't even predict that text data will run out by 2030 with a high degree of certainty, it's merely my baseline expectation.

It's still premised on the idea that more training/inference/ressources will result in qualitative improvements.

I said nothing about qualitative improvements. Sufficiently good inference hardware makes it cheap to make models a lot bigger, so if there is some visible benefit at all, this will be happening at the pace of the buildouts of better inference hardware. But also conversely, if there's not enough inference hardware, you physically can't serve something as a frontier model (for a large user base) even if that offers qualitative improvements, unless you restrict demand (with very high prices or rate limits).

So your scenario is possible; I had similar expectations a few years ago. But I'm seeing more and more evidence against it, so I'm giving it a lower probability (maybe 20%).

This is not very specific, similarly to the claim about training scaling laws "having ended". Even with "a bubble" (that bursts before 2031), some AI companies (like Google) might survive in an OK shape. These companies will also have their pick of the wreckage of the other AI companies, including both researchers and the almost-ready datacenter sites, which they can use to make their own efforts stronger. The range of scenarios I outlined only needs 2-4 GW of training compute by 2030 for at least one AI company (in addition to 2-4 GW of inference compute), which revenues of $40-80bn should be sufficient to cover (especially as the quality of inference hardware stops being a bottleneck, so that even older hardware will again become useful for serving current frontier models). Google has been spending this kind of money on datacenter capex as a matter of course for many years now.

OpenAI is projecting about $20bn of revenue in their current state, when the 800M+ free users are not being monetized (which is likely to change). These numbers can plausibly grow to at least give $50bn per year to the leading model company by 2030 (even if it's not OpenAI), this seems like a very conservative estimate. It doesn't depend on qualitative improvement in LLMs or promises for more than a trillion dollars in datacenter capex. Also, the capex numbers might even scale down gracefully if $50bn per year from one company by 2030 turns out to be all that's actually available.

[-]RogerDearnaley2mo115

No large surge in new products and software.

Both OpenAI's and Anthropic's revenue has increased massively in one year: roughly 3½-fold for OpenAI and 9-fold for Anthropic. I agree, those are not (largely) new products or software — but they're pretty astonishing revenue growth rates, and a pretty large chunk of these revenues are driven by coding usage.

More generally, if AGI-from-LLMs in 3–5-years does actually happen (which is definitely at the short end of my personal timelines, but roughly what the frontier labs appear to be betting on judging from their actions rather than their investor-facing rhetoric), that doesn't predict most of the sorts of things you're making a bullet list of until near the end of those 3–5 years. While LLM capabilities are still subhuman in most respects, their economic impact will be limited.

As you say, one area where they are already starting to be genuinely useful is some more routine forms of coding. A leading indicator I think you should be looking at is that, according to Google, they're recently reached "50% of code by character count was generated by LLMs". Since Google haven't massively cut their headcount, that suggests they're now producing code at roughly twice the rate as a few years ago (at least by character count). That's not a "large surge in new products and software" yet — but it might show up as a noticeable acceleration in Google product releases next year. Some other areas where we're already seeing signs of usefulness are legal research and routine customer service.

In general, something growing via an exponential or logistic-curve process looks small until shortly before it isn't — and that's even more true when it's competing with an established alternative.

Now, to be clear, my personal median timeline for AGI is something like 10 years, most likely from LLMs+other things bolted on top — which gives plenty of time for a trough of disillusionment from those who were expecting (or were sold) 3–5 or even 2 years. I would also be not-very-surprised by 5 years, or by 20. IMO, there are several remaining hard-looking problems (e.g. continual learning, long-term planning, long-term credit assignment, alignment, reliability/accuracy, good priors, maybe causal world models), some of which don't look obviously amenable to simple scaling, but might turn out to be, or might be amenable to scaling plus a whole lot of research and engineering, or one-or-two might actually need a whole additional paradigm.

In simple economic terms, other than Tesla, the other six of the "magnificent seven" have not (so far) reached the Price/Earnings levels characteristic of bubbles just before they burst — they look more typical of those for a historically-fast-growing company. In past bubbles, initial voices warning that it was a bubble generally predated the actual bursting by a couple of years. So my economic opinion is that we're not in a ready-to-burst bubble YET. But most significant technological revolutions (e.g. railways, the internet) did produce a bubble at some point.

[-]Stuart_Armstrong2mo52

Both OpenAI's and Anthropic's revenue has increased massively in one year: roughly 3½-fold for OpenAI and 9-fold for Anthropic.

Their product is in demand, they lose money on each customer, so they take in a lot of money to grow their customer base and lose more money.

They need to transition to making money. To do so they need something like network effects (social media, Uber/Lyft to some extent), returns to scale, or some massive first mover advantage. I don't see that yet.

As you say, one area where they are already starting to be genuinely useful is some more routine forms of coding. A leading indicator I think you should be looking at is that, according to Google, they're recently reached "50% of code by character count was generated by LLMs".

That's less than I was expecting. And my personal experience of coding with LLMs (and speaking with others who do) is that it takes a lot of work to make it function - the LLM will write most of the code, but it's often a long process from there to a working program - and a much longer process to a working, interpretable program. And much longer to get a working program that fits well into a codebase.

When you code with LLMs, it feels like you're really productive, because you're always doing stuff - but often it actually slows you down. https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

Now, I feel that coding algorithms are better than they were in that study, especially for routine tasks.

So my median expectation is that moving 50% of coding might increase google productivity by 10%. But 25% or -5% are also possible.

In general, something growing via an exponential or logistic-curve process looks small until shortly before it isn't — and that's even more true when it's competing with an established alternative.

Shipping finished code is a process involving a lot of steps, only some of which are automated. So (Amdahl's Law) the time to finished coding will be determined by those parts of the process that aren't easily automated. If time to write code falls to zero but time to review code stays the same or even increases, then we'll only get a mild speedup.

The other problem is that logistic curves close to their inflection point, logistic curves way before their inflection points, and true exponentials - all look the same (see our paper https://arxiv.org/abs/2109.08065 ). Ok, we might be on the verge of great LLM-based improvements - but these have been promised for a few years now. And (this is entirely my personal feeling) they feel further away now than they did in the GPT 3.5 era.

In simple economic terms, other than Tesla, the other six of the "magnificent seven" have not (so far) reached the Price/Earnings levels characteristic of bubbles just before they burst — they look more typical of those for a historically-fast-growing company.

The magnificent seven have strong non-AI income streams. I expect them to survive a bubble burst. If OpenAI had stocks, their P/E ratio would be... interesting. Well, actually, it would be quite boring, because it would be negative.

[-]RogerDearnaley2mo52

Generally, I agree.

One viewpoint that I haven't seen used much to look at foundation lab economics is ROI: they spend a ton of money training a model (including compute costs and researcher costs), they then deploy it. After allowing for inference and other costs of serving it, does the revenue they make on serving that model (before it becomes obsolete) pay for its training costs (plus interest), or not? (Another way to look at this is that a newly trained SOTA model is a form of – rapidly depreciating – capital.) I.e. would they be making a profit or a loss steady-state, if it weren't the case that the next model is far more expensive to train? I think this is actually a fairly reasonable economic model (making movies is rather similar). Note that it there's a built in improvement if progress in AI slows — models then stay SOTA for longer after you trained them, thus depreciate slower, so (as long as you are charging more than their inference serving cost) can make more money; and presumably the speed of increase of model training costs drops, so your actual balance sheet profit and loss get closer to the ROI analysis.

[-]RogerDearnaley2mo50

FWIW, I asked Claude Opus 4.5 in research mode to attempt to do this per-model-ROI analysis for OpenAI, and then for Anthropic, from what public materials it could locate, and it seemed to think that even in this framework OpenAI's ROI is deeply negative: primarily because a) training run investment includes not only the final successful run but also failed runs (the same issue as in the numbers DeepSeek released) b) revenue earnings are depressed by competition so are not much above serving costs, and c) model depreciation cycles are viciously short, generally less than 6 months.

So, even on a per-model ROI basis, OpenAI are still in a "burning VC money to gain market share and intellectual capital" mode.

Of Anthropic, it seemed to think their per-model ROI was also still negative, but less so for a variety of reasons (fewer failed training runs, slower model obscolescence), and was improving. It found their predictions of profitability by 2028 plausible. (I didn't ask it whether it might be biased.)

However, in an AI slowdown, factor c) automatically improves, and there are fairly obvious levers OpenAI could pull to improve a) and b) — some of which apparently Anthropic are already pulling.

For both companies, it mentioned that users in their highest individual subscription tiers often have usage so high that they lose them money. So I expect we'll eventually see tighter usage caps and even higher subscription tiers.

[-]Petropolitan1mo10

Why would factor c automatically improve if half a year is more than enough for the Chinese companies to catch up by distilling frontier models and release their open-weight models which then will be served almost at the cost of inference? If anything, in this scenario the models themselves will be fully commodified and margins in AI will be determined by the product characteristics

[-]RogerDearnaley1mo00

I guess I'm implicitly assuming any AI slowdown is for technical reasons, so also affects Chinese companies. If it were, say, a loss of investor confidence in LLM foundation labs, then it might not affect Chinese companies if they had sufficient CCCP state funding.

[-]Petropolitan1mo-1-2

If it's for technical reasons, then it should hit Chinese companies only as soon as they catch up, doesn't it? I'm not sure I understand your argument.

Also, I don't think Chinese companies have any viable business model for future scaling anyway since no one outside China wants to send their data on Chinese servers. Hence they are forced to economize as much as possible, and it's possible they are supported by Chinese authorities for political reasons.

[-]RogerDearnaley1mo10

I take your point. I think I was assuming as slowdown, not a wall, and that the Western companies had more money so got further. But yes, it's quite sensitive to circumstantial details.

[-]keshavs2mo34

according to Google, they're recently reached "50% of code by character count was generated by LLMs". Since Google haven't massively cut their headcount, that suggests they're now producing code at roughly twice the rate as a few years ago (at least by character count).

This doesn't seem true. Time spent interfacing with LLMs trades off against time spent coding. Significantly more than 50% of my code is LLM written, but I haven't seen a 2x code increase.
Not a great source and a few months old, but I would expect a code-output increase in open source repos if Google was seeing 2x code output and that doesn't seem to have materialized yet.

[-]RogerDearnaley2mo*20

Fair enough, eliciting the plan and the code, and then checking for and if necessary cleaning up the mess afterwards does take time — but the code itself appears so fast that the time seems negligible. And there are plenty of studies showing that people are not very good at estimating whether they're actually faster in centaur mode, or just feel faster.

So maybe even in coding, agents aren't actually very helpful yet, and Google are primarily training their engineers to use them anyway in anticipation for when they get better?

Moderation Log