What's up with Anthropic predicting AGI by early 2027?

[-]Vladimir_Nesov6h140

Late 2026 is also when Anthropic will already have their gigawatt of TPUs, so by early 2027 they'll have had some time with them. If these are Ironwood TPUs, they have 50 TB of HBM per pod, so models with tens of trillions of total params are efficient to inference or train with RLVR there. A model could be pretrained on a sufficient amount of weaker hardware in advance, ready for RLVR as soon as the TPUs get properly online. At gigawatt scale pretraining, about 3T active params might be compute optimal, so this is the right kind of system to avoid having to settle for smaller MoE models due to insufficient HBM per scale-up world.

Gigawatt scale training systems of 2026 are step 2 of 3 in the rapid scaling progression that started in 2022, advancing 12x in raw compute every 2 years (and possibly 2x on top of that due to adoption of lower precision in training), keeping up so far in 2024 and 2026. Step 3 might take longer than 2 years if 5 gigawatt training systems won't yet be built in 2028 (I only expect 2 gigawatt in 2028, but 5 gigawatt will probably be there by end of 2030). Scaling by yet another step of this progression would require an AI company with $600bn revenue, so this might take a while (absent AGI).

[-]Nathan Helm-Burger6h60

In my mental model, I expect we are at least one fundamental breakthrough away from AGI (in keeping with François Chollet's points about intelligence being about rapid efficient learning rather than just applying broad knowledge).

It seems difficult to me to predict how far we are from a breakthrough that gives us a significant improvement on this metric.

So, to me, it seems really important to ask how much a given level of LLM coding assistance is enabling researchers to iterate more easily over a broader range of experiments. I don't have sufficient insight into the research patterns of AI companies to have a good sense of novel experiments per researcher per month (ERM). My expectation is that an increase in this metric ERM would give us some sense of how to update from the base rate of major conceptual breakthroughs (estimated in your article as 1 per 10 years, at 2010 - 2020 levels of researcher hours per year).

To figure out current odds of breakthrough per year, I'd want to know how many more researchers worldwide are working on ML than in the 2010-2020 period. I'd want to discount this by assuming many of the marginal additions are not as inspired and experienced as the earlier researchers, and are retreading known ground more, and are running less well designed experiments.

Then I'd want to make the upward adjustment of expected LLM assistance to ERM. Also, perhaps to being a sufficiently helpful research design assistant that it somewhat offsets the decrease to experiment quality caused by addition of many marginal researchers (by bumping up the low end, perhaps causing some that were below threshold of relevance to become potentially relevant).

If, over the next 5 years, we see a gradual average improvement to the "ML breakthrough rate", we should expect the next breakthrough to arrive in more like 4-6 years rather than 8-12.

If a big enough breakthrough in "learning and extrapolation rate from limited data" (aka Chollet-style intelligence) does get found, I think that puts us on a fundamentally different trend line.

I also think there's a third kind of "brain capability" axis that we might see a breakthrough along. I think it's probably less impactful than the Chollet-style. This is what I'd call the "evolutionary programming"-style. In other words, the fact that evolution has managed to shape the genetics of a baby horse such that its fetal brain development was sufficient for it to walk soon after birth with nearly no learning. This seems different from either "inefficiently learned knowledge and heuristics" (aka current style) or from Chollet-style. So this third axis would be something like researchers being able to explicitly engineer a target skill into a model such that the model needed only a tiny amount of training on this targeted task to be competent.

[-]Nathan Helm-Burger6h20

Note that if the "evolutionary programming" axis is real, and a breakthrough does occur there, it is possible that it might mean reaearchers could possibly "pre-program" a model to be good at rapid learning (aka increase its Chollet-style intelligence).

I suspect that something like this process of evolution predisposing a brain to be better at abstract learning is a key factor in the differentiation of humans from previous primates.

Date	Qualitative milestone	Engineering multiplier^[27]	AI R&D multiplier^[28]	50%/90%-reliability time-horizon for internal engineering tasks^[29]
March 2027	Powerful AI	600x	100x^[30]	∞/∞
February 2027	Full automation of AI R&D	200x	35x	∞/∞
Dec. 2026	Full automation of research engineering	50x	6x	∞/∞
Sept. 2026	Vast majority automated	5x	2x	10 months/3 weeks^[31]
June 2026	Most engineering is automated	3x	1.5x	2 weeks/1.5 days
March 2026	Large AI augmentation	1.8x	1.25x	1 day/1 hours
Oct. 2025	Significant AI augmentation	1.3x	1.1x	1.5 hours/0.2 hours^[32]

Date	Proposed: Engineering multiplier	Proposed: 50%/90%-reliability time-horizon for internal engineering tasks	My: engineering multiplier	My: 50%/90%-reliability time-horizon for internal engineering tasks
Dec. 2026	50x	∞/∞	1.75x	7 hours / 1 hours
Sept. 2026	5x	10 months/3 weeks	1.6x	5 hours / 0.75 hour
June 2026	3x	2 weeks/1.5 days	1.45x	3.5 hours / 0.5 hours
March 2026	1.8x	1 day/1 hours	1.35x	2.5 hours / 0.35 hours
Oct. 2025	1.3x	1.5 hours/0.2 hours	1.2x	1.5 hours^[41]/0.2 hours

Recently (in fact, after I initially drafted this post), OpenAI expressed a prediction/goal of automated AI research by March 2028. Specifically, Jakub Pachocki said: "...anticipating this progress, we of course make plans around it internally. And we want to provide some transparency around our thinking there. And so we want to take this maybe somewhat unusual step of sharing our internal goals and goal timelines towards these very powerful systems. And, you know, these particular dates, we absolutely may be quite wrong about them. But this is how we currently think. This is currently how we plan and organize." It's unclear what confidence in this prediction OpenAI is expressing and how much it is a prediction rather than just being an ambitious goal. The arguments I discuss in this post are also mostly applicable to this prediction. ↩︎
In this post, I'll often talk about Anthropic as an entity (e.g. "Anthropic's prediction", "Anthropic thinks", etc.). I of course get that Anthropic isn't a single unified entity with coherent beliefs, but I still think talking in this way is reasonable because there are outputs from Anthropic expressing "official" predictions and because Dario does in many ways represent and lead the organization and does himself have beliefs. If you want, you can imagine replacing "Anthropic" with "Dario" in places where I refer to Anthropic as an entity in this post. ↩︎
In fact, I'm not aware of anyone at Anthropic other than Jack Clark and Dario who have timelines this short, though I think many people expect only somewhat longer and the discussion in this post is still applicable to somewhat longer timelines. ↩︎
It's possible that Dario/Anthropic have updated towards longer timelines, but if so, there isn't public evidence of this. ↩︎
I cut a few of the bullets that seemed less relevant for brevity. ↩︎
It's unclear to me whether Dario means the AI does tasks that would take an experienced human hours/days/weeks or that the AI is autonomously working for hours/days/weeks without human involvement (possibly completing tasks which are much larger in scope because AIs likely work faster than humans at tasks that they are good at). ↩︎
I normally forecast to more specific milestones like "AIs capable of full automation of AI R&D", but I'll stick with powerful AI for this post. ↩︎
At least AI progress given a fixed supply of data and compute, the AIs wouldn't necessarily be able to automate collecting data from the physical world insofar as this is important. I'm skeptical that this is that important. I also think Anthropic doesn't think this is that important, at least my guess is that they probably don't think spending lots of time acquiring additional data is necessary for very high levels of capability. ↩︎
Beyond e.g. rarely asking humans about how to resolve various trade-offs between priorities that come up. Or perhaps very rare involvement by humans that doesn't substantially bottleneck progress. ↩︎
Perhaps Dario intends "smarter than a Nobel Prize winner" to not include the level of cognitive diversity within a large company and he also thinks this cognitive diversity is critical for AI progress or progress in other scientific domains. In this case, the AI would be able to automate any given job, but not automate whole companies without a bunch of human help. I'll assume this isn't what Dario means because it would be incongruous with usage of the term "country of geniuses in a datacenter". ↩︎
I interpret "early 2027" as "within the first third of 2027", but we could charitably interpret "early 2027" as "within the first half of 2027". ↩︎
This is because giving the AI the relevant knowledge to do those jobs could take some time and diffusion might not happen that quickly (while I expect that AI companies will try hard to automate AI R&D as soon as they can, as much as they can). (It could still be easy to quickly adjudicate if AIs end up substantially more capable than needed for automating scientific R&D in most of the relevant fields.) ↩︎
It's worth clarifying that the prediction made by Anthropic isn't (as far as I can tell) that powerful AIs will necessarily be externally deployed by early 2027 (they could be kept secret inside of AI companies), but I currently expect strong public evidence of these capabilities within a short period regardless. This is both because I expect external deployment of powerful AI (or at least systems close to powerful AI) within a short period and even in the absence of external deployment, we may be able to get strong evidence via other routes (e.g. transparency resulting in credible evidence). ↩︎
By accelerating research engineering by 2x, I mean "the acceleration to research engineering activities is as valuable to the company as it would be to have all of their research engineers operate 2x faster when doing work that is reasonably centrally research engineering work (including parts of their job which aren't coding like high level software design and meetings, though AIs wouldn't necessarily have to accelerate every part of the job for an overall 2x boost)". Note that accelerating research engineering by 2x results in a substantially less than 2x acceleration to overall AI progress. See here for discussion. This is also different from the notion of AI R&D labor acceleration I define here (the notion discussed in the linked post is a broader notion that includes all labor, not just engineering). Also, note that when I said "2x acceleration to research engineering", I mean "2x acceleration to things that are best described as research engineering work (but still including messy aspects of these jobs, e.g. not just literal coding)". ↩︎
I also expect that by early 2027 (start of May 2027) annualized AI company revenue will be around $100 billion and that reasonably large fractions of software engineering outside of AI companies will be transformed substantially, though to a lesser extent than within AI companies. ↩︎
Strangely, in their blog post about their recommendations to the OSTP Anthropic says something substantially stronger than what they say elsewhere: "As our CEO Dario Amodei writes in 'Machines of Loving Grace', we expect powerful AI systems will emerge in late 2026 or early 2027". Dario doesn't say this in Machines of Loving Grace, instead he says a substantially weaker statement: "I think it could come as early as 2026, though there are also ways it could take much longer.". Further, in their actual submission to the OSTP, they say "Based on current research trajectories, we anticipate that powerful AI systems could emerge as soon as late 2026 or 2027." which is also weaker. The submission to the OSTP did say "Powerful AI technology will be built during this Administration." which implies a pretty high probability (maybe 80%?) on powerful AI prior to January 2029 and thus probably a decently high probability by the middle of 2027 (only 1.5 years earlier), though this could be consistent with thinking powerful AI by early 2027 is only plausible (e.g. 25% likely). ↩︎
I would also feel substantially more sympathetic (and retain more weight on the views of the relevant people) if intermediate predictions (for e.g. the beginning/middle of 2026) were made by relevant proponents and the relevant proponents acknowledged these predictions were falsified insofar as that ends up being the case. Note that even if intermediate predictions consistent with shorter timelines are falsified, that shouldn't necessarily result in having (much) longer timelines because there might be other stronger updates pointing in favor of shorter timelines. ↩︎
The full quote is: "I think we'll be there in three to six months—where AI is writing 90 percent of the code. And then in twelve months, we may be in a world where AI is writing essentially all of the code." ↩︎
Because METR doesn't directly measure time-horizon on internal engineering tasks relative to an AI company's engineers (instead measuring this on their task suite with contractors for the human baseline), we need to convert these values. I've done this by multiplying by 0.5 which is my guess for the correspondence between these numbers for a given model, at least over the past 6 months or so. (This lines up with the initial value used for October of 1.5 hours for the 50% reliability time-horizon on internal tasks.) ↩︎
This matches the Superhuman Coder level of capabilities from AI 2027. ↩︎
And can also complete these projects around 30x faster than a small human team, making this a somewhat feasible benchmark to run. ↩︎
As discussed earlier, when I say this, I mean "research engineers are as productive as if they operated 5x faster when doing research engineering work (including non-coding activities like meetings, though AIs don't have to speed up all aspects of the job to achieve an overall acceleration of 5x)". I don't necessarily mean their output is 5x higher as this might (in some cases) depend on other inputs like compute. ↩︎
Doing the decomposition into smaller tasks and then sampling might be messy. I'll ignore these details and assume some reasonable option exists. ↩︎
That said, for this timeline at the point of September 2026, AIs can usually acquire the relevant context quickly and easily. ↩︎
A more general operationalization is: AIs can implement production-ready inference for a new (importantly different architecture) AI in the context of an existing codebase with existing examples and another implementation of this new AI (for different hardware) that can be probed, and these AIs can reach performance (and stability etc.) similar to a well optimized human implementation. The implementation would need to be mergeable as is, but wouldn't necessarily need to handle actually being fully deployable. ↩︎
Maybe we can operationalize this as: after humans have gotten the inference/training implementation basically working, and have implemented good correctness tests, the AI can improve the implementation by as much as an expert human would in a few days, in a PR that's at least as mergeable as typical human PRs, more than half of the time. ↩︎
The boost by AIs to engineering productivity is as good as an X times speed up to how fast engineers at the company work. Like, for 10x, it is as good as if all of the engineers thought, typed, talked, etc. 10x faster. ↩︎
I'm using the same notion of AI R&D multiplier as AI 2027. As in, the pre-diminishing returns multiplier of algorithmic progress on top of the amount of compute and human labor at a given amount in time. Note that this isn't a multiplier of overall AI progress which is also driven by scaling up compute and scaling up spending on data. I'm also basing these numbers on AI 2027 to a significant degree, though assuming somewhat more aggressive numbers based on Anthropic predicting a faster takeoff. ↩︎
This is for randomly sampled engineering tasks within the company, when comparing to a typical/normal engineer who has the relevant skill set but who doesn't have special context. I give both the 50% reliability number and the 90% reliability number. ↩︎
It's possible that Anthropic expects less acceleration in AI R&D from powerful AI (and full automation of AI R&D) and instead expects that the amount of additional effort required to go from "full automation of research engineering" to "powerful AI" isn't that high. I've pulled in AI R&D acceleration numbers by assuming Anthropic thinks acceleration will be somewhat more aggressive than predicted by AI 2027 in keeping with their more aggressive forecast. I personally expect lower multipliers. ↩︎
I tentatively think the gap between 50% and 90% reliability will increase when AIs are very capable and good long horizon agents but still have some things they can't do. This is because there are many very long/large tasks that are at least somewhat specialized or that at least avoid potential weaknesses but when randomly sampling tasks a bunch of tasks will still touch upon weaknesses. (Another way to put this is that tasks will cluster some in the skill set required and many/most big tasks won't be diverse enough to hit the AI's weaknesses.) ↩︎
It's plausible that the 90% reliability time horizon for internal engineering tasks is actually substantially lower right now; I don't have a strong view about this number and it might be driven by a pretty small fraction of tasks that current AIs are very bad at. I don't think this makes a huge difference to the forecast either way. ↩︎
This is my view after putting some weight on the views of people predicting very short timelines, including some weight on predictions from Dario/Anthropic. ↩︎
To be clear, this algorithmic breakthrough would be reasonably likely to involve scaling up some new type of training that requires collecting new types of data. (E.g., see recent advances in RLVR.) So while the breakthrough would have to be algorithmic to be fast enough, progress could also involve a bunch of data collection and scaling up training work that happens over some longer period of time smoothing out progress (though this would have to happen pretty quickly). In general, I'd guess that large breakthroughs tend to get smoothed out over some period due to iteration and figuring out how to best leverage the new thing. ↩︎
It would be reasonable to start the trend with GPT-3.5 in which case the trend is around 3.5 years old or with GPT-2 in which case the trend is around 6.75 years old. (The measurements for GPT-2 and GPT-3 are more dubious, so I'm sympathetic to ignoring these models.) ↩︎
There are also pretty good reasons to think that the rate of massive breakthroughs / paradigm changes that result in above trend progress should go down over time as the field of AI grows. This is via the law of large numbers; however, research progress being possibly heavy tailed complicates this argument. ↩︎
Sonnet 3.5 was released 1.5 years ago and I'm calling that as the first AI that could do agentic software engineering a bit. ↩︎
I think "accelerating engineering using AIs is as useful as making all engineers at the company >10x faster" is around 15% likely by the end of 2026. ↩︎
It could be priced in for the time horizon trend because increasingly long tasks require being able to complete increasingly hard and diverse subtasks. See here for some discussion of this perspective. Even if it is already priced in, this does imply that you might only reach full automation of engineering at the point of surprisingly high time horizons (rather than seeing inherent superexponentiality in the trend). E.g., maybe you get full automation when the trend predicts AIs can reliably complete many year long tasks selected from a diverse distribution of messy tasks from within AI companies. This would be because this long time horizon corresponded to resolving the final last mile problems. ↩︎
See here for discussion and the rest of the post for a broader argument like this. ↩︎
I expect a bit lower than 1.5 hours, but the comparison is cleanest if I align the initial values between my prediction and use the same conversion factor. ↩︎
Though we don't necessarily know the conversion between performance on METR's task suite (with relatively benchmarkable tasks) and performance on actual tasks within AI companies. These could differ in either the doubling time or the initial time-horizon. I'm also looking at 90% reliability while METR measures 80% reliability, though some fraction of tasks may be invalid/impossible meaning that 80%-reliability on METR's task suite might be closer to something like 85% in practice. ↩︎
By the end of 2026, METR's task suite may no longer be meaningful due to not having enough hard tasks from a representative distribution. Regardless, we should be able to get a sense for how good AIs are at autonomous software engineering using other benchmarks and qualitative reporting. ↩︎
In the median outcome, I expect to update towards longer timelines but this doesn't violate conservation of expected evidence because there is some chance I update a lot towards much shorter timelines. (In other words, my update distribution is asymmetric.) See also Joe Carlsmith's blog post about predictable updating. ↩︎
However, I would still think that massive automation of engineering within AI companies (accelerating engineering by >5x) before 2029 is totally plausible and this could pose substantial risk of sabotage from misaligned AIs. ↩︎
AI 2027 forecasts a ~1.25x AI R&D speedup for October while I think the current speedup is probably more like 1.1x or possibly lower. In terms of qualitative descriptions of capabilities, we seem somewhat behind though not that far behind. ↩︎
We're in the equivalent of 6/2025 in the AI 2027 trajectory which is 27 months before 9/2027 and we're cutting off 10 months from this trajectory meaning the trajectory is (27-10) / 27 = ~0.6. ↩︎

LESSWRONG
LW

LESSWRONG
LW

109

What's up with Anthropic predicting AGI by early 2027?

109

109

What does "powerful AI" mean?

Earlier predictions

A proposed timeline that Anthropic might expect

Why powerful AI by early 2027 seems unlikely to me

Trends indicate longer

My rebuttals to arguments that trend extrapolations will underestimate progress

Naively trend extrapolating to full automation of engineering and then expecting powerful AI just after this is probably too aggressive

What I expect

What updates should we make in 2026?

If something like my median expectation for 2026 happens

If something like the proposed timeline (with powerful AI in March 2027) happens through June 2026

If AI progress looks substantially slower than what I expect

If AI progress is substantially faster than I expect, but slower than the proposed timeline (with powerful AI in March 2027)

Appendix: deriving a timeline consistent with Anthropic's predictions