LESSWRONG
LW

romeo
7781210
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
5romeo's Shortform
4mo
24
No wikitag contributions to display.
Serving LLM on Huawei CloudMatrix
romeo2mo10

CloudMatrix announcements indeed predated AI 2027 but the compute forecast did make predictions of how much compute China will have, including domestic production, smuggling and legal purchasing of foreign chips and found that they would still be significantly behind by 2027. The CloudMatrix doesn't change this because its still around 2x less cost-efficient than what US companies have access to, and US companies are investing around 4-5x their Chinese counterparts. This follow up blog post addressed the concern that we underestimated China, focusing on this compute gap.

I think China has a very serious chance of overtaking the US in terms of both compute and overall frontier AI capabilities post-2030, since they might crack EUV by then and the US will start running into more significant power bottlenecks that China won't face. 

Reply
Shortform
romeo4mo100

I agree that its very plausible that China would steal the weights of Agent-3 or Agent-4 after stealing Agent-2. This was a toss up when writing the story, we ultimately went with just stealing Agent-2 for a combination of reasons. From memory the most compelling were something like: 

  1. OpenBrain + the national security state can pretty quickly/cheaply significantly increase the difficulty and importantly the lead time required for another weights theft.
  2. Through 2027 China is only 2-3 months behind, so the upside from another weights theft (especially when you consider the lead time needed) is not worth the significantly increased cost from (1). 

Something that is a bit more under-explored is the potential sabotage between the projects. We have the uncertain assumption that the efforts on both sides would roughly cancel out, but we were quite uncertain on the offense-defense balance. I think a more offense favored reality could change the story quite a bit, basically with the idea of MAIM slowing both sides down a bunch.

Reply
Early Chinese Language Media Coverage of the AI 2027 Report: A Qualitative Analysis
romeo4mo30

FYI Scott Alexander wrote up AI 2027: Media, Reactions, Criticism

Reply
Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall
romeo4mo30

so maybe the second 2,000x of scaling should be reached by 2045 instead.

Yeah sounds reasonable, that would match up with my 1.56x/year number, so to summarize, we both think this is roughly plausible for 2028-2045?

1.3x/year (compute production) x 1.2x/year (compute efficiency) ~= 1.55x/year (compute available)

1.1x/year (investment) x 1.4x/year (price performance) ~= 1.55x/year (compute available)

So a 3x slowdown compared to the 2022-2028 trend (~3.5x/year).

Reply
Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall
romeo4mo30

To be clear I don't think the profit margin is the only thing that explains the discrepancy. 

I think the relevant question is more like: under my method, is 1.3x (production) x 1.2x (hardware) = 1.56x/year realistic over 2-3 decades or am I being too bullish? You could ask an analagous thing about your method (i.e., is 1x investment and 1.4x price performance realistic over the next 2-3 decades?) Two different ways of looking at it that should converge. 

If i'm not being too bullish with my numbers (which is very plausible, e.g., it could easily be 1.2x, 1.2x), then i'd guess the discrepancy with your method comes from it being muddled with economic factors (not just chip designer profit margins but supply/demand factors affecting costs across the entire supply chain, e.g., down to like how much random equipment costs and salaries for engineers). Maybe 1x investment is too low, maybe should be multiplied with inflation and GDP growth?

Reply
Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall
romeo4mo113

When funding stops increasing, the current pace of 3.55x per year (fueled by increasing funding) regresses to the pace of improvement in price-performance of compute of 1.4x per year, which is 3.7x slower. If the $140bn training systems of 2028 do get built, they'll each produce about 1.5e22 BF16 FLOP/s of compute, enough to train models for about 5e28 BF16 FLOPs.


This is a nice way to break it down, but I think it might have weird dependencies e.g., chip designer profit margins.

Instead of: 

training run investment ($) x hardware price performance (FLOP/$) = training compute (FLOP)

Another possible breakdown is: 

hardware efficiency per unit area (FLOP/s/mm^2) x global chip production (mm^2) x global share of chips used in training run (%) * training time (s) = training compute (FLOP)

This gets directly at the supply side of compute. It's basically 'moore's law x AI chip production x share of chips used'. In my model for the next three years are 1.35x1.65x1.5 ~= 3.4x, so matches your 3.55x/year pretty closely. Where we differ slightly I think would be in the later compute slowdown. 

Under my model there are also one-time gains happening in AI chip production and share of chips used (as a result of the one-time spending gains in your model). Chip production has one-time gains because AI only uses 5-10% of TSMC leading nodes and is using up spare capacity as fast as packaging/memory can be produced. Once this caps out, I think the 1.65x will default to being something like 1.2-4x as it gets bottlenecked on fab expansion (assuming like you said an investment slowdown). 'Share of chips used' growth goes to 1x by definition. 

Even taking the lower end of the estimate would mean that 'moore's law' hardware gains would have to slow down ~2x to 1.16x/year to match your 1.4x number. I do think hardware gains will slow somewhat but 1.16x is below what I would bet. Taking my actual medians, I think i'm at like 1.3x (production) x 1.2x (hardware) = 1.56x/year so more like a 2.8x slowdown, not 3.7x slowdown.

So resolving the discrepancy, it seems like my model is basically saying that your model overestimates the slowdown because it assumes profit margins stay fixed, but instead under slowing investment growth these should collapse? That feels like it doesn't fully explain it though since it seems like it should be a one time fall (albeit a big one). Maybe in the longer term (i.e. post-2035) I agree with you more and my 1.3x production number is too bullish. 

Reply
romeo's Shortform
romeo4mo252

A brief history of things that have defined my timelines to AGI since learning about AI safety <2 years ago

  • Bio anchors gave me a rough ceiling around 1e40 FLOP for how much compute will easily make AGI.
  • Fun with +12 OOMs of Compute brought that same 'training-compute-FLOP needed for AGI' down a bunch to around 1e35 FLOP.
  • Researching how much compute is scaling in the near future.

At this point I think it was pretty concentrated across ~1e27 - 1e33 flop so very long tail and something like a 2030-2040 50% CI. 

  • The benchmarks+gaps argument to partial AI research automation.
  • The takeoff forecast for how partial AI research automation will translate to algorithmic progress.
  • The trend in METR's time horizon data.

At this point my middle 50% CI is like 2027 - 2035, and would be tighter if not for a long tail that I keep around just because I think it's have a bunch of uncertainty. Though I do wish I had more arguments in place to justify the tail or make it bigger, ones that compete in how compelling they feel to me to the ones above.

Reply
romeo's Shortform
romeo4mo31

Thanks for linking. I skimmed the early part of this post because you labelled it explicitly as viewpoints. Then I see that you engaged with a bunch of arguments about short timelines, but they are all pretty weak/old ones that I never found very convincing (the only exception is that bio anchors gave me an early ceiling early on around 1e40 FLOP for compute needed to make AGI). Then you got to LLMs and acknowledged:

  • The existence of today's LLMs is scary and should somewhat shorten people's expectations about when AGI comes.

But then gave a bunch of points about the things LLMs are missing and suck at, which I already agree with. 

Aside: Have I mischaracterized so far? Please let me know if so.

So, do you think you have arguments against the 'benchmarks+gaps argument' for timelines to AI research automation, or why AI research automation won't translate to much algorithmic progress? Or any of the other things that I listed as ones that moved my timelines down:

  • Fun with +12 OOMs of Compute IMO, a pretty compelling writeup that brought my 'timelines to AGI uncertainty-over-training-compute-FLOP' down a bunch to around 1e35 FLOP
  • Researching how much compute is scaling.
  • The benchmarks+gaps argument to partial AI research automation
  • The takeoff forecast for how partial AI research automation will translate to algorithmic progress.
  • The recent trend in METR's time horizon data.
Reply
romeo's Shortform
romeo4mo30

But instead of discussing the crux of which system is relevant (which has to be about details of recursive self-improvement), only the proponents will tend to talk about software-only singularity, while the opponents will talk about different systems whose scaling they see as more relevant, such as the human economy or datacenter economy.

Totally agree! Thank you for phrasing it elegantly. This is basically what I commented on Ege's post yesterday, I asked him to engage with the actual crux and make arguments about why the software-only singularity is unlikely.

Reply
romeo's Shortform
romeo4mo30

There's an entire class of problem within ML that I would see as framing problems and the one thing I think LLMs don't help that much with is framing. 

Could you say more about this? What do you mean by framing in this context? 

There's this quote I've been seeing from Situation Awareness that all you have to do is "believe in a straight line on a curve" and when I hear that and see the general trend extrapolations my spider senses start tingling.

Yeah that's not really compelling to me either. SitA didn't move my timelines. Curious if you have engaged with the benchmarks+gaps argument to AI R&D automation (timelines forecast), and then the AI algorithmic progress that would drive (takeoff forecast). These are the things that actually moved my view. 

Thanks for the link, that's compelling.

Reply
Load More
11How 2025 AI Forecasts Fared So Far
3mo
2
5romeo's Shortform
4mo
24
658AI 2027: What Superintelligence Looks Like
Ω
5mo
Ω
222