romeo — LessWrong

Plans A, B, C, and D for misalignment risk

Do you have a take on 'all things considered p(doom) / p(bad outcome)' by plan? My guess is the EV from moving to plan A doesn't look great on pure takeover risk but looks better on the all things considered risk.

In terms of how likely each regime is to happen, I feel like Plan C is modal, not Plan D and Plan B is already similarly likely to Plan D. My A / B / C / D / E is probably 5 / 25 / 35 / 30 / 5. Main reasons is that I don't see current lab leaders actually wanting to do D, seems like they probably all will want to do C to me when the time comes, enough so to probably just do it even if there's a chance of losing race.

In terms of all things considered p(doom) my A / B / C / D / E is probably 12 / 35 / 40 / 70 / 85.

So key takes that i feel are:
- Plan B doesn't feel that much better than Plan C to me.
- Plan C is probably already the default.
- A lot of people are working on increasing Plan B mass which it seems not high EV to me.
- Most tractable highest EV things to work on IMO seem like increasing Plan A mass, and reducing plan C p(doom).

Vladimir_Nesov's Shortform

romeo1mo50

Thoughts on whether the >10x lower chip-to-chip interconnect from the CPX chips (PCIe 6.0x16's 128GB/s unidirectional vs. NVLink 5's 1.8TB/s bidirectional) will be a bottleneck blocking them from being that useful in pre-training?

Serving LLM on Huawei CloudMatrix

romeo3mo10

CloudMatrix announcements indeed predated AI 2027 but the compute forecast did make predictions of how much compute China will have, including domestic production, smuggling and legal purchasing of foreign chips and found that they would still be significantly behind by 2027. The CloudMatrix doesn't change this because its still around 2x less cost-efficient than what US companies have access to, and US companies are investing around 4-5x their Chinese counterparts. This follow up blog post addressed the concern that we underestimated China, focusing on this compute gap.

I think China has a very serious chance of overtaking the US in terms of both compute and overall frontier AI capabilities post-2030, since they might crack EUV by then and the US will start running into more significant power bottlenecks that China won't face.

Shortform

romeo5mo100

I agree that its very plausible that China would steal the weights of Agent-3 or Agent-4 after stealing Agent-2. This was a toss up when writing the story, we ultimately went with just stealing Agent-2 for a combination of reasons. From memory the most compelling were something like:

OpenBrain + the national security state can pretty quickly/cheaply significantly increase the difficulty and importantly the lead time required for another weights theft.
Through 2027 China is only 2-3 months behind, so the upside from another weights theft (especially when you consider the lead time needed) is not worth the significantly increased cost from (1).

Something that is a bit more under-explored is the potential sabotage between the projects. We have the uncertain assumption that the efforts on both sides would roughly cancel out, but we were quite uncertain on the offense-defense balance. I think a more offense favored reality could change the story quite a bit, basically with the idea of MAIM slowing both sides down a bunch.

Early Chinese Language Media Coverage of the AI 2027 Report: A Qualitative Analysis

romeo5mo30

FYI Scott Alexander wrote up AI 2027: Media, Reactions, Criticism

Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall

romeo6mo30

so maybe the second 2,000x of scaling should be reached by 2045 instead.

Yeah sounds reasonable, that would match up with my 1.56x/year number, so to summarize, we both think this is roughly plausible for 2028-2045?

1.3x/year (compute production) x 1.2x/year (compute efficiency) ~= 1.55x/year (compute available)

1.1x/year (investment) x 1.4x/year (price performance) ~= 1.55x/year (compute available)

So a 3x slowdown compared to the 2022-2028 trend (~3.5x/year).

Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall

romeo6mo30

To be clear I don't think the profit margin is the only thing that explains the discrepancy.

I think the relevant question is more like: under my method, is 1.3x (production) x 1.2x (hardware) = 1.56x/year realistic over 2-3 decades or am I being too bullish? You could ask an analagous thing about your method (i.e., is 1x investment and 1.4x price performance realistic over the next 2-3 decades?) Two different ways of looking at it that should converge.

If i'm not being too bullish with my numbers (which is very plausible, e.g., it could easily be 1.2x, 1.2x), then i'd guess the discrepancy with your method comes from it being muddled with economic factors (not just chip designer profit margins but supply/demand factors affecting costs across the entire supply chain, e.g., down to like how much random equipment costs and salaries for engineers). Maybe 1x investment is too low, maybe should be multiplied with inflation and GDP growth?

Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall

romeo6mo113

When funding stops increasing, the current pace of 3.55x per year (fueled by increasing funding) regresses to the pace of improvement in price-performance of compute of 1.4x per year, which is 3.7x slower. If the $140bn training systems of 2028 do get built, they'll each produce about 1.5e22 BF16 FLOP/s of compute, enough to train models for about 5e28 BF16 FLOPs.

This is a nice way to break it down, but I think it might have weird dependencies e.g., chip designer profit margins.

Instead of:

training run investment ($) x hardware price performance (FLOP/$) = training compute (FLOP)

Another possible breakdown is:

hardware efficiency per unit area (FLOP/s/mm^2) x global chip production (mm^2) x global share of chips used in training run (%) * training time (s) = training compute (FLOP)

This gets directly at the supply side of compute. It's basically 'moore's law x AI chip production x share of chips used'. In my model for the next three years are 1.35x1.65x1.5 ~= 3.4x, so matches your 3.55x/year pretty closely. Where we differ slightly I think would be in the later compute slowdown.

Under my model there are also one-time gains happening in AI chip production and share of chips used (as a result of the one-time spending gains in your model). Chip production has one-time gains because AI only uses 5-10% of TSMC leading nodes and is using up spare capacity as fast as packaging/memory can be produced. Once this caps out, I think the 1.65x will default to being something like 1.2-4x as it gets bottlenecked on fab expansion (assuming like you said an investment slowdown). 'Share of chips used' growth goes to 1x by definition.

Even taking the lower end of the estimate would mean that 'moore's law' hardware gains would have to slow down ~2x to 1.16x/year to match your 1.4x number. I do think hardware gains will slow somewhat but 1.16x is below what I would bet. Taking my actual medians, I think i'm at like 1.3x (production) x 1.2x (hardware) = 1.56x/year so more like a 2.8x slowdown, not 3.7x slowdown.

So resolving the discrepancy, it seems like my model is basically saying that your model overestimates the slowdown because it assumes profit margins stay fixed, but instead under slowing investment growth these should collapse? That feels like it doesn't fully explain it though since it seems like it should be a one time fall (albeit a big one). Maybe in the longer term (i.e. post-2035) I agree with you more and my 1.3x production number is too bullish.

romeo's Shortform

romeo6mo252

A brief history of things that have defined my timelines to AGI since learning about AI safety <2 years ago

Bio anchors gave me a rough ceiling around 1e40 FLOP for how much compute will easily make AGI.
Fun with +12 OOMs of Compute brought that same 'training-compute-FLOP needed for AGI' down a bunch to around 1e35 FLOP.
Researching how much compute is scaling in the near future.

At this point I think it was pretty concentrated across ~1e27 - 1e33 flop so very long tail and something like a 2030-2040 50% CI.

The benchmarks+gaps argument to partial AI research automation.
The takeoff forecast for how partial AI research automation will translate to algorithmic progress.
The trend in METR's time horizon data.

At this point my middle 50% CI is like 2027 - 2035, and would be tighter if not for a long tail that I keep around just because I think it's have a bunch of uncertainty. Though I do wish I had more arguments in place to justify the tail or make it bigger, ones that compete in how compelling they feel to me to the ones above.

romeo's Shortform

romeo6mo31

Thanks for linking. I skimmed the early part of this post because you labelled it explicitly as viewpoints. Then I see that you engaged with a bunch of arguments about short timelines, but they are all pretty weak/old ones that I never found very convincing (the only exception is that bio anchors gave me an early ceiling early on around 1e40 FLOP for compute needed to make AGI). Then you got to LLMs and acknowledged:

The existence of today's LLMs is scary and should somewhat shorten people's expectations about when AGI comes.

But then gave a bunch of points about the things LLMs are missing and suck at, which I already agree with.

Aside: Have I mischaracterized so far? Please let me know if so.

So, do you think you have arguments against the 'benchmarks+gaps argument' for timelines to AI research automation, or why AI research automation won't translate to much algorithmic progress? Or any of the other things that I listed as ones that moved my timelines down:

Fun with +12 OOMs of Compute IMO, a pretty compelling writeup that brought my 'timelines to AGI uncertainty-over-training-compute-FLOP' down a bunch to around 1e35 FLOP
Researching how much compute is scaling.
The benchmarks+gaps argument to partial AI research automation
The takeoff forecast for how partial AI research automation will translate to algorithmic progress.
The recent trend in METR's time horizon data.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments