The 0.2 OOMs/year target

0.2 OOMs/year was the pre-AlexNet growth rate in ML systems.

I think you'd want to set the limit to something slightly faster than Moore's law. Otherwise you have a constant large compute overhang.

Ultimately, we're going to be limited by Moore's law (or its successor) growth rates eventually anyway. We're on a kind of z-curve right now, where we're transitioning from ML compute being some small constant fraction of all compute to some much larger constant fraction of all compute. Before the transition it grows at the same speed as compute in general. After the transition it also grows at the same speed as compute in general. In the middle it grows faster as we rush to spend a much larger share of GWP on it.

From that perspective, Moore's law growth is the minimum growth rate you might have (unless annual spend on ML shrinks). And the question is just whether you transition from the small constant fraction of all compute to the large constant fraction of all compute slowly or quickly.

Trying to not do the transition at all (i.e. trying to growing at exactly the same rate as compute in general) seems potentially risky, because the resulting constant compute overhang means it's relatively easy for someone somewhere to rush ahead locally and build something much better than SOTA.

If on the other hand, you say full steam ahead and don't try to slow the transition at all, then on the plus side the compute overhang goes away, but on the minus side, you might rush into dangerous and destabilizing capabilities.

Perhaps a middle path makes sense, where you slow the growth rate down from current levels, but also slowly close the compute overhang gap over time.

[-]Hoagy3y116

Moore's law is a doubling every 2 years, while this proposes doubling every 18 months, so pretty much what you suggest (not sure if you were disagreeing tbh but seemed like you might be?)

[-]ESRogs3y42

Ah, good point!

[-]Lukas_Gloor3y42

Otherwise you have a constant large compute overhang.

I think we should strongly consider finding a way of dealing with that rather than only looking at solutions that produce no overhang. For all we know, total compute required for TAI (especially factoring in future algorithmic progress) isn't far away from where we are now. Dealing with the problem of preventing defectors from exploiting a compute overhang seems potentially easier than solving alignment on a very short timescale.

[-]ESRogs3y20

I suppose a possible mistake in this analysis is that I'm treating Moore's law as the limit on compute growth rates, and this may not hold once we have stronger AIs helping to design and fabricate chips.

Even so, I think there's something to be said for trying to slowly close the compute overhang gap over time.

[-]PeterMcCluskey3y1413

Upvoted for creative thinking. I'm having trouble deciding whether it's a good idea.

[-]Hoagy3y86

0.2 OOMs/year is equivalent to a doubling time of 8 months.

I think this is wrong, that's nearly 8 doublings in 5 years, should instead be doubling every 5 years, should instead be doubling every 5 / log2(10) = 1.5.. years

I think pushing GPT-4 out to 2029 would be a good level of slowdown from 2022, but assuming that we could achieve that level of impact, what's the case for having a fixed exponential increase? Is it to let of some level of 'steam' in the AI industry? So that we can still get AGI in our lifetimes? To make it seem more reasonable to policymakers?

I would still rather have a moratorium until some measure of progress of understanding personally. We don't have a fixed temperature increase per decade built into our climate targets.

[-]Cleo Nardo3y60

The 0.2 OOMs/year target would be an effective moratorium until 2029, because GPT-4 overshot the target.

[-]Cleo Nardo3y40

Yep, thanks! 0.2 OOMs/year is equivalent to a doubling time of 18 months. I think that was just a typo.

[-]Ben Livengood3y76

I wonder if a basket of SOTA benchmarks would make more sense. Allow no more than X% increase in performance across the average of the benchmarks per year. This would capture the FLOPS metric along with potential speedups, fine-tuning, or other strategies.

Conveniently, this is how the teams are already ranking their models against each other so there's ample evidence of past progress and researchers are incentivized to report accurately; there's no incentive to "cheat" if researchers are not allowed to publish greater increases on SOTA benchmarks than the limit allows (e.g. journals would say "shut it down" instead of publish the paper), unless an actor wanted to simply jump ahead of everyone else and go for a singleton on their own, which is already an unavoidable risk without EY-style coordinated hard stop.

[-]Cleo Nardo3y20

Great idea! Let's measure algorithmic improvement in the same way economists measure inflation, with a basket-of-benchmarkets.

This basket can itself be adjusted over time so it continuously reflected the current use-cases of SOTA AI.

I haven't thought about it much, but my guess is the best thing to do is to limit training compute directly but adjust the limit using the basket-of-benchmarks.

[-]Ben Livengood3y10

One weakness I realized overnight is that this incentivizes branching out into new problem domains. One potential fix is to, when novel domains show up, shoehorn the big LLMs into solving that domain on the same benchmark and limit new types of models/training to what the LLMs can accomplish in that new domain. Basically setting an initially low SOTA that can grow at the same percentage as the rest of the basket. This might prevent leapfrogging the general models with narrow ones that are mostly mesa-optimizer or similar.

[-]Emerson Spartz3y75

I think this is a really promising idea.

If the goal is to unify diverse stakeholders, including non-technical ones, I wonder if it would be better to use a less-wonky target (e.g. "50%" instead of ".002 OOMs")

[-]DeAnno3y7-3

I think you have to set this up in such a way that the current ceiling is where we already are, not back in time to before GPT-4. If you don't, then the chance it actually gets adopted seems vastly lower, since all adopters that didn't make their own GPT-4 already have to agree to be 2nd-class entities until 2029.

It's very difficult to talk about nuclear non-proliferation when a bunch of people already have nukes. If you can actually enforce it, that's a different story, but if you could actually enforce anything relating to this mess the rest just becomes details anyway.

[-]Cleo Nardo3y60

Nuclear proliferation worked despite the fact that many countries with nuclear weapons were "grandfathered in".
If the y-axis for the constraint is fixed to the day of the negotiaiton, then stakeholders who want a laxer constraint are incentivised to delay negotiation. To avoid that hazard, I have picked a schelling date (2022) to fix the y-axis.
The purpose of this article isn't to proposal any policy, strategy, treaty, agreement, law, etc which might achieve the 0.2 OOMs/year target. instead, the purpose of this article is to propose a target itself. This has inherent coordination benefits, c.f. the 2ºC target.

[-]DeAnno3y1-6

Nuclear non-proliferation worked because the grandfathered-in countries had all the power and the ones who weren't were under the implicit threat of embargo, invasion, or even annihilation. Despite all its accomplishments, GPT-4 does not give Open AI the ability to enforce its monopoly with the threat of violence.

Not to mention that 3-4 of the 5 listed countries non-party to the treaty developed nukes anyway. If Meta decides to flagrantly ignore the 0.2 OOM limit and creates something actually dangerous it's not going to sit quietly in a silo waiting for further mistakes to be made before it kills us all.

[-]Cleo Nardo3y50

I think you've misunderstood what we mean by "target". Similar issues applied to the 2°C target, which nonetheless yielded significant coordination benefits.

The 2°C target helps facilitate coordination between nations, organisations, and individuals.
It provided a clear, measurable goal.
It provided a sense of urgency and severity.
It promoted a sense of shared responsibility.
It helped to align efforts across different stakeholders.
It created a shared understanding of what success would look like.
The AI governance community should converge around a similar target.

[-]jacob_cannell3y62

Unfortunately we may already have enough compute, and it will be difficult to enforce a ban on decentralized training (which isn't competitive yet, but likely could be with more research).

[-]Cleo Nardo3y30

This isn't a policy proposal, it's a target, like the 2°C climate target.

[-]NickGabs3y36

I think concrete ideas like this that take inspiration from past regulatory successes are quite good, esp. now that policymakers are discussing the issue.

[-]Cervera3y36

I appreciate the concept.

I wonder about how the hardware overhangs would look like after the moratorium ends or somebody bypasses it.

On first thought it doesnt look like a robuts solution, I assume a 15-20% improvement on Compute access per year. Would need to plot it against the moratorium treshold and see if over time one gets closer to the Yudkowski Airstrike treshold but I assume no, 20% vs 54%,,

I dont know! maybe this is a good idea.

[-]Cleo Nardo3y30

I've added a section on hardware:

Comparing 0.2 OOMs/year target to hardware growth-rates:
Moore's Law states that transitiors per integrated circuit doubles roughly every 2 years.
Koomey's Law states that the FLOPs-per-Joule doubled roughly every 1.57 years until 2000, whereupon it began doubling roughly every 2.6 years.
Huang's Law states that the growth-rate of GPU performance exceeds that of CPU performance. This is a somewhat dubious claim, but nonetheless I think the doubling time of GPUs is longer than 18 months.
In general, the 0.2 OOMs/year target is faster than the current hardware growth-rate.

[-]Christopher King3y36

Suggestion: mention that OOM means "orders if magnitude". I was confused until I noticed the powers of 10.

[-]IC Rainbow3y10

Confused how?.. The only thing that comes to mind is that's FOOM sans F. Asking for 0.2 FOOMS limit seens reasonable given current trajectory 😅

Year	Maximum training footprint (FLOPs) in logarithm base 10	Maximum training footprint (FLOPs)
2020	23.6	3.98E+23
2021	23.8	6.31E+23
2022	24.0	1.00E+24
2023	24.2	1.58E+24
2024	24.4	2.51E+24
2025	24.6	3.98E+24
2026	24.8	6.31E+24
2027	25.0	1.00E+25
2028	25.2	1.58E+25
2029	25.4	2.51E+25
2030	25.6	3.98E+25
2031	25.8	6.31E+25
2032	26.0	1.00E+26
2033	26.2	1.58E+26
2034	26.4	2.51E+26
2035	26.6	3.98E+26
2036	26.8	6.31E+26
2037	27.0	1.00E+27
2038	27.2	1.58E+27
2039	27.4	2.51E+27
2040	27.6	3.98E+27
2041	27.8	6.31E+27
2042	28.0	1.00E+28
2043	28.2	1.58E+28
2044	28.4	2.51E+28
2045	28.6	3.98E+28
2046	28.8	6.31E+28
2047	29.0	1.00E+29
2048	29.2	1.58E+29
2049	29.4	2.51E+29
2050	29.6	3.98E+29

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

84

The 0.2 OOMs/year target

84

Ω 33

84

Ω 33

Paris Climate Accords

0.2 OOMs/year target

Effective training footprint

What is the effective training footprint?

Caveats:

Fixing the y-axis

Year-by-year limits

Implications of the 0.2 OOMs/year target