I agree with all those points. The main point I'm making is just that hardware efficiency seems like it should also be a function of total compute capex, which would mean it moves in the same way as algo progress, which would further exaggerate a slow down. Do you basically agree with that?
Interesting point. But could I check why you and @Vladimir_M are confident Moore's law continues at all?
I'd have guessed maintaining the rate of gains in hardware efficiency will require exponentially increasing chip R&D spending and researcher hours.
But if total spending on chips has plateaued, then Nvidia etc.'s R&D spending will have also plateaued, which I think would imply hardware efficiency gains drop to near zero.
Interesting that the growth rate of technical alignment researchers seem to be a bit slower than capabilities researchers. Compare:
The exponential models describe a 24% annual growth rate in the number of technical AI safety organizations and 21% growth rate in the number of technical AI safety FTEs.
To the following taken from here:
In 2021, OpenAI had about 300 employees; today, it has about 3,000. Anthropic and DeepMind have also grown more than 3x, and new companies have entered. The number of ML papers produced per year has roughly doubled every two years.
This suggests to me that the capabilities field is growing more like 30-40% per year.
In addition, governance seems to be growing even slower than technical safety, even in the last 3 years.
Hmm interesting.
I see the case for focusing only on compute (since relatively objective), but it still seems important to try to factor in some amount of algorithmic progress to pretraining (which means that the cost to achieve GPT-6 level performance will be dropping over time).
The points on Epoch are getting outside of my expertise – I see my role as to synthesise what experts are saying. It's good to know these critiques exist and it would be cool to see them written up and discussed.
Thanks, it was a video so didn't survive the cross-post. I've put in a hyperlink for now.
Thanks, useful to have these figures and an independent data on these calculations.
I've been estimating it based on a 500x increase in effective FLOP per generation, rather than 100x of regular FLOP.
Rough calculations are here.
At the current trajectory, the GPT-6 training run costs $6bn in 2028, and GPT-7 costs $130bn in 2031.
I think that makes GPT-8 a couple of trillion in 2034.
You're right that if you wanted to train GPT-8 in 2031 instead, then it would cost roughly 500x more than training GPT-7 that year.
I'm open to that and felt unsure the post was a good idea after I released it. I had some discussion with him on twitter afterwards, where we smoothed things over a bit: https://x.com/GaryMarcus/status/1888604860523946354
Thanks for the comments!
I'm especially keen to explore bottlenecks (e.g. another suggestion I saw is that to reach 1 billion a year would require 10x current global lithium production to supply the batteries.)
A factor of 2 for increased difficultly due to processing intensity seems reasonable, and I should have thrown it in. (Though my estimates were to an order of magnitude so this probably won't change the bottom line, and on the other side, many robots will weigh <100kg and some will be non-humanoid.)
interesting point