I think Tetlock and cia might have already done some related work?
Question decomposition is part of the superforecasting commandments, though I can't recall off the top of my head if they were RCT'd individually or just as a whole.
ETA: This is the relevant paper (h/t Misha Yagudin). It was not about the 10 commandments. Apparently those haven't been RCT'd at all?
I cowrote a detailed response here
https://www.cser.ac.uk/news/response-superintelligence-contained/
Essentially, this type of reasoning proves too much, since it implies we cannot show any properties whatsoever of any program, which is clearly false.
Here is some data through Matthew Barnett and Jess Riedl
Number of cumulative miles driven by Cruise's autonomous cars is growing as an exponential at roughly 1 OOM per year.
That is to very basic approximation correct.
Davidson's takeoff model illustrates this point, where a "software singularity" happens for some parameter settings due to software not being restrained to the same degree by capital inputs.
I would point out however that our current understanding of how software progress happens is somewhat poor. Experimentation is definitely a big component of software progress, and it is often understated in LW.
More research on this soon!
algorithmic progress is currently outpacing compute growth by quite a bit
This is not right, at least in computer vision. They seem to be the same order of magnitude.
Physical compute has growth at 0.6 OOM/year and physical compute requirements have decreased at 0.1 to 1.0 OOM/year, see a summary here or a in depth investigation here
Another relevant quote
Algorithmic progress explains roughly 45% of performance improvements in image classification, and most of this occurs through improving compute-efficiency.
Thanks!
Our current best guess is that this includes costs other than the amortized compute of the final training run.
If no extra information surfaces we will add a note clarifying this and/or adjust our estimate.
Thanks Neel!
The difference between tf16 and FP32 comes to a x15 factor IIRC. Though also ML developers seem to prioritise other characteristics than cost effectiveness when choosing GPUs like raw performance and interconnect, so you can't just multiply the top price performance we showcase by this factor and expect that to match the cost performance of the largest ML runs today.
More soon-ish.
I agree! I'd be quite interested in looking at TAS data, for the reason you mentioned.