This is a linkpost for https://shape-of-code.com/2022/03/13/growth-in-flops-used-to-train-ml-models/

Given the ongoing history of continually increasing compute power, what is the maximum compute power that might be available to train ML models in the coming years?

New to LessWrong?

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 3:37 AM

Speaking of compute and experience curves, Karpathy just posted about replicating Le Cun's 1989 pre-MNIST digit classifying results and what difference compute & methods make: https://karpathy.github.io/2022/03/14/lecun1989/

Thanks, an interesting read until the author peers into the future.  Moore's law is on its last legs, so the historical speed-ups will soon be just that, something that once happened.  There are some performance improvements still to come from special purpose cpus, and half-precision floating-point will reduce memory traffic (which can then be traded for cpu perforamnce).

Thanks for writing! I don't see an actual answer to the question asked in the beginning -- "Given the ongoing history of continually increasing compute power, what is the maximum compute power that might be available to train ML models in the coming years?" Did I miss it?