This seems to strongly suggest that a "software-only" intelligence explosion is unlikely, or at least, would be quite slow.
Except that we had Beren claim that SOTA algorithmic progress is mostly data progress. Which could also mean that the explosion may be based on architectures which have yet to be found, like the right version of neuralese. As far as I am aware, architectures like the Coconut paper by Meta or this paper on arxiv forget everything once they output the token, meaning that they are unlikely to be the optimal architecture.
As someone who has written that post, I think the title nowadays over claims, and the only reason I chose the title was because of the fact that it was used originally.
I'd probably argue that it explains a non-trivial amount of progress, but nowadays I'd focus way more on compute being the main driver of AI progress in general.
And yes, new architectures/loss functions could change the situation, but the paper is evidence that we should expect these architectures to rely on a lot of compute to be more efficient.
This is a linkpost to a new Substack article from MIT FutureTech explaining our recent paper On the Origins of Algorithmic Progress in AI.
We demonstrate that some algorithmic innovations have efficiency gains which get larger as pre-training compute increases. These scale-dependent innovations constitute the majority of pre-training efficiency gains over the last decade, which may imply that what looks like algorithmic progress is driven by compute scaling rather than many incremental innovations.
From the paper, our core contributions are:
MIT FutureTech is an interdisciplinary lab at the intersection of computer science and economics, focused specifically on trends in AI and computing, and funded in part by Coefficient Giving.