I think it's premature to conclude that AGI progress will be large pre-trained transformers indefinitely into the future. They are surprisingly(?) effective but for comparison they are not as effective in the narrow domains where AlphaZero and AlphaStar are using value and action networks paired with Monte-Carlo search with orders of magnitude fewer parameters. We don't know what MCTS on arbitrary domains will look like with 2-4 OOM-larger networks, which are within reach now. We haven't formulated methods of self-play for improvement with LLMs and I think that's also a potentially large overhang.
There's also a human limit to the types of RSI we can imagine and once pre-trained transformers exceed human intelligence in the domain of machine learning those limits won't apply. I think there's probably significant overhang in prompt engineering, especially when new capabilities emerge from scaling, that could be exploited by removing the serial bottleneck of humans trying out prompts by hand.
Finally I don't think GOFAI is dead; it's still in its long winter waiting to bloom when enough intelligence is put into it. We don't know the intelligence/capability threshold necessary to make substantial progress there. Generally, the bottleneck has been identifying useful mappings from the real world to mathematics and algorithms. Humans are pretty good at that, but we stalled at formalizing effective general intelligence itself. Our abstraction/modeling abilities, working memory, and time are too limited and we have no idea where those limits come from, whether LLMs are subject to the same or similar limits, or how the limits are reduced/removed with model scaling.
If deep learning yields AGI, the question is how far can its intelligence jump beyond human level before it runs out of compute available in the world, using the improvements that can be made very quickly at the current level of intelligence. In short sprints, a hoard of handmade constants can look as good as asymptotic improvement. So the latter's hypothetical impossibility doesn't put convincing bounds on how far this could be pushed before running out of steam. And if by that point procedures for bootstrapping nanotech become obvious, this keeps going, transitioning into disassembling the world for more compute without pause. All without refuting the bitter lesson.
I think you forgot one critical thing. Why does the normal argument for RSI's inevitability fail? The answer is: it doesn't.
Even though there is some research in the direction of a neural network changing each of its weights directly, this isn't important to the main argument because it is about improving source code. The weights are more like compiled code.
In the context of deep learning, the source code consists of:
So the question is if a deep learning model could improve any of this code. The question of if it can improve its "compiled code" (the weights) is also probably yes, but isn't what the argument is based on.
It seems pretty clear to me that AI's could get really good at understanding and predicting the results of editing model weights in the same way they can get good at predicting how proteins will fold. From there, directly creating circuits that add XYZ reasoning functionality seems at least possible.
This is a solid argument inasmuch as we define RSI to be about self-modifying its own weights/other-inscrutable-reasoning-atoms. That does seem to be quite hard given our current understanding.
But there are tons of opportunities for an agent to improve its own reasoning capacity otherwise. At a very basic level, the agent can do at least two other things:
Most problems in computer science have superlinear time complexity
on one hand sure, improving this is (likely) impossible in the limit because of fundamental complexity properties. On the other hand, the agent can still become vastly smarter than humans. A particular example: the human mind, without any assistance, is very bad at solving 3SAT. But we've invented computers, and then constraint solvers, and now are able to solve 3SAT much much faster, even though 3SAT is (likely) exponentially hard. So the RSI argument here is, the smarter (or faster) the model is, the more special-purpose tools it can create to efficiently solve specific problems and thus upgrade its reasoning ability. Not to infinity, but likely far beyond humans.
So I'm going to strong disagree here.
First of all, as it turns out in practice, scale was everything. This means that any AI idea you want to name, unless that idea was based on a transformer and worked on by approximately 3 labs, it was never actually attempted.
We can just ignore all the other thousands of AI methods that humans tried because they were not attempted with a relevant level of scale.
Therefore, RSI has never been tried.
Second, you can easily design a variation on RSI that works fine with current paradigms.
It's not precisely RSI but it's functionally the same thing. Here are the steps:
 benchmark of many tasks. Tasks must be autogradeable, human participants must be able to 'play' the tasks so we have a control group score, tasks must push the edge of human cognitive ability (so the average human scores nowhere close to the max score, and top 1% humans do not max the bench either), there must be many tasks and with a rich permutation space. (so it isn't possible for a model to memorize all permutations)
 heuristic weight score on this task intended to measure how "AGI like" a model is. So it might be the RMSE across the benchmark. But also have a lot of score weighting on zero shot, cross domain/multimodal tasks. That is, the kind of model that can use information from many different previous tasks on a complex exercise it has never seen before is closer to an AGI, or closer to replicating "Leonardo da Vinci", who had exceptional human performance presumably from all this cross domain knowledge.
 In the computer science task set, there are tasks to design an AGI for a bench like this. The model proposes a design, and if that design has already been tested, immediately receives detailed feedback on how it performed.
The "design an AGI" subtask can be much simpler than "write all the boilerplate in Python", but these models will be able to do that if needed.
As tasks scores approach human level across a broad set of tasks, you have an AGI. You would expect it to almost immediately improve to a low superintelligence. As AGIs get used in the real world and fail to perform well at something, you add more tasks to the bench, and/or automate creating simulated scenarios that use robotics data.
Why aren't we already doing this if it's so simple?
Because each AGI candidate training run has to be at least twice as large as llama-65b, so it means 2m+ in training costs per run. And you need to explore the possibility space pretty broadly, so you would figure several thousand runs to really get to a decent AGI design which will not be optimal.
This is one of the reasons foom cannot happen. At least not without a lot more compute than we have now. Each attempt is too expensive.
Can we refine the above algorithm into something more compute efficient? Yes, somewhat (by going to a modular architecture, where each "AGI candidate" is composed of hundreds of smaller networks, and we reuse most of them in between candidates), but it's going to still require a lot more compute than llama-65b took to train.
Direct self-improvement (i.e. rewriting itself at the cognitive level) does seem much, much harder with deep learning systems than with the sort of systems Eliezer originally focused on.
In DL, there is no distinction between "code" and "data"; it's all messily packed together in the weights. Classic RSI relies on the ability to improve and reason about the code (relatively simple) without needing to consider the data (irreducibly complicated).
Any verification that a change to the weights/architecture will preserve a particular non-trivial property (e.g. avoiding value drift) is likely to be commensurate in complexity to the complexity of the weights. So... very complex.
The safest "self-improvement" changes probably look more like performance/parallelization improvements than "cognitive" changes. There are likely to be many opportunities for immediate performance improvements, but that could quickly asymptote.
I think that recursive self-empowerment might now be a more accurate term than RSI for a possible source of foom. That is, the creation of accessory tools for capability increase. More like a metaphorical spider at the center of an increasingly large web. Or (more colorfully) a shoggoth spawning a multitude of extra tentacles.
The change is still recursive in the sense that marginal self-empowerment increase the ability to self-empower.
So I'd say that a "foom" is still possible in DL, but is both less likely and almost certainly slower. However, even if a foom is days or weeks rather than minutes, many of the same considerations apply. Especially if the AI has already broadly distributed itself via the internet.
Perhaps instead of just foom, we get "AI goes brrrr... boom... foom".
Hypothetical examples include: more efficient matrix multiplication, faster floating point arithmetic, better techniques for avoiding memory bottlenecks, finding acceptable latency vs. throughput trade-offs, parallelization, better usage of GPU L1/L2/etc caches, NN "circuit" factoring, and many other algorithmic improvements that I'm not qualified to predict.