Overall take: unimpressed.
Very simple gears in a subculture's worldview can keep being systematically misperceived if it's not considered worthy of curious attention. On the local llama subreddit, I keep seeing assumptions that AI safety people call for never developing AGI, or claim that the current models can contribute to destroying the world. Almost never is there anyone who would bother to contradict such claims or assumptions. This doesn't happen because it's difficult to figure out, this happens because the AI safety subculture is seen as unworthy of engagement, and so people don't learn what it's actually saying, and don't correct each other on errors about what it's actually saying.
This gets far worse with more subtle details, the standard of willingness to engage is raised higher to actually study what the others are saying, when it would be difficult to figure out even with curious attention. Rewarding engagement is important.
I'm being a bit simplistic. The point is that it needs to stop being a losing or a close race, and all runners getting faster doesn't obviously help with that problem. I guess there is some refactor vs. rewrite feel to the distinction between the project of stopping humans from building AGIs right now, and the project of getting first AGIs to work on alignment and global security in a post-AGI world faster than other AGIs overshadow such work. The former has near/concrete difficulties, the latter has nebulous difficulties that don't as readily jump to attention. The whole problem is messiness and lack of coordination, so starting from scratch with AGIs seems more promising than reforming human society. But without strong coordination on development and deployment of first AGIs, the situation with activities of AGIs is going to be just as messy and uncoordinated, only unfolding much faster, and that's not even counting the risk of getting a superintelligence right away.
The relevant thing is how probability both gets clearer and improves with further research enabled by pause. Currently, as a civilization we are at the startled non-sapient deer stage, that's not a position from which to decide the future of the universe.
Plans that rely on aligned AGIs working on alignment faster than humans would need to ensure that no AGIs work on anything else in the meantime. The reason humans have no time to develop alignment of superintelligence is that other humans develop misaligned superintelligence faster. Similarly by default very fast AGIs working on alignment end up having to compete with very fast AGIs working on other things that lead to misaligned superintelligence. Preventing aligned AGIs from building misaligned superintelligence is not clearly more manageable than preventing humans from building AGIs.
Quantum nondeterminism is going to make an address not much better than compressing the local content directly, searching for the thing rather than at a location. And to the extent laws of physics follow from the local content anyway (my mind holds memories of observing the world and physics textbooks), additionally specifying them does nothing. So unclear if salience of laws of physics in shortest descriptions is correct.
My point is that elegance of natural impact regularization takes different shapes for different minds, and paving over everything is only elegant for minds that care about the state of the physical world at some point in time, rather than the arc of history.
Aligning human-level AGIs is important to the extent there is risk it doesn't happen before it's too late. Similarly with setting up a world where initially aligned human-level AGIs don't soon disempower humans (as literal humans might in the shoes of these AGIs), or fail to protect the world from misused or misaligned AGIs or superintelligences.
Then there is a problem of aligning superintelligences, and of setting up a world where initially aligned superintelligences don't cause disempowerment of humans down the line (whether that involves extinction or not). Humanity is a very small phenomenon compared to a society of superintelligences, remaining in control of it is a very unusual situation. (Humanity eventually growing up to become a society of superintelligences while holding off on creating a society of alien superintelligences in the meantime seems like a more plausible path to success.)
Solving any of these problems doesn't diminish importance of the others, which remain as sources of possible doom, unless they too get solved before it's too late. Urgency of all of these problems originates from the risk of succeeding in developing AGI. Tasking the first aligned AGIs with solving the rest of the problems caused by the technology that enables their existence seems like the only plausible way of keeping up, since by default all of this likely occurs in a matter of years (from development of first AGIs). Though economic incentives in AGI deployment risk escalating the problems faster than AGIs can implement solutions to them. Just as initial development of AGIs risks creating problems faster than humans can prepare for them.
The argument depends on awareness that the canvas is at least a timeline (but potentially also various counterfactuals and frames), not a future state of the physical world in the vicinity of the agent at some point of time. Otherwise elegance asks planning to pave over the world to make it easier to reason about. In contrast, a timeline will have permanent scars from the paving-over that might be harder to reason through sufficiently beforehand than keeping closer to the status quo, or even developing affordances to maintain it.
Interestingly, this seems to predict that preference for "low impact" is more likely for LLM-ish things trained on human text (than for de novo RL-ish things or decision theory inspired agents), but for reasons that have nothing to do with becoming motivated to pursue human values. Instead, the relevant imitation is for ontology of caring about timelines, counterfactuals, and frames.
LLMs will soon scale beyond the available natural text data, and generation of synthetic data is some sort of change of architecture, potentially a completely different source of capabilities. So scaling LLMs without change of architecture much further is an expectation about something counterfactual. It makes sense as a matter of theory, but it's not relevant for forecasting.
Three points: how much compute is going into a training run, how much natural text data it wants, and how much data is available. For training compute, there are claims of multi-billion dollar runs being plausible and possibly planned in 2-5 years. Eyeballing various trends and GPU shipping numbers and revenues, it looks like about 3 OOMs of compute scaling is possible before industrial capacity constrains the trend and the scaling slows down. This assumes that there are no overly dramatic profits from AI (which might lead to finding ways of scaling supply chains faster than usual), and no overly dramatic lack of new capabilities with further scaling (which would slow down investment in scaling). That gives about 1e28-1e29 FLOPs at the slowdown in 4-6 years.
At 1e28 FLOPs, Chinchilla scaling asks for 200T-250T tokens. Various sparsity techniques increase effective compute, asking for even more tokens (when optimizing loss given fixed hardware compute).
On the outside, there are 20M-150M accessible books, some text from video, and 1T web pages of extremely dubious uniqueness and quality. That might give about 100T tokens, if LLMs are used to curate? There's some discussion (incl. comments) here, this is the figure I'm most uncertain about. In practice, absent good synthetic data, I expect multimodality to fill the gap, but that's not going to be as useful as good text for improving chatbot competence. (Possibly the issue with the original claim in the grandparent is what I meant by "soon".)