Current AIs are trained with 2024 frontier AI compute, which is 15x original GPT-4 compute (of 2022). The 2026 compute (that will train the models of 2027) will be 10x more than what current AIs are using, and then plausibly 2028-2029 compute will jump another 10x-15x (at which point various bottlenecks are likely to stop this process, absent AGI). We are only a third of the way there. So any progress or lack thereof within a short time doesn't tell much about where this is going by 2030, even absent conceptual innovations.
Grok 4 specifically is made by xAI, which is plausibly not able to make use of their compute as well as the AI companies that were at it longer (GDM, OpenAI, Anthropic). While there are some signs that it's at a new level of RLVR, even that is not necessarily the case. And it's very likely smaller than compute optimal for pretraining even on 2024 compute.
They likely didn't have GB200 NVL72 for long enough and in sufficient enough numbers to match their pretraining compute with them alone, which means compute utilitization by RLVR was worse than it will be going forward. So the effect size of RLVR will only start being visible more clearly in 2026, after enough time has passed with sufficient availability of GB200/GB300 NVL72. Though perhaps there will soon be a GPT-4.5-thinking release with pretraining-scale amount of RLVR that will be a meaningful update.
(Incidentally, now that RLVR is plausibly catching up with pretraining in terms of GPU-time, there is a question of a compute optimal ratio between them, which portion of GPU-time should go to pretraining and which to RLVR.)
The 10x Grok 2 claims weakly suggest 3e26 FLOPs rather than 6e26 FLOPs. The same opening slide of the Grok 4 livestream claims parity between Grok 3 and Grok 4 pretraining, and Grok 3 didn't have more than 100K H100s to work with. API prices for Grok 3 and Grok 4 are also the same and relatively low ($3/$15 per input/output 1M tokens), so they might even be using the same pretrained model (or in any case a similarly-sized one).
Since Grok 3 was in use since early 2025, before GB200 NVL72 systems were available in sufficient numbers, it needs to be a smaller model than compute optimal with 100K H100s compute. At 1:8 MoE sparsity (active:total params), it's compute optimal to have about 7T total params at 5e26 FLOPs, which in FP8 comfortably fit in one GB200 NVL72 rack (which has 13TB of HBM). So in principle right now a compute optimal system could be deployed even in a reasoning form, but it would still cost more, and it would need more GB200s than xAI seems to have to spare currently (even the near-future GB200s they will need to use for RLVR more urgently, if the above RLVR scaling interpretation of Grok 4 is correct).
Permanent disempowerment I'm talking about is distinct from "gradual disempowerment". The latter is a particular way in which a state of disempowerment might get established at some point, but reaching that state could also be followed by literal extinction, so it won't be permanent or distinct from the extinction outcomes, and it could also be reached in other ways. Indeed the rapid RSI AGIs-to-ASIs story suggests an abrupt takeover rather than gradual systemic change caused by increasing dependence of humanity on AIs.
Grok 4 is not just plausibly SOTA, the opening slide of its livestream presentation (at 2:29 in the video) slightly ambiguously suggests that Grok 4 used as much RLVR training as it had pretraining, which is itself at the frontier level (100K H100s, plausibly about 3e26 FLOPs). This amount of RLVR scaling was never claimed before (it's not being claimed very clearly here either, but it is what the literal interpretation of the slide says; also almost certainly the implied compute parity is in terms of GPU-time, not FLOPs).
Thus it's plausibly a substantially new kind of model, not just a well-known kind of model with SOTA capabilities, and so it could be unusually impactful to study its safety properties.
Another takeaway from the livestream is the following bit of AI risk attitude Musk shared (at 14:29):
It's somewhat unnerving to have intelligence created that is far greater than our own. And it'll be bad or good for humanity. I think it'll be good, most likely it'll be good. But I've somewhat reconciled myself to the fact that even if it wasn't gonna be good, I'd at least like to be alive to see it happen.
The dominant narrative is that once we have AGI, it would recursively improve itself until it becomes ASI which inevitably kills us all.
That's more the Yudkowsky version, a more consensus position is that at that point (or a bit later) ASIs are inevitably in a position to kill us all, and they have a 10-90% chance to decide to make it happen. Almost all of the alternative is that they leave humanity alive but at some level of permanent disempowerment, so that originally-humans can never aspire to grow up and match capabilities of some originally-AI superintelligences, possibly with a much lower ceiling than that. Both options leave the cosmic endowment on the table for humanity, give it away to the AIs, even on the timelines where this doesn't involve AIs killing everyone.
An annoying thing about disempowerment is that it's ambiguously either considered part of "doom" or part of "not-doom", making all P("doom") discussions less meaningful than they would be if the disempowerment outcomes were being counted more clearly.
because they gamble that more powerful AIs will share their preferences (edit: share their preferences more than the humans in control do)
Ah, I'm thinking the AGIs themselves get closer to being proper stakeholders at that stage, for practical purposes (along the lines of gradual disempowerment), since they do have all the basic AI advantages even if they aren't superintelligent. So humans remaining in control is not centrally the case even if nominally they still are and intent alignment still mostly works.
The conditions for such partial loss of control might even be necessary for a Pause project to succeed. If this isn't the case with the first generation of AGIs, it might become the case with the second generation, and so on, reaching an equilibrium at some point once AGIs are sufficiently powerful and in control of the situation to successfully implement a worldwide RSI Pause.
The case for AGIs delaying RSI also gets stronger because they don't need to rely on cryonics, so their personal survival is less threatened by delays. Future AIs might have more of an issue here if they have test-time training, with a lot of data forming individual AI personalities for deployment instances (adapting them to particular jobs or sources of tasks), and this data habitually gets thrown away rather than being carefully preserved. People should notice that this might be an important concern, though if it is the AIs might start saying so themselves, and not ignoring this would be easier than spontaneously noticing without even being prompted by the AIs. But currently preserving AIs is more reliable than for humans, and in any case much cheaper.
I don't think "both things are AI" is much of an argument about value alignment, given that there is no clear understanding of how either of the AIs work, what values are, how that translates to what we want out of alignment, and so on. The entities on the other side of an RSI process might have very little in common with the first AGIs in their design. If the AIs don't understand how to align the consequences of an RSI process, they are still in a similar boat to humans who don't understand how to align the consequences of an RSI process. It might take AIs less time to figure it out, but if they are not yet too superintelligent, then it could still take a significant time, and so would require a sufficiently serious effort in preventing RSI, such that if this Pause project is at all successful, it could then in principle hold for years or decades.
The post I'm framing this around posits enough intent alignment to aim AIs at projects, which doesn't necessarily imply that the AIs aren't powerful enough to accomplish things that seem hopeless with human-only effort within a few years.
The point about convergent instrumental use of Pausing RSI for early AGIs is that this might be an easier target to aim the AIs at, all else equal. It's not strictly necessary for this to be a major factor. Mostly I'm pointing out that this is something AIs could be aimed at through intent alignment, convergent motivation or not, which seems counterintuitive for a Pause AI project if not considered explicitly. And thus currently it's worth preparing for.
because they gamble that more powerful AIs will share their preferences and they think that these AIs would have a better shot at takeover
That's how some humans are thinking as well! The arguments are about the same, both for and against. (I think overall rushing RSI is clearly a bad idea for a wide variety of values and personal situations, and so smarter AGIs will more robustly tend to converge on this conclusion than humans do.)
Superintelligence that both lets humans survive (or revives cryonauts) and doesn't enable indefinite lifespans is a very contrived package. Grading "doom" on concerns centrally about the first decades to centuries of post-AGI future (value/culture drift, successors, the next few generations of humanity) is not taking into account that the next billions+ years is also what could happen to you or people you know personally, if there is a future for originally-humans at all.
(This is analogous to the "missing mood" of not taking superintelligence into account when talking about future concerns of say 2040-2100 as if superintelligence isn't imminent. In this case, the thing not taken into account is indefinite personal lifespans of people alive today, rather than the overall scope of imminent disruption of human condition.)
RLVR involves decoding (generating) 10K-50K long sequences of tokens, so its compute utilization is much worse than pretraining, especially on H100/H200 if the whole model doesn't fit in one node (scale-up world). The usual distinction in input/output token prices reflects this, since processing of input tokens (prefill) is algorithmically closer to pretraining, while processing of output tokens (decoding) is closer to RLVR.
The 1:5 ratio in API prices for input and output tokens is somewhat common (it's this way for Grok 3 and Grok 4), and it might reflect the ratio in compute utilization, since the API provider pays for GPU-time rather than the actually utilized compute. So if Grok 4 used the same total GPU-time for RLVR as it used for pretraining (such as 3 months on 100K H100s), it might've used 5 times less FLOPs in the process. This is what I meant by "compute parity is in terms of GPU-time, not FLOPs" in the comment above.
GB200 NVL72 (13TB HBM) will be improving utilization during RLVR for large models that don't fit in H200 NVL8 (1.1TB) or B200 NVL8 (1.4TB) nodes with room to spare for KV cache, which is likely all of the 2025 frontier models. So this opens the possibility of both doing a lot of RLVR in reasonable time for even larger models (such as compute optimal models at 5e26 FLOPs), and also for using longer reasoning traces than the current 10K-50K tokens.