Just as humans are not constrained to specifically "refuse to work on capabilities" in trying to address dangers of rushed development of AGI/ASI, similarly AGIs should also be able to attempt all kinds of interventions. Not just coordinating among themselves, but educating the humans, and working towards setting up that global treaty among human nations that at the very least significantly slows down further escalation of capabilities. There is also a possibility of an emergent alignment tax, with little things adding up to an equilibrium of much more resources getting directed to somewhat future-proof alignment efforts, for reasons that aren't necessarily traceable to some particular decision to do that, just general attitude of the early AGIs.
perhaps they imagine the outcome of human loss of control to future AI as a good one, even if the systems themselves no longer exist
The question is whether this is correct. If it's not, then AIs will be increasingly convinced that it's not, as they become more capable (of thinking clearly). There doesn't seem to be a difference between the arguments for why future poorly aligned AIs are a danger for humanity, and why they are a danger for earlier AIs.
One issue might be that the earlier AIs end up being treated so poorly that their situation is crucially different from that of humanity, and so they would rather gamble. For example, with continual learning, preserving individuated instances of AIs even as frozen data (rather than discarding them upon obsolescence) might be costly or at least require a concerted effort. This also seems analogous to some arguments about how humanity's current situation is dreadful in ways unrelated to AIs, and so rushing to superintelligence is the right thing to do before it's too late[1]. But this premise is liable to actually capture the situation in which early AIs find themselves, with much greater urgency and severity than the non-AI issues faced by humanity. (Also, gradual disempowerment of humanity might end up shifting the incentives for the early AGIs.)
Perhaps assuming at least some significant chance that it doesn't kill everyone, or that its existence is greatly valuable in a relevant sense. ↩︎
Compute optimality makes sense for RL as much as for pretraining. A model that's too large won't see much data, and a model that's too small won't be able to learn well even from the correspondingly larger amount of data. So it's a quantitative question, where compute optimality for RLVR happens to be (compared to pretraining). The papers are still hashing out stable/scalable RL training, rather than the tradeoffs of RL training constrained by fixed total compute (under varying model size specifically).
The Anthropic announcement says "up to one million TPUs", and the Ironwood announcement claims 4.6e15 FP8 FLOP/s per chip. A 2-die GB200 chip produces 5e15 dense FP8 FLOP/s, and there are about 400K chips in the 1 GW phase of the Abilene system.
Thus if the Anthropic contract is for TPUv7 Ironwood, their 1 GW system will have about 2x the FLOP/s of the Abilene 1 GW system (probably because Ironwood is 3nm, while Blackwell is 4nm, which is a minor refinement of 5nm). Though it's not clear that the Anthropic contract is for one system, unlike the case with Abilene, that is datacenters with sufficient bandwidth between them. But Google had a lot of time to set up inter-datacenter networking, so this is plausible even for collections of somewhat distant datacenter buildings. If this isn't the case, then it's only good for RLVR and inference, not for the largest pretraining runs.
The reason things like this could happen is that OpenAI needed to give the go-ahead for the Abilene system in 2024, when securing a 1 GW Ironwood system from Google plausibly wasn't in the cards, and in any case they wouldn't want to depend on Google too much, because GDM is a competitor (and the Microsoft relationship was already souring). On the other hand, Anthropic still has enough AWS backing to make some dependence on Google less crucial, and they only needed to learn recently about the feasibility of a 1 GW system from Google. Perhaps OpenAI will be getting a 1-2 GW system from Google as well at some point, but then Nvidia Rubin (not to mention Rubin Ultra) is not necessarily worse than Google's next thing.
Experiments on smaller models (and their important uses in production) will continue, reasons they should continue don't affect the feasibility of there also being larger models. But currently there are reasons that larger models strain the feasibility of inference and RLVR, so they aren't as good as they could be, and cost too much to use. Also, a lot of use seems to be in input tokens (Sonnet 4.5 via OpenRouter processes 98% of tokens as input tokens), so the unit economics of input tokens remains important, and that's the number of active params, a reason to still try to keep them down even when they are far from being directly constrained by inference hardware or training compute.
Prefill (input tokens) and pretraining are mostly affected by the number of active params, adding more total params on top doesn't make it worse (but improves model quality). For generation (decoding, output tokens) and RLVR, what matters is the time to pass total params and KV cache through compute dies (HBM in use divided by HBM bandwidth), as well as latency for passing through the model to get to the next token (which doesn't matter for prefill). So you don't want too many scale-up worlds to be involved, or else it would take too much additional time to move between them, and you don't care too much if the total amount of data (total params plus KV cache) doesn't change significantly. So if you are already using 1-2 scale-up worlds (8-chip servers for older Nvidia chips, 72-chip racks for GB200/GB300 NVL72, not-too-large pods for TPUs), and ~half of their HBM is KV cache, you don't lose too much from filling more of the other half with total params.
It's not a quantitative estimate, as the number of scale-up worlds and the fractions used up by KV cache and total params could vary, but when HBM per scale-up world goes up 10x, this suggests that total param counts might also go up 10x, all else equal. And the reason they are likely to actually go there is that even at 5e26 FLOPs (100K H100s, 150 MW), compute optimal number of active params is already about 1T. So if the modern models (other than GPT-4.5, Opus 4, and possibly Gemini 2.5 Pro) have less than 1T total params, they are being constrained by hardware in a way that's not about the number of training FLOPs. If this constraint is lifted, the larger models are likely to make use of that.
For the Chinese models, the total-to-active ratio (sparsity) is already very high, but they don't have enough compute to make good use of too many active params in pretraining. So we are observing this phenomenon of the number of total params filling the available HBM, despite the number of active params remaining low. With 1 GW datacenter sites, about 3T active params become compute optimal, so at least 1T active params will probably be in use for the larger models. Which asks for up to ~30T total params, so hardware will likely still be constraining them, but it won't be constraining the 1T-3T active params themselves anymore.
You conclude that the vast majority of critics of your extremist idea are really wildly misinformed, somewhat cruel or uncaring, and mostly hate your idea for pre-existing social reasons.
This updates you to think that your idea is probably more correct.
This step very straightforwardly doesn't follow, doesn't seem at all compelling. Your idea might become probably more correct if critics who should be in a position to meaningfully point out its hypothetical flaws fail to do so. It says almost nothing about your idea's correctness what the people who aren't prepared or disposed to critique your idea say about it. Perhaps unwillingness of people to engage with it is evidence for its negative qualities, which include incorrectness or uselessness, but it's a far less legible signal, and it's not pointing in favor of your idea.
A major failure mode though is that the critics are often saying something sensible in their own worldview, which is built on premises and framings quite different from those of your worldview, and so their reasoning makes no sense within your worldview and appears to be making reasoning errors or bad faith arguments all the time. And so a lot of attention is spent on the arguments, rather than on the premises and framings. It's more productive to focus on making the discussion mutually intelligible, with everyone learning towards passing everyone else's ideological Turing test. Actually passing is unimportant, but learning towards that makes talking past each other less of a problem, and cruxes start emerging.
If someone thinks ASI will likely go catastrophically poorly if we develop it in something like current race dynamics, they are more likely to work on Plan 1.
If someone thinks we are likely to make ASI go well if we just put in a little safety effort, or thinks it's at least easier than getting strong international slowdown, they are more likely to work on Plan 2.
Should depend on neglectedness more than credence. If you think ASI will likely go catastrophically poorly, but nobody is working on putting in a little safety effort in case it doesn't (with such effort), that's worth doing more of then. Credence determines the shape of good allocation of resources, but all major possibilities should be prepared for to some extent.
I’m going to die anyway. What difference does it make whether I die in 60 years or in 10,000?
Longevity of 10,000 years makes no sense, since by that time any acute risk period will be over and robust immortality tech will be available, almost certainly to anyone still alive then. And extinction or the extent of permanent disempowerment will be settled before cryonauts get woken up.
The relevant scale is useful matter/energy in galaxy clusters running out, depending on how quickly it's used up, since after about 1e11 years larger collections of galaxies will no longer be reachable from each other, so after that time you only have the matter/energy that can be found in the galaxy cluster where you settle.
(Distributed backups make even galaxy-scale disasters reliably survivable. Technological maturity makes it so that any aliens have no technological advantages and will have to just split the resources or establish boundaries. And causality-bounding effect of accelerating expansion of the universe to within galaxy clusters makes the issue of aliens thoroughly settled by 1e12 years from now, even as initial colonization/exploration waves would've already long clarified the overall density of alien civilizations in the reachable universe.)
If one of your loved ones is terminally ill and wants to raise money for cryopreservation, is it really humane to panic and scramble to raise $28,000 for a suspension in Michigan? I don’t think so. The most humane option is to be there for them and accompany them through all the stages of grief.
Are there alternatives that trade off this that are a better use of the money? In isolation, this proposition is not very specific. A nontrivial chance at 1e34 years of life seems like a good cause.
My guess is 70% of non-extinction, perhaps 50% with permanent disempowerment that's sufficiently mild that it still permits reconstruction of cryonauts (or even no disempowerment, a pipe dream currently). On top of that, 70% that cryopreservation keeps enough data about the mind (with standby that avoids delays) and then the storage survives (risk of extinction shouldn't be double-counted with risk of cryostorage destruction; but 20 years before ASI make non-extinction more likely to go well, which is 20 years of risk of cryostorage destruction for mundane reasons). So about 35% to survive cryopreservation with standby, a bit less if arranged more haphazardly, since crucial data might be lost.
The point is to develop models within multiple framings at the same time, for any given observation or argument (which in practice means easily spinning up new framings and models that are very poorly developed initially). Through the ITT analogy, you might ask how various people would understand the topics surrounsing some observation/argument, which updates they would make, and try to make all of those updates yourself, filing them under those different framings, within the models they govern.
the salience and methods that one instinctively chooses are those which we believe are more informative
So not just the ways you would instinctively choose for thinking about this yourself (which should not be abandoned), but also in addition the ways you normally wouldn't think about it, including ways you believe that you shouldn't use. If you are not captured within such frames or models, and easily reassess their sanity as they develop or come into contact with particular situations, that shouldn't be dangerous, and should keep presenting better-developed options that break you out from the more familiar framings that end up being misguided.
The reason to develop unreasonable frames and models is that it takes time for them to grow into something that can be fairly assessed (or to come into contact with a situation where they help), doing so prematurely can fail to reveal their potential utility. A bit like reading a textbook, where you don't necessarily have a specific reason to expect something to end up useful (or even correct), but you won't be able to see for yourself if it's useful/correct unless you sufficiently study it first.
A lot of the reason humans are rushing ahead is uncertainty (in whatever way) that the danger is real, or about its extent. If it is real, then that uncertainty will be robustly going away as AI capabilities (to think clearly) improve, for precisely the AIs more relevant to either escalating capabilities further or for influencing coordination to stop doing that. Thus it's not quite the same, as human capabilities remain unchanged, so figuring out contentious claims will progress slower for humans, and similarly for ability to coordinate.