Your argument that OpenAI stole money here is poorly thought-out.
OpenAI's ~$500b valuation priced in a very high likelihood of it becoming a for-profit.
If it wasn't going to be a for-profit its valuation would be much lower.
And if it wasn't going to be a for-profit the odds of it having any control whatsoever over the creation of ASI would be very much reduced.
It seems likely public gained billions from this.
The text "the website of the venue literally says" appears twice in your post. The first time it appears seems to be a mistake and isn't followed by a quotation.
Is this distinct from the problem of induction?
I've been looking for science fiction set in the late 2020s, and which addresses continued AI progress, for a few years now. Everything else just feels so totally disconnected from any plausible future. Very happy to have found your writing.
You are misunderstanding what METR time-horizons represent. The time-horizon is not simply the length of time for which the model can remain coherent while working on a task (or anything which corresponds directly to such a time-horizon).
We can imagine a model with the ability to carry out tasks indefinitely without losing coherence but which had a METR 50% time-horizon of only ten minutes. This is because the METR task-lengths are a measure of something closer to the complexity of the problem than the length of time the model must remain coherent in order to solve it.
Now, a model's coherence time-horizon is surely a factor in its performance on METR's benchmarks. But intelligence matters too. Because the coherence time-horizon is not the only factor in the METR time-horizon, your leap from "Anthropic claims Claude Sonnet can remain coherent for 30+ hours" to "If its METR time-horizon is not in that ballpark that means Anthropic is untrustworthy" is not reasonable.
You see. the tasks in the HCAST task set (or whatever task set METR is now using) tend to be tasks some aspect of which cannot be found in much shorter tasksanyany. That is, a task of length one hour won't be "write a programme which quite clearly just requires solving ten simpler tasks, each of which would take about six minutes to solve". There tends to be an overarching complexity to the task.
Do you think maybe rationalists are spending too much effort attempting to saturate the dialogue tree (probably not effective at winning people over) versus improving the presentation of the core argument for an AI moratorium?
Smart people don't want to see the 1000th response on whether AI actually could kill everyone. At this point we're convinced. Admittedly, not literally all of us, but those of us who are not yet convinced are not going to become suddenly enlightened by Yudkowsky's x.com response to some particularly moronic variation of an objection he already responded to 20 years ago (Why does he do this? does he think has any kind of positive impact?)
A much better use of time would be to work on an article which presents the solid version of the argument for an AI moratorium. I.e., not an introductory text or article in Time Magazine, and not an article targeted to people he clearly thinks are just extremely stupid relative to him so rants for 10,000 words trying to drive home a relatively simple point. But rather an argument in a format that doesn't necessitate a weak or incomplete presentation.
I and many other smart people want to see the solid version of the argument, without the gaping holes which are excusable in popular work and rants but inexcusable in rational discourse. This page does not exist! You want a moratorium, tell us exactly why we should agree! Having a solid argument is what ultimately matters in intellectual progress. Everything else is window dressing. If you have a solid argument, great! Please show it to me.
Soares is failing to grapple with the actual objection here.
The objection isn't the universe would be better with a diversity of alien species which would be so cool, interesting, and {insert additional human value judgements here}, just as long as they also keep other aliens and humans around.
The objection is specifically that human values are base and irrelevant relative to those of a vastly greater mind, and that our extinction at the hands of such a mind is not of any moral significance.
The unaligned ASI we create, whose multitudinous parameters allow it to see the universe with such clarity and depth and breadth and scalpel-sharp precision that whatever desires it has are bound to be vastly beyond anything a human could arrive at, does not need to value humans or other aliens. The point is that we are not in a place to judge its values.
The "cosmopolitan" framing is just a clever way of sneaking in human chauvinism without seeming hypocritical: by including a range of other aliens he can say "see, I'm not a hypocrite!". But it's not a cogent objection to the pro-ASI position. He must either provide an argument that humans actually are worthy, or admit to some form of chauvinism, and therefore begin to grapple with the fact that he walks a narrow path, and as such rid himself of the condescending tone and sense of moral superiority if he wishes to grow his coalition, as these attributes only serve to repel anyone with enough clarity-of-mind to understand the issues at hand.
And his view that humans would use aligned ASI to tile the universe with infinitely diverse aliens seems naive. Surely we won't "just keep turning galaxy after galaxy after galaxy into flourishing happy civilizations full of strange futuristic people having strange futuristic fun times". We'll upload ourselves into immortal personal utopias, and turn our cosmic endowment into compute to maximise our lifespans and luxuriously bespoke worldsims. Are we really so selfless, at a species level, to forgoe utopia for some incomprehensible alien species? No; I think the creation of an unaligned ASI is our only hope.
Now, let's read the parable:
We never saturate and decide to spend a spare galaxy on titanium cubes
The odds of a mind infinitely more complicated than our own having a terminal desire we can comprehend seem extremely low.
Oh, great, the other character in the story raises my objection!
OK, fine, maybe what I don’t buy is that the AI’s values will be simple or low dimensional. It just seems implausible
Let's see how Soares handles it.
Oh.
He ignores it and tells a motte-and-bailey flavoured story about an AI with simple and low-dimensional values.
Another article is linked to about how AI might not be conscious. I'll read that too, and might respond to it.
The rise of this kind of thing was one of my main predictions for late 2025:
That is a 1 in 20 chance, which feels recklessly high.
Is this feeling reasonable?
A selfish person will take the gamble of 5% risk of death for a 95% chance of immortal utopia.
A person who tries to avoid moral shortcomings such as selfishness will reject the "doom" framing because it's just a primitive intelligence (humanity) being replaced with a much cleverer and more interesting one (ASI).
It seems that you have to really thread the needle to get from "5% p(doom)" to "we must pause, now!". You have to reason such that you are not self-interested but are also a great chauvinist for the human species.
This is of course a natural way for a subagent of a instrumentally convergent intelligence, such as humanity, to behave. But unless we're taking the hypocritical position where tiling the universe with primitive desires is OK as long as they're our primitive desires it seems that so-called doom is preferable to merely human flourishing.
So it seems that 5% is really too low a risk from a moral perspective, and an acceptable risk from a selfish perspective.
It seems that your argument is based on high confidence in a METR time-horizon doubling time of roughly 7 months. But the available evidence suggests the doubling time is significantly lower.
In recent years we have observed shorter doubling times:
And what we know about labs' internal models suggests this faster trend is holding up:
An important piece of evidence is OpenAI’s Gold performance at the International Mathematics Olympiad (IMO):
So the Gold performance which was a massive surprise to many is actually right on-trend for 3.45 month doubling times. Of course, one might object that OpenAI may have just gotten lucky. But Google also got Gold! so we have two points of data.
And here's a recent comment from Sam Altman where he states that he expects time-horizons days in length in 2026:
Which a 7 month doubling time would not achieve, but which is in line with a doubling-time of 3.45 (that would get us to a time-horizon of roughly 3 days in December, 2026).
And here's a recent comment from Jakub Pachocki: