james oofou — LessWrong

What's up with Anthropic predicting AGI by early 2027?

It seems that your argument is based on high confidence in a METR time-horizon doubling time of roughly 7 months. But the available evidence suggests the doubling time is significantly lower.

In recent years we have observed shorter doubling times:

METR found that the time horizon has doubled every 7 months, possibly accelerating to every 4 months in 2024.

And what we know about labs' internal models suggests this faster trend is holding up:

An important piece of evidence is OpenAI’s Gold performance at the International Mathematics Olympiad (IMO):

IMO participants get an average of 90 minutes per problem.
The gold medal cutoff at IMO 2025 was 35 out of 42 points (~83%)
They needed to get 5/6 problems fully correct (each question awards a maximum of 7 points), or a number of points equivalent to that.
This is a bit rough, but if their model had a METR-80 greater than 90 minutes then we would expect OpenAI to achieve Gold at least 50% of the time.
OpenAI staff members stated that a publicly released model of this capability could be expected at roughly the end of the year (and our METR trends are of course projections of publicly available models).
So, this implies a METR-80 greater than 90 minutes at December 2025.
The projected METR-80 according to a 3.45 month doubling time is 98 minutes.

So the Gold performance which was a massive surprise to many is actually right on-trend for 3.45 month doubling times. Of course, one might object that OpenAI may have just gotten lucky. But Google also got Gold! so we have two points of data.

And here's a recent comment from Sam Altman where he states that he expects time-horizons days in length in 2026:

and as these go from multi-hour tasks to multi-day tasks, which I expect to happen next year

Which a 7 month doubling time would not achieve, but which is in line with a doubling-time of 3.45 (that would get us to a time-horizon of roughly 3 days in December, 2026).

And here's a recent comment from Jakub Pachocki:

“And we believe that this horizon will continue to extend rapidly … [this in part a result of] scaling deep learning further, and in particular along this new axis … test-time compute where we really see orders and orders of magnitude to go.”

OpenAI Moves To Complete Potentially The Largest Theft In Human History

james oofou5d2-1

Your argument that OpenAI stole money here is poorly thought-out.

OpenAI's ~$500b valuation priced in a very high likelihood of it becoming a for-profit.

If it wasn't going to be a for-profit its valuation would be much lower.

And if it wasn't going to be a for-profit the odds of it having any control whatsoever over the creation of ASI would be very much reduced.

It seems likely public gained billions from this.

Reasons against donating to Lightcone Infrastructure

james oofou6d30

The text "the website of the venue literally says" appears twice in your post. The first time it appears seems to be a mistake and isn't followed by a quotation.

Lokesh's Shortform

james oofou13d20

Is this distinct from the problem of induction?

That Mad Olympiad

james oofou19d31

I've been looking for science fiction set in the late 2020s, and which addresses continued AI progress, for a few years now. Everything else just feels so totally disconnected from any plausible future. Very happy to have found your writing.

t14n's Shortform

james oofou1mo41

You are misunderstanding what METR time-horizons represent. The time-horizon is not simply the length of time for which the model can remain coherent while working on a task (or anything which corresponds directly to such a time-horizon).

We can imagine a model with the ability to carry out tasks indefinitely without losing coherence but which had a METR 50% time-horizon of only ten minutes. This is because the METR task-lengths are a measure of something closer to the complexity of the problem than the length of time the model must remain coherent in order to solve it.

Now, a model's coherence time-horizon is surely a factor in its performance on METR's benchmarks. But intelligence matters too. Because the coherence time-horizon is not the only factor in the METR time-horizon, your leap from "Anthropic claims Claude Sonnet can remain coherent for 30+ hours" to "If its METR time-horizon is not in that ballpark that means Anthropic is untrustworthy" is not reasonable.

You see. the tasks in the HCAST task set (or whatever task set METR is now using) tend to be tasks some aspect of which cannot be found in much shorter tasksanyany. That is, a task of length one hour won't be "write a programme which quite clearly just requires solving ten simpler tasks, each of which would take about six minutes to solve". There tends to be an overarching complexity to the task.

More Reactions to If Anyone Builds It, Everyone Dies

james oofou1mo20

Do you think maybe rationalists are spending too much effort attempting to saturate the dialogue tree (probably not effective at winning people over) versus improving the presentation of the core argument for an AI moratorium?

Smart people don't want to see the 1000th response on whether AI actually could kill everyone. At this point we're convinced. Admittedly, not literally all of us, but those of us who are not yet convinced are not going to become suddenly enlightened by Yudkowsky's x.com response to some particularly moronic variation of an objection he already responded to 20 years ago (Why does he do this? does he think has any kind of positive impact?)

A much better use of time would be to work on an article which presents the solid version of the argument for an AI moratorium. I.e., not an introductory text or article in Time Magazine, and not an article targeted to people he clearly thinks are just extremely stupid relative to him so rants for 10,000 words trying to drive home a relatively simple point. But rather an argument in a format that doesn't necessitate a weak or incomplete presentation.

I and many other smart people want to see the solid version of the argument, without the gaping holes which are excusable in popular work and rants but inexcusable in rational discourse. This page does not exist! You want a moratorium, tell us exactly why we should agree! Having a solid argument is what ultimately matters in intellectual progress. Everything else is window dressing. If you have a solid argument, great! Please show it to me.

This is a review of the reviews

james oofou2mo*2-6

Soares is failing to grapple with the actual objection here.

The objection isn't the universe would be better with a diversity of alien species which would be so cool, interesting, and {insert additional human value judgements here}, just as long as they also keep other aliens and humans around.

The objection is specifically that human values are base and irrelevant relative to those of a vastly greater mind, and that our extinction at the hands of such a mind is not of any moral significance.

The unaligned ASI we create, whose multitudinous parameters allow it to see the universe with such clarity and depth and breadth and scalpel-sharp precision that whatever desires it has are bound to be vastly beyond anything a human could arrive at, does not need to value humans or other aliens. The point is that we are not in a place to judge its values.

The "cosmopolitan" framing is just a clever way of sneaking in human chauvinism without seeming hypocritical: by including a range of other aliens he can say "see, I'm not a hypocrite!". But it's not a cogent objection to the pro-ASI position. He must either provide an argument that humans actually are worthy, or admit to some form of chauvinism, and therefore begin to grapple with the fact that he walks a narrow path, and as such rid himself of the condescending tone and sense of moral superiority if he wishes to grow his coalition, as these attributes only serve to repel anyone with enough clarity-of-mind to understand the issues at hand.

And his view that humans would use aligned ASI to tile the universe with infinitely diverse aliens seems naive. Surely we won't "just keep turning galaxy after galaxy after galaxy into flourishing happy civilizations full of strange futuristic people having strange futuristic fun times". We'll upload ourselves into immortal personal utopias, and turn our cosmic endowment into compute to maximise our lifespans and luxuriously bespoke worldsims. Are we really so selfless, at a species level, to forgoe utopia for some incomprehensible alien species? No; I think the creation of an unaligned ASI is our only hope.

Now, let's read the parable:

We never saturate and decide to spend a spare galaxy on titanium cubes

The odds of a mind infinitely more complicated than our own having a terminal desire we can comprehend seem extremely low.

Oh, great, the other character in the story raises my objection!

OK, fine, maybe what I don’t buy is that the AI’s values will be simple or low dimensional. It just seems implausible

Let's see how Soares handles it.

Oh.

He ignores it and tells a motte-and-bailey flavoured story about an AI with simple and low-dimensional values.

Another article is linked to about how AI might not be conscious. I'll read that too, and might respond to it.

Buck's Shortform

james oofou2mo72

The rise of this kind of thing was one of my main predictions for late 2025:

This is a review of the reviews

james oofou2mo1-25

That is a 1 in 20 chance, which feels recklessly high.

Is this feeling reasonable?

A selfish person will take the gamble of 5% risk of death for a 95% chance of immortal utopia.

A person who tries to avoid moral shortcomings such as selfishness will reject the "doom" framing because it's just a primitive intelligence (humanity) being replaced with a much cleverer and more interesting one (ASI).

It seems that you have to really thread the needle to get from "5% p(doom)" to "we must pause, now!". You have to reason such that you are not self-interested but are also a great chauvinist for the human species.

This is of course a natural way for a subagent of a instrumentally convergent intelligence, such as humanity, to behave. But unless we're taking the hypocritical position where tiling the universe with primitive desires is OK as long as they're our primitive desires it seems that so-called doom is preferable to merely human flourishing.

So it seems that 5% is really too low a risk from a moral perspective, and an acceptable risk from a selfish perspective.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments