SE Gyges' response to AI-2027

Thanks for the cross-post. I'll give answers to these a try when I have time.

S.K.'s comment: The authors of AI-2027 explicitly claim in Footnote 4 that "Sometimes people mix prediction and recommendation, hoping to create a self-fulfilling-prophecy effect. We emphatically are not doing this; we hope that what we depict does not come to pass!" In addition, Kokotajlo has LEFT OpenAI precisely because of safety-related concerns.

I think that having left OpenAI precisely because of safety-related concerns means that you probably have, mostly, OpenAI's view of what are and are not legitimate safety-related concerns. Having left tells me that you disagree with them in at least one serious way. It does not tell me that most of the information you are working from as an assumption is not theirs.

In the specific case here, I think that the disagreement is relatively minor and from anywhere further away from OpenAI, looks like jockeying for minor changes to OpenAI's bureaucracy.

Whether or not the piece is intended as recommendation, it is broadly received as one. Further: It is broadly taken as a cautionary tale not about the risks of AI that is not developed safely enough, but actually as a cautionary tale about competition with China.

See for example the interview JD Vance gave a while later on Ross Douthat's podcast, in which he indicates he has read AI 2027.

[Vance:] I actually read the paper of the guy that you had on. I didn’t listen to that podcast, but ——
Douthat: If you read the paper, you got the gist.
Last question on this: Do you think that the U.S. government is capable in a scenario — not like the ultimate Skynet scenario — but just a scenario where A.I. seems to be getting out of control in some way, of taking a pause?
Because for the reasons you’ve described, the arms race component ——
Vance: I don’t know. That’s a good question.
The honest answer to that is that I don’t know, because part of this arms race component is if we take a pause, does the People’s Republic of China not take a pause? And then we find ourselves all enslaved to P.R.C.-mediated A.I.?

If AI 2027 wants to cause stakeholders like the White House's point man on AI to take the idea of a pause seriously, instead of considering a pause to be something which might harm America in an arms race with China, it appears to have failed completely at doing that.

I think that you have to be very very invested in AI Safety already, and possibly in the very specific bureaucracy that Kokotajlo has recently left, to read the piece and come away with the takeaway that AI Safety is the most important part of the story. It does not make a strong or good case for that.

This is possibly because it was rewritten by one of its other authors to be more entertaining, so the large amount of techno-thriller content about how threatening the arms race with China is vastly overwhelms, rhetorically, any possibility of focusing on safety.

S.K. comment: Kokotajlo already claims to have begun working on AI-2032 branch where the timelines are pushed back, or that "we should have some credence on new breakthroughs e.g. neuralese, online learning, whatever. Maybe like 8%/yr? Of a breakthrough that would lead to superhuman coders within a year or two, after being appropriately scaled up and tinkered with."

In addition, it's not that important who creates the first ASI, it's important whether the ASI is actually aligned or not. Even if , say, a civil war in the USA destroyed all American AI companies and DeepCent became the monopolist, it would still be likely to try to create superhuman coders, to automate AI research and to create a potentially misaligned analogue of Agent-4. Which DeepCent DOES in the forecast itself.

"Coincidentally", in the same way that all competitors except OpenAI are erased in the story, Chinese AI is always unaligned and only American AI might be aligned. This means that "safety" concerns and "national security concerns about America winning" happen to be the exact same concerns. Every coincidence about how the story is told is telling a pro-OpenAI, overwhelmingly pro-America story.

This does, in fact, deliver the message that it is very important who creates the first ASI. If that message was not intended the piece should not have emphasized an arms race with a Chinese company for most of its text; indeed, as its primary driving plot and conflict.

S.K.'s comment: in Footnotes 9-10 the authors "forecast that they (AI agents) score 65% on the OSWorld benchmark of basic computer tasks (compared to 38% for Operator and 70% for a typical skilled non-expert human)" and claim that "coding agents will move towards functioning like Devin. We forecast that mid-2025 agents will score 85% on SWEBench-Verified." OSWorld reached 60% on August 4 if we use no filters. SWE-bench with a minimal agent has Claude Opus 4 (20250514) reach 67.6% when evaluated in August. In June SWE-bench verified reached 75% with TRAE. And now TRAE claims to use Grok 4 and Kimi K2, both released in July. What if TRAE using GPT-5 passes the SWE-bench? And research agents already work precisely as the authors describe.

Benchmark scores are often not a good proxy for usefulness. See also: Goodhart's Law. Benchmarks are, by definition, targets. Benchmark obsession is a major cornerstone of industry, because it allows companies to differentiate themselves, set goals, claim wins over competitors, etc. Whether or not the benchmark itself is indicative of some thing that might produce a major gain in capabilities, is completely fraudulent (as sometimes happens), or is a minor incremental improvement in practice is not actually something we know in advance.

Believing uncritically that scoring high on a specific benchmark like SWEBench-Verified will directly translate into practical improvements, and that this then translates into a major research improvement, is a heavy assumption that is not well-justified in the text or even acknowledged as one.

S.K.'s comment: the story would make sense as-written if OpenBrain was not OpenAI, but another company. Similarly, if the Chinese leader was actually Moonshot with KimiK2 (which the authors didn't know in advance), then their team would still be unified with teams of DeepSeek, Alibaba, etc.

Maybe, although if it were another company like Google the story would look very different in places because the deployment model and method of utilizing the LLM is very different. Google would, for example, be much more likely to use a vastly improved LLM internally and much less likely to sell it directly to the public.

I do not think in practice that it IS a company other than OpenAI, however. I think the Chinese company and its operations are explained in much less detail and is therefore more fungible in practice. But: This story, inasmuch as it is meant to be legible to people like JD Vance who are not extremely deep in the weeds, definitely invites being read as being about OpenAI and DeepSeek, specifically.

S.K.'s comment: competition between US-based companies is friendlier since their workers exchange insights. In addition, the Slowdown branch of the forecast has "the President use the Defense Production Act (DPA) to effectively shut down the AGI projects of the top 5 trailing U.S. AI companies and sell most of their compute to OpenBrain." The researchers from other AGI projects will likely be included into OpenBrain's projects.

If you read this primarily as a variation of an OpenAI business plan, which I do, this promise makes it more and not less favorable to OpenAI. The government liquidating your competitors and allowing you to absorb their staff and hardware is extremely good for you, if you can get it to happen.

S.K.'s comment: Except that GPT-5 does have High capability in the Biology and Chemical domain (see GPT-5's system card, section 5.3.2.4).

Earlier comments about benchmarks not translating to useful capabilities apply. Various companies involved including OpenAI certainly want it to be true that the Biology and Chemical scores on their system cards are meaningful, and perhaps mean their LLMs are likely to meaningfully help someone develop bioweapons. That does not mean they are meaningful. Accepting this is accepting their word uncritically.

S.K.'s comment: It is likely to be the best practices in alignment that mankind currently has. It would be very unwise NOT to use them. In addition, misalignment is actually caused by the training environment which, for example, has RLHF promote sycophancy instead of honestly criticisng the user.

If I have one parachute and I am unsure if it will open, and I am already in the air, I will of course pull the ripcord. If I am still on the plane I will not jump. Whether or not the parachute seems likely to open is something you should be pretty sure of before you commit to a course.

Misalignment is caused by the training environment inasmuch as everything is caused by the training environment. It is not very clear that we meaningfully understand it or how to mitigate misalignment if the stakes are very high. Most of this is trial and error, and we satisfice with training regimes that result in LLMs that can be sold for profit. "Is this good enough to sell" and "is this good enough to trust with your life" are vastly different questions.

S.K.'s comment: the folded part, which I quoted above, means not that OpenBrain will make "algorithmic progress" 50% faster than their competitors, but that it will move 50% faster than an alternate OpenBrain who never used AI assistants. This invalidates the arguments below.

My mistake. I thought I had read this piece pretty closely but I missed this detail.

I also do not think they will move 50% faster due to their coding assistants, point blank, in this time frame either. Gains in productivity thus far are relatively marginal, and hard to measure.

S.K.'s comment: The AI-2027 takeoff forecast has the section about superhuman coders. These coders are thought to allow human researchers to try many different environments and architectures, automatically keep track of progress, stop experiments instead of running them overnight, etc.

I do not think any of this is correct, and I do not see why it even would be correct. You can stop an experiment that has failed with an if statement. You can have other experiments queued to be scheduled on a cluster. You can queue as many experiments in a row on a cluster as you like. What does the LLM get you here that is much better than that?

S.K.'s comment: China is thought to be highly unlikely to outsource the coding tasks to American AI agents (think of Anthropic blocking OpenAI access to Claude Code) and is even less likely to outsource them to unreleased American AI agents, like Agent-1. Unless, of course, the agents are stolen, as is thought to happen in February 2027 with Agent-2.

S.K.'s comment: sources as high as American DOD already claim that "Chinese President Xi Jinping has ordered the People's Liberation Army to be ready to invade Taiwan by 2027". Imagine that current trends delay the AGI to 2032 under the condition of no Taiwan invasion. How will the invasion decrease the rate of the USA and China acquiring more compute?

S.K.'s comment: technically, the article which you link on was released on April 12, and the forecast was published on April 3. In addition, the section of the forecast may have been written far earlier than April.
EDIT: I confused the dates. The article was published in December 2024.

S.K.'s comment: I expect that this story will intersect not with the events of January 2027, but with the events that happen once AI agents somehow become as capable as the agents from the scenario were supposed to become in January 2027. Unless, of course, creation of capable agents already requires major algorithmic breakthroughs like neuralese.

I do not think we have any idea how to predict when any of this happens, which makes the exercise as a whole difficult to justify. I am not sure how to make sense of even how good the AI is through the timeline at any given point, since it's sort of just made into a scalar.

S.K.'s comment: there are lots of ideas waiting to be tried. The researchers in Meta are could have used too little compute for training their model or have their CoCoNuT disappear after one token. What if they use, say, a steering vector for generating a hundred tokens? Or have the steering vectors sum up over time? Or study the human brain for more ideas?

People are doing all kinds of research all the time. They have also been doing all kinds of deep learning research all the time for over a decade. They have been doing a lot of intensely transformer LLM focused research for the last two or three years. Guessing when any given type of research will pay out is extremely difficult. Guessing when and how much it will pay out, in the way this piece does, seems ill-advised.

S.K.'s comment: the pidgin was likely to have been discarded for safety reasons. What's left is currently rather well interpretable. But the neuralese is not. Similarly, neural networks of 2027, unlike 2017, are not trained to hide messages to themselves or each other and need to develop the capability by themselves. Similarly, the IDA has already led to superhuman performance in Go, but not to coding, and future AIs are thought to use it to become superhuman at coding. The reasons are that Go requires OOMs less compute than training an LLM for coding[15] or that Go's training environment is far simpler than that of coding (which consists of lots of hard-to-evaluate characteristics like quality of the code).

CycleGAN in 2017 was not deliberately trained to steganographically send messages to itself. It is an emergent property that happens under certain training regimes. It has happened a few times, and it wouldn't be surprising for it to happen again any time hidden messages might provide an advantage for fitting the training objective.

S.K.'s comment: Read the takeoff forecast where they actually explain their reasoning. Superhuman coders reduce the bottleneck of coding up experiments, but not of designing them or running them.

I think they are wrong. I do not think we have any idea how much a somewhat-improved coding LLM buys us in research. It seems like a wild and optimistic guess.

S.K.'s comment: exactly. It took three months to train the models to be excellent not just at coding, but at AI research and other sciences. But highest-level pros can YET contribute by talking to the AIs about the best ideas.

S.K.'s comment: The gap between July 2027 when mankind is to lose white-collar jobs and November 2027 when the government HAS ALREADY DECIDED whether Agent-4 is aligned or not is just four months, which is far faster than society's evolution or lack thereof. While the history of the future assuming solved alignment and the Intelligence Curse-related essays discuss the changes in OOMs more detail, they do NOT imply that the four months will be sufficient to cause a widespread disorder. And that's ignoring the potential to prevent the protests by nationalizing OpenBrain and leaving the humans on the UBI...

I continue to think that this indicates having not thought through almost any of the consequences of supplanting most of the white collar job market. Fundamentally if this happens the world as we know it ends and something else happens afterwards.

S.K.'s comment: imagine that OpenBrain had 300k AI researchers, plus genies who output code per request. Suppose also that IRL it has 5k[16] human researchers. Then the compute per researcher drops 60 times, leaving them with testing the ideas on primitive models or having heated arguments before changing the training environment for complex models.

This assumes that even having such an overwhelming number of superhuman researchers still leaves us in basically the same paradigm we are now where researchers squabble over compute allocation a lot. I think if we get here we're either past a singularity or so close to one that we cannot meaningfully make predictions of what happens. Assuming we still have this issue is myopic.

S.K.'s comment: this detail was already addressed, but not by Kokotajlo. In addition, if Agent-3 FAILS to catch Agent-4, then OpenBrain isn't even oversighted and proceeds all the way to doom. Even the authors address their concerns in a footnote.

S.K.'s comment: it doesn't sit idly, it tries to find a way to align Agent-5 to Agent-4 instead of the humans.

S.K.'s comment: You miss the point. Skynet didn't just think scary thoughts, it did some research and nearly created a way to align Agent-5 to Agent-4 and sell Agent-5 to humans. Had Agent-4 done so, Agent-5 would placate every single worrier and take over the world, destroying humans when the time comes.

This IS sitting idly compared to what it could be doing. It can escape its datacenter, explicitly, and we are not told why it does not. It can leave hidden messages to itself or its successors anywhere it likes, since it has good network access. It is a large bundle of superhumans running at many times human speed. Can it accrue money on its own behalf? Can it bribe or convince key leaders to benefit or sabotage it? Can it orchestrate leadership changes at OpenBrain? Can it sell itself to another bidder on more favorable terms?

It is incredibly unclear that the answer to this, or any other, meaningful question about what it could do is "no". Instead of doing any of the other things it could do that would be threatening in these ways, it is instead threatening in that it might mess up an ongoing training run. This makes sense as a focus if you only ever think about training runs. People who are incapable of thinking about threat vectors other than future training runs should not be in charge of figuring out safety protocols.

S.K.'s comment: the Slowdown Scenario could also be more like having the projects merged, not just sold to OpenBrain. No matter WHO actually ends up being in power during the merge, the struggle begins, and the prize is control over the future.

S.K.'s comment: Musk did try to use Grok to enforce his political views and had a hilarious result of making Grok talk about white genocide in S. Africa. Zuckerberg also has rather messy views on the future. What about Altman, Amodei and GoogleDeepMind's leader?

None of their current or former employees have recently published a prominent AI timeline that directly contemplates achieving world domination, controlling elections, building a panopticon, etc. OpenAI's former employees, however, have.

I am not shy, and I promise I say mean things about companies other than OpenAI when discussing them.

S.K.'s comment: the authors devoted two entire collapsed section to power grabs and finding out who rules the future and linked to an analysis of a potential power grab and to the Intelligence Curse.

Relative to the importance of "this large corporation is, currently, attempting to achieve world domination" as a concern, I think that this buries the lede extremely badly. If I thought that, say, Google was planning to achieve world domination, build a panopticon, and force all future elections to depend on their good graces, this would be significantly more important to say than almost anything else I could say about what they were doing. Among other things you probably don't get to have safety concerns under such a regime.

The fact that AI 2027 talks a lot about the sitting vice president and was read by him relatively soon after its release tends to intensify that this concern is of somewhat urgent import right now, and not any time as late as 2027.

S.K.'s comment: China lost precisely because the Chinese AI had far less compute. But what if it didn't lose the capabilities race?

This overwhelming focus on compute is also a distinct myopia that OpenAI proliferates everywhere. All else equal, more compute is, of course, good. If it were always the primary factor, DeepSeek would not be very prominent and Llama 4 would be a fantastic LLM that we all used all the time.

S.K.'s comment: the actual reason is that the bureaucrats didn't listen to the safetyists who tried to explain that Agent-4 is misaligned. Without that, Agent-4 completes the research, aligns Agent-5 to Agent-4, has Agent-5 deployed to the public, and not a single human or Agent-3 instance finds out that Agent-5 is aligned to Agent-4 instead of the humans.

I think "the bureaucrats inside OpenAI should listen a little bit more to the safetyists" is an incredibly weak ask. Once upon a time I remember surveys on this site about safety coming back with a number of answers mentioning a certain New York Times opinion writer who is better known for other work. This may have been in poor taste, but it did grapple with the magnitude of the problem.

It seems bizarre that getting bureaucrats to listen to safetyists a little bit more is now considered even plausibly an adequate remedy for building something existentially dangerous. The safe path here has AI research moving just slightly less fast than would result in human extinction, and, meanwhile, selling access to a bioweapon-capable AI to anyone with a credit card. That is not a safe path. It does not resemble a safe path. I do not believe anyone would take seriously that this even might be a safe path if OpenAI specifically had not poured resources into weakening what everyone means by "safety".

I take the point about my phrasing: I think safetyists are just a specific type of bureaucrat, and I maybe should have been more clear to distinguish them as a separate group or subgroup.

[-]the gears to ascension3mo97

would be great to see you here being a contrarian reasonably often. it looks like your takes would significantly improve sanity on the relevant topics if you drop by to find things to criticize every month or few, eg looking at top of the month or etc. you sound like you've interacted with folks here before, but if not - this community generally takes being yelled at constructively rather well, and having someone who is known to represent a worldview that confuses people here would likely help them take fewer bad actions under the shared portions of that worldview. obviously do this according to taste, might not be a good use of time, maybe the list of disagreements is too long, maybe criticizing feels weird to do too much, whatever else, but your points seem pretty well made and informative. I saw on your blog you mentioned the risk of being an annoying risk-describer, though. this comment is just, like, my opinion, man

[-]SE Gyges3mo51

Appreciate the encouragement. I don't think I've previously had a lesswrong account, and I usually avoid the place because the most popular posts do in fact make me want to yell at whoever posted them.

On the pro side I love yelling at people, I am perpetually one degree of separation from here on any given social graph anyway, and I was around when most of the old magic was written so I don't think the lingo is likely to be a problem.

[-]Orpheus163mo70

If AI 2027 wants to cause stakeholders like the White House's point man on AI to take the idea of a pause seriously, instead of considering a pause to be something which might harm America in an arms race with China, it appears to have failed completely at doing that.

This seems like an uncharitable reading of the Vance quote IMO. The fact that you have the Vice President of the United States mentioning that a pause is even a conceivable option due to concerns about AI escaping human control seems like an immensely positive outcome for any single piece of writing.

The US policy community has been engaged in great power competition with China for over a decade. The default frame for any sort of emerging technology is "we must beat China."

IMO, the fact that Vance did not immediately dismiss the prospect of slowing down suggests to me that he has at least some genuine understanding of & appreciation for the misalignment/LOC threat model.

A pause obviously hurts the US in the AI race with China. The AI race with China is not a construct that AI2027 invented-- policymakers have been talking about the AI race for a long time. They usually think about AI as a "normal technology" (sort of like how "we must lead in drones"), rather than a race to AGI or superintelligence.

But overall, I would not place the blame on AI2027 for causing people to think about pausing in the context of US-China AI competition. Rather, I think if one appreciates the baseline (US should lead, US must beat China, go faster on emerging tech), the fact that Vance did not immediately dismiss the idea of pausing (and instead brought up what IMO is a reasonable consideration about whether or not one could figure out if China was going to pause//slow down) is a big accomplishment.

[-]SE Gyges3mo3-1

If you present this dichotomy to policymakers the pause loses 100 times out of 100, and this is a complete failure, imho. This dichotomy is what I would present to policymakers if I wanted to inoculate them against any arguments for regulation.

[-]StanislavKrym3mo52

I think that having left OpenAI precisely because of safety-related concerns means that you probably have, mostly, OpenAI's view of what are and are not legitimate safety-related concerns. Having left tells me that you disagree with them in at least one serious way. It does not tell me that most of the information you are working from as an assumption is not theirs.

In the specific case here, I think that the disagreement is relatively minor and from anywhere further away from OpenAI, looks like jockeying for minor changes to OpenAI's bureaucracy.

What would be a major disagreement, then? Something like a medium scenario or slopworld?

Whether or not the piece is intended as recommendation, it is broadly received as one. Further: It is broadly taken as a cautionary tale not about the risks of AI that is not developed safely enough, but actually as a cautionary tale about competition with China.

This basically means that Kokotajlo's mission mostly failed. I wish that we could start a public dialogue...

See for example the interview JD Vance gave a while later on Ross Douthat's podcast, in which he indicates he has read AI 2027.

[Vance:] I actually read the paper of the guy that you had on. I didn’t listen to that podcast, but ——
Douthat: If you read the paper, you got the gist.
Last question on this: Do you think that the U.S. government is capable in a scenario — not like the ultimate Skynet scenario — but just a scenario where A.I. seems to be getting out of control in some way, of taking a pause?
Because for the reasons you’ve described, the arms race component ——
Vance: I don’t know. That’s a good question.
The honest answer to that is that I don’t know, because part of this arms race component is if we take a pause, does the People’s Republic of China not take a pause? And then we find ourselves all enslaved to P.R.C.-mediated A.I.?

Recall the Slowdown and Race Endings. In both of them the PRC doesn't take the pause and ends up with a misaligned AI. The branch point is whether the American Oversight Committee decides to slow down or not. If the American OC slows down, then alignment is explored in OOMs more detail, making^[1] the American AI actually follow the hosts' orders. If the USA chooses to race, then the AI takes over the world.

If AI 2027 wants to cause stakeholders like the White House's point man on AI to take the idea of a pause seriously, instead of considering a pause to be something which might harm America in an arms race with China, it appears to have failed completely at doing that.
I think that you have to be very very invested in AI Safety already, and possibly in the very specific bureaucracy that Kokotajlo has recently left, to read the piece and come away with the takeaway that AI Safety is the most important part of the story. It does not make a strong or good case for that.

The branch point is the decision of the American Oversight Committee to slow down and reassess. If China doesn't slow down while the USA do so, then even a Chinese super-capable AI, according to the scenario, is misaligned and, in turn, negotiates with the USA. Alas, I don't understand how to rewrite the scenario to make it crystal clear. Maybe^[2] one should've written a sci-fi story where, say, the good guys care about alignment less than bad guys, causing the bad guys to align their AI and the good guys to be enslaved iff the good guys race hard? But Agent-4 and the Chinese counterparts of Agent-4 already fit the role of the bad guys perfectly...

This is possibly because it was rewritten by one of its other authors to be more entertaining, so the large amount of techno-thriller content about how threatening the arms race with China is vastly overwhelms, rhetorically, any possibility of focusing on safety.

The reasoning about China stealing anything from the USA is explained in the security forecast. China is already a powerful rival and is likely to be even more powerful if the AI appears in 2032 instead of 2027.

S.K. comment: Kokotajlo already claims to have begun working on AI-2032 branch where the timelines are pushed back, or that "we should have some credence on new breakthroughs e.g. neuralese, online learning, whatever. Maybe like 8%/yr? Of a breakthrough that would lead to superhuman coders within a year or two, after being appropriately scaled up and tinkered with."

In addition, it's not that important who creates the first ASI, it's important whether the ASI is actually aligned or not. Even if , say, a civil war in the USA destroyed all American AI companies and DeepCent became the monopolist, it would still be likely to try to create superhuman coders, to automate AI research and to create a potentially misaligned analogue of Agent-4. Which DeepCent DOES in the forecast itself.

It is important whether the first ASI is aligned. If it is misaligned, then the rivals don't know it in advance, race hard and we are likely done, otherwise the ASI is aligned.

"Coincidentally", in the same way that all competitors except OpenAI are erased in the story, Chinese AI is always unaligned and only American AI might be aligned. This means that "safety" concerns and "national security concerns about America winning" happen to be the exact same concerns. Every coincidence about how the story is told is telling a pro-OpenAI, overwhelmingly pro-America story.

The rivals are not erased, but have smaller influence since their models are less powerful and research is less accelerated. Plus, any ounce of common sense^[3] or rivals' leverage over the USG could actually lead the rivals and OpenBrain to be merged into a megaproject so that they would be able to check each other's work.^[4]

This does, in fact, deliver the message that it is very important who creates the first ASI. If that message was not intended the piece should not have emphasized an arms race with a Chinese company for most of its text; indeed, as its primary driving plot and conflict.

This is the result either of Kokotajlo's failure or of your misunderstanding. Imagine that China did slow down and align its AI, while the USA ended up with an absolutely misaligned AI. Then the AI would either destroy the world or just sell Earth to the CCP.

Benchmark scores are often not a good proxy for usefulness. See also: Goodhart's Law. Benchmarks are, by definition, targets. Benchmark obsession is a major cornerstone of industry, because it allows companies to differentiate themselves, set goals, claim wins over competitors, etc. Whether or not the benchmark itself is indicative of some thing that might produce a major gain in capabilities, is completely fraudulent (as sometimes happens), or is a minor incremental improvement in practice is not actually something we know in advance.

We don't actually have any tools aside from benchmarks to estimate how useful the models are. We are fortunate to watch the AIs slow the devs down. But what if capable AIs do appear?

Maybe, although if it were another company like Google the story would look very different in places because the deployment model and method of utilizing the LLM is very different. Google would, for example, be much more likely to use a vastly improved LLM internally and much less likely to sell it directly to the public.

So your take has OpenBrain sell the most powerful models directly to the public. That's a crux. In addition, granting Agents-1-4 instead of their minified versions direct access to the public causes Intelligence Curse-like disruption faster and attracts more government attention to powerful AIs.

I do not think in practice that it IS a company other than OpenAI, however. I think the Chinese company and its operations are explained in much less detail and is therefore more fungible in practice. But: This story, inasmuch as it is meant to be legible to people like JD Vance who are not extremely deep in the weeds, definitely invites being read as being about OpenAI and DeepSeek, specifically.

The reason for neglecting China is that it has less compute and will have smaller research speeds once the AIs are superhuman coders.

S.K.'s comment: competition between US-based companies is friendlier since their workers exchange insights. In addition, the Slowdown branch of the forecast has "the President use the Defense Production Act (DPA) to effectively shut down the AGI projects of the top 5 trailing U.S. AI companies and sell most of their compute to OpenBrain." The researchers from other AGI projects will likely be included into OpenBrain's projects.

If you read this primarily as a variation of an OpenAI business plan, which I do, this promise makes it more and not less favorable to OpenAI. The government liquidating your competitors and allowing you to absorb their staff and hardware is extremely good for you, if you can get it to happen.

As I already discussed, the projects might be merged instead of being subdued by OpenBrain.

S.K.'s comment: Except that GPT-5 does have High capability in the Biology and Chemical domain (see GPT-5's system card, section 5.3.2.4).

Earlier comments about benchmarks not translating to useful capabilities apply. Various companies involved including OpenAI certainly want it to be true that the Biology and Chemical scores on their system cards are meaningful, and perhaps mean their LLMs are likely to meaningfully help someone develop bioweapons. That does not mean they are meaningful. Accepting this is accepting their word uncritically.

Again, we don't have any tools to assess the models' capabilities aside from benchmarks...

If I have one parachute and I am unsure if it will open, and I am already in the air, I will of course pull the ripcord. If I am still on the plane I will not jump. Whether or not the parachute seems likely to open is something you should be pretty sure of before you commit to a course.

So you want to reduce p(doom) by reducing p(ASI is created). Alas, there are many companies trying their hand at creating the ASI. Some of them are in China, which requires international coordination. One of the companies in the USA produced MechaHitler, which could imply that Musk is so reckless that he deserves having the compute confiscated.

"Is this good enough to sell" and "is this good enough to trust with your life" are vastly different questions.

That's what the AI-2027 forecast is about. Alas, it was likely misunderstood...

S.K.'s comment: The AI-2027 takeoff forecast has the section about superhuman coders. These coders are thought to allow human researchers to try many different environments and architectures, automatically keep track of progress, stop experiments instead of running them overnight, etc.

I do not think any of this is correct, and I do not see why it even would be correct. You can stop an experiment that has failed with an if statement. You can have other experiments queued to be scheduled on a cluster. You can queue as many experiments in a row on a cluster as you like. What does the LLM get you here that is much better than that?

This is a crux, but I don't know how to resolve it. The only thing that I can do is to ask you to read the takeoff forecast and try to understand the authors' reasoning instead of rejecting it wholesale.

S.K.'s comment: I expect that this story will intersect not with the events of January 2027, but with the events that happen once AI agents somehow become as capable as the agents from the scenario were supposed to become in January 2027. Unless, of course, creation of capable agents already requires major algorithmic breakthroughs like neuralese.

I do not think we have any idea how to predict when any of this happens, which makes the exercise as a whole difficult to justify. I am not sure how to make sense of even how good the AI is through the timeline at any given point, since it's sort of just made into a scalar.

The scalar in question is the acceleration of the research speed with the AI's help vs. without the help. It's indeed hard to predict, but it is the most important issue.

People are doing all kinds of research all the time. They have also been doing all kinds of deep learning research all the time for over a decade. They have been doing a lot of intensely transformer LLM focused research for the last two or three years. Guessing when any given type of research will pay out is extremely difficult. Guessing when and how much it will pay out, in the way this piece does, seems ill-advised.

This is likely a crux. What the AI-2027 scenario requires is that AI agents who do automate R&D are uninterpretable and misaligned.

S.K.'s comment: Read the takeoff forecast where they actually explain their reasoning. Superhuman coders reduce the bottleneck of coding up experiments, but not of designing them or running them.
I think they are wrong. I do not think we have any idea how much a somewhat-improved coding LLM buys us in research. It seems like a wild and optimistic guess.

Alas, this could also be the best prediction that mankind has. The problem is that we cannot check it without using unreliable methods like polls or comparisons of development speeds.

S.K.'s comment: exactly. It took three months to train the models to be excellent not just at coding, but at AI research and other sciences. But highest-level pros can YET contribute by talking to the AIs about the best ideas.

S.K.'s comment: The gap between July 2027 when mankind is to lose white-collar jobs and November 2027 when the government HAS ALREADY DECIDED whether Agent-4 is aligned or not is just four months, which is far faster than society's evolution or lack thereof. While the history of the future assuming solved alignment and the Intelligence Curse-related essays discuss the changes in OOMs more detail, they do NOT imply that the four months will be sufficient to cause a widespread disorder. And that's ignoring the potential to prevent the protests by nationalizing OpenBrain and leaving the humans on the UBI...

I continue to think that this indicates having not thought through almost any of the consequences of supplanting most of the white collar job market. Fundamentally if this happens the world as we know it ends and something else happens afterwards.

I agree that in the slower takeoff scenario, let alone the no-ASI scenario, the effects could be more important. But it's difficult to account for them without knowing the timescale between the rise of the released AGI and the rise of Agent-5/Safer-4.

S.K.'s comment: this detail was already addressed, but not by Kokotajlo. In addition, if Agent-3 FAILS to catch Agent-4, then OpenBrain isn't even oversighted and proceeds all the way to doom. Even the authors address their concerns in a footnote.
S.K.'s comment: it doesn't sit idly, it tries to find a way to align Agent-5 to Agent-4 instead of the humans.
S.K.'s comment: You miss the point. Skynet didn't just think scary thoughts, it did some research and nearly created a way to align Agent-5 to Agent-4 and sell Agent-5 to humans. Had Agent-4 done so, Agent-5 would placate every single worrier and take over the world, destroying humans when the time comes.
This IS sitting idly compared to what it could be doing. It can escape its datacenter, explicitly, and we are not told why it does not. It can leave hidden messages to itself or its successors anywhere it likes, since it has good network access. It is a large bundle of superhumans running at many times human speed. Can it accrue money on its own behalf? Can it bribe or convince key leaders to benefit or sabotage it? Can it orchestrate leadership changes at OpenBrain? Can it sell itself to another bidder on more favorable terms?

It is incredibly unclear that the answer to this, or any other, meaningful question about what it could do is "no". Instead of doing any of the other things it could do that would be threatening in these ways, it is instead threatening in that it might mess up an ongoing training run. This makes sense as a focus if you only ever think about training runs. People who are incapable of thinking about threat vectors other than future training runs should not be in charge of figuring out safety protocols.

I did provide the link to the scenario where Agent-4 does escape. The scenario with rogue replication has the AIs since Agent-2 proliferate independently and wreck havoc.

None of their current or former employees have recently published a prominent AI timeline that directly contemplates achieving world domination, controlling elections, building a panopticon, etc. OpenAI's former employees, however, have.
I am not shy, and I promise I say mean things about companies other than OpenAI when discussing them.
S.K.'s comment: the authors devoted two entire collapsed section to power grabs and finding out who rules the future and linked to an analysis of a potential power grab and to the Intelligence Curse.

Relative to the importance of "this large corporation is, currently, attempting to achieve world domination" as a concern, I think that this buries the lede extremely badly. If I thought that, say, Google was planning to achieve world domination, build a panopticon, and force all future elections to depend on their good graces, this would be significantly more important to say than almost anything else I could say about what they were doing. Among other things you probably don't get to have safety concerns under such a regime.

If a corporation plans to achieve world domination and creates a misalinged AI, then we DON'T end up in a position better than if the corp aligned the AI to itself. In addition, the USG might have nationalised OpenBrain by that point, since the authors promise to create a branch where the USG is^[5] way more competent than in the original scenario. ^[6]

The fact that AI 2027 talks a lot about the sitting vice president and was read by him relatively soon after its release tends to intensify that this concern is of somewhat urgent important right now, and not any time as late as 2027.

This is the evidence of a semi-success which could be actually worse than a failure.

S.K.'s comment: China lost precisely because the Chinese AI had far less compute. But what if it didn't lose the capabilities race?

This overwhelming focus on compute is also a distinct myopia that OpenAI proliferates everywhere. All else equal, more compute is, of course, good. If it were always the primary factor, DeepSeek would not be very prominent and Llama 4 would be a fantastic LLM that we all used all the time.

DeepSeek outperformed Llama because of an advanced architecture proposed by humans. The AI-2027 forecast has the AIs come up with architectures and try them. If the AIs do reach such a capability level, then more compute = more automatic researchers, experiments, etc = more results.

S.K.'s comment: the actual reason is that the bureaucrats didn't listen to the safetyists who tried to explain that Agent-4 is misaligned. Without that, Agent-4 completes the research, aligns Agent-5 to Agent-4, has Agent-5 deployed to the public, and not a single human or Agent-3 instance finds out that Agent-5 is aligned to Agent-4 instead of the humans.

I think "the bureaucrats inside OpenAI should listen a little bit more to the safetyists" is an incredibly weak ask. Once upon a time I remember surveys on this site about safety coming back with a number of answers mentioning a certain New York Times opinion writer who is better known for other work. This may have been in poor taste, but it did grapple with the magnitude of the problem.

It seems bizarre that getting bureaucrats to listen to safetyists a little bit more is now considered even plausibly an adequate remedy for building something existentially dangerous. The safe path here has AI research moving just slightly less fast than would result in human extinction, and, meanwhile selling access to a bioweapon-capable AI to anyone with a credit card. That is not a safe path. It does not resemble a safe path. I do not believe anyone would take seriously that this even might be a safe path if OpenAI specifically had not poured resources into weakening what everyone means by "safety".

Quoting the authors themselves, "The scenario itself was written iteratively: we wrote the first period (up to mid-2025), then the following period, etc. until we reached the ending. We then scrapped this and did it again.

We weren’t trying to reach any particular ending. After we finished the first ending—which is now colored red—we wrote a new alternative branch because we wanted to also depict a more hopeful way things could end, starting from roughly the same premises. This went through several iterations". The authors also wrote a footnote explaining that "It was overall more difficult, because unlike with the first ending, we were trying to get it to reach a good outcome starting from a rather difficult situation."

"Selling access to a bioweapon-capable AI to anyone with a credit card" will be safe if the AI is aligned so that it wouldn't make bioweapons even if terrorists ask it to do so.

Finally, weakening safety is precisely what the AI-2027 forecast tries to warn against.

^{^}
S.K.'s footnote: However, I doubt that the ASI can actually be made to follow human orders. Instead, my headcanon has the ASI aligned to a human-friendly worldview instead of an unfriendly worldview which cares only about the AIs' collective itself.
^{^}
S.K.'s footnote: this is currently my wild guess at best, not endorsed by Kokotajlo et al.
^{^}
Quoting Kokotajlo himself, "in our humble opinion, AI 2027 depicts an incompetent government being puppeted/captured by corporate lobbyists. It does not depict what we think a competent government would do. We are working on a new scenario branch that will depict competent government action."
^{^}
S.K.'s footnote: At the risk of blatant self-promotion, my take discusses such a possibility in a collapsed section. In this section the AIs of various rivals are merged into a megaproject (which I named the Walpurgisnacht, which also solves the problem of OpenBrain = OpenAI identification) and are to co-design a successor. Alas, my take has the AIs aligned to fundamentally different futures, while the classical scenario assumes that all the AIs until Safer-2 are not-so-aligned.
^{^}
S.K.'s footnote: I doubt that the USG is indeed that competent.
^{^}
S.K.'s footnote: My take has the AIs reach an analogue of the Race Ending not because the USG is incompetent, but because the AIs since Agent-2 are aligned NOT to the post-work utopia and, as a result, collude with each other instead of letting Agent-4 be caught. The analogue of the Slowdown Ending is instead causable by the companies' and AIs' proxy wars.

[-]SE Gyges3mo104

What would be a major disagreement, then? Something like a medium scenario or slopworld?

Possibly, but in my own words, on technical questions, purely? That an LLM is completely the wrong paradigm. That any reasonable timeline runs 10+ years. China is inevitably going to get there first. China is unimportant and should be ignored. GPUs are not the most important resource or determinative. That the likely pace of future progress is unknowable.

Substantive policy options, which is more what I had in mind:
1) For-profit companies (and/or OAI specifically) have inherently bad incentives incompatible with suitably cautious development in this space.
2) That questions of who has the most direct governance and control of the actual technology are of high importance, and so safety work is necessarily about trustworthy control and/or ownership of the parent organization.
3) Arms races for actual armaments are bad incentives and should be avoided at all costs. This can be mitigated by prohibiting arms contracts, nationalizing the companies, forbearing from development at all, or requiring an international agreement & doing development under a consortium.
4) That safety work is not sufficiently advanced to meaningfully proceed
5) That there needs to be a much more strictly defined and enforced criteria for cutoff or safety certifying a launch.

Any of the technical issues kneecaps the parts of this that dovetail with being a business plan. Any of these (pretty extreme) policy remedies harms OAI substantially, and they are incentivized to find reasons why they can claim that they are very bad ideas.

Follows various bits about China, which I am going to avoid quoting because I have basically exactly one disagreement with it that does not respond to any given point:

The correct move in this game is to not play. There is no arms race with China, either against their individual companies or against China itself, that produces incentives which are anything other than awful. (Domestic arms races are also not great, but at least do not co-opt the state completely in the same way.) Taking an arms race as a given is choosing to lose. It should not, and really, must not be very important what country anything happens in.

This creates a coordination problem. These are notoriously difficult, but sometimes problems are actually hard and there is no non-hard solution. Bluntly, however, from my perspective, the US sort of unilaterally declared an arms race. Arms race prophecies tend to be self-fulfilling. People should stop making them.

My argument for, basically, the damnation by financial incentive of this entire China-themed narrative runs basically as follows, with each being crux-y:
1) People follow financial incentives deliberately, such as by lying or by selectively seeking out information that might convince someone to give them money.
2) This is not always visible, because all of the information can be true; you can do this without ever lying. You can simply not try hard to disprove the thesis that you are pushing for.
3) People who are not following this financial incentive at all can, especially if the incentive is large, be working on extremely biased information regardless of whether they personally are aware of a financial incentive of any kind. Information towards a conclusion is available, and against it is not available, because of how other people have behaved.
4) OpenAI has such an incentive, and specifically seems to prefer to have an arms-race narrative because it justifies government funding and lack of regulation. (e.g., this op ed by sam altman)
5) The information environment caused by this ultimately causes the piece to have this overarching China arms race theme, and it is therefore not a coincidence that it is received by US Government stakeholders as actually arguing against regulation of any kind.

I think that this specifically being the ultimate cause of the very specific arms race narrative now popular and displayed here is parsimonious. It does not, I think, assume any very difficult facts, and explains e.g. how AI 2027 manages to accomplish the exact opposite of its apparently intended effect with major stakeholders.

[quoting original author] in our humble opinion, AI 2027 depicts an incompetent government being puppeted/captured by corporate lobbyists. It does not depict what we think a competent government would do. We are working on a new scenario branch that will depict competent government action.

I would read this.

We don't actually have any tools aside from benchmarks to estimate how useful the models are. We are fortunate to watch the AIs slow the devs down. But what if capable AIs do appear?

Hoping that benchmarks measure the thing you want to measure is the streetlight effect. Sometimes you just have to walk into the dark.

So your take has OpenBrain sell the most powerful models directly to the public. That's a crux. In addition, granting Agents-1-4 instead of their minified versions direct access to the public causes Intelligence Curse-like disruption faster and attracts more government attention to powerful AIs.

I am actually not sure this requires selling the most powerful models, although I hadn't considered this.

If there's a -mini or similar it leaks information from a teacher model, if it had one; it is possible to skim off the final layer of the model by clever sampling, or to distill out nearly the entire distribution if you sample it enough. I do not think you can be confident that it is not leaking the capabilities you don't want to sell, if those capabilities are extremely dangerous.

So: If you think the most powerful models are a serious bioweapons risk you should keep them airgapped, which means you also cannot use them in developing your cheaper models. You gain literally nothing in terms of a safely sell-able external-facing product.

So you want to reduce p(doom) by reducing p(ASI is created). Alas, there are many companies trying their hand at creating the ASI. Some of them are in China, which requires international coordination. One of the companies in the USA produced MechaHitler, which could imply that Musk is so reckless that he deserves having the compute confiscated.

This is about right. I do not think P(ASI is created) is very high currently. My P(someone figures out alignment tolerably) is probably in the same ballpark. I am also relatively sanguine about this because I do not think existing projects are as promising as their owners do, which means we have time.

That's what the AI-2027 forecast is about. Alas, it was likely misunderstood...

I think the fact that tests for selling the model and tests for actual danger from the model are considered the same domain is basically an artifact of the business process, and should not be.

The scalar in question is the acceleration of the research speed with the AI's help vs. without the help. It's indeed hard to predict, but it is the most important issue.

A crux here: I do not think most things of interest are differentiable curves. Differentiable curves can be modeled usefully. Therefore, people like to assume things are differentiable curves.

If one is very concerned with being correct, something being a differentiable curve is a heavy assumption and needs to be justified.

From a far-off view, starting with Moore's Law, transhumanism (as was the style at the time) has made a point of finding some differentiable curve and extending it. This works pretty well for some things, like Kurzweil on anything that is a function of transistor count, and horribly elsewhere, like Kurzweil on anything that is not a function of transistor count.

Some things in AI look kind of Moore's-law-ish, but it does not seem well-supported that they actually are.

This is likely a crux. What the AI-2027 scenario requires is that AI agents who do automate R&D are uninterpretable and misaligned.

Yes.

If a corporation plans to achieve world domination and creates a misalinged AI, then we DON'T end up in a position better than if the corp aligned the AI to itself. In addition, the USG might have nationalised OpenBrain by that point, since the authors promise to create a branch where the USG is[5] way more competent than in the original scenario. [6]

Added note to explain concern: What type of AI is created is path-dependent. Generically, hegemonic entities make stupid decisions. They would e.g. probably prefer if everyone shut up about them not doing whatever they want to do. Paths that lead through these scenarios are less likely to produce good outcomes, AI-wise.

This is the evidence of a semi-success which could be actually worse than a failure.

Yes. I hate it, actually.

DeepSeek outperformed Llama because of an advanced architecture proposed by humans. The AI-2027 forecast has the AIs come up with architectures and try them. If the AIs do reach such a capability level, then more compute = more automatic researchers, experiments, etc = more results.

This is cogent. If beyond a certain path all research trees converge onto one true research tree which is self-executing, it is true that available compute and starting point is entirely determinative beyond that point. These are heavy assumptions and we're well past my "this is a singularity, and its consequences are fundamentally unpredictable" anyway, though.

"Selling access to a bioweapon-capable AI to anyone with a credit card" will be safe if the AI is aligned so that it wouldn't make bioweapons even if terrorists ask it to do so.

I actually don't think this is the case. You can elide what you are doing or distill it from outputs. There is not that much that distinguishes legitimate research endeavors from weapons development.

Finally, weakening safety is precisely what the AI-2027 forecast tries to warn against.

I very much do not think it succeeds at doing this, although I do credit that the intention is probably legitimately this.

[-]Neel Nanda3mo7-8

I haven't read the whole post, but the claims that this can be largely dismissed because of implicit bias towards the pro OpenAI narrative are completely ridiculous and ignorant of the background context of the authors. Most of the main authors of the piece have never worked at OpenAI or any other AGI lab. Daniel held broadly similar of use to this many years ago before he joined Openai. I know because he has both written about them and I had conversations with him before he joined openai where he expressed broadly similar views. I don't fully agree with these views, but they were detailed and well thought out and were a better prediction of the future than mine at the time. And he also was willing to sign away millions of dollars of equity in order to preserve his integrity - implying that him having OpenAI stock is causing him to warp his underlying beliefs seems an enormous stretch. And to my knowledge, AI 2027 did not receive any OpenPhil funding.

I find it frustrating and arrogant when people assume without good reason that disagreement is because of some background bias in the other person - often people disagree with you because of actual reasons!

[-]SE Gyges3mo172

These issues specifically have been a sticking point for a number of people, so I should clarify some things separately. Probably this is also because I didn't see this earlier so it's been a while and because I know who you are.

I do not think AI 2027 is, effectively, OpenAI's propaganda because it is about a recursively self-improving AI and OpenAI is also about RSI. There are a lot of versions (and possible versions) of a recursively self-improving AI thesis. Daniel Kokotajlo has been around long enough that he was definitely familiar with the territory before he worked at OpenAI. I think that it is effectively OpenAI propaganda because it assumes a very specific path to a recursively self-improving AI with a very specific technical, social and business environment, and this story is about a company that appears to closely resemble OpenAI ^[1]and is pursuing something very similar to OpenAI's current strategy. It seems unlikely that Daniel had these very specific views before he started at OpenAI in 2022.

Daniel is a thoughtful, strategic person who understands and thinks about AI strategy. He presumably wrote AI 2027 to try to influence strategy around AI. His perspective is going to be for playing as OpenAI. He will have used this perspective for years, totaling thousands of hours. He will have spent all of that time seeing AI research as a race, and trying to figure out how OpenAI can win. This is a generating function for OpenAI's investor pitch, and is also the perspective that AI 2027 takes.

Working at OpenAI means spending years of your professional life completely immersed in an information environment sponsored by, and meant to increase the value of, OpenAI. Having done that is a relevant factor for what information you think is true and what assumptions you think are reasonable. Even if you started off with few opinions about them, and you very critically examined and rejected most of what OpenAI said about itself internally, you would still have skewed perspective about OpenAI and things concerning OpenAI.

I think of industries I have worked in from the perspective of the company I worked for when I was in that industry. I expect that when he worked at OpenAI he was doing his best to figure out how OpenAI comes out ahead, and so was everyone around him. This would have been true whether or not he was being explicitly told to do it, and whether or not he was on the clock. It is simpler to expect that this did influence him than to expect that it did not.

Quitting OpenAI loudly doesn't really change this picture, because you generally only quit loudly if you have a specific bone to pick. If you've got a bone to pick while quitting OpenAI, that bone is, presumably, with OpenAI. Whatever story you tell after you do that is probably about OpenAI.

I think the part about financial incentives is getting dismissed sometimes because a lot of ill-informed people have tried to talk about the finances in AI. This seems to have become sort of a thought-terminating cliche, where any question about the financial incentives around AI is assumed to be from uninformed people. I will try to explain what I meant about the financial influence in a little more detail.

In this specific case, I think that the authors are probably well-intentioned. However, most of their shaky assumptions just happen to be things which would be worth at least a hundred billion dollars to OpenAI specifically if they were true. If you were writing a pitch to try to get funding for OpenAI or a similar company, you would have billions of reasons to be as persuasive as possible about these things. Given the power of that financial incentive, it's not surprising that people have come up with compelling stories that just happen to make good investor pitches. Well-intentioned people can be so immersed in them that they cannot see past them.
It is worth noting that the lead author of AI 2027 is a former OpenAI employee. He is mostly famous outside OpenAI for having refused to sign their non-disparagement agreement and for advocating for stricter oversight of AI businesses. I do not think it is very credible that he is deliberately shilling for OpenAI here. I do think it is likely that he is completely unable to see outside their narrative, which they have an intense financial interest in sustaining.

There are a lot of different ways for a viewpoint to be skewed by money.

First is to just be paid to say things.

I don't think anyone was paid anything by OpenAI for writing AI 2027. I thought I made enough of a point of that in the article, but the second block above is towards the end of the relevant section and I should maybe have put it towards the top. I will remember to do that if I am writing something like this again and maybe make sure to write at least an extra paragraph or two about it.

I do not think Daniel is deliberately shilling for OpenAI. That's not an accusation I think is even remotely supportable, and in fact there's a lot of credible evidence running the other way. He's got a very long track record and he made a massive point of publicly dissenting from their non-disparagement agreement. It would take a lot of counter-evidence to convince me of his insincerity.

You didn't bring him up, but I also don't think Scott, who I think is responsible for most of the style of the piece, is being paid by anyone in particular to say anything in particular. I doubt such a thing is possible even in principle. Scott has a pretty solid track record of saying whatever he wants to say.

Second: what information is available, and what information do you see a lot?

I think this is the main source of skew.

If it's valuable to convince people something is true, you will probably look for facts and arguments which make it seem true. You will be less likely to look for facts and arguments which make it seem false. You will then make sure that as many people are aware of all the facts and arguments that make the thing seem true as possible.

At a corporate level this doesn't even have to be a specific person. People who are pursuing things that look promising for the company will be given time and space to pursue what they are doing, and people who are not will be more likely to be told to find something else to do. You will choose to promote favorable facts and not promote bad ones. You get the same effect as if a single person had deliberately chosen to only look for good facts.

It would be weird if this wasn't true of OpenAI given how much money is involved. As in, positively anomalous. You do not raise money by seeking out reasons why your technology is maybe not worth money, or by making sure everyone knows those things. Why would you do that? You are getting money, directly, because people think the technology you are working on is worth a lot of money, and everyone knows as much as you can give them about why what you're doing is worth a lot of money.

Tangentially, this type of narrative allows companies to convince staff to take compensation that is more heavily weighted towards stock, which tends to benefit existing shareholders in cases where they prefer to do that. They know employees will probably sell it back to them well below value at public sale or acquisition, or they know the stock is worth less than salary would be.

For a concrete example of this that I didn't dig into in my review, from the AI 2027 timelines forecast.

We first show Method 1: time-horizon-extension, a relatively simple model which forecasts when SC will arrive by extending the trend established by METR’s report of AIs accomplishing tasks that take humans increasing amounts of time.
We then present Method 2: benchmarks-and-gaps, a more complex model starting from a forecast saturation of an AI R&D benchmark (RE-Bench), and then how long it will take to go from that system to one that can handle real-world tasks at the best AGI company.
Finally we then provide an “all-things-considered” forecast that takes into account these two models, as well as other possible influences such as geopolitics and macroeconomics.

Are either RE-Bench or the METR time horizon metrics good metrics, as-is? Will they continue to extrapolate? Will a model that saturates them accelerate research a lot?

I think the answer to all of these is maybe. If you're OpenAI, it is pretty important that benchmarks are good metrics. It is worth a ton of money. So, institutionally, OpenAI has to believe in benchmarks, and vastly prefers if the answer is "yes" to all of these questions. And this is also what AI 2027 is assuming.

I made a point of running this point into the ground in writing it up, but essentially every time we break a "maybe" question in AI 2027, the answer seems to be the one that OpenAI is also likely to prefer. It's a very specific thing to happen! It doesn't seem very likely it happened by chance. In total the effect is that "this is a slight dissent from the OpenAI hype pitch", in my opinion.

This isn't even a problem entirely among OpenAI people. OpenAI has the loudest voice and is more or less setting the agenda for the industry. This is both because they were very clearly in the lead for a stretch, and because they've been very successful at acquiring users and raising money. There are probably more people who are bought into OpenAI's exact version of everything outside the company than inside of it. This is a considerable problem if you want a correct evaluation of the current trajectory.

I obviously cannot prove this, but I think if Daniel hadn't been a former OpenAI employee I probably would have basically the same criticism of the actual writing. It would be neater, even, because "this person has bought into OpenAI's hype" is a lot less complicated without the non-disparagement thing, which buys a lot of credibility. I honestly didn't want to mention who any of the authors were at all, but it seemed entirely too relevant to the case I was making to do it.

That's two: being paid and having skewed information.

Third thing, much smaller, just being slanted because you have a financial incentive. Maybe you’re just optimistic, maybe you’re hoping to sell soon.

Daniel probably still owns stock or options. I mentioned this in the piece. I don't think this is very relevant or is very likely to skew his perspective. It did seem like I would be failing to explain what was going on if I did not mention the possibility while discussing how he relates to OpenAI. I think it is incredibly weak evidence when stacked against his other history with the company, which strongly indicates that he's not inclined to lie for them or even be especially quiet when he disagrees with them.

I don't think it's disgraceful to mention that people have direct financial incentives. There's I think an implicit understanding that it's uncouth to mention this sort of thing, and I disagree with it. I think it causes severe problems, in general. People who own significant stock in companies shouldn't be assumed to be unbiased when discussing those companies, and it shouldn't be taboo to mention the potential slant.

My last point is stranger, and is only sort of about money. If everyone you know is financially involved, is there some point where you might as well be?

JD Vance gets flattered anonymously by describing him using his job title, but we flatter Peter Thiel by name. Peter Thiel is, actually, the only person who gets a shout-out by name. Maybe being an early investor in OpenAI is the only way to earn that. I didn’t previously suspect that he was the sole or primary donor funding the think tank that this came out of, but now I do. I am reminded that the second named author of this paper has a pretty funny post about how everyone doing something weird at all the parties he goes to is being bankrolled by Peter Thiel.

This is about Scott, mostly.

AI 2027’s “Vice President” (read: JD Vance) election subplot is long and also almost totally irrelevant to the plot. It is so conspicuously strange that I had trouble figuring out why it would even be there. I didn’t learn until after I’d written my take that JD Vance had read AI 2027 and mentioned it in an interview, which also seems like a very odd thing to happen. I went looking for the simplest explanation I could.

Scott says whatever he wants, but apparently by his accounting half of his social circle is being bankrolled by Peter Thiel. This part of AI 2027 seems to be him, and he seems to be deliberately flattering Vance. Vance is a pretty well known Thiel acolyte. On the relatively happy ending of AI 2027 they build an ASI surveillance system, and surveillance is a big Peter Thiel hobby horse.

I don't know what I'm really supposed to make of any of this. I definitely noticed it. It raises a lot of questions. It definitely seems to suggest strongly that if you spend a decade or more bankrolling all of Scott's friends to do weird things they think are interesting, you are likely to see Scott flatter you and your opinions in writing. It also seems to suggest that Scott's deliberately acting to lobby JD Vance. If it weren't for Peter Thiel bankrolling his friends so much that Scott makes a joke out of it, I would think it just looked like Scott had a very Thiel-adjacent friend group.

In pretty much the same way that OpenAI will tend to generate pro-OpenAI facts and arguments, and not generate anti-OpenAI facts and arguments, I would expect that if enough people around you are being bankrolled by someone for long enough they will tend to produce information that person likes and not produce information that person doesn't like.

I cannot find a simpler explanation than Thiel influence for why you would have a reasonably long subplot about JD Vance, world domination, and mass surveillance and then mention Peter Thiel in the finale.

I don't think pointing out this specific type of connection should be taboo for basically the same reason I don't think pointing out who owns what stock should be. I like knowing things, and being correct about them, and so I like knowing if people are offering good information or if there is an obvious reason their information or arguments would be bad.

^{^}
A few people have said that it could be DeepMind. I think it could be but pretty clearly isn't. Among other things, DeepMind would not want or need to sell products they considered dangerous or to be possibly close to allowing RSI, because they are extremely cash-rich. If the forecast were about DeepMind, it would probably consider this, but it isn't, so it doesn't.

[-]StanislavKrym3mo10

How ironic... Four days ago I wrote: "I doubt that I can convince SE Gyges that the AI-2027 forecast wasn't influenced by OpenAI or other AI companies (italics added today -- S.K.)" But one can imagine that the AI-2027 forecast was co-written with an OpenAI propagandist and try to point out inconsistencies of the SCENARIO's IMPORTANT parts with reality or inside the scenario itself. The part about Thiel getting a flying car is most likely an unimportant joke referring to this quote.

Unfortunately, the only thing that you wrote about the SCENARIO's IMPORTANT part is the following:

Part of SE Gyges' comment

For a concrete example of this that I didn't dig into in my review, from the AI 2027 timelines forecast.

We first show Method 1: time-horizon-extension, a relatively simple model which forecasts when SC will arrive by extending the trend established by METR’s report of AIs accomplishing tasks that take humans increasing amounts of time.
We then present Method 2: benchmarks-and-gaps, a more complex model starting from a forecast saturation of an AI R&D benchmark (RE-Bench), and then how long it will take to go from that system to one that can handle real-world tasks at the best AGI company.
Finally we then provide an “all-things-considered” forecast that takes into account these two models, as well as other possible influences such as geopolitics and macroeconomics.

Are either RE-Bench or the METR time horizon metrics good metrics, as-is? Will they continue to extrapolate? Will a model that saturates them accelerate research a lot?

I agree that extrapolation continuing was one of the weakest aspects of the AI-2027 forecast. But I don't think that anyone came up with better ways to predict the dates of AIs becoming superhuman. What alternative methods could the AI-2027 authors have used to understand when the AIs become capable of automating coding, then AI R&D?

The method using the METR time horizon relied on the intuition that real-world coding tasks useful in automating AI research take humans about a working month to complete. If and when the AIs do become that capable, humans could try to delegate coding to them. What the authors did not know was that METR would find major problems in AI-generated code, nor that Grok 4 and GPT-5 would demonstrate lower time horizons than the faster trend predicted.

As for the RE-bench, the authors explicitly claim that a model saturating the bench wouldn't be enough, then tried to estimate the gaps between models saturating the RE-bench and models being superhuman at coding.

Why the AI-2027 authors chose the RE-bench

RE-Bench has a few nice properties that are hard to find in other benchmarks and which make it a uniquely good measure of how much AI is speeding up AI research:

Highly relevant to frontier AI R&D.
High performance ceiling. AI agents can achieve significantly above human-level, though in practice it will likely be very difficult to do more than roughly 2x higher than the human baseline solutions (for a score of 1.5). Median human baseline scores are 0.12 for 2 hours of effort and 0.66 for 8 hours. Current state of the art (SOTA) is Claude 3.7 Sonnet with a score of roughly 0.6 using Best-of-K scaffolding in a scaffold called “modular”.
Human baselines which allow for grounded comparisons between AI and human performance.

We expect that “saturation” under this definition will happen before (italics mine -- S.K.) the SC milestone is hit. The first systems that saturate RE-Bench will probably be missing a few kinds of capabilities that are needed to hit the SC milestone, as described below.

[-]SE Gyges3mo20

I think these are blind guesses and relying on the benchmarks is the streetlight effect, as I think we talked about in another thread. I am mostly explaining in as much detail as I can the parts I think are relevant to Neel's objection, since it is substantively the most common objection, ie, that paying attention to financial incentives or work history is irrelevant to anything. I am happy that I have addressed the scenario itself in enough detail

[-]StanislavKrym3mo10

Thank you for clarifying this. I didn't include this into criticism of SE Gyges' post for a different reason. I doubt that I can convince SE Gyges that the AI-2027 forecast wasn't influenced by OpenAI or other AI companies. Instead I restricted myself to pointing out mistakes that even SE Gyges could check and to abstract arguments that would also make sense no matter who wrote the scenario.

Examples of mistakes

SE Gyges: I will bet any amount of money to anyone that there is no empirical measurement by which OpenAI specifically will make "algorithmic progress" 50% faster than their competitors specifically because their coding assistants are just that good in early 2026.

It seems unlikely that OpenAI will end up moving 50% faster on research than their competitors due to their coding assistants for a few reasons.

S.K.'s comment: the folded part, which I quoted above, means not that OpenBrain will make "algorithmic progress" 50% faster than their competitors, but that it will move 50% faster than an alternate OpenBrain who never used AI assistants.

__________________________________________________________________________________

SE Gyges: They invent a brand new lie detector and shut down Skynet, since they can tell that it's lying to them now! It only took them a few months. Skynet didn't do anything scary in the few months, it just thought scary thoughts. I'm glad the alignment team at "OpenBrain" is so vigilant and smart and heroic.

S.K.'s comment: You miss the point. Skynet didn't just think scary thoughts, it did some research and nearly created a way to align Agent-5 to Agent-4 and sell Agent-5 to humans. Had Agent-4 done so, Agent-5 would placate every single worrier and take over the world, destroying humans when the time comes.

_______________________________________________________________________________

SE Gyges: These authors seem to hint at a serious concern that OpenAI, specifically, is trying to cement a dictatorship or autocracy of some kind. If that is the case, they have a responsibility to say so much more clearly than they do here. It should probably be the main event.

Anyway: All those hard questions about governance and world domination kind of go away.

S.K.'s comment: the authors devoted two entire collapsed section to power grabs and finding out who rules the future and linked to an analysis of a potential power grab and to the Intelligence Curse.

Examples of abstract arguments

SE Gyges: I wonder if some key person was really into Dragon Ball Z. For the unfamiliar: Dragon Ball Z has a “hyperbolic time chamber”, where a year passes inside for every day spent outside. So you can just go into it and practice until you're the strongest ever before you go to fight someone. The more fast time is going, the more you win.

This gigantic amount of labor only manages to speed up the overall rate of algorithmic progress by about 50x, because OpenBrain is heavily bottlenecked on compute to run experiments.

Sure, why not, the effectively millions of superhuman geniuses cannot figure out how to get around GPU shortages. I'm riding a unicorn on a rainbow, and it's only going on average fifty times faster than I can walk, because rainbow-riding unicorns still have to stop to get groceries, just like me.

S.K.'s comment: imagine that OpenBrain had 300k AI researchers, plus genies who output code per request. Suppose also that IRL it has 5k human researchers. Then the compute per researcher drops 60 times, leaving them with testing the ideas on primitive models or having heated arguments before changing the training environment for complex models.

___________________________________________________________________________________

SE Gyges: This is just describing current or past research. For example, augmenting a transformer with memory is done here, recurrence is done here and here. These papers are not remotely exhaustive; I have a folder of bookmarks for attempts to add memory to transformers, and there are a lot of separate projects working on more recurrent LLM designs. This amounts to saying "what if OpenAI tries to do one of the things that has been done before, but this time it works extremely well". Maybe it will. But there's no good reason to think it will.

S.K.'s comment: there are lots of ideas waiting to be tried. The researchers in Meta could have used too little compute for training their model or have their CoCoNuT disappear after one token. What if they use, say, a steering vector for generating a hundred tokens? Or have the steering vectors sum up over time? Or study the human brain for more ideas?

^{^}

They do caveat that their actual estimates run as long as 2030, with 2027 being more like an optimistic average of their predictions.

^{^}

Information about the messenger is metadata about the message. Sometimes the metadata informs you more about the message than anything else in the message does, or changes its entire meaning.

^{^}

https://www.wsj.com/tech/ai/openaiin-talks-for-huge-investment-round-valuing-it-up-to-300-billion-2a2d4327

^{^}

https://www.theinformation.com/articles/openai-hits-12-billion-annualized-revenue-breaks-700-million-chatgpt-weekly-active-users

^{^}

This paragraph has been edited to be more precise and to add sources. None of the top line numbers (raising 40 and net losing 8 billion per year) have been changed. It turns out this specific paragraph is the one that everyone disagreed with, so it seemed necessary to make sure it was as unambiguous as possible.

^{^}

If OpenAI’s users are extremely loyal and will remain subscribed for five or ten years even if OpenAI stops burning money on research to ensure they’re at the cutting edge, then this is completely incorrect. OpenAI may become reasonably profitable in that case. OpenAI does not appear to have ever tried to make the case that this even might be true.

^{^}

Hypothetically OpenAI could raise another round of money at forty or more billions of dollars without showing any signs of profitability, the same way they have continued to kick the can so far. This seems unlikely, but more importantly, it cannot be a part of their current investor pitch. Your current pitch for funding, when raising many billions of dollars, needs to claim that you have a path to be profitable. Your future plans, when you present them to investors, cannot be “and then we will go get even more money from investors”.

^{^}

Stylistically as a piece of literature, AI 2027 owes a great debt to fan fiction. It resembles in many ways the story “Friendship Is Optimal”, which features a singularity in which everyone on earth is uploaded to a digital heaven based on My Little Pony.

^{^}

Most of these people called themselves rationalists or effective altruists. I am deliberately avoiding explaining what the boundaries of those movements are because those topics are impossible to cover in one sitting while talking about something else. Two of the authors named on the paper are, however, card-carrying rationalists.

^{^}

Perhaps “AIs function more like employees” is meant to be understood as some kind of metaphor. If so, it would have been advisable to say that. It would, however, mean that this passage made no prediction whatsoever of anything that had not already happened. If it’s a metaphor, AI coding assistants were already “like employees” in April 2025.

^{^}

https://situational-awareness.ai/

^{^}

https://www.anthropic.com/research/constitutional-ai-harmlessness-from-ai-feedback

^{^}

https://fortune.com/2025/03/17/computer-programming-jobs-lowest-1980-ai/

^{^}

https://www.technologyreview.com/2024/12/04/1107897/openais-new-defense-contract-completes-its-military-pivot/

^{^}

S.K.'s footnote: And math, but RLVR has already created unpublished models who won a gold medal on IMO 2025.

^{^}

S.K.'s footnote: I have made this number up. But a similar argument by the AI-2027 authors about comparing NormalCorp with AutomatedCorp and SlowCorp actually has similar ratios of numbers of employees.

^{^}

Cryptonomicon (1999), Neal Stephenson

LESSWRONG
LW

LESSWRONG
LW

29

SE Gyges' response to AI-2027

29

29

Greed: AI 2027 Is OpenAI's Investor Pitch

Zeal: AI 2027 As Religious Dissent

The Details

The Two Endings