x

LESSWRONG

LW

Sparks of RSI? — LessWrong

14

Sparks of RSI?

by Nathan Helm-Burger

14th Mar 2026

1 min read

14

Are your long-running agents self-improving in loops with minimal prompting? Mine sure are!

I think we're seeing the first sparks of RSI here, folks. I'm expecting the frontier labs to scramble furiously to push this forward, finding and patching the meta-failure-modes. Thus, I expect next versions to be even better at this.

Here's what some other people are saying/claiming:

https://x.com/shreyasnsharma/status/2032567729560105117

https://x.com/varun_mathur/status/2032671842230501729

https://x.com/TuXinming/status/2032478765033701835

https://x.com/andrewwhite01/status/2031761577943425475

https://x.com/aramh/status/2029553870502756706

https://x.com/polynoamial/status/2029622090152956335

https://t.co/znsJlcww5r

And many more. This is just a few examples. Not super impressive so far, but if this "task" goes the way many others have of first showing signs of progress in the 1-3% accuracy range, then rapidly shooting upwards over the next couple of model versions.... Yeah.

Basically, I think we're in crunch time. Automated alignment time is here. Get cracking.

16Vladimir_Nesov

13Nathan Helm-Burger

6Vladimir_Nesov

2Nathan Helm-Burger

14Vladimir_Nesov

3Nathan Helm-Burger

3Gordon Seidoh Worley

9

14

New Comment

9 comments, sorted by

Click to highlight new comments since: Today at 5:24 PM

[-]Vladimir_Nesov2mo161

When RSI is not unbounded (doesn't take steps that eventually proceed far beyond the modern civilization), it shouldn't count as RSI. Some test-time training things might work better than pure in-context learning, but not all post-deployment improvement is RSI, and not all RSI must happen post-deployment.

The most near-term path to RSI that seems plausible to me is not about any sort of continual learning or scaffolding, but automation of routine AI R&D, because it gets to leverage RLVR, the only method of training a wide variety of deep skills that currently works with LLMs. Automating application of RLVR to LLMs (part of routine AI R&D) thus suggests the possibility of genuine RSI. On the other hand, all the post-deployment things don't have the immediate potential to make LLMs able to play good chess, or to become fluent in a novel topic of math (that is, something not already trained into the LLM with RLVR before deployment). These things might help indirectly though, as part of automated routine AI R&D, and of training LLMs to get better at routine AI R&D.

[-]Nathan Helm-Burger2mo1310

This argument sounds to me like "Sparks are not wildfires. Even campfires are not wildfires. Until the fire is huge and raging out of control, unable to be easily extinguished, it is not the scary thing."

Sure. Ok. That's not what I mean. I mean: this small harmless thing (sparks) seems to me to have the potential to lead rapidly to the dangerous thing in the future (wildfire).

The point I'm trying to make is that the time to take preparatory action against wildfires is before they are present where you are.

What I'm saying is that I am standing deep in a drought-striken forest with no way out. I am watching some reckless children trying to build a bonfire. Currently they are managing to make sparks. The sparks haven't caught yet, they do not self-sustain or grow. I am nevertheless predicting that soon the massive amount of optimization power being applied to this goal will show progress. The sparks will become a small fire, able to sustain if cared for and supplied with adequate material and care (e.g. shelter from extinguishment).

If you have ever built a fire with only ancient tools, you know that the step from no sparks to sparks can actually be the hardest. The next hardest step is small spark to small ember. After that things get rapidly easier and faster. The exothermic reaction gets stronger, more easily sustained, more easily increased, less picky about the quality of the fuel it is given. This pattern holds true all the way up to an out-of-control wildfire. Campfire to bonfire is easy. Bonfire or campfire to wildfire is so easy that it commonly happens by accident in dry woods, which is why campfires are forbidden even to people who claim intent to keep their fires small and under control. Too easy.

[-]Vladimir_Nesov2mo62

There are sparks all over the place right now, sure. What matters is which of them get to contribute to wildfires, and which are lost in those wildfires started by completely different sparks, never given the opportinity to develop because of the order in which things happen. Not all sparks are sparks of wildfires. RSI is the wildfire, and not all sparks are relevant to RSI. Especially in actuality where something else happens first and ends their relevance, rather than in principle where in isolation and with enough resources they could be developed further.

Specificity is important when there are so many sparks. So I'm gesturing at a specific spark, automation of routine R&D, that plausibly might actually cause a wildfire before the other sparks grow similarly dangerous. Maybe it doesn't catch, but in that case the other things still need to have a path towards learning of novel RLVR-level skills to have the potential, and many of them probably don't, at least on their own.

Non-specific sparks matter for defense in depth, as in computer security where you harden the system against even the patently impossible interventions wherever it's not too costly to do that. AI is existentially dangerous and nothing remotely close to the current methods can change that. But most sparks don't matter for forecasting what is going to actually happen.

[-]Nathan Helm-Burger2mo20

Not all sparks are sparks of wildfires. RSI is the wildfire, and not all sparks are relevant to RSI.

Yes, that's fair. I am claiming that these sparks are relevant. I am making a prediction. I am not proving it. I could very well be wrong. That's an important distinction.

[-]teradimich2mo30

You previously estimated ~10% probability for AGI in 2025–2027 with a median around the mid-2030s. Have your timelines updated since then?

[-]Vladimir_Nesov2mo140

My timelines didn't notably update overall (I wrote this comment before re-reading the comment you linked). Automation of routine AI R&D is an answer to the question of what specifically causes AGI/RSI in 2026-2027, if it happens this early, but in my model getting a clearer sense of what form this might take doesn't make AGI in 2026-2027 more likely. Most of my AGI probability is in unknown breakthroughs or scaling outcomes (where quantity becomes quality in a capability that wouldn't a priori obviously be able to go that far, before the necessary quantity actually arrives). These things are enabled by more compute and then either follow quickly (low-hanging fruit at a given level of compute, unlikely to be accessed earlier even if possible in principle) or take multiple years (when needing human-invented conceptual advancements). As compute grows faster/slower, this directly influences the probability of AGI/RSI per year during the few years after that.

I expect compute buildout (for individual AI companies) to continue at the current pace (of 2-4x more becoming available each year) in 2022-2029, perhaps with low-hanging fruit getting picked through 2032, then slower growth in 2029-2035, and even slower after that (absent AGI). Without AGI by 2035-2045, a lasting ban/pause gets more likely as global cultural attitudes might change. So the highest per-year probability is in 2027-2032, then notably lower in 2032-2038, and even lower after that. And I'm placing the median in 2032-2033. Which means 10% per year in 2027-2032, extending the first 10% to 2026-2027 since there's some visibility into the very near future that says this probably isn't happening right now.

New-for-me considerations from mid 2025 to now are a clearer picture of capabilities of RLVR and its implications for AI company revenues, and some details on what might happen around the Rubin Ultra buildout. Turns out RLVR works for IMO gold even with relatively small models (DeepSeek-V3), and there are now some LLM solutions to technical open problems, so it's probably sufficient for training the deep skills aspect of AGI (it's more than mere elicitation), especially with bigger models. Though jaggedness still makes it less useful than that suggests. This made 100 billion dollar revenues (2-5 GW training systems) for AI companies before 2030 more likely than o1/o3 suggested on their own. Scaffoldings like Claude Code, especially with better post-deployment adaptation (what's being foreshadowed as "continual learning"), make even trillion dollar revenues before 2030-2032 plausible (which means 30-50 GW training systems, but it'll take more time to scale the supply chains and actually build that with hardware of the same generation, as a single system for an individual AI company, probably closer to mid-2030s).

For the Rubin Ultra buildout (2028-2029), individual 5 GW systems don't seem to be in the works, which previously seemed to suggest it already starts a slowdown in the trend, which then only lasts at the current pace during 2022-2026, and goes 2x slower in 2026-2029. (Trillion dollar revenues only extend the slower part of the trend, after the initial slowdown somewhere in 2026-2029, as the supply chains struggle to catch up to the available funding, and before compute mostly stops growing other than through improved price performance of hardware.) But Nvidia's bet on FP8 in Rubin makes 2027-2029 hardware 2x-4x more performant per GW than I expected (2x from chips that are faster in FP8 because they no longer care about BF16 as much, and maybe another 2x from the confidence that FP8 is a first-class citizen in training of even the largest models, where this confidence wasn't already priced in). So even 2 GW Rubin training systems remain on trend, even though the trend previously asked for 5 GW training systems for the same compute. The fact that Nvidia is making this bet means others will likely be doing the same, so this doesn't necessarily only concern OpenAI.

[-]Nathan Helm-Burger2mo30

I, also, would like to hear @Vladimir_Nesov 's response to this.

Someone recently asked me if I have updated my 2024-2027 estimate for AGI. My answer was that I have adjusted the early side of that estimate (of course), but not the later side. I continue to expect AGI before 2028.

[-]james oofou2mo50

I think it's pretty clear we're in the early stages of an intelligence explosion. Researchers are already sped up ~20% as of early February, so doubling times are shortening. Each doubling of time horizon increases researcher productivity by a greater amount than the previous doubling. There's enough substitutivity that compute won't be enough of a constraint to block a singularity.

[-]Gordon Seidoh Worley2mo31

To me the clear limiting factor to RSI right now is the lack of real continual learning. Current models forget too quickly and struggle to maintain attention on larger context windows to make huge progress within runs. But they can make more than zero self-improvement, which is enough to be dangerous.

More from Nathan Helm-Burger

Curated and popular this week