Was a philosophy PhD student, left to work at AI Impacts, then Center on Long-Term Risk, then OpenAI. Quit OpenAI due to losing confidence that it would behave responsibly around the time of AGI. Now executive director of the AI Futures Project. I subscribe to Crocker's Rules and am especially interested to hear unsolicited constructive criticism. http://sl4.org/crocker.html
Some of my favorite memes:
(by Rob Wiblin)
(xkcd)
My EA Journey, depicted on the whiteboard at CLR:
(h/t Scott Alexander)
The qualitative conclusions seem to be the same if you use power law & most of the progress comes from change in slope:
However, if half the progress comes from change in intercept, we get this weird graph with a discontinuity at the end:
Not sure what's going on there. Maybe the intercept has risen enough that there is no longer any crossover point, despite the slope of the AI line still being shallower than the slope of the human line?
Update: If we assume that some % of the progress comes from improved slope and some % comes instead from rising intercept, we get an interesting result: The curve looks more exponential at the beginning, and more superexponential towards the end:
Also, the doubling time difficulty decay model makes worse and worse predictions, since it doesn't account for the intercept increase component of the progress!
Here's a possible launch-blocking strategy: Orbital lasers and/or kinetic interceptors.
You have a dense grid of laser satellites in orbit. Those on earth trying to get up into space face some disadvantages:
If they stay down on earth and shoot up at your satellites, while your satellites shoot back at them e.g. with lasers and kinetic interceptors, the gravity well gives you an inherent disadvantage. Your stuff has to climb out of the gravity well to hit the satellites whilst their stuff just has to fall down at the right angle. Plus also, the atmosphere and day-night cycle makes your solar panels much less efficient than theirs, so in a war of lasers they might just be able to outproduce you, plus the atmosphere makes the lasers inefficient anyway.
If they try to power up into space to fight you there using rockets, well, a big rocket can be blown up by a small bullet or laser burst as it exits the atmosphere and tries to accelerate to orbital velocity. Very vulnerable. If you try to armor it, you need to armor not just the payload but the rocket itself, which will be like an OOM bigger than the payload, plus the armor might weigh the thing down a lot reducing payload size.
A counterpoint is that you have to spread out your satellites in a grid whereas they can launch their entire fleet all at once in a single location to try to break through the grid. But yeah.
Musing:
Consider a graph with "Performance, i.e. how much diverse valuable stuff you can accomplish" on the Y-axis, and "Time budget, i.e. how long you are allowed to operate as an agent with a computer" on the X-axis.
Suppose that initially, frontier AIs are broadly superhuman when given very small time budgets, but subhuman when given large time budgets. That is, humans are better able to 'scale time budget' than AIs; humans are better at 'long-horizon agency skills.'
This would be represented as a human line on the graph, and an AI line on the graph, that both go up and to the right, but that intersect: the AI line starts higher but has a lower slope.
Suppose further that as AI progress continues, frontier AIs gradually get better at converting time budget into performance / scaling time budget / long-horizon agency skills. That is, the slope of the AI line increases.
Perhaps it gradually gets closer to the theoretical limits, e.g. each time step the slope of the AI line gets 0.1% closer to the slope of the theoretical-limits-of-agency-skills line. (Which must of course be higher-slope than the human line; let's be conservative and say 2x higher.)
What happens? See this vibecoded graph:
tl;dr: As AI progress continues and AIs gradually get better at agency skills, the "crossover point" gets to larger and larger time horizons. That is, the crossover between the time budgets for which AIs outperform humans, and the time budgets for which AIs underperform, gets longer. (See the green line on the right graph, which is the actual data + some added noise)
Specifically it gets longer exponentially... wait no, superexponentially! It only looks like an exponential initially. But as the slope of the AI line starts to get close to the slope of the human line, it starts to bend up a bit and then shoots up to infinity, and then it's over: AIs have better slope than humans; there is no longer any crossover point. AIs have "infinite horizon length" now.
(Bonus: The dotted lines are generated by fitting a simple "doubling difficulty decay rate" function to the noisy data, given different amounts of initial noisy data. After 50 data points they approximate pretty well.)
I don't want to lean on this too heavily, partly because maybe Claude made mistakes for all I know, but yeah. I think this captures and articulates a bunch of the intuitions I have about why it's wrong to use an exponential fit for the METR data. The underlying phenomenon just seems kinda analogous to this toy model.
That said, even on this toy model, there are two possible explanations for an observed pattern of crossover point horizon length increasing. One explanation is that the slope of the AIs line is getting steeper, but another explanation is that the intercept is rising. In principle it might be that the intercept is rising exponentially while the slope is staying the same, in which case the exponential fit would be appropriate and you'd actually never get to human-equivalent slope; there would always be some time budget / horizon length where humans outperformed AIs.
I think it would be pretty exciting to try to get evidence about how much of the METR horizon length progress is coming from intercept rising vs. slope increasing.
It's probably too late now, but maybe what we should have done is:
AGI = artificial general intelligence = an AI system that is generally capable rather than just narrowly capable
ASI = AGI that is better than the best humans at everything while also being faster and cheaper
We could then say that these are AGI companies building lots of little AGIs, and that they are on a path to ASI but aren't there yet and probably won't be for several more years. Claude is an AGI. GPT5 is an AGI. Etc. They keep improving in capability and generality and one day they'll be ASIs, but they aren't yet.
Thank you for looking into this in depth! My apologies, I have only skimmed the report so far, but I have a few questions:
(1) Suppose someone used 100 starship launches to put optimal debris into orbit. Suppose someone else had capacity for 1000 starship launches. Could they just... Put armor on their spaceships/satellites &/or build them with redundant structures so they can take a hit or two? This would increase the weight of course but maybe that's fine since there's plenty of launch capacity?
(2) My understanding is that if you orbit closer to earth, atmospheric drag becomes a problem. But it is less of a problem for objects with lower surface area to mass ratios, which inherently advantages larger objects. So... Couldn't you just have fewer, bigger, armored satellites that orbit closer to earth where the little debris can't go?
(3) Another strategy of course would be to boost out to a high orbit away from all the debris. It seems you considered this strategy and calculated that with e.g. 40 starships of debris it wouldn't work. What's the difference in difficulty between trying to block boosts to normal orbits like LEO vs trying to block something going into a very high orbit or escaping orbit entirely? Does it take e.g. an OOM less debris to block the former vs. the latter? Five OOMs?
Thank you for the feedback. I feel you. However it seems like you were thinking of the purpose of this project as more "scary demo"-y than I was. If this project lengthens people's timelines, well, maybe that's correct and valuable?
I am quite worried though that the AI village might be systematically underestimating AI capabilities due to e.g. the harness/scaffold being suboptimal, due to the AIs not having been trained to use it, and due to the AIs tripping over each other in various ways.
What is Meet PERCEY and what is AutoFac?
Note that since the value of the equity is constantly going up, that's a huge underestimate.
“Our greatest fear should not be of failure, but of succeeding at something that doesn't really matter.” –attributed to DL Moody[1]
I don't think what you are doing makes the situation worse. Perhaps you do think that of me though; this would be understandable...
I'm not sure your model makes sense (e.g. is phase 3 really a distinct phase?) If it does, I'd guess that humans also have a Phase 3 though, and/or current AI agents might already not. It's just that the transition from phase 2 to 3 happens farther out than it does for current AIs. In the future there will be AIs whose transition is farther out still, possibly infinite.
I think you are talking about a different notion of time horizon than me. You are talking about the transition between phase 2 and 3 for a given human or AI, whereas I'm talking about the crossover point where humans start to be better than AIs, which currently exists but won't always exist.
EDIT: I do think you make good points though, thanks for the comment.
Also fwiw it's not true that my model assumes that any given AI's ability to solve tasks continues to grow at the same rate when you give it more inference-time compute; the widget Claude built has settings for whether the underlying relationship is linear, asymptotic, or power law and qualitatively the results are similar in all three cases. (basically, if we assume that AIs get diminishing returns to more time budget, that's fine, as long as we also assume humans get diminishing returns to more time budget, which seems reasonable in that case. AIs returns will diminish faster at first, but as they improve, the slope will increase until it matches and then surpasses the human slope.)