AI Timelines

habryka; Daniel Kokotajlo; Ajeya Cotra; Ege Erdil

Introduction

How many years will pass before transformative AI is built? Three people who have thought about this question a lot are Ajeya Cotra from Open Philanthropy, Daniel Kokotajlo from OpenAI and Ege Erdil from Epoch. Despite each spending at least hundreds of hours investigating this question, they still still disagree substantially about the relevant timescales. For instance, here are their median timelines for one operationalization of transformative AI:

Median Estimate for when 99% of currently fully remote jobs will be automatable
Daniel	4 years
Ajeya	13 years
Ege	40 years

You can see the strength of their disagreements in the graphs below, where they give very different probability distributions over two questions relating to AGI development (note that these graphs are very rough and are only intended to capture high-level differences, and especially aren't very robust in the left and right tails).

In what year would AI systems be able to replace 99% of current fully remote jobs?

In what year will the energy consumption of humanity or its descendants be 1000x greater than now?

Median indicated by small dotted line. Note that Ege's median is outside of the bounds at 2177

So I invited them to have a conversation about where their disagreements lie, sitting down for 3 hours to have a written dialogue. You can read the discussion below, which I personally found quite valuable.

The dialogue is roughly split in two, with the first part focusing on disagreements between Ajeya and Daniel, and the second part focusing on disagreements between Daniel/Ajeya and Ege.

I'll summarize the discussion here, but you can also jump straight in.

Summary of the Dialogue

Some Background on their Models

Ajeya and Daniel are using a compute-centric model for their AI forecasts, illustrated by Ajeya's draft AI Timelines report, and Tom Davidson's takeoff model where the question of "when transformative AI" gets reduced to "how much compute is necessary to get AGI and when will we have that much compute? (modeling algorithmic advances as reductions in necessary compute)".

Whereas Ege thinks such models should have a lot of weight in our forecasts, but that they likely miss important considerations and doesn't have enough evidence to justify the extraordinary predictions it makes.

Habryka's Overview of Ajeya & Daniel discussion

Ajeya thinks translating AI capabilities into commercial applications has gone slower than expected ("it seems like 2023 brought the level of cool products I was naively picturing in 2021") and similarly thinks there will be a lot of kinks to figure out before AI systems can substantially accelerate AI development.
Daniel agrees that impactful commercial applications have been slower than expected, but also thinks that the parts that made that slow can be automated substantially, and that a lot of the complexity comes from shipping something that can be useful to general consumers, and that for applications internal to the company, these capabilities can be unlocked faster.
Compute overhangs also play a big role in the differences between Ajeya and Daniel's timelines. There is currently substantial room to scale up AI by just spending more money on readily available compute. However, within a few years, increasing the amount of training compute further will require accelerating the semiconductor supply chain, which probably can't be easily achieved by just spending more money. This creates a "compute overhang" that accelerates AI progress substantially in the short run. Daniel thinks it's more likely than not that we will get transformative AI before this compute overhang is exhausted. Ajeya thinks that is plausible, but overall it's more likely to happen after, which broadens her timelines quite a bit.

These disagreements probably explain some but not most of the differences in the timelines for Daniel and Ajeya.

Habryka's Overview of Ege & Ajeya/Daniel Discussion

Ege thinks that Daniel's forecast leaves very little room for Hoftstadter's law ("It always takes longer than you expect, even when you take into account Hofstadter's Law"), and in-general that there will be a bunch of unexpected things that go wrong on the path to transformative AI
Daniel thinks that Hofstadter's law is inappropriate for trend extrapolation. I.e. it doesn't make sense to look at Moore's law and be like "ah, and because of planning fallacy the slope of this graph from today is half of what it was previously"
Both Ege and Ajeya don't expect a large increase in transfer learning ability in the next few years. For Ege this matters a lot because it's one of the top reasons why AI will not speed up the economy and AI development that much. Ajeya thinks we can probably speed up AI R&D anyways by making AI that doesn't have transfer as good as humans, but is just really good at ML engineering and AI R&D because it was directly trained to be.
Ege expects that AI will have a large effect on the economy, but has substantial probability on persistent deficiencies that prevent AI from fully automating AI R&D or very substantially accelerating semiconductor progress.

Overall, whether AI will get substantially better at transfer learning (e.g. seeing an AI be trained on one genre of video game and then very quickly learn to play another genre of video game) would update all participants substantially towards shorter timelines.

We ended the dialogue with Ajeya, Daniel and Ege by putting numbers on how much various AGI milestones would cause them to update their timelines (with the concrete milestones proposed by Daniel). Time constraints made it hard to go into as much depth as we would have liked, but me and Daniel are excited about fleshing more concrete scenarios of how AGI could play out and then collecting more data on how people would update in such scenarios.

The Dialogue

habryka

Daniel, Ajeya, and Ege all seem to disagree quite substantially on the question of "how soon is AI going to be a really big deal?". So today we set aside a few hours to try to dig into that disagreement, and what the most likely cruxes between your perspectives might be.

To keep things grounded and to make sure we don't misunderstand each other, we will be starting with two reasonably well-operationalized questions:

In what year would AI systems be able to replace 99% of current fully remote jobs? (With operationalization stolen from an AI forecasting slide deck that Ajeya shared)
In what year will the energy consumption of humanity or its descendants be 1000x greater than now?

These are of course both very far from a perfect operationalization of AI risk (and I think for most people both of these questions are farther off than their answer to "how long are your timelines?"), but my guess is it will be good enough to elicit most of the differences in y'all's models and make it clear that there is indeed disagreement.

Visual probability distributions

habryka

To start us off, here are two graphs of y'all's probability distributions:

**When will 99% of fully remote jobs be automated?**

Opening statements

habryka

Ok, so let's get started:

What is your guess about which belief of yours the other two most disagree with, that might explain some of the divergence in your forecasts?

Daniel

Daniel Kokotajlo

I don't understand Ege's views very well at all yet, so I don't have much to say there. By contrast I have a lot to say about where I disagree with Ajeya. In brief: My training compute requirements distribution is centered a few OOMs lower than Ajeya's is. Why? For many reasons, but (a) I am much less enthusiastic about the comparison to the human brain than she is (or than I used to be!) and (b) I am less enthusiastic about the horizon length hypothesis / I think that large amounts of training on short-horizon tasks combined with small amounts of training on long-horizon tasks will work (after a few years of tinkering maybe).

habryka

Daniel: Just to clarify, it sounds like you approximately agree with a compute-focused approach of AI forecasting? As in, the key variable to forecast is how much compute is necessary to get AGI, with maybe some adjustment for algorithmic progress, but not a ton?

How do things like "AIs get good at different horizon lengths" play into this (which you were also mentioning as one of the potential domains of disagreement)?

(For readers: The horizon length hypothesis is that the longer the the feedback loops for a task are, the harder it is for an AI to get good at this task.

Balancing a broom on one end has feedback loops of less than a second. The task of "running a company" has month-to-year long feedback loops. The hypothesis is that we need much more compute to get AIs that are better at the second than the first. See also Richard Ngo's t-AGI framework which posits that the domain in which AI is generally intelligence will gradually expand from short time horizons to long time horizons.)

Daniel Kokotajlo

Yep I think Ajeya's model (especially the version of it expanded by Davidson & Epoch) is our best current model of AGI timelines and takeoff speeds. I have lots to critique about it but it's basically my starting point. And I am qualified to say this, so to speak, because I actually did consider about a half-dozen different models back in 2020 when I was first starting to research the topic and form my own independent impressions, and the more I thought about it the more I thought the other models were worse. Examples of other models: Polling AI scientists, extrapolating gross world product (GWP) a la Roodman, deferring to what the stock markets imply, Hanson's weird fractional progress model thingy, the semi-informative prior models... I still put some weight on those other models but not much.

As for the horizon lengths question: This feeds into the training compute requirements variable. IIRC Ajeya's original model had different buckets for short, medium, and long-horizon, where e.g. medium-horizon bucket meant roughly "Yeah we'll be doing a combo of short horizon and long horizon training, but on average it'll be medium-horizon training, such that the compute costs will be e.g. [inference FLOPs]*[many trillions of datapoints, as per scaling laws applied to bigger-than-human-brain-models]*[4-6 OOMs of seconds of subjective experience per datapoint on average]

So Ajeya had most of her mass on the medium and long horizon buckets, whereas I was much more bullish that the bulk of the training could be short horizon with just a "cherry on top" of long-horizon. Quantitatively I was thinking something like "Say you have 100T datapoints of next-moment-prediction as part of your short-horizon pre-training. I claim that you can probably get away with merely 100M datapoints of million-second-long tasks, or less."

For some intuitions why I think this, it may help to read this post and/or this comment thread.

Ege

Ege Erdil

I think my specific disagreements with Ajeya and Daniel might be a little different, but an important meta-level point is my general skepticism of arguments that imply wild conclusions. This becomes especially relevant with predictions of a 3 OOM increase in our energy consumption in the next 10 or 20 years. It's possible to tell a compelling story about why that might happen, but also possible to do the opposite, and judging how convincing those arguments should be is difficult for me.

Daniel Kokotajlo

OK, in response to Ege, I guess we disagree about this "that-conclusion-is-wild-therefore-unlikely" factor. I think for things like this it's a pretty poor guide to truth relative to other techniques (e.g. models, debates between people with different views, model-based debates between people with different views) I'm not sure how to make progress on resolving this crux. Ege, you say it's mostly in play for the 1000x energy consumption thing; wanna focus on discussing the other question instead?

Ege Erdil

Sure, discussing the other question first is fine.

I'm not sure why you think heuristics like "I don't update as much on specific arguments because I'm skeptical of my ability to do so" are ineffective, though. For example, this seems like it goes against the fractional Kelly betting heuristic from this post, which I would endorse in general: you want to defer to the market to some extent because your model has a good chance of being wrong.

I don't know if it's worth going down this tangent right now, though, so it's probably more productive to focus on the first question for now.

I think my wider distribution on the first question is also affected by the same high-level heuristic, though to a lesser extent. In some sense, if I were to fully condition on the kind of compute-based model that you and Ajeya seem to have about how AI is likely to develop, I would probably come up with a probability distribution for the first question that more or less agrees with Ajeya's.

habryka

That's interesting. I think digging into that seems good to me.

Can you say a bit more about how you are thinking about it at a high level? My guess is you have a bunch of broad heuristics, some of which are kind of like "well, the market doesn't seem to think AGI is happening soon?", and then those broaden your probability mass, but I don't know whether that's a decent characterization, and would be interested in knowing more of the heuristics that drive that.

Ege Erdil

I'm not sure I would put that much weight on the market not thinking it's happening soon, because I think it's actually fairly difficult to tell what market prices would look like if the market did think it was happening soon.

Setting aside the point about the market and elaborating on the rest of my views: I would give a 50% chance that in 30 years, I will look back on something like Tom Davidson's takeoff model and say "this model captured all or most of the relevant considerations in predicting how AI development was likely to proceed". For me, that's already a fairly high credence to have in a specific class of models in such an uncertain domain.

However, conditional on this framework being importantly wrong, my timelines get substantially longer because I see no other clear path from where we are to AGI if the scaling pathway is not available. There could be other paths (e.g. large amounts of software progress) but they seem much less compelling.

If I thought the takeoff model from Tom Davidson (and some newer versions that I've been working on personally) were basically right, my forecasts would just look pretty similar to the forecasts of that model, and based on my experience with playing around in these models and the parameter ranges I would consider plausible, I think I would just end up agreeing with Ajeya on the first question.

Does that explanation make my view somewhat clearer?

habryka

However, conditional on this framework being importantly wrong, my timelines get substantially longer because I see no other clear path from where we are to AGI if the scaling pathway is not available. There could be other paths (e.g. large amounts of software progress) but they seem much less compelling.

This part really helps, I think.

I would currently characterize your view as "Ok, maybe all we need is to increase compute scaling and do some things that are strictly easier than that (and so will be done by the time we have enough compute). But if that's wrong, forecasting when we'll get AGI gets much harder, since we don't really have any other concrete candidate hypothesis for how to get to AGI, and that implies a huge amount of uncertainty on when things will happen".

Ege Erdil

I would currently characterize your view as "Ok, maybe all we need is to increase compute scaling and do some things that are strictly easier than that (and so will be done by the time we have enough compute). But if that's wrong, forecasting when we'll get AGI gets much harder, since we don't really have any other concrete candidate hypothesis for how to get to AGI, and that implies a huge amount of uncertainty on when things will happen".

That's basically right, though I would add the caveat that entropy is relative so it doesn't really make sense to have a "more uncertain distribution" over when AGI will arrive. You have to somehow pick some typical timescale over which you expect that to happen, and I'm saying that once scaling is out of the equation I would default to longer timescales that would make sense to have for a technology that we think is possible but that we have no concrete plans for achieving on some reasonable timetable.

Ajeya Cotra

I see no other clear path from where we are to AGI if the scaling pathway is not available. There could be other paths (e.g. large amounts of software progress) but they seem much less compelling.

I think it's worth separating the "compute scaling" pathway into a few different pathways, or else giving the generic "compute scaling" pathway more weight because it's so broad. In particular, I think Daniel and I are living in a much more specific world than just "lots more compute will help;" we're picturing agents built from LLMs, more or less. That's very different from e.g. "We can simulate evolution." The compute scaling hypothesis encompasses both, as well as lots of messier in-between worlds. It's pretty much the one paradigm that anyone in the past who was trying to forecast timelines and got anywhere close to predicting when AI would start getting interesting used. Like I think Moravec is looking super good right now. In some sense, "we come up with a brilliant insight to do this way more efficiently than nature even when we have very little compute" is a hypothesis that should have had <50% weight a priori, compared to "capabilities will start getting good when we're talking about macroscopic amounts of compute."

Or maybe I'd say on priors you could have been 50/50 between "things will get more and more interesting the more compute we have access to" and "things will stubbornly stay super uninteresting even if we have oodles of compute because we're missing deep insights that the compute doesn't help us get"; but then when you look around at the world, you should update pretty hard toward the first.

Ajeya

Ajeya Cotra

On Daniel's opening points: I think I actually just agree with both a) and b) right now — or rather, I agree that the right question to ask about the training compute requirement is something more along the lines of "How many GPT-N to GPT-N.5 jumps do we think it would take?", and that short horizon LLMs plus tinkering looks more like "the default" than like "one of a few possibilities," where other possibilities included a more intense meta-learning step (which is how it felt in 2019-20). The latter was the biggest adjustment in my updated timelines.

That said though, I think two important object-level points push the "needed model size" and "needed amount of tinkering" higher in my mind than it is for Daniel:

In-context learning does seem pretty bad, and doesn't seem to be improving a huge amount. I think we can have TAI without really strong human-like in-context learning, but it probably requires more faff than if we had that out of the gate.
Relatedly, adversarial robustness seems not-great right now. This also feels overcomable, but I think it increases the scale that you need (by analogy, like 5-10 years ago it seemed like vision systems were good enough for cars except in the long tail / in adversarial settings, and I think vision systems had to get a fair amount bigger, plus there had to be a lot more tinkering on the cars, to get to now where they're starting to be viable).

And then a meta-level point is that I (and IIRC Metaculus, according to my colleague Isabel) have been kind of surprised for the last few years about the lack of cool products built on LLMs (it seems like 2023 brought the level of cool products I was naively picturing in 2021). I think there's a "reality has a lot of detail, actually making stuff work is a huge pain" dynamic going on, and it lends credence to the "things will probably be fairly continuous" heuristic that I already had.

A few other meta-points:

The Paul self-driving car bets post was interesting to me, and I place some weight on "Daniel is doing the kind of 'I can see how it would be done so it's only a few years away' move that I think doesn't serve as a great guide to what will happen in the real world."
Carl is the person who seems like he's been the most right when we've disagreed, so he's probably the one guy whose views I put the most weight on. But also Carl seems like he errs aggressive and errs in the direction of believing people will aggressively pursue the most optimal thing (being more surprised than I was, for a longer period of time, about how people haven't invested more in AI and accomplished more with it by now). His timelines are longer than Daniel's, and I think mine are a bit longer than his.
In general, I do count Daniel as among a pretty small set of people who were clearly on record with views more correct than mine in 2020 about both the nature of how TAI would be built (LLMs+tinkering) and how quickly things would progress. Although it's a bit complicated because 2020-me thought we'd be seeing more powerful LLM products by now.
Other people who I think were more right include Carl, Jared Kaplan, Danny Hernandez, Dario, Holden, and Paul. Paul is interesting because I think he both put more weight than I did on "it's just LLMs plus a lot of decomposition and tinkering" but also puts more weight than either me or Daniel on "things are just hard and annoying and take a long time;" this left him with timelines similar to mine in 2020, and maybe a bit longer than mine now.
Oh — another point that seems interesting to discuss at some point is that I suspect Daniel generally wants to focus on a weaker endpoint because of some sociological views I disagree with. (Screened off by the fact that we were answering the same question about remotable jobs replacement, but I think hard to totally screen off.)

On in-context learning as a potential crux

Daniel Kokotajlo

Re: Ajeya:

Interesting, I thought the biggest adjustment to your timelines was the pre-AGI R&D acceleration modelled by Davidson. That was another disagreement between us originally that ceased being a disagreement once you took that stuff into account.
re: in-context learning: I don't have much to say on this & am curious to hear more. Why do you think it needs to get substantially better in order to reach AGI, and why do you think it's not on track to do so? I'd bet that GPT4 is way better than GPT3 at in-context learning for example.
re: adversarial robustness: Same question I guess. My hot take would be (a) it's not actually that important, the way forward is not to never make errors in the first place but rather to notice and recover from them enough that the overall massive parallel society of LLM agents moves forward and makes progress, and (b) adversarial robustness is indeed improving. I'd be curious to hear more, perhaps you have data on how fast it is improving and you extrapolate the trend and think it'll still be sucky by e.g. 2030?
re: schlep & incompetence on the part of the AGI industry: Yep, you are right about this, and I was wrong. Your description of Carl also applies to me historically; in the past three years I've definitely been a "this is the fastest way to AGI, therefore at least one of the labs will do it with gusto" kind of guy, and now I see that is wrong. I think basically I fell for the planning fallacy & efficient market hypothesis fallacy.

However, I don't think this is the main crux between us, because it basically pushes things back by a few years, it doesn't e.g. double (on a log scale) the training requirements. My current, updated model of timelines, therefore, is that the bottleneck in the next five years is not necessarily compute but instead quite plausibly schlep & conviction on the part of the labs. This is tbh a bit of a scary conclusion.

Ajeya Cotra

re: in-context learning: I don't have much to say on this & am curious to hear more. Why do you think it needs to get substantially better in order to reach AGI, and why do you think it's not on track to do so? I'd bet that GPT4 is way better than GPT3 at in-context learning for example.

The traditional image of AGI involves having an AI system that can learn new (to it) skills as efficiently as humans (with as few examples as humans would need to see). I think this is not how the first transformative AI system will look, because ML is less sample efficient than humans and it doesn't look like in-context learning is on track to being able to do the kind of fast sample-efficient learning that humans do. I think this is not fatal for getting TAI, because we can make up for it by a) the fact that LLMs' "ancestral memory" contains all sorts of useful information about human disciplines that they won't need to learn in-context, and b) explicitly guiding the LLM agent to "reason out loud" about what lessons it should take away from its observations and putting those in an external memory it retrieves from, or similar.

I think back when Eliezer was saying that "stack more layers" wouldn't get us to AGI, this is one of the kinds of things he was pointing to: that cognitively, these systems didn't have the kind of learning/reflecting flexibility that you'd think of re AGI. When people were talking about GPT-3's in-context learning, I thought that was one of the weakest claims by far about its impressiveness. The in-context learning at the time was like: you give it a couple of examples of translating English to French, and then you give it an English sentence, and it dutifully translate that into French. It already knew English and it already knew French (from its ancestral memory), and the thing it "learned" was that the game it was currently playing was to translate from English to French.

I agree that 4 is a lot better than 3 (for example, you can teach 4 new games like French Toast or Hitler and it will play them — unless it already knows that game, which is plausible). But compared to any object-level skill like coding (many of which are superhuman), in-context learning seems quite subhuman. I think this is related to how ARC Evals' LLM agents kind of "fell over" doing things like setting up Bitcoin wallets.

Like Eliezer often says, humans evolved to hunt antelope on the savannah, and that very same genetics coding for the very same brain can build rockets and run companies. Our LLMs right now generalize further from their training distribution than skeptics in 2020 would have said, and they're generalizing further and further as they get bigger, but they have nothing like the kind of savannah-to-boardroom generalization we have. This can create lots of little issues in lots of places when an LLM will need to digest some new-to-it development and do something intelligent with it. Importantly, I don't think this is going to stop LLM-agent-based TAI from happening, but it's one concrete limitation that pushes me to thinking we'll need more scale or more schlep than it looks like we'll need before taking this into account.

Adversarial robustness, which I'll reply to in another comment, is similar: a concrete hindrance that isn't fatal but is one reason I think we'll need more scale and schlep than it seems like Daniel does (despite agreeing with his concrete counterarguments of the form "we can handle it through X countermeasure").

Daniel Kokotajlo

Re: Ajeya: Thanks for that lengthy reply. I think I'll have to ponder it for a bit. Right now I'm stuck with a feeling that we agree qualitatively but disagree quantitatively.

Ege Erdil

I think it's worth separating the "compute scaling" pathway into a few different pathways, or else giving the generic "compute scaling" pathway more weight because it's so broad. In particular, I think Daniel and I are living in a much more specific world than just "lots more compute will help;" we're picturing agents built from LLMs, more or less. That's very different from e.g. "We can simulate evolution." The compute scaling hypothesis encompasses both, as well as lots of messier in-between worlds.

I think it's fine to incorporate these uncertainties as a wider prior over the training compute requirements, and I also agree it's a reason to put more weight on this broad class of models than you otherwise would, but I still don't find these reasons compelling enough to go significantly above 50%. It just seems pretty plausible to me that we're missing something important, even if any specific thing we can name is unlikely to be what we're missing.

To give one example, I initially thought that the evolution anchor from the Bio Anchors report looked quite solid as an upper bound, but I realized some time after that it doesn't actually have an appropriate anthropic correction and this could potentially mess things up. I now think if you work out the details this correction turns out to be inconsequential, but it didn't have to be like that: this is just a consideration that I missed when I first considered the argument. I suppose I would say I don't see a reason to trust my own reasoning abilities as much as you two seem to trust yours.

The compute scaling hypothesis is much broader, and it's pretty much the one paradigm that anyone in the past who was trying to forecast timelines and got anywhere close to predicting when AI would start getting interesting used. Like I think Moravec is looking super good right now.

My impression is that Moravec predicted in 1988 that we would have AI systems comparable to the human brain in performance around 2010. If this actually happens around 2037 (your median timelines), Moravec's forecast will have been off by around a factor of 2 in terms of the time differential from when he made the forecast. That doesn't seem "super good" to me.

Maybe I'm wrong about exactly what Moravec predicted - I didn't read his book and my knowledge is second-hand. In any event, I would appreciate getting some more detail from you about why you think he looks good.

Or maybe I'd say on priors you could have been 50/50 between "things will get more and more interesting the more compute we have access to" and "things will stubbornly stay super uninteresting even if we have oodles of compute because we're missing deep insights that the compute doesn't help us get"; but then when you look around at the world, you should update pretty hard toward the first.

I agree that if I were considering two models at those extremes, recent developments would update me more toward the former model. However, I don't actually disagree with the abstract claim that "things will get more and more interesting the more compute we have access to" - I expect more compute to make things more interesting even in worlds where we can't get to AGI by scaling compute.

Ege Erdil

I agree that 4 is a lot better than 3 (for example, you can teach 4 new games like French Toast or Hitler and it will play them — unless it already knows that game, which is plausible).

A local remark about this: I've seen a bunch of reports from other people that GPT-4 is essentially unable to play tic-tac-toe, and this is a shortcoming that was highly surprising to me. Given the amount of impressive things it can otherwise do, failing at playing a simple game whose full solution could well be in its training set is really odd.

So while I agree 4 seems better than 3, it still has some bizarre weaknesses that I don't think I understand well.

habryka

Ege: Just to check, GPT-4V (vision model) presumably can play tic-tac-toe easily? My sense is that this is just one of these situations where tokenization and one-dimensionality of text makes something hard, but it's trivial to get the system to learn it if it's in a more natural representation.

Ege Erdil

Just to check, GPT-4V (vision model) presumably can play tic-tac-toe easily?

This random Twitter person says that it can't. Disclaimer: haven't actually checked for myself.

Taking into account government slowdown

habryka

As a quick question, to what degree do y'alls forecasts above take into account governments trying to slow things down and companies intentionally going slower because of risks?

Seems like a relevant dimension that's not obviously reflected in usual compute models, and just want to make sure that's not accidentally causing some perceived divergence in people's timelines.

Daniel Kokotajlo

I am guilty of assuming governments and corporations won't slow things down by more than a year. I think I mostly still endorse this assumption but I'm hopeful that instead they'll slow things down by several years or more. Historically I've been arguing with people who disagreed with me on timelines by decades, not years, so it didn't seem important to investigate this assumption. That said I'm happy to say why I still mostly stand by it. Especially if it turns out to be an important crux (e.g. if Ege or Ajeya think that AGI would probably happen by 2030 absent slowdown)

habryka

That said I'm happy to say why I still mostly stand by it.

Cool, might be worth investigating later if it turns out to be a crux.

Ege Erdil

As a quick question, to what degree do y'alls forecasts above take into account governments trying to slow things down and companies intentionally going slower because of risks?
Seems like a relevant dimension that's not obviously reflected in usual compute models, and just want to make sure that's not accidentally causing some perceived divergence in people's timelines.

Responding to habryka: I do think government regulations, companies slowing down because of risks, companies slowing down because they are bad at coordination, capital markets being unable to allocate the large amounts of capital needed for huge training runs for various reasons, etc. could all be important. However, my general heuristic for thinking about the issue is more "there could be a lot of factors I'm missing" and less "I think these specific factors are going to be very important".

In terms of the impact of capable AI systems, I would give significantly less than even but still non-negligible odds that these kinds of factors end up limiting the acceleration in economic growth to e.g. less than an order of magnitude.

Ajeya Cotra

As a quick question, to what degree do y'alls forecasts above take into account governments trying to slow things down and companies intentionally going slower because of risks?

I include this in a long tail of "things are just slow" considerations, although in my mind it's mostly not people making a concerted effort to slow down because of x-risk, but rather just the thing that happens to any sufficiently important technology that has a lot of attention on it: a lot of drags due to the increasing number of stakeholders, both drags where companies are less blase about releasing products because of PR concerns, and drags where governments impose regulations (which I think they would have in any world, with or without the efforts of x-risk-concerned contingent).

habryka

Slight meta: I am interested in digging in a bit more to find some possible cruxes between Daniel and Ajeya, before going more in-depth between Ajeya and Ege, just to keep the discussion a bit more focused.

Recursive self-improvement and AI's speeding up R&D

habryka

Daniel: Just for my own understanding, you have adjusted the compute-model to account for some amount of R&D speedup as a result of having more AI researchers.

To what degree does that cover classical recursive self-improvements or things in that space? (E.g. AI systems directly modifying their training process or weights or develop their own pre-processing modules?)

Or do you expect a feedback loop that's more "AI systems do research that routes through humans understanding those insights and being in the loop on implementing them to improve the AI systems"?

Daniel Kokotajlo

When all we had was Ajeya's model, I had to make my own scrappy guess at how to adjust it to account for R&D acceleration due to pre-AGI systems. Now we have Davidson's model so I mostly go with that.

It covers recursive-self-improvement as a special case. I expect that to be what the later, steeper part of the curve looks like (basically a million AutoGPTs running in parallel across several datacenters, doing AI research but 10-100x faster than humans would, with humans watching the whole thing from the sidelines clapping as metrics go up); the earlier part of the curve looks more like "every AGI lab researcher has access to a team of virtual engineers that work at 10x speed and sometimes make dumb mistakes" and then the earliest part of the curve is what we are seeing now with copilot and chatgpt helping engineers move slightly faster.

Ajeya Cotra

Interesting, I thought the biggest adjustment to your timelines was the pre-AGI R&D acceleration modelled by Davidson. That was another disagreement between us originally that ceased being a disagreement once you took that stuff into account.

These are entangled updates. If you're focusing on just "how can you accelerate ML R&D a bunch," then it seems less important to be able to handle low-feedback-loop environments quite different from the training environment. By far the biggest reason I thought we might need longer horizon training was to imbue the skill of efficiently learning very new things (see here).

Ajeya Cotra

Right now I'm stuck with a feeling that we agree qualitatively but disagree quantitatively.

I think this is basically right!

Ajeya Cotra

re: adversarial robustness: Same question I guess. My hot take would be (a) it's not actually that important, the way forward is not to never make errors in the first place but rather to notice and recover from them enough that the overall massive parallel society of LLM agents moves forward and makes progress, and (b) adversarial robustness is indeed improving. I'd be curious to hear more, perhaps you have data on how fast it is improving and you extrapolate the trend and think it'll still be sucky by e.g. 2030?

I'll give a less lengthy reply here, since structurally it's very similar to in-context learning, and has the same "agree-qualitatively-but-not-quantitatively" flavor. (For example, I definitely agree that the game is going to be coping with errors and error-correction, not never making errors; we're talking about whether that will take four years or more than four years.)

"Not behaving erratically / falling over on super weird or adversarial inputs" is a higher-level-of-abstraction cognitive skill humans are way better at than LLMs. LLMs are improving at this skill with scale (like all skills), and there are ways to address it with schlep and workflow rearrangements (like all problems), and it's unclear how important it is in the first place. But it's plausibly fairly important, and it seems like their current level is "not amazing," and the trend is super unclear but not obviously going to make it in four years.

In general, when you're talking about "Will it be four years from now or more than four years from now?", uncertainty and FUD on any point (in-context-learning, adversarial robustness, market-efficiency-and-schlep) pushes you toward "more than four years from now" — there's little room for it to push in the other direction.

Ege Erdil

In general, when you're talking about "Will it be four years from now or more than four years from now?", uncertainty and FUD on any point (in-context-learning, adversarial robustness, pushes you toward "more than four years from now"

I'm curious why Ajeya thinks this claim is true for "four years" but not true for "twenty years" (assuming that's an accurate representation of her position, which I'm not too confident about).

Ajeya Cotra

I'm curious why Ajeya thinks this claim is true for "four years" but not true for "twenty years" (assuming that's an accurate representation of her position, which I'm not too confident about).

I don't think it's insane to believe this to be true of 20 years, but I think we have many more examples of big, difficult, society-wide things happening over 20 years than over 4.

Daniel Kokotajlo

Quick comment re: in-context learning and/or low-data learning: It seems to me that GPT-4 is already pretty good at coding, and a big part of accelerating AI R&D seems very much in reach -- like, it doesn't seem to me like there is a 10-year, 4-OOM-training-FLOP gap between GPT4 and a system which is basically a remote-working OpenAI engineer that thinks at 10x serial speed. Even if the research scientists are still human, this would speed things up a lot I think. So while I find the abstract arguments about how LLMs are worse at in-context learning etc. than humans plausible, when I think concretely about AI R&D acceleration it still seems like it's gonna start happening pretty soon, and that makes me also update against the abstract argument a bit.

habryka

So, I kind of want to check an assumption. On a compute-focused worldview, I feel a bit confused about how having additional AI engineers helps that much. Like, maybe this is a bit of a strawman, but my vibe is that there hasn't really been much architectural innovation or algorithmic progress in the last few years, and the dominant speedup has come from pouring more compute into existing architectures (with some changes to deal with the scale, but not huge ones).

Daniel, could you be more concrete about how a 10x AI engineer actually helps develop AGI? My guess is on a 4-year timescale you don't expect it to route through semiconductor supply chain improvements.

And then I want to check what Ajeya thinks here and whether something in this space might be a bit of a crux. My model of Ajeya does indeed think that AI systems in the next few years will be impressive, but not really actually that useful for making AI R&D go better, or at least not like orders of magnitude better.

Ege Erdil

Like, maybe this is a bit of a strawman, but my vibe is that there hasn't really been much architectural innovation or algorithmic progress in the last few years, and the dominant speedup has come from pouring more compute into existing architectures (with some changes to deal with the scale, but not huge ones).

My best guess is that algorithmic progress has probably continued at a rate of around a doubling of effective compute per year, at least insofar as you buy that one-dimensional model of algorithmic progress. Again, model uncertainty is a significant part of my overall view about this, but I think it's not accurate to say there hasn't been much algorithmic progress in the last few years. It's just significantly slower than the pace at which we're scaling up compute so it looks relatively less impressive.

(Daniel, Ajeya +1 this comment)

habryka

I was modeling one doubling a year as approximately not very much, compared to all the other dynamics involved, though of course it matters a bunch over the long run.

Daniel Kokotajlo

Re: Habryka's excellent point about how maybe engineering isn't the bottleneck, maybe compute is instead:

My impression is that roughly half the progress has come from increased compute and the other half from better algorithms. Going forward when I think concretely about the various limitations of current algorithms and pathways to overcome them -- which I am hesitant to go into detail about -- it sure does seem like there are still plenty of low and medium-hanging fruit to pick, and then high-hanging fruit beyond which would take decades for human scientists to get to but which can perhaps be reached much faster during an AI takeoff.

I am on a capabilities team at OpenAI right now and I think that we could be going something like 10x faster if we had the remote engineer thing I mentioned earlier. And I think this would probably apply across most of OpenAI research. This wouldn't accelerate our compute acquisition much at all to be clear, but that won't stop a software singularity from happening. Davidson model backs this up I think -- I'd guess that if you magically change it to keep hardware & compute progress constant, you still get a rapid R&D acceleration, just a somewhat slower one.

I'd think differently if I thought that parameter count was just Too Damn Low, like I used to think. If I was more excited about the human brain size comparison, I might think that nothing short of 100T parameters (trained according to Chinchilla also) could be AGI, and therefore that even if we had a remote engineer thinking at 10x speed we'd just eat up the low-hanging fruit and then stall while we waited for bigger computers to come online. But I don't think that.

Ajeya Cotra

On a compute-focused worldview, I feel a bit confused about how having additional AI engineers helps that much. Like, maybe this is a bit of a strawman, but my vibe is that there hasn't really been much architectural innovation or algorithmic progress in the last few years, and the dominant speedup has come from pouring more compute into existing architectures (with some changes to deal with the scale, but not huge ones).

I think there haven't been flashy paradigm-shifting insights, but I strongly suspect each half-GPT was a hard-won effort on a lot of fronts, including both a lot of mundane architecture improvements (like implementing long contexts in less naive ways that don't incur quadratic cost), a lot of engineering to do the model parallelism and other BS that is required to train bigger models without taking huge GPU utilization hits, and a lot of post-training improvements to make usable nice products.

habryka

Ajeya: What you say seems right, but also the things you say also don't sound like the kind of thing that when you accelerate then 10x, then you get AGI 10x earlier. As you said, a lot of BS required to train large models, a lot of productization, but that doesn't speed up the semiconductor supply chain.

The context length and GPU utilization thing feels most relevant.

Ajeya Cotra

Ajeya: What you say seems right, but also the things you say also don't sound like the kind of thing that when you accelerate then 10x, then you get AGI 10x earlier. As you said, a lot of BS required to train large models, a lot of productization, but that doesn't speed up the semiconductor supply chain.

Yeah, TBC, I think there's a higher bar than Daniel thinks there is to speeding stuff up 10x for reasons like this. I do think that there's algorithm juice, like Daniel says, but I don't think that a system you look at and naively think "wow this is basically doing OAI ML engineer-like things" will actually lead to a full 10x speedup; 10x is a lot.

I think you will eventually get the 10x, and then the 100x, but I'm picturing that happening after some ramp-up where the first ML-engineer-like systems get integrated into workflows, improve themselves, change workflows to make better use of themselves, etc.

Ajeya Cotra

Quick comment re: in-context learning and/or low-data learning: It seems to me that GPT-4 is already pretty good at coding, and a big part of accelerating AI R&D seems very much in reach.

Agree this is the strongest candidate for crazy impacts soon, which is why my two updates of "maybe meta-learning isn't that important and therefore long horizon training isn't as plausibly necessary" and "maybe I should just be obsessed with forecasting when we have the ML-research-engineer-replacing system because after that point progress is very fast" are heavily entangled. (Daniel reacts "+1" to this)

-- like, it doesn't seem to me like there is a 10-year, 4-OOM-training-FLOP gap between GPT4 and a system which is basically a remote OpenAI engineer that thinks at 10x serial speed

I don't know, 4 OOM is less than two GPTs, so we're talking less than GPT-6. Given how consistently I've been wrong about how well "impressive capabilities in the lab" will translate to "high economic value" since 2020, this seems roughly right to me?

Daniel Kokotajlo

I don't know, 4 OOM is less than two GPTs, so we're talking less than GPT-6. Given how consistently I've been wrong about how well "impressive capabilities in the lab" will translate to "high economic value" since 2020, this seems roughly right to me?

I disagree with this update -- I think the update should be "it takes a lot of schlep and time for the kinks to be worked out and for products to find market fit" rather than "the systems aren't actually capable of this." Like, I bet if AI progress stopped now, but people continued to make apps and widgets using fine-tunes of various GPTs, there would be OOMs more economic value being produced by AI in 2030 than today.

And so I think that the AI labs will be using AI remote engineers much sooner than the general economy will be. (Part of my view here is that around the time it is capable of being a remote engineer, the process of working out the kinks / pushing through schlep will itself be largely automatable.)

Ege Erdil

Like, I bet if AI progress stopped now, but people continued to make apps and widgets using fine-tunes of various GPTs, there would be OOMs more economic value being produced by AI in 2030 than today.

I'm skeptical we would get 2 OOMs or more with just the current capabilities of AI systems, but I think even if you accept that, scaling from $1B/yr to $100B/yr is easier than from $100B/yr to $10T/yr. Accelerating AI R&D by 2x seems more like the second change to me, or even bigger than that.

Ajeya Cotra

And so I think that the AI labs will be using AI remote engineers much sooner than the general economy will be. (Part of my view here is that around the time it is capable of being a remote engineer, the process of working out the kinks / pushing through schlep will itself be largely automatable.)

I agree with this

Daniel Kokotajlo

Yeah idk I pulled that out of my ass, maybe 2 OOM is more like the upper limit given how much value there already is. I agree that going from X to 10X is easier than going from 10X to 100X, in general. I don't think that undermines my point though. I disagree with your claim that making AI progress go 2x faster is more like scaling from $100B to $10T-- I think it depends on when it happens! Right now in our state of massive overhang and low-hanging-fruit everywhere, making AI progress go 2x faster is easy.

Also to clarify when I said 10x faster I meant 10x faster algorithmic progress; compute progress won't accelerate by 10x obviously. But what this means is that I think we'll transition from a world where half or more of the progress is coming from scaling compute, to a world where most of the progress is coming from algorithmic improvements / pushing-through-schlep.

Do we expect transformative AI pre-overhang or post-overhang?

habryka

I think a hypothesis I have for a possible crux for a lot of the disagreement between Daniel and Ajeya is something like "will we reach AGI before the compute overhang is over vs. after?".

Like, in as much as we think we are in a compute-overhang situation, there is an extremization that applies to people's timelines where if you we'll get there using just remaining capital and compute, you expect quite short timelines, but if you expect it will require faster chips or substantial algorithmic improvements, you expect longer, and with less probability-mass in-between.

Curious about Daniel and Ajeya answering the question of "what probability do you assign to AGI before we exhausted the current compute overhang vs. after?"

Ajeya Cotra

"what probability do you assign to AGI before we exhausted the current compute overhang vs. after?"

I think there are different extremities of compute overhang. The most extreme one which will be exhausted most quickly is like "previously these companies were training AI systems on what is essentially chump change, and now we're starting to get into a world where it's real money, and soon it will be really real money." I think within 3-4 years we'll be talking tens of billions for a training run; I think the probability we get drop-in replacements for 99% remotable jobs (regardless of whether we've rolled those drop-in replacements out everywhere) by then is something like...25%?

And then after that progress is still pretty compute-centric, but it moves slower because you're spending very real amounts of money, and you're impacting the entire supply chain: you need to build more datacenters which come with new engineering challenges, more chip-printing facilities, more fabs, more fab equipment manufacturing plans, etc.

Daniel Kokotajlo

re: Habryka: Yes we disagree about whether the current overhang is enough. But the cruxes for this are the things we are already discussing.

habryka

re: Habryka: Yes we disagree about whether the current overhang is enough. But the cruxes for this are the things we are already discussing.

Cool, that makes sense. That does seem like it might exaggerate the perceived disagreements between the two of you, when you just look at the graphs, though it's of course still highly decision-relevant to dig deeper into whether this is true or not.

Hofstadter's law in AGI forecasting

Ajeya Cotra

TBC Daniel, I think we differ by a factor of 2 on the probability for your median scenario. I feel like a general structure of our disagreements have been like: you (Daniel) are saying a scenario that makes sense and which I place a lot of weight on, but it seems like there are other scenarios and it seems like your whole timetable leaves little room for Hofstadter's law.

Ege Erdil

I feel like a general structure of our disagreements have been like: you (Daniel) are saying a scenario that makes sense and which I place a lot of weight on, but it seems like there are other scenarios and it seems like your whole timetable leaves little room for Hofstadter's law.

I think this also applies to the disagreement between me and Ajeya.

Daniel Kokotajlo

A thing that would change my mind is if I found other scenarios more plausible. Wanna sketch some?

Regarding Hofstadter's law: A possible crux between us is that you both seem to think it applies on timescales of decades -- a multiplicative factor on timelines -- whereas I think it's more like "add three years." Right?

Ege Erdil

Re: Hofstadter's law: A possible crux between us is that you both seem to think it applies on timescales of decades -- a multiplicative factor on timelines -- whereas I think it's more like "add three years." Right?

Yes, in general, that's how I would update my timelines about anything to be longer, not just AGI. The additive method seems pretty bad to me unless you have some strong domain-specific reason to think you should be making an additive update.

Daniel Kokotajlo

Yes, in general, that's how I would update my timelines about anything to be longer, not just AGI. The additive method seems pretty bad to me unless you have some strong domain-specific reason to think you should be making an additive update.

Excellent. So my reason for doing the additive method is that I think Hofstadter's law / schlep / etc. is basically the planning fallacy, and it applies when your forecast is based primarily on imagining a series of steps being implemented. It does NOT apply when your forecast is based primarily on extrapolating trends. Like, you wouldn't look at a graph of exponential progress in Moore's law or solar power or whatever and then be like "but to account for Hofstadter's Law I will assume things take twice as long as I expect, therefore instead of extrapolating the trend-line straight I will cut its slope by half."

And when it comes to AGI timelines, I think that the shorter-timeline scenarios look more subject to the planning fallacy, whereas the longer-timeline scenarios look more like extrapolating trends.

So in a sense I'm doing the multiplicative method, but only on the shorter worlds. Like, when I say 2027 as my median, that's kinda because I can actually quite easily see it happening in 2025, but things take longer than I expect, so I double it... I'm open to being convinced that I'm not taking this into account enough and should shift my timelines back a few years more; however I find it very implausible that I should add e.g. 15 years to my median because of this.

Summary of where we are at so far and exploring additional directions

habryka

We've been going for a while and it might make sense to take a short step back. Let me try to summarize where we are at:

We've been mostly focusing on the disagreement between Ajeya and Daniel. It seems like one core theme in the discussion has been the degree to which "reality has a lot of detail and kinks need to be figured out before AI systems are actually useful". Ajeya currently thinks that while it is true that AGI companies will have access to these tools earlier, there still will be a lot of stuff to figure out before you actually have a system equivalent to a current OAI engineer. Daniel made a similar update in noticing a larger-than-he-expected delay in the transition from "having all the stuff necessary to make a more capably system, like architecture, compute, training setup" and "actually producing a more capable system".

However, it's also not clear how much this actually explains the differences in the timelines for the two of you.

We briefly touched on compute overhangs being a thing that's very relevant to both of your distributions, in that Daniel assigns substantially higher probability to a very high R&D speed-up before the current overhang is exhausted, which pushes his probability mass a bunch earlier. And correspondingly Ajeya's timelines are pretty sensitive to relatively small changes in compute requirements on the margin, since that would push a bunch of probability mass into the pre-overhang world.

Ajeya Cotra

I'll put in a meta note here that I think it's pretty challenging to argue about a 25% vs a 50% on the Daniel scenario, that is literally one bit of evidence one of us sees that the other doesn't. It seems like Daniel thinks I need stronger arguments/evidence than I have to be at 25% instead of 50%, but it's easy to find one bit somewhere and hard to argue about whether it really is one bit.

Exploring conversational directions

Daniel Kokotajlo

In case interested, here are some possible conversation topics/starters:

(1) I could give a scenario in which AGI happens by some very soon date, e.g. December 2024 or 2026, and then we could talk about what parts of the scenario are most unlikely (~= what parts would cause the biggest updates to us if we observed them happening)

(2) Someone without secrecy concerns (i.e. someone not working at OpenAI, i.e. Ajeya or Ege or Habryka) could sketch what they think they would aim to have built by 2030 if they were in charge of a major AI lab and were gunning for AGI asap. Parameter count, training FLOP, etc. taken from standard projections, but then more details like what the training process and data would look like etc. Then we could argue about what this system would be capable of and what it would be incapable of, e.g. how fast would it speed up AI R&D compared to today.

(2.5) As above except for convenience we use Steinhardt's What will GPT-2030 look like? and factor the discussion into (a) will GPT-2030 be capable of the things he claims it will be capable of, and (b) will that cause a rapid acceleration of AI R&D leading shortly to AGI?

(3) Ege or Ajeya could sketch a scenario in which the year 2035 comes and goes without AGI, despite there being no AI progress slowdown (no ban, no heavy regulation, no disruptive war, etc.). Then I could say why I think such a scenario is implausible, and we could discuss more generally what that world looks like.

Ajeya Cotra

On Daniel's four topics:

(1) I could give a scenario in which AGI happens by some very soon date, e.g. December 2024 or 2026, and then we could talk about what parts of the scenario are most unlikely (~= what parts would cause the biggest updates to us if we observed them happening)

I suspect I'll be like "Yep, seems plausible, and my probability on it coming to pass is 2-5x smaller."

(2) Someone without secrecy concerns (i.e. someone not working at OpenAI, i.e. Ajeya or Ege or Habryka) could sketch what they think they would aim to have built by 2030 if they were in charge of a major AI lab and were gunning for AGI asap. Parameter count, training FLOP, etc. taken from standard projections, but then more details like what the training process and data would look like etc. Then we could argue about what this system would be capable of and what it would be incapable of, e.g. how fast would it speed up AI R&D compared to today.

I could do this if people thought it would be useful.

(2.5) As above except for convenience we use Steinhardt's What will GPT-2030 look like? and factor the discussion into (a) will GPT-2030 be capable of the things he claims it will be capable of, and (b) will that cause a rapid acceleration of AI R&D leading shortly to AGI?

I like this blog post but I feel like it's quite tame compared to what both Daniel and I think is plausible so not sure if it's the best thing to anchor on.

(3) Ege or Ajeya could sketch a scenario in which the year 2035 comes and goes without AGI, despite there being no AI progress slowdown (no ban, no heavy regulation, no disruptive war, etc.). Then I could say why I think such a scenario is implausible, and we could discuss more generally what that world looks like.

I can do this if people thought it would be useful.

Ege's median world

Ege Erdil

My median world looks something like this: we keep scaling compute until we hit training runs at a size of 1e28 to 1e30 FLOP in maybe 5 to 10 years, and after that scaling becomes increasingly difficult because of us running up against supply constraints. Software progress continues but slows down along with compute scaling. However, the overall economic impact of AI continues to grow: we have individual AI labs in 10 years that might be doing on the order of e.g. $30B/yr in revenue.

We also get more impressive capabilities: maybe AI systems can get gold on the IMO in five years, we get more reliable image generation, GPT-N can handle more complicated kinds of coding tasks without making mistakes, stuff like that. So in 10 years AI systems are just pretty valuable economically, but I expect the AI industry to look more like today's tech industry - valuable but not economically transformative.

This is mostly because I don't expect just putting 1e30 FLOP of training compute into a system will be enough to get AI systems that can substitute for humans on most or all tasks of the economy. However, I would not be surprised by a mild acceleration of overall economic growth driven by the impact of AI.

Ajeya Cotra

This is mostly because I don't expect just putting 1e30 FLOP of training compute into a system will be enough to get AI systems that can substitute for humans on most or all tasks of the economy.

To check, do you think that having perfect ems of some productive human would be transformative, a la the Duplicator?

If so, what is the main reason you don't think a sufficiently bigger training run would lead to something of that level of impact? Is this related to the savannah-to-boardroom generalization / human-level learning-of-new things point I raised previously?

Ege Erdil

To check, do you think that having perfect ems of some productive human would be transformative, a la the Duplicator?

Eventually, yes, but even there I expect substantial amounts of delay (median of a few years, maybe as long as a decade) because people won't immediately start using the technology.

If so, what is the main reason you don't think a sufficiently bigger training run would lead to something of that level of impact? Is this related to the savannah-to-boardroom generalization / human-level learning-of-new things point I raised previously?

I think that's an important part of it, yes. I expect the systems we'll have in 10 years will be really good at some things with some bizarre failure modes and domains where they lack competence. My example of GPT-4 not being able to play tic-tac-toe is rather anecdotal, but I would worry about other things of a similar nature when we actually want these systems to replace humans throughout the economy.

Daniel Kokotajlo

Eventually, yes, but even there I expect substantial amounts of delay (median of a few years, maybe as long as a decade) because people won't immediately start using the technology.

Interestingly, I think in the case of ems this is more plausible than in the case of normal AGI. Because normal AGI will be more easily extendible to superhuman levels.

Ajeya Cotra

FWIW I think the kind of AGI you and I are imagining as the most plausible first AGI is pretty janky, and the main way I see it improving stuff is by doing normal ML R&D, not galaxy-brained "editing its own source code by hand" stuff. The normal AI R&D could be done by all the ems too.

(It depends on where the AI is at when you imagine dropping ems into the scenario.)

Daniel Kokotajlo

I agree with that. The jankiness is a point in my favor, because it means there's lots of room to grow by ironing out the kinks.

Daniel Kokotajlo

Overall Ege, thanks for writing that scenario! Here are some questions / requests for elaboration:

(1) So in your median world, when do we finally get to AGI, and what changes between 2030 and then that accounts for the difference?

(2) I take it that in this scenario, despite getting IMO gold etc. the systems of 2030 are not able to do the work of today's OAI engineer? Just clarifying. Can you say more about what goes wrong when you try to use them in such a role? Or do you think that AI R&D will indeed benefit from automated engineers, but that AI progress will be bottlenecked on compute or data or insights or something that won't be accelerating?

(3) What about AI takeover? Suppose an AI lab in 2030, in your median scenario, "goes rogue" and decides "fuck it, let's just deliberately make an unaligned powerseeking AGI and then secretly put it in charge of the whole company." What happens then?

Ege Erdil

(1) So in your median world, when do we finally get to AGI, and what changes between 2030 and then that accounts for the difference?

(2) I take it that in this scenario, despite getting IMO gold etc. the systems of 2030 are not able to do the work of today's remote OAI engineer? Just clarifying. Can you say more about what goes wrong when you try to use them in such a role? Or do you think that AI R&D will indeed benefit from automated engineers, but that AI progress will be bottlenecked on compute or data or insights or something that won't be accelerating?

(3) What about AI takeover? Suppose an AI lab in 2030, in your median scenario, "goes rogue" and decides "fuck it, let's just deliberately make an unaligned powerseeking AGI and then secretly put it in charge of the whole company." What happens then?

(1): I'm sufficiently uncertain about this that I don't expect my median world to be particularly representative of the range of outcomes I consider plausible, especially when it comes to giving a date. What I expect to happen is a boring process of engineering that gradually irons out the kinks of the systems, gradual hardware progress allowing bigger training runs, better algorithms allowing for better in-context learning, and many other similar things. As this continues, I expect to see AIs substituting for humans on more and more tasks in the economy, until at some point AIs become superior to humans across the board.

(2): AI R&D will benefit from AI systems, but they won't automate everything an engineer can do. I think when you try to use the systems in practical situations; they might lose coherence over long chains of thought, or be unable to effectively debug non-performant complex code, or not be able to have as good intuitions about which research directions would be promising, et cetera. In 10 years I fully expect many people in the economy to substantially benefit from AI systems, and AI engineers probably more than most.

(3): I don't think anything notable would happen. I don't believe the AI systems of 2030 will be capable enough to manage an AI lab.

Ajeya Cotra

I think Ege's median world is plausible, just like Daniel's median world; I think my probability on "Ege world or more chill than that" is lower than my probability on "Daniel world or less chill than that." Earlier I said 25% on Daniel-or-crazier, I think I'm at 15% on Ege-or-less-crazy.

Daniel Kokotajlo

Re: the "fuck it" scenario: What I'm interested in here is what skills you think the system would be lacking, that would make it fail. Like right now for example we had a baby version of this with ChaosGPT4, which lacked strategic judgment and also had a very high mistakes-to-ability-to-recover-from-mistakes ratio, and also started from a bad position (being constantly monitored, zero human allies). So all it did was make some hilarious tweets and get shut down.

Ajeya Cotra

Ege, do you think you'd update if you saw a demonstration of sophisticated sample-efficient in-context learning and far-off-distribution transfer?

E.g. suppose some AI system was trained to learn new video games: each RL episode was it being shown a video game it had never seen, and it's supposed to try to play it; its reward is the score it gets. Then after training this system, you show it a whole new type of video game it has never seen (maybe it was trained on platformers and point-and-click adventures and visual novels, and now you show it a first-person-shooter for the first time). Suppose it could get decent at the first-person-shooter after like a subjective hour of messing around with it. If you saw that demo in 2025, how would that update your timelines?

Ege Erdil

Ege, do you think you'd update if you saw a demonstration of sophisticated sample-efficient in-context learning and far-off-distribution transfer?

Yes.

Suppose it could get decent at the first-person-shooter after like a subjective hour of messing around with it. If you saw that demo in 2025, how would that update your timelines?

I would probably update substantially towards agreeing with you.

Daniel Kokotajlo

(1): I'm sufficiently uncertain about this that I don't expect my median world to be particularly representative of the range of outcomes I consider plausible, especially when it comes to giving a date. What I expect to happen is a boring process of engineering which gradually irons out the kinks of the systems, gradual hardware progress allowing bigger training runs, better algorithms allowing for better in-context learning, and many other similar things. As this continues, I expect to see AIs substituting for humans on more and more tasks in the economy, until at some point AIs become superior to humans across the board.

Your median is post-2060 though. So I feel like you need to justify why this boring process of engineering is going to take 30 more years after 2030. Why 30 years and not 300? Indeed, why not 3?

Daniel Kokotajlo

(2): AI R&D will benefit from AI systems, but they won't automate everything an engineer can do. I think when you try to use the systems in practical situations; they might lose coherence over long chains of thought, or be unable to effectively debug non-performant complex code, or not be able to have as good intuitions about which research directions would be promising, et cetera. In 10 years I fully expect many people in the economy to substantially benefit from AI systems, and AI engineers probably more than most.

How much do you think they'll be automating/speeding things up? Can you give an example of a coding task such that, if AIs can do that coding task by, say, 2025, you'll update significantly towards shorter timelines, on the grounds that they are by 2025 doing things you didn't expect to be doable by 2030?

(My position is that all of these deficiencies exist in current systems but (a) will rapidly diminish over the next few years and (b) aren't strong blockers to progress anyway, e.g. even if they don't have good research taste they can still speed things up substantially just by doing the engineering and cutting through the schlep)

Ege Erdil

Your median is post-2060 though. So I feel like you need to justify why this boring process of engineering is going to take 30 more years after 2030. Why 30 years and not 300? Indeed, why not 3?

I don't think it's going to take ~30 (really 40 per the distribution I submitted) years after 2030, that's just my median. I think there's a 1/3 chance it takes more than 75 and 1/5 chance it takes more than 175.

If you're asking me to justify why my median is around 2065, I think this is not really that easy to do as I'm essentially just expressing the betting odds I would accept based on intuition.

Formalizing it is tricky, but I think I could say I don't find it that plausible the problem of building AI is so hard that we won't be able to do it even after 300 years of hardware and software progress. Just the massive scaling up of compute we could get from hardware progress and economic growth over that kind of timescale would enable things that look pretty infeasible over the next 20 or 30 years.

Far-off-distribution transfer

habryka

The Ege/Ajeya point about far-off-distribution transfer seem like an interesting maybe-crux, so let's go into that for a bit.

My guess is Ajeya has pretty high probability that that kind of distribution transfer will happen within the next few years and very likely the next decade?

Ajeya Cotra

Yeah, FWIW I think the savannah-to-boardroom transfer stuff is probably underlying past-Eliezer (not sure about current Eliezer) and also a lot of "stochastic parrot"-style skeptics. I think it's a good point under-discussed by the short timelines crowd, though I don't think it's decisive.

Ajeya Cotra

My guess is Ajeya has pretty high probability that that kind of distribution transfer will happen within the next few years and very likely the next decade?

Actually I'm pretty unsure, and slightly lean toward no. I just think it'll take a lot of hard work to make up for the weaknesses of not having transfer this good. Paul has a good unpublished Google doc titled "Doing without transfer." I think by the time systems are transformative enough to massively accelerate AI R&D, they will still not be that close to savannah-to-boardroom level transfer, but it will be fine because they will be trained on exactly what we wanted them to do for us. (This btw also underlies some lower-risk-level intuitions I have relative to MIRI crowd.)

habryka

Actually I'm pretty unsure, and slightly lean toward no.

Oh, huh, that is really surprising to me. But good to have that clarified.

Ajeya Cotra

Yeah, I just think the way we get our OAI-engineer-replacing-thingie is going to be radically different cognitively than human OAI-engineers, in that it will have coding instincts honed through ancestral memory the way grizzly bears have salmon-catching instincts baked into them through their ancestral memory. For example, if you give it a body, I don't think it'd learn super quickly to catch antelope in the savannah, the way a baby human caveperson could learn to code if you transported them to today.

But it's salient to me that this might just leave a bunch of awkward gaps, since we're trying to make do with systems holistically less intelligent than humans, but just more specialized to coding, writing, and so on. This is why I think the Ege world is plausible.

I also dislike using the term AGI for this reason. (Or rather, I think there is a thing people have in mind by AGI which makes sense, but it will come deep into the Singularity, after the earlier transformative AI systems that are not AGI-in-this-sense.)

Ege Erdil

I also dislike using the term AGI for this reason.

In my median world, the term "AGI" also becomes increasingly meaningless because different ways people have operationalized criteria for what counts as AGI and what doesn't begin to come apart. For example, we have AIs that can pass the Turing test for casual conversation (even if judges can ask about recent events), but these AIs can't be plugged in to do an ordinary job in the economy.

Ajeya Cotra

In my median world, the term "AGI" also becomes increasingly meaningless because different ways people have operationalized criteria for what counts as AGI and what doesn't begin to come apart. For example, we have AIs that can pass the Turing test for casual conversation (even if judges can ask about recent events), but these AIs can't be plugged in to do an ordinary job in the economy.

Yes, I'm very sympathetic to this kind of thing, which is why I like TAI (and it's related to the fact that I think we'll first have grizzly-bears-of-coding, not generally-intelligent-beings). But it bites much less in my view because it's all much more compressed and there's a pretty shortish period of a few years where all plausible things people could mean by AGI are achieved, including the algorithm that has savannah-to-boardroom-level transfer.

A concrete scenario & where its surprises are

Daniel Kokotajlo

We can delete this hook later if no one bites, but in case someone does, here's a scenario I think it would be productive to discuss:

(1) Q1 2024: A bigger, better model than GPT-4 is released by some lab. It's multimodal; it can take a screenshot as input and output not just tokens but keystrokes and mouseclicks and images. Just like with GPT-4 vs. GPT-3.5 vs. GPT-3, it turns out to have new emergent capabilities. Everything GPT-4 can do, it can do better, but there are also some qualitatively new things that it can do (though not super reliably) that GPT-4 couldn't do.

(2) Q3 2024: Said model is fine-tuned to be an agent. It was already better at being strapped into an AutoGPT harness than GPT-4 was, so it was already useful for some things, but now it's being trained on tons of data to be a general-purpose assistant agent. Lots of people are raving about it. It's like another ChatGPT moment; people are using it for all the things they used ChatGPT for but then also a bunch more stuff. Unlike ChatGPT you can just leave it running in the background, working away at some problem or task for you. It can write docs and edit them and fact-check them; it can write code and then debug it.

(3) Q1 2025: Same as (1) all over again: An even bigger model, even better. Also it's not just AutoGPT harness now, it's some more sophisticated harness that someone invented. Also it's good enough to play board games and some video games decently on the first try.

(4) Q3 2025: OK now things are getting serious. The kinks have generally been worked out. This newer model is being continually trained on oodles of data from a huge base of customers; they have it do all sorts of tasks and it tries and sometimes fails and sometimes succeeds and is trained to succeed more often. Gradually the set of tasks it can do reliably expands, over the course of a few months. It doesn't seem to top out; progress is sorta continuous now -- even as the new year comes, there's no plateauing, the system just keeps learning new skills as the training data accumulates. Now many millions of people are basically treating it like a coworker and virtual assistant. People are giving it their passwords and such and letting it handle life admin tasks for them, help with shopping, etc. and of course quite a lot of code is being written by it. Researchers at big AGI labs swear by it, and rumor is that the next version of the system, which is already beginning training, won't be released to the public because the lab won't want their competitors to have access to it. Already there are claims that typical researchers and engineers at AGI labs are approximately doubled in productivity, because they mostly have to just oversee and manage and debug the lightning-fast labor of their AI assistant. And it's continually getting better at doing said debugging itself.

(5) Q1 2026: The next version comes online. It is released, but it refuses to help with ML research. Leaks indicate that it doesn't refuse to help with ML research internally, and in fact is heavily automating the process at its parent corporation. It's basically doing all the work by itself; the humans are basically just watching the metrics go up and making suggestions and trying to understand the new experiments it's running and architectures it's proposing.

(6) Q3 2026 Superintelligent AGI happens, by whatever definition is your favorite. And you see it with your own eyes.

Question: Suppose this scenario happens. What does your credence in "AGI by 2027" look like at each of the 6 stages? E.g. what are the biggest updates, and why?

My own first-pass unconfident answer is:
0 -- 50%
1 -- 50%
2 -- 65%
3 -- 70%
4 -- 90%
5 -- 95%
6 -- 100%

Ajeya Cotra

(3) Q1 2025: Same as (1) all over again: An even bigger model, even better. Also it's not just AutoGPT harness now, it's some more sophisticated harness that someone invented. Also it's good enough to play board games and some video games decently on the first try.

I don't know how much I care about this (not zero), but I think someone with Ege's views should care a lot about how it was trained. Was it trained on a whole bunch of very similar board games and video games? How much of a distance of transfer is this, if savannah to boardroom is 100?

Ege Erdil

FWIW I interpreted this literally: we have some bigger model like chatgpt that can play some games decently on the first try, and conditional on (2) my median world has those games being mostly stuff similar to what it's seen before

so i'm not assuming much evidence of transfer from (2), only some mild amount

habryka

Yeah, let's briefly have people try to give probability estimates here, though my model of Ege feels like the first few stages have a ton of ambiguity in their operationalization, which will make it hard to answer in concrete probabilities.

Ajeya Cotra

+1, I also find the ambiguity makes answering this hard

I'll wait for Ege to answer first.

Ege Erdil

Re: Daniel, according to my best interpretation of his steps:

0 -- 6%
1 -- 6%
2 -- 12%
3 -- 15%
4 -- 30%
5 -- 95%
6 -- 100%

Ajeya Cotra

Okay here's my answer:

0 -- 20%
1 -- 28%
2 -- 37%
3 -- 50%
4 -- 75%
5 -- 87%
6 -- 100%

My updates are spread out pretty evenly because the whole scenario seems qualitatively quite plausible and most of my uncertainty is simply whether it will take more scale or more schlep at each stage than is laid out here (including stuff like making it more reliable for a combo of PR and regulation and usable-product reasons).

Daniel Kokotajlo

Thanks both! I am excited about this for a few reasons. One I think it might help to focus the discussion on the parts of the story that are biggest updates for you (and also on the parts that are importantly ambiguous! I'm curious to hear about those!) and two, because as the next three years unfold, we'll be able to compare what happens to this scenario.

Ege Erdil

unfortunately i think the scenarios are vague enough that as a practical matter it will be tricky to adjudicate or decide whether they've happened or not

Daniel Kokotajlo

I agree, but I still think it's worthwhile to do this. Also this was just a hastily written scenario, I'd love to improve it and make it more precise, and I'm all ears for suggestions!

Ajeya Cotra

Ege, I'm surprised you're at 95% at stage 5, given that stage 5's description is just that AI is doing a lot of AI R&D and leaks suggest it's going fast. If your previous timelines were several decades, then I'd think even with non-god-like AI systems speeding up R&D it should take like a decade?

Ege Erdil

I think once you're at step 5 it's overwhelmingly likely that you already have AGI. The key sentence for me is "it's basically doing all the work by itself" - I have a hard time imagining worlds where an AI can do basically all of the work of running an AI lab by itself but AGI has still not been achieved.

If the AI's role is more limited than this, then my update from 4 to 5 would be much smaller.

Ajeya Cotra

I thought Daniel said it was doing all the ML R&D by itself, and the humans were managing it (the AIs are in the role of ICs and the humans are in the role of managers at a tech company). I don't think it's obvious that just because some AI systems can pretty autonomously do ML R&D, they can pretty autonomously do everything, and I would have expected your view to agree with me more there. Though maybe you think that if it's doing ML R&D autonomously, it must have intense transfer / in-context-learning and so it's almost definitely across-the-board superhuman?

Ege Erdil

If it's only doing the R&D then I would be lower than 95%, and the exact probability I give for AGI just depends on what that is supposed to mean. That's an important ambiguity in the operationalization Daniel gives, in my opinion.

In particular, if you have a system that can somehow basically automate AI R&D but is unable to take over the other tasks involved in running an AI lab, that's something I don't expect and would push me far below the 95% forecast I provided above. In this case, I might only update upwards by some small amount based on (4) -> (5), or maybe not at all.

Overall summary, takeaways and next steps

habryka

Here is a summary of the discussion so far:

Daniel made an argument against Hofstadter's law for trend extrapolation and we discussed the validity of that for a bit.

A key thing that has come up as an interesting crux/observation is that Ege and Ajeya both don't expect a massive increase in transfer learning ability in the next few years. For Ege this matters a lot because it's one of the top reasons why AI will not speed up the economy and AI development that much. Ajeya thinks we can probably speed up AI R&D anyways by making grizzly-bear-like-AI that doesn't have transfer as good as humans, but is just really good at ML engineering and AI R&D because it was directly trained to be.

This makes observing substantial transfer learning a pretty relevant crux for Ege and Ajeya in the next few years/decades. Ege says he'd have timelines more similar to Ajeya's if he observed this.

Daniel and Ajeya both think that the most plausible scenario is grizzly-bear-like AI with subhuman transfer but human-level or superhuman ML engineering skills, but while Daniel thinks it'll be relatively fast to work with the grizzly-bear-AIs to massively accelerate R&D, Ajeya thinks that the lower-than-human level "general intelligence" / "transfer" will be a hindrance in a number of little ways, making her think it's plausible we'll need bigger models and/or more schlep. If Ajeya saw extreme transfer work out, she'd update more toward thinking everything will be fast and easy, and thus have Daniel-like timelines (even though Daniel himself doesn't consider extreme transfer to be a crux for him.)

Daniel and Ege tried to elicit what concretely Ege expects to happen over the coming decades when AI progress continues but doesn't end up that transformative. Ege expects that AI will have a large effect on the economy, but assigns a substantial amount of probability on persistent deficiencies that prevent it from fully automating AI R&D or very substantially accelerating semiconductor progress.

(Ajeya, Daniel and Ege all thumbs-up this summary)

Ajeya Cotra

Okay thanks everyone, heading out!

habryka

Thank you Ajeya!

Daniel Kokotajlo

Yes thanks Ajeya Ege and Oliver! Super fun.

habryka

Thinking about future discussions on this topic, I think putting probabilities on the scenario that Daniel outlined was a bit hard given the limited time we had, but I quite like the idea of doing a more parallelized and symmetric version of this kind of thing where multiple participants output a concrete sequence of events, and then have other people forecast how they would update on each of those observations, which does seem like a fun way to elicit disagreements and cruxes.

I had a nice conversation with Ege today over dinner, in which we identified a possible bet to make! Something I think will probably happen in the next 4 years, that Ege thinks will probably NOT happen in the next 15 years, such that if it happens in the next 4 years Ege will update towards my position and if it doesn't happen in the next 4 years I'll update towards Ege's position.

Drumroll...

I (DK) have lots of ideas for ML experiments, e.g. dangerous capabilities evals, e.g. simple experiments related to paraphrasers and so forth in the Faithful CoT agenda. But I'm a philosopher, I don't code myself. I know enough that if I had some ML engineers working for me that would be sufficient for my experiments to get built and run, but I can't do it by myself.

When will I be able to implement most of these ideas with the help of AI assistants basically substituting for ML engineers? So I'd still be designing the experiments and interpreting the results, but AutoGPT5 or whatever would be chatting with me and writing and debugging the code.

I think: Probably in the next 4 years. Ege thinks: probably not in the next 15.

Ege, is this an accurate summary?

Yes, this summary seems accurate.

3Vasco Grilo2y

Thanks for the update, Daniel! How about the predictions about energy consumption? In what year will the energy consumption of humanity or its descendants be 1000x greater than now? Your median date for humanity's energy consumption being 1 k times as large as now is 2031, whereas Ege's is 2177. What is your median primary energy consumption in 2027 as reported by Our World in Data as a fraction of that in 2023? Assuming constant growth from 2023 until 2031, your median fraction would be 31.6 (= (10^3)^((2027 - 2023)/(2031 - 2023))). I would be happy to set up a bet where: * I give you 10 k€ if the fraction is higher than 31.6. * You give me 10 k€ if the fraction is lower than 31.6. I would then use the 10 k€ to support animal welfare interventions.

9Daniel Kokotajlo2y

To be clear, my view is that we'll achieve AGI around 2027, ASI within a year of that, and then some sort of crazy robot-powered self-replicating economy within, say, three years of that. So 1000x energy consumption around then or shortly thereafter (depends on the doubling time of the crazy superintelligence-designed-and-managed robot economy). So, the assumption of constant growth from 2023 to 2031 is very false, at least as a representation of my view. I think my median prediction for energy consumption in 2027 is the same as yours.

1Vasco Grilo2y

Thanks, Daniel! [...] Is you median date of ASI as defined by Metaculus around 2028 July 1 (it would be if your time until AGI was strongly correlated with your time from AGI to ASI)? If so, I am open to a bet where: * I give you 10 k€ if ASI happens until the end of 2028 (slightly after your median, such that you have a positive expected monetary gain). * Otherwise, you give me 10 k€, which I would donate to animal welfare interventions.

6Daniel Kokotajlo2y

That's better, but the problem remains that I value pre-AGI money much more than I value post-AGI money, and you are offering to give me post-AGI money in exchange for my pre-AGI money (in expectation). You could instead pay me $10k now, with the understanding that I'll pay you $20k later in 2028 unless AGI has been achieved in which case I keep the money... but then why would I do that when I could just take out a loan for $10k at low interest rate? I have in fact made several bets like this, totalling around $1k, with 2030 and 2027 as the due date iirc. I imagine people will come to collect from me when the time comes, if AGI hasn't happened yet. But it wasn't rational for me to do that, I was just doing it to prove my seriousness.

3Vasco Grilo2y

Have you or other people worried about AI taken such loans (e.g. to increase donations to AI safety projects)? If not, why?

3Daniel Kokotajlo2y

Idk about others. I haven't investigated serious ways to do this,* but I've taken the low-hanging fruit -- it's why my family hasn't paid off our student loan debt for example, and it's why I went for financing on my car (with as long a payoff time as possible) instead of just buying it with cash. *Basically I'd need to push through my ugh field and go do research on how to make this happen. If someone offered me a $10k low-interest loan on a silver platter I'd take it.

1Vasco Grilo2y

We could set up the bet such that it would involve you losing/gaining no money in expectation under your views, whereas you would lose money in expectation with a loan? Also, note the bet I proposed above was about ASI as defined by Metaculus, not AGI.

3Daniel Kokotajlo2y

I gain money in expectation with loans, because I don't expect to have to pay them back. What specific bet are you offering?

2Vasco Grilo2y

I see. I was implicitly assuming a nearterm loan or one with an interest rate linked to economic growth, but you might be able to get a longterm loan with a fixed interest rate. [...] I transfer 10 k today-€ to you now, and you transfer 20 k today-€ to me if there is no ASI as defined by Metaculus on date X, which has to be sufficiently far away for the bet to be better than your best loan. X could be 12.0 years (= LN(0.9*20*10^3/(10*10^3))/LN(1 + 0.050)) from now assuming a 90 % chance I win the bet, and an annual growth of my investment of 5.0 %. However, if the cost-effectiveness of my donations also decreases 5 %, then I can only go as far as 6.00 years (= 12.0/2). I also guess the stock market will grow faster than suggested by historical data, so I would only want to have X roughly as far as in 2028. So, at the end of the day, it looks like you are right that you would be better off getting a loan.

3Daniel Kokotajlo2y

Thanks for doing the math on this and changing your mind! <3

1Vasco Grilo1y

You are welcome! [...] Here is a bet which would be worth it for me even with more distant resolution dates. If, until the end of 2028, Metaculus' question about ASI: * Resolves with a given date, I transfer to you 10 k 2025-January-$. * Does not resolve, you transfer to me 10 k 2025-January-$. * Resolves ambiguously, nothing happens. This bet involves fixed prices, so I think it would be neutral for you in terms of purchasing power right after resolution if you had the end of 2028 as your median date of ASI. I would transfer you nominally more money if you won than you nominally would transfer to me if I won, as there would tend to be more inflation if you won. I think mid 2028 was your median date of ASI, so the bet resolving at the end of 2028 may make it worth it for you. If not, it can be moved forward. It would still be the case that the purchasing power of a nominal amount of money would decrease faster after resolution if you won than if I did. However, you could mitigate this by investing your profits from the bet if you win. The bet may still be worse than some loans, but you can always make the bet and ask for such loans?

3Daniel Kokotajlo1y

I think I still don't understand, sorry. Does "today" refer to the date the metaculus question resolves, or to today? What does today-$ mean?

1Vasco Grilo1y

Sorry for the lack of clarity! "today-$" refers to January 2025. For example, assuming prices increased by 10 % from this month until December 2028, the winner would receive 11 k$ (= 10*10^3*(1 + 0.1)).

0Vasco Grilo2y

Thanks, Daniel. That makes sense. [...] My offer was also in this spirit of you proving your seriousness. Feel free to suggest bets which would be rational for you to take. Do you think there is a significant risk of a large AI catastrophe in the next few years? For example, what do you think is the probability of human population decreasing from (mid) 2026 to (mid) 2027?

You are basically asking me to give up money in expectation to prove that I really believe what I'm saying, when I've already done literally this multiple times. (And besides, hopefully it's pretty clear that I am serious from my other actions.) So, I'm leaning against doing this, sorry. If you have an idea for a bet that's net-positive for me I'm all ears.

Yes I do think there's a significant risk of large AI catastrophe in the next few years. To answer your specific question, maybe something like 5%? idk.

3Vasco Grilo2y

Are you much higher than Metaculus' community on Will ARC find that GPT-5 has autonomous replication capabilities??

3Daniel Kokotajlo2y

Good question. I guess I'm at 30%, so 2x higher? Low confidence haven't thought about it much, there's a lot of uncertainty about what METR/ARC will classify as success, and I also haven't reread ARC/METR's ARA eval to remind myself of how hard it is.

3IlluminateReality2y

Have your probabilities for AGI on given years changed at all since this breakdown you gave 7 months ago? I, and I’m sure many others, defer quite a lot to your views on timelines, so it would be good to have an updated breakdown. 15% - 2024 15% - 2025 15% - 2026 10% - 2027 5% - 2028 5% - 2029 3% - 2030 2% - 2031 2% - 2032 2% - 2033 2% - 2034 2% - 2035

8Daniel Kokotajlo2y

My 2024 probability has gone down from 15% to 5%. Other than that things are pretty similar, so just renormalize I guess.

9habryka2y

I am not Daniel, but why would "constant growth" make any sense under Daniel's worldview? The whole point is that AI can achieve explosive growth, and right now energy consumption growth is determined by human growth, not AI growth, so it seems extremely unlikely for growth between now and then to be constant.

7ryan_greenblatt2y

Daniel almost surely doesn't think growth will be constant. (Presumably he has a model similar to the one here.) I assume he also thinks that by the time energy production is >10x higher, the world has generally been radically transformed by AI.

1Vasco Grilo2y

Thanks, Ryan. [...] That makes senes. Daniel, my terms are flexible. Just let me know what is your median fraction for 2027, and we can go from there. [...] Right. I think the bet is roughly neutral with respect to monetary gains under Daniel's view, but Daniel may want to go ahead despite that to show that he really endorses his views. Not taking the bet may suggest Daniel is worried about losing 10 k€ in a world where 10 k€ is still relevant.

6Daniel Kokotajlo2y

I'm not sure I understand. You and I, as far as I know, have the same beliefs about world energy consumption in 2027, at least on our median timelines. I think it could be higher, but only if AGI timelines are a lot shorter than I think and takeoff is a lot faster than I think. And in those worlds we probably won't be around to resolve the bet in 2027, nor would I care much about winning that bet anyway. (Money post-singularity will be much less valuable to me than money before the singularity)

My sense is that this post holds up pretty well. Most of the considerations under discussion still appear live and important including: in-context learning, robustness, whether jank AI R&D accelerating AIs can quickly move to more general and broader systems, and general skepticism of crazy conclusions.

At the time of this dialogue, my timelines were a bit faster than Ajeya's. I've updated toward the views Daniel expresses here and I'm now about half way between Ajeya's views in this post and Daniel's (in geometric mean).

My read is that Daniel looks somewhat too aggressive in his predictions for 2024, though it is a bit unclear exactly what he was expecting. (This concrete scenario seems substantially more bullish than what we've seen in 2024, but not by a huge amount. It's unclear if he was intending these to be mainline predictions or a 25th percentile bullish scenario.)

AI progress appears substantially faster than the scenario outlined in Ege's median world. In particular:

On "we have individual AI labs in 10 years that might be doing on the order of e.g. $30B/yr in revenue". OpenAI made $4 billion in revenue in 2024 and based on historical trends it looks like AI company re

... (read more)

I agree the discussion holds up well in terms of the remaining live cruxes. Since this exchange, my timelines have gotten substantially shorter. They're now pretty similar to Ryan's (they feel a little bit slower but within the noise from operationalizations being fuzzy; I find it a bit hard to think about what 10x labor inputs exactly looks like).

The main reason they've gotten shorter is that performance on few-hour agentic tasks has moved almost twice as fast as I expected, and this seems broadly non-fake (i.e. it seems to be translating into real world use with only a moderate lag rather than a huge lag), though this second part is noisier and more confusing.

This dialogue occurred a few months after METR released their pilot report on autonomous replication and adaptation tasks. At the time it seemed like agents (GPT-4 and Claude 3 Sonnet iirc) were starting to be able to do tasks that would take a human a few minutes (looking something up on Wikipedia, making a phone call, searching a file system, writing short programs).

Right around when I did this dialogue, I launched an agent benchmarks RFP to build benchmarks testing LLM agents on many-step real-world tasks. Thr... (read more)

One thing that I think is interesting, which doesn't affect my timelines that much but cuts in the direction of slower: once again I overestimated how much real world use anyone who wasn't a programmer would get. I definitely expected an off-the-shelf agent product that would book flights and reserve restaurants and shop for simple goods, one that worked well enough I would actually use it (and I expected that to happen before the one hour plus coding tasks were solved; I expected it to be concurrent with half hour coding tasks).

I can't tell if the fact that AI agents continue to be useless to me is a portent that the incredible benchmark performance won't translate as well as the bullish people expect to real world acceleration; I'm largely deferring to the consensus in my local social circle that it's not a big deal. My personal intuitions are somewhat closer to what Steve Newman describes in this comment thread.

It seems like anecdotally folks are getting like +5%-30% productivity boost from using AI; it does feel somewhat aggressive for that to go to 10x productivity boost within a couple years.

Of course AI company employees have the most hands-on experience

FWIW I am not sure this is right--most AI company employees work on things other than "try to get as much work as possible from current AI systems, and understand the trajectory of how useful the AIs will be". E.g. I think I have more personal experience with running AI agents than people at AI companies who don't actively work on AI agents.

There are some people at AI companies who work on AI agents that use non-public models, and those people are ahead of the curve. But that's a minority.

Yeah, good point, I've been surprised by how uninterested the companies have been in agents.

Another effect here is that the AI companies often don't want to be as reckless as I am, e.g. letting agents run amok on my machines.

Interestingly, I've heard from tons of skeptics I've talked to (e.g. Tim Lee, CSET people, AI Snake Oil) that timelines to actual impacts in the world (such as significant R&D acceleration or industrial acceleration) are going to be way longer than we say because AIs are too unreliable and risky, therefore people won't use them. I was more dismissive of this argument before but:

It matches my own lived experience (e.g. I still use search way more than LLMs, even to learn about complex topics, because I have good Google Fu and LLMs make stuff up too much).
As you say, it seems like a plausible explanation for why my weird friends make way more use out of coding agents than giant AI companies.

7Daniel Kokotajlo1y

I tentatively remain dismissive of this argument. My claim was never "AIs are actually reliable and safe now" such that your lived experience would contradict it. I too predicted that AIs would be unreliable and risky in the near-term. My prediction is that after the intelligence explosion the best AIs will be reliable and safe (insofar as they want to be, that is.) ...I guess just now I was responding to a hypothetical interlocutor who agrees that AI R&D automation could come soon but thinks that that doesn't count as "actual impacts in the world." I've met many such people, people who think that software-only singularity is unlikely, people who like to talk about real-world bottlenecks, etc. But you weren't describing such a person, you were describing someone who also thinks we won't be able to automate AI R&D for a long time. There I'd say... well, we'll see. I agree that AIs are unreliable and risky and that therefore they'll be able to do impressive-seeming stuff that looks like they could automate AI R&D well before they actually automate AI R&D in practice. But... probably by the end of 2025 they'll be hitting that first milestone (imagine e.g. an AI that crushes RE-Bench and also can autonomously research & write ML papers, except the ML papers are often buggy and almost always banal / unimportant, and the experiments done to make them had a lot of bugs and wasted compute, and thus AI companies would laugh at the suggestion of putting said AI in charge of a bunch of GPUs and telling it to cook.) And then two years later maybe they'll be able to do it for real, reliably, in practice, such that AGI takeoff happens. Maybe another thing I'd say is "One domain where AIs seem to be heavily used in practice, is coding, especially coding at frontier AI companies (according to friends who work at these companies and report fairly heavy usage). This suggests that AI R&D automation will happen more or less on schedule."

7Ajeya Cotra1y

I'm not talking about narrowly your claim; I just think this very fundamentally confuses most people's basic models of the world. People expect, from their unspoken models of "how technological products improve," that long before you get a mind-bendingly powerful product that's so good it can easily kill you, you get something that's at least a little useful to you (and then you get something that's a little more useful to you, and then something that's really useful to you, and so on). And in fact that is roughly how it's working — for programmers, not for a lot of other people. Because I've engaged so much with the conceptual case for an intelligence explosion (i.e. the case that this intuitive model of technology might be wrong), I roughly buy it even though I am getting almost no use out of AIs still. But I have a huge amount of personal sympathy for people who feel really gaslit by it all.

To put it another way: we probably both agree that if we had gotten AI personal assistants that shop for you and book meetings for you in 2024, that would have been at least some evidence for shorter timelines. So their absence is at least some evidence for longer timelines. The question is what your underlying causal model was: did you think that if we were going to get superintelligence by 2027, then we really should see personal assistants in 2024? A lot of people strongly believe that, you (Daniel) hardly believe it at all, and I'm somewhere in the middle.

If we had gotten both the personal assistants I was expecting, and the 2x faster benchmark progress than I was expecting, my timelines would be the same as yours are now.

8Daniel Kokotajlo1y

That's reasonable. Seems worth mentioning that I did make predictions in What 2026 Looks Like, and eyeballing them now I don't think I was saying that we'd have personal assistants that shop for you and book meetings for you in 2024, at least not in a way that really works. (I say at the beginning of 2026 "The age of the AI assistant has finally dawned.") In other words I think even in 2021 I was thinking that widespread actually useful AI assistants would happen about a year or two before superintelligence. (Not because I have opinions about the orderings of technologies in general, but because I think that once an AGI company has had a popular working personal assistant for two years they should be able to figure out how to make a better version that dramatically speeds up their R&D.)

4Noosphere891y

Indeed, I believe this is the main explanation for why my median timelines are longer than say situational awareness, and why AI isn't nearly as impactful as people used to think back in the day. The big difference from a lot of skeptics is I believe this adds at most 1-2 decades to the timeline, not multiple decades to make AI very, very useful.

8Ajeya Cotra1y

Yeah TBC, I'm at even less than 1-2 decades added, more like 1-5 years.

7Tao Lin1y

i've recently done more AI agents running amok and i've found Claude was actually more aligned and did stuff i asked it not to much less than oai models enough that it actaully made a difference lol

2Daniel Kokotajlo1y

lol what? Can you compile/summarize a list of examples of AI agents running amok in your personal experience? To what extent was it an alignment problem vs. a capabilities problem?

9Tao Lin1y

not running amock, just not reliably following instructions "only modify files in this folder" or "don't install pip packages". Claude follows instructions correctly, some other models are mode collapsed into a certain way of doing things, eg gpt-4o always thinks it's running python in chatgpt code interpreter and you need very strong prompting to make it behave in a way specific to your computer

5Tao Lin1y

a hypothetical typical example would be it tries to use the file /usr/bin/python because it's memorized that that's the path to python, that fails, then it concludes it must create that folder which would require sudo permissions, if it can it could potentially mess something

7maxnadeau1y

You mentioned CyBench here. I think CyBench provides evidence against the claim "agents are already able to perform self-contained programming tasks that would take human experts multiple hours". AFAIK, the most up-to-date CyBench run is in the joint AISI o1 evals. In this study (see Table 4.1, and note the caption), all existing models (other than o3, which was not evaluated here) succeed on 0/10 attempts at almost all the Cybench tasks that take >40 minutes for humans to complete.

5elifland1y

I believe Cybench first solve times are based on the fastest top professional teams, rather than typical individual CTF competitors or cyber employees, for which the time to complete would probably be much higher (especially for the latter).

5maxnadeau1y

Do you think that cyber professionals would take multiple hours to do the tasks with 20-40 min first-solve times? I'm intuitively skeptical. One (edit: minor) component of my skepticism is that someone told me that the participants in these competitions are less capable than actual cyber professionals, because the actual professionals have better things to do than enter competitions. I have no idea how big that selection effect is, but it at least provides some countervailing force against the selection effect you're describing.

8elifland1y

Yes, that would be my guess, medium confidence. [...] I'm skeptical of your skepticism. Not knowing basically anything about the CTF scene but using the competitive programming scene as an example, I think the median competitor is much more capable than the median software engineering professional, not less. People like competing at things they're good at.

7Neel Nanda1y

I don't know much about CTF specifically, but based on my maths exam/olympiad experience I predict that there's a lot of tricks to go fast (common question archetypes, saved code snippets, etc) that will be top of mind for people actively practicing, but not for someone with a lot of domain expertise who doesn't explicitly practice CTF. I also don't know how important speed is for being a successful cyber professional. They might be able to get some of this speed up with a bit of practice, but I predict by default there's a lot of room for improvement.

That concrete scenario was NOT my median prediction. Sorry, I should have made that more clear at the time. It was genuinely just a thought experiment for purposes of eliciting people's claims about how they would update on what kinds of evidence. My median AGI timeline at the time was 2027 (which is not that different from the scenario, to be clear! Just one year delayed basically.)

To answer your other questions:
--My views haven't changed much. Performance on the important benchmarks (agency tasks such as METR's RE-Bench) has been faster than I expected for 2024, but the cadence of big new foundation models seems to be slower than I expected (no GPT-5; pretraining scaling is slowing down due to data wall apparently? I thought that would happen more around GPT-6 level). I still have 2027 as my median year for AGI.
--Yes, I and others have run versions of that exercise several times now and yes people have found it valuable. The discussion part, people said, was less valuable than the "force yourself to write out your median scenario" part, so in more recent iterations we mostly just focused on that part.

9Ege Erdil1y

I think overall things have been moving faster than I've expected, though only in some dimensions than others. The point about revenue is particularly salient to me and I would now put the complete automation of remotable jobs 30 years out in my median world instead of 40 years out. Progress on long context coherence, agency, executive function, etc. remains fairly "on trend" despite the acceleration of progress in reasoning and AI systems currently being more useful than I expected, so I don't update down by 2x or 3x (which is more like the speedup we've seen relative to my math or revenue growth expectations).

So your median for the complete automation of remotable jobs is 2055?

What about for the existence of AI systems which can completely automate AI software R&D? (So, filling the shoes of the research engineers and research scientists etc. at DeepMind, the members of technical staff at OpenAI, etc.)

What about your 10th percentile, instead of your median?

Progress on long context coherence, agency, executive function, etc. remains fairly "on trend" despite the acceleration of progress in reasoning and AI systems currently being more useful than I expected, so I don't update down by 2x or 3x (which is more like the speedup we've seen relative to my math or revenue growth expectations).

According to METR, if I recall correctly, 50%-horizon length of LLM-based AI systems has been doubling roughly every 200 days for several years, and seems to if anything be accelerating recently. And it's already at 40 minutes. So in, idk, four years, if trends continue, AIs should be able to show up and do a day's work of autonomous research or coding as well as professional humans.* (And that's assuming an exponential trend, whereas it'll have to be superexponential eventually. Though of course invest... (read more)

1Matt Putz1y

I'm curious what the biggest factors were that made you update?

6ryan_greenblatt1y

Mostly faster benchmark performance than I expected (see Ajeya's comment here) and o3 (and o1) being evidence that RL training can scalably work and RL can plausibly scale very far.

This post taught me a lot about different ways of thinking about timelines, thanks to everyone involved!

I’d like to offer some arguments that, contra Daniel’s view, AI systems are highly unlikely to be able to replace 99% of current fully remote jobs anytime in the next 4 years. As a sample task, I’ll reference software engineering projects that take a reasonably skilled human practitioner one week to complete. I imagine that, for AIs to be ready for 99% of current fully remote jobs, they would need to be able to accomplish such a task. (That specific category might be less than 1% of all remote jobs, but I imagine that the class of remote jobs requiring at least this level of cognitive ability is more than 1%.)

Rather than referencing scaling laws, my arguments stem from analysis of two specific mechanisms which I believe are missing from current LLMs:

Long-term memory. LLMs of course have no native mechanism for retaining new information beyond the scope of their token buffer. I don’t think it is possible to carry out a complex extended task, such as a week-long software engineering project, without long-term memory to manage the task, keep track of intermediate thoughts regarding

... (read more)

Thanks for this thoughtful and detailed and object-level critique! Just the sort of discussion I hope to inspire. Strong-upvoted.

Here are my point-by-point replies:

Of course there are workarounds for each of these issues, such as RAG for long-term memory, and multi-prompt approaches (chain-of-thought, tree-of-thought, AutoGPT, etc.) for exploratory work processes. But I see no reason to believe that they will work sufficiently well to tackle a week-long project. Briefly, my intuitive argument is that these are old school, rigid, GOFAI, Software 1.0 sorts of approaches, the sort of thing that tends to not work out very well in messy real-world situations. Many people have observed that even in the era of GPT-4, there is a conspicuous lack of LLMs accomplishing any really meaty creative work; I think these missing capabilities lie at the heart of the problem.

I agree that if no progress is made on long-term memory and iterative/exploratory work processes, we won't have AGI. My position is that we are already seeing significant progress in these dimensions and that we will see more significant progress in the next 1-3 years. (If 4 years from now we haven't seen such progress I'll admit... (read more)

Likewise, thanks for the thoughtful and detailed response. (And I hope you aren't too impacted by current events...)

I agree that if no progress is made on long-term memory and iterative/exploratory work processes, we won't have AGI. My position is that we are already seeing significant progress in these dimensions and that we will see more significant progress in the next 1-3 years. (If 4 years from now we haven't seen such progress I'll admit I was totally wrong about something). Maybe part of the disagreement between us is that the stuff you think are mere hacky workarounds, I think might work sufficiently well (with a few years of tinkering and experimentation perhaps).

Wanna make some predictions we could bet on? Some AI capability I expect to see in the next 3 years that you expect to not see?

Sure, that'd be fun, and seems like about the only reasonable next step on this branch of the conversation. Setting good prediction targets is difficult, and as it happens I just blogged about this. Off the top of my head, predictions could be around the ability of a coding AI to work independently over an extended period of time (at which point, it is arguably an "engineering AI"). Two di... (read more)

Oooh, I should have thought to ask you this earlier -- what numbers/credences would you give for the stages in my scenario sketched in the OP? This might help narrow things down. My guess based on what you've said is that the biggest update for you would be Step 2, because that's when it's clear we have a working method for training LLMs to be continuously-running agents -- i.e. long-term memory and continuous/exploratory work processes.

6Vladimir_Nesov3y

The timelines-relevant milestone of AGI is ability to autonomously research, especially AI's ability to develop AI that doesn't have particular cognitive limitations compared to humans. Quickly giving AIs experience at particular jobs/tasks that doesn't follow from general intelligence alone is probably possible through learning things in parallel or through AIs experimenting with greater serial speed than humans can. Placing that kind of thing into AIs is the schlep that possibly stands in the way of reaching AGI (even after future scaling), and has to be done by humans. But also reaching AGI doesn't require overcoming all important cognitive shortcomings of AIs compared to humans, only those that completely prevent AIs from quickly researching their way into overcoming the rest of the shortcomings on their own. It's currently unclear if merely scaling GPTs (multimodal LLMs) with just a bit more schlep/scaffolding won't produce a weirdly disabled general intelligence (incapable of replacing even 50% of current fully remote jobs at a reasonable cost or at all) that is nonetheless capable enough to fix its disabilities shortly thereafter, making use of its ability to batch-develop such fixes much faster than humans would, even if it's in some sense done in a monstrously inefficient way and takes another couple giant training runs (from when it starts) to get there. This will be clearer in a few years, after feasible scaling of base GPTs is mostly done, but we are not there yet.

2Eli Tyre3y

I think a lot of the forecasted schlep is not commercialization, but research and development to get working prototypes. It can be that there are no major ideas that you need to find, but that your current versions don't really work because of a ton of finicky details that you haven't optimized yet. But when you, your system will basically work.

Here's a sketch for what I'd like to see in the future--a better version of the scenario experiment done above:

2-4 people sit down for a few hours together.
For the first 1-3 hours, they each write a Scenario depicting their 'median future' or maybe 'modal future.' The scenarios are written similarly to the one I wrote above, with dated 'stages.' The scenarios finish with superintelligence, or else it-being-clear-superintelligence-is-many-decades-away-at-least.
As they write, they also read over each other's scenarios and ask clarifying questions. E.g. "You say that in 2025 they can code well but unreliably -- what do you mean exactly? How much does it improve the productivity of, say, OpenAI engineers?"
By the end of the period, the scenarios are finished & everyone knows roughly what each stage means because they've been able to ask clarifying questions.
Then for the next hour or so, they each give credences for each stage of each scenario. Credences in something like "ASI by year X" where X is the year ASI happens in the scenario.
They also of course discuss and critique each other's credences, and revise their own.
At the end, hopefully some interesting movements will have

... (read more)

Blast from the past! Now that it's been a few years, some of the stages in my mini-scenario have indeed come to pass, albeit a few months later than in the scenario. I'd say we are roughly at stage 4 now:

(1) Q1 2024: A bigger, better model than GPT-4 is released by some lab. It's multimodal; it can take a screenshot as input and output not just tokens but keystrokes and mouseclicks and images. Just like with GPT-4 vs. GPT-3.5 vs. GPT-3, it turns out to have new emergent capabilities. Everything GPT-4 can do, it can do better, but there are also some qualitatively new things that it can do (though not super reliably) that GPT-4 couldn't do.
(2) Q3 2024: Said model is fine-tuned to be an agent. It was already better at being strapped into an AutoGPT harness than GPT-4 was, so it was already useful for some things, but now it's being trained on tons of data to be a general-purpose assistant agent. Lots of people are raving about it. It's like another ChatGPT moment; people are using it for all the things they used ChatGPT for but then also a bunch more stuff. Unlike ChatGPT you can just leave it running in the background, working away at some problem or task for you. It can write docs

... (read more)

6Lukas Finnveden18d

If it was truly at "what people imagined stage 4 to be", you might think that you/Ege/Ajeya are supposed to assign 90%/30%/75% to AGI within the next ~2.5 years. (Though ofc you could have had other updates that cancel out something here.) I think in fact all of you are lower than your own numbers there.

4Lukas Finnveden18d

My sense is that this isn't a big part of the story for how new models' capabilities are being increased. Though I don't think we know for sure either way. [...] This seems accurate for coders. Is it true for people who aren't coders? It's not really true for my job or life admin tasks (like I use the models a fair bit, but it's more in chat-bot mode than in agent-mode / trusting the models to do a lot of stuff for me) but maybe it's more true for others.

4Ege Erdil17d

I think my prediction about [...] has come true (unlike many of my actual predictions about the rate of progress, which were consistently too bearish about the median world due to how I was hedging against the current paradigm of scaling running into a bottleneck). Looking back at the dialogue, I can't actually remember how I interpreted Daniel's stage 4 when I offered my probability estimate of 30%. I don't currently think there's a 30% chance of "superintelligent AGI" over the next few years, which makes me think what I had in mind for stage 4 was something more impressive than what actually ended up happening. This also matches my rough sense that the current world is more like my 75th - 80th percentile world from late 2023 and not the 90th+ percentile that would be needed to justify an update from 6% to 30%, which is the update I said I would make if I observed (4) happen.

4Daniel Kokotajlo17d

tbf I myself think that it's not clear we're in stage 4, I think stage 3 is arguably where we're at. But we seem closer to 4 than to 3 imo. Maybe we should interpolate between them?

4Ege Erdil16d

I think we're in stage 4 in some ways and stage 3 in others, yeah. Looking back at the scenario, I think if we just push your predictions forward by 70% (multiply the time gap from when they were made to when the predicted events would happen by 1.7) they look pretty good. Arguably (2) happened in Q2 2025 (first time I remember AI agents being a big deal was with o3) and (3) happened in Q4 2025 (I think there was a transition in how autonomously the agents could function around that time, it's definitely the time when I felt comfortable just letting the AI write code without manually reviewing everything it was writing). If we take that view, (4) will probably "really happen" in Q4 2026.

4Daniel Kokotajlo15d

Yep! Thanks. Note that the scenario I gave wasn't actually a prediction, or at least, it wasn't my median world. I said elsewhere in thread that my median was 2027 for AGI, and implied that my median for ASI was more like 27/28: [...] So my actual prediction would have been like the scenario but stretched out another 1-2 years.

4Ege Erdil15d

Makes sense.

1000x energy consumption in 10-20 years is a really wild prediction, I would give it a <0.1% probability.
It's several orders of magnitude faster than any previous multiple, and requires large amounts of physical infrastructure that takes a long time to construct.

1000x is a really, really big number.

Baseline

2022 figures, total worldwide consumption was 180 PWh/year^[1]

Of that:

Oil: 53 PWh
Coal: 45 PWh
Gas: 40 PWh
Hydro: 11 PWh
Nuclear: 7 PWh
Modern renewable: 13 PWh
Traditional: 11 PWh

(2 sig fig because we're talking about OOM here)

There has only been a x10 multiple in the last 100 years - humanity consumed approx. 18 PWh/year around 1920 or so (details are sketchy for obvious reasons).

Looking at doubling time, we have:

1800 (5653 TWh)
1890 (10684 TWh) - 90 years
1940 (22869 TWh) - 50
1960 (41814 TWh) - 20
1978 (85869 TWh) - 18
2018 (172514 TWh) - 40

So historically, the fastest rate of doubling has been 20 years.

Build it anyway

It takes 5-10 years for humans to build a medium to large size power plant, assuming no legal constraints.
AGI is very unlikely to be able to build an individual plant much faster, although it could build more at once.

Let's ignore that and assume AGI can build instantly.

W

... (read more)

I strongly disagree. The underlying reason is that an actual singularity seems reasonably likely.

This involves super-exponential growth driven by vastly superhuman intelligence.

Large scale fusion or literal dyson spheres are both quite plausible relatively soon (<5 years) after AGI if growth isn't restricted by policy or coordination.

I think you aren't engaging with the reasons why smart people think that 1000x energy consumption could happen soon. It's all about the growth rates. Obviously anything that looks basically like human industrial society won't be getting to 1000x in the next 20 years; the concern is that a million superintelligences commanding an obedient human nation-state might be able to design a significantly faster-growing economy. For an example of how I'm thinking about this, see this comment.

IIUC, 1000x was chosen to be on-the-order-of the solar energy reaching the earth

I was surprised by this number (I would have guessed total power consumption was a much lower fraction of total solar energy), so I just ran some quick numbers and it basically checks out.

This document claims that "Averaged over an entire year, approximately 342 watts of solar energy fall upon every square meter of Earth. This is a tremendous amount of energy—44 quadrillion (4.4 x 10^16) watts of power to be exact."
Our World in Data says total energy consumption in 2022 was 179,000 terawatt-hours

Plugging this in and doing some dimensional analysis, it looks like the earth uses about 2000x the current energy consumption, which is the same OOM.

A NOAA site claims it's more like 10,000x:

173,000 terawatts of solar energy strikes the Earth continuously. That's more than 10,000 times the world's total energy use.

But plugging this number in with the OWiD value for 2022 gives about 8500x multiplier (I think the "more than 10000x" claim was true at the time it was made though). So maybe it's an OOM off, but for a loose claim using round numbers it seems close enough for me.

[edit: Just realized that Richard121 quotes some of the same figures above for total energy use and solar ir... (read more)

9Vladimir_Nesov3y

Drosophila biomass doubles every 2 days. Small things can assemble into large things. If AGI uses serial speed advantage to quickly design superintelligence, which then masters macroscopic biotech (or bootstraps nanotech if that's feasible) and of course fusion, that gives unprecedented physical infrastructure scaling speed. Given capability/hardware shape of AlphaFold 2, GPT-4, and AlphaZero, there is plausibly enough algorithmic improvement overhang to get there without needing to manufacture more compute hardware first, just a few model training runs.

1Bill Walsh1y

Thank you! Here's something that I suspect is getting missed: AI... Even a hyperintelligent AI, doesn't have hands. Even a post singularity AI can't build a power plant or a transmission line. Furthermore, the map is not the territory, a hyperintelligent AI might be able to solve the string theory equations, and design a futuristic power plant, like Tony Stark's arc reactor, but it's <1% likely to be something we can build with present day tooling that we have at scale. We're almost certainly going to have to (assuming that it gives us the designs) build new factories to build the tooling, to build the factories to build the tooling, to build the factory to build the arc reactors Likely (90%+). And that's AFTER we build the exoscale LHC to validate the theoretical predictions (because any number of theories with valid math have been proven wrong). Who builds that? All of that is going to have to be built by humans, because we don't have the robots, or the factories to build the robots, or... And during this time, civilization can't "stop", those humans still need to be fed, need entertainment, and sleep, and downtime. Sure, there may be more labor available since pretty much all office jobs will be gone, but training those humans to be useful an a construction site, or factory floor? Not so trivial. And no guiding intelligence, regardless of how smart, can escape those constraints. It may be able to tell us exactly where to go to find the rare earths, but it can't move the dirt, build the refinery, or fabricate the finished products. Dyson spheres? Where is the rocketry? Orsay it does better than that, say, repulsive lift off of earth's magnetic field, where are the miles of superconductors going to come from? New superconductive materials? Where's the factory to build them? I doubt that you're going to just "retool" a present wire drawing factory to draw unobtanium dioxide wire. Where's the asteroid mining infrastructure going to come from

This random Twitter person says that it can't. Disclaimer: haven't actually checked for myself.

https://chat.openai.com/share/36c09b9d-cc2e-4cfd-ab07-6e45fb695bb1

Here is me playing against GPT-4, no vision required. It does just fine at normal tic-tac-toe, and figures out anti-tic-tac-toe with a little bit of extra prompting.

GPT-4 can follow the rules of tic-tac-toe, but it cannot play optimally. In fact it often passes up opportunities for wins. I've spent about an hour trying to get GPT-4 to play optimal tic-tac-toe without any success.

Here's an example of GPT-4 playing sub-optimally: https://chat.openai.com/share/c14a3280-084f-4155-aa57-72279b3ea241

Here's an example of GPT-4 suggesting a bad move for me to play: https://chat.openai.com/share/db84abdb-04fa-41ab-a0c0-542bd4ae6fa1

7ReaderM3y

Optimal play requires explaining the game in detail. See here https://chat.openai.com/share/75758e5e-d228-420f-9138-7bff47f2e12d

2gwern3y

Have you or all the other tic-tac-toe people considered just spending a bit of time finetuning GPT-3 or GPT-4 to check how far away it is from playing optimally?

6[anonymous]3y

I would guess that you could train models to perfect play pretty easily, since the optimal tic-tac-toe strategy is very simple (Something like "start by playing in the center, respond by playing on a corner, try to create forks, etc".) It seems like few-shot prompting isn't enough to get them there, but I haven't tried yet. It would be interesting to see if larger sizes of gpt-3 can learn faster than smaller sizes. This would indicate to what extent finetuning adds new capabilities rather than surfacing existing ones. I still find the fact that gpt-4 cannot play tic-tac-toe despite prompting pretty striking on its own, especially since it's so good at other tasks.

4ReaderM3y

Optimal tic tac toe takes explaining the game in excruciating detail. https://chat.openai.com/share/75758e5e-d228-420f-9138-7bff47f2e12d

1clydeiii3y

Skip over tic-tac-toe and go straight to chess:

6Rafael Harth3y

Gonna share mine because that was pretty funny. I thought I played optimally missing a win whoops, but GPT-4 won anyway, without making illegal moves. Sort of.

@Daniel Kokotajlo it looks like you expect 1000x-energy 4 years after 99%-automation. I thought we get fast takeoff, all humans die, and 99% automation at around the same time (but probably in that order) and then get massive improvements in technology and massive increases in energy use soon thereafter. What takes 4 years?

(I don't think the part after fast takeoff or all humans dying is decision-relevant, but maybe resolving my confusion about this part of your model would help illuminate other confusions too.)

Good catch. Let me try to reconstruct my reasoning:

I was probably somewhat biased towards a longer gap because I knew I'd be discussing with Ege who is very skeptical (I think?) that even a million superintelligences in control of the entire human society could whip it into shape fast enough to grow 1000x in less than a decade. So I probably was biased towards 'conservatism.' (in scare quotes because the direction that is conservative vs. generous is determined by what other people think, not by the evidence and facts of the case)
As Habryka says, I think there's a gap between 99% automatable and 99% automated. I think the gap between AI R&D being 99% automatable and being actually automated will be approximately one day, unless there is deliberate effort to slow down. But automating the world economy will take longer because there won't be enough compute to replace all the jobs, many jobs will be 'sticky' and people won't just be immediately laid off, many jobs are partially physical and thus would require robots to fully automate, robots which need to be manufactured, etc.
I also think there's a gap between a fully automated economy and 1000x energy consumption. Napkin math: Sa

... (read more)

6Zach Stein-Perlman3y

Thanks! [...] I'm curious what fraction-of-2023-tasks-automatable and maybe fraction-of-world-economy-automated you think will occur at e.g. overpower time, and the median year for that. (I sometimes notice people assuming 99%-automatability occurs before all the humans are dead, without realizing they're assuming anything.)

Distinguishing:

(a) 99% remotable 2023 tasks automateable (the thing we forecast in the OP)
(b) 99% 2023 tasks automatable
(c) 99% 2023 tasks automated
(d) Overpower ability

My best guess at the ordering will be a->d->b->c.

Rationale: Overpower ability probably requires something like a fully functioning general purpose agent capable of doing hardcore novel R&D. So, (a). However it probably doesn't require sophisticated robots, of the sort you'd need to actually automate all 2023 tasks. It certainly doesn't require actually having replaced all human jobs in the actual economy, though for strategic reasons a coalition of powerful misaligned AGIs would plausibly wait to kill the humans until they had actually rendered the humans unnecessary.

My best guess is that a, d, and b will all happen in the same year, possibly within the same month. c will probably take longer for reasons sketched above.

5Richard1213y

That's wildly optimistic. There aren't any businesses that can change anywhere near that fast. Even if they genuinely wanted to, the laws 99% of business are governed by mean that they genuinely can't do that. The absolute minimum time for such radical change under most jurisdictions is roughly six months. Looking at the history of step changes in industry/business such as the industrial and information revolutions, I think the minimum plausible time between "can be automated with reasonable accuracy" and "is actually automated" is roughly a decade (give or take five years), because the humans who would be 'replaced' will not go gently. That is far faster than either of the previous revolutions though, and a lot faster than the vast majority of people are capable of adapting. Which would lead to Interesting Times...

8O O3y

The idea is R&D will already be partially automated before hitting the 99% mark, so 99% marks the end of a gradual shift towards automation.

3Richard1213y

I think there is a significant societal difference, because that last step is a lot bigger than the ones before. In general, businesses tend to try to reduce headcount as people retire or leave, even if it means some workers have very little to do. Redundancies are expensive and take a long time - the larger they are, the longer it takes. Businesses are also primarily staffed and run by humans who do not wish to lose their own jobs. For a real-world example of a task that is already >99% automatable, consider real estate conveyancing. The actual transaction is already entirely automated via simple algorithms - the database of land ownership is updated indicating the new owner, and the figures representing monetary wealth are updated in two or more bank accounts. The work prior to that consists of identity confirmation, and document comprehension to find and raise possible issues that the buyer and seller need to be informed about. All of this is already reasonably practicable with existing LLMs and image matching. Have any conveyancing solicitors replaced all of their staff thusly?

6O O3y

Keep in mind Daniel said AI R&D

I think one component is that the prediction is for when 99% of jobs are automatable, not when they are automated (Daniel probably has more to say here, but this one clarification seems important).

Ege, do you think you'd update if you saw a demonstration of sophisticated sample-efficient in-context learning and far-off-distribution transfer?

Manifold Market on this question:

Curated. I feel like over the last few years my visceral timelines have shortened significantly. This is partly in contact with LLMs, particularly their increased coding utility, and a lot downstream of Ajeya's and Daniel's models and outreach (I remember spending an afternoon on an arts-and-crafts 'build your own timeline distribution' that Daniel had nerdsniped me with). I think a lot of people are in a similar position and have been similarly influenced. It's nice to get more details on those models and the differences between them, as well as to hear Ege pushing back with "yeah but what if there are some pretty important pieces that are missing and won't get scaled away?", which I hear from my environment much less often.

There are a couple of pieces of extra polish that I appreciate. First, having some specific operationalisations with numbers and distributions up-front is pretty nice for grounding the discussion. Second, I'm glad that there was a summary extracted out front, as sometimes the dialogue format can be a little tricky to wade through.

On the object level, I thought the focus on schlep in the Ajeya-Daniel section and slowness of economy turnover in the Ajaniel-Ege se... (read more)

If human-level AI is reached quickly mainly by spending more money on compute (which I understood to be Kokotajlo's viewpoint; sorry if I misunderstood), it'd also be quite expensive to do inference with, no? I'll try to estimate how it compares to humans.

Let's use Cotra's "tens of billions" for training compared to GPT-4's $100m+, for roughly a 300x multiplier. Let's say that inference costs are multiplied by the same 300x, so instead of GPT-4's $0.06 per 1000 output tokens, you'd be paying GPT-N $18 per 1000 output tokens. I think of GPT output as analog... (read more)

Nice analysis. Some thoughts:

1. If scaling continues with something like Chinchilla scaling laws, the 300x multiplier for compute will not be all lumped into increasing parameters / inference cost. Instead it'll be split roughly half and half. So maybe 20x more data/trainingtime and 15x more parameters/inference cost. So, instead of $200/hr, we are talking more like $15/hr.

2. Hardware continues to improve in the near term; FLOP/$ continues to drop. As far as I know. Of course during AI boom times the price will be artificially high due to all the demand... Not sure which direction the net effect will be.

3. Reaching human-level AI might involve trading off inference compute and training compute, as discussed in Davidson's model (see takeoffspeeds.com and linked report) which probably is a factor that increases inference compute of the first AGIs (while shortening timelines-to-AGI) perhaps by multiple OOMs.

4. However much it costs, labs will be willing to pay. An engineer that works 5x, 10x, 100x faster than a human is incredibly valuable, much more valuable than if they worked only at 1x speed like all the extremely high-salaried human engineers at AI labs.

5hold_my_fish3y

Thanks, that's an excellent and important point that I overlooked: the growth rate of inference cost is about half that of training cost.

Subjectively there is clear improvement between 7b vs. 70b vs. GPT-4, each step 1.5-2 OOMs of training compute. The 70b models are borderline capable of following routine instructions to label data or pour it into specified shapes. GPT-4 is almost robustly capable of that. There are 3-4 more effective OOMs in the current investment scaling sprint (3-5 years), so another 2 steps of improvement if there was enough equally useful training data to feed the process, which there isn't. At some point, training gets books in images that weren't previously availabl... (read more)

6Tao Lin3y

Leela Zero uses MCTS, it doesnt play superhuman in one forward pass (like gpt-4 can do in some subdomains) (i think, didnt find any evaluations of Leela Zero at 1 forward pass), and i'd guess that the network itself doesnt contain any more generalized game playing circuitry than an llm, it just has good intuitions for Go. Nit: [...] 1.5 to 2 OOMs? 7b to 70b is 1 OOM of compute, adding in chinchilla efficiency would make it like 1.5 OOMs of effective compute, not 2. And llama 70b to gpt-4 is 1 OOM effective compute according to openai naming - llama70b is about as good as gpt-3.5. And I'd personally guess gpt4 is 1.5 OOMs effective compute above llama70b, not 2.

4Vladimir_Nesov3y

Good catch, since the context from LLMs is performance in one forward pass, the claim should be about that, and I'm not sure it's superhuman without MCTS. I think the intended point survives this mistake, that is it's a much smaller model than modern LLMs that has relatively very impressive performance primarily because of high quality of the synthetic dataset it effectively trains on. Thus models at the scale of near future LLMs will have a reality-warping amount of dataset quality overhang. This makes ability of LLMs to improve datasets much more impactful than their competence at other tasks, hence the anchors of capability I was pointing out were about labeling and rearranging data according to instructions. And also makes compute threshold gated regulation potentially toothless. [...] With Chinchilla scaling, compute is square of model size, so 2 OOMs under that assumption. Of course current 7b models are overtrained compared to Chinchilla (all sizes of LLaMA-2 are trained on 2T tokens), which might be your point. And Mistral-7b is less obviously a whole step below LLaMA-2-70b, so the full-step-of-improvement should be about earlier 7b models more representative of how the frontier of scaling advances, where a Chinchilla-like tradeoff won't yet completely break down, probably preserving data squared compute scaling estimate (parameter count no longer works very well as an anchor with all the MoE and sparse pre-training stuff). Not clear what assumptions make it 1.5 OOMs instead of either 1 or 2, possibly Chinchilla-inefficiency of overtraining? [...] I was going from EpochAI estimate that puts LLaMA 2 at 8e23 FLOPs and GPT-4 at 2e25 FLOPs, which is 1.4 OOMs. I'm thinking of effective compute in terms of compute necessary for achieving the same pre-training loss (using lower amount of literal compute with pre-training algorithmic improvement), not in terms of meaningful benchmarks for fine-tunes. In this sense overtrained smaller LLaMAs get even less effecti

4Buck3y

Iirc, original alphago had a policy network that was grandmaster level but not superhuman without MCTS.

6Ege Erdil3y

This is not quite true. Raw policy networks of AlphaGo-like models are often at a level around 3 dan in amateur rankings, which would qualify as a good amateur player but nowhere near the equivalent of grandmaster level. If you match percentiles in the rating distributions, 3d in Go is perhaps about as strong as an 1800 elo player in chess, while "master level" is at least 2200 elo and "grandmaster level" starts at 2500 elo. Edit: Seems like policy networks have improved since I last checked these rankings, and the biggest networks currently available for public use can achieve a strength of possibly as high as 6d without MCTS. That would be somewhat weaker than a professional player, but not by much. Still far off from "grandmaster level" though.

8Buck3y

According to figure 6b in "Mastering the Game of Go without Human Knowledge", the raw policy network has 3055 elo, which according to this other page (I have not checked that these Elos are comparable) makes it the 465th best player. (I don’t know much about this and so might be getting the inferences wrong, hopefully the facts are useful)

4Ege Erdil3y

I don't think those ratings are comparable. On the other hand, my estimate of 3d was apparently lowballing it based on some older policy networks, and newer ones are perhaps as strong as 4d to 6d, which on the upper end is still weaker than professional players but not by much. However, there is a big gap between weak professional players and "grandmaster level", and I don't think the raw policy network of AlphaGo could play competitively against a grandmaster level Go player.

The important thing for alignment work isn't the median prediction; if we had an alignment solution just by then, we'd have a 50% chance of dying from that lack.

I think the biggest takeaway is that nobody has a very precise and reliable prediction, so if we want to have good alignment plans in advance of AGI, we'd better get cracking.

I think Daniel's estimate does include a pretty specific and plausible model of a path to AGI, so I take his the most seriously. My model of possible AGI architectures requires even less compute than his, but I think the Hofst... (read more)

Knowing how much time we've got is important to using it well. It's worth this sort of careful analysis.

I found most of this to be wasted effort based on too much of an outside view. The human brain gives neither an upper nor lower bound on the computation needed to achieve transformative AGI. Inside views that include gears-level models of how our first AGIs will function seem much more valuable; thus Daniel Kokatijlo's predictions seem far better informed than the others here.

Outside views like "things take longer than they could, often a lot longer" are... (read more)

Ege, do you think you'd update if you saw a demonstration of sophisticated sample-efficient in-context learning and far-off-distribution transfer?

Yes.
Suppose it could get decent at the first-person-shooter after like a subjective hour of messing around with it. If you saw that demo in 2025, how would that update your timelines?
I would probably update substantially towards agreeing with you.

DeepMind released an early-stage research model SIMA: https://deepmind.google/discover/blog/sima-generalist-ai-agent-for-3d-virtual-environments/

It was tested on 6... (read more)

I disagree with this update -- I think the update should be "it takes a lot of schlep and time for the kinks to be worked out and for products to find market fit" rather than "the systems aren't actually capable of this." Like, I bet if AI progress stopped now, but people continued to make apps and widgets using fine-tunes of various GPTs, there would be OOMs more economic value being produced by AI in 2030 than today.

As a personal aside: Man, what a good world that would be. We would get a lot of the benefits of the early singularity, but not the risks.

Ma... (read more)

I am on a capabilities team at OpenAI right now

Um. What?

I guess I'm out of the loop. I thought you, Daniel, were doing governance stuff.

What's your rationale for working on capabilities if you think timelines are this compressed?

I'm doing safety work at a capabilities team, basically. I'm trying not to advance capabilities myself. I'm trying to make progress on a faithful CoT agenda. Dan Selsam, who runs the team, thought it would be good to have a hybrid team instead of the usual thing where the safety people and capabilities people are on separate teams and the capabilities people feel licensed to not worry about the safety stuff at all and the safety people are relatively out of the loop.

I found the discussion around Hofstadter's law in forecasting to be really useful as I've definitely found myself and others adding fudge factors to timelines to reflect unknown unknowns which may or may not be relevant when extrapolating capabilities from compute.

In my experience many people are of the feeling that current tools are primarily limited by their ability to plan and execute over longer time horizons. Once we have publicly available tools that are capable of carrying out even simple multi-step plans (book me a great weekend away with my parents with a budget of $x and send me the itinerary), I can see timelines amongst the general public being dramatically reduced.

7Daniel Kokotajlo3y

I think unknown unknowns are a different phenomenon than Hofstadter's Law / Planning Fallacy. My thinking on unknown unknowns is that they should make people spread out their timelines distribution, so that it has more mass later than they naively expect, but also more mass earlier than they naively expect. (Just as there are unknown potential blockers, there are unknown potential accelerants.) Unfortunately I think many people just do the former and not the latter, and this is a huge mistake.

5davidconrad3y

Interesting. I fully admit most of my experience with unknown unknowns comes from either civil engineering projects or bringing consumer products to market, both situations where the unknown unknowns are disproportionately blockers. But this doesn't seem to be the case with things like Moore's Law or continual improvements in solar panel efficiency where the unknowns have been relatively evenly distributed or even weighted towards being accelerants. I'd love to know if you have thoughts on what makes a given field more likely to be dominated by blockers or accelerants!

4Daniel Kokotajlo3y

Civil engineering projects and bringing consumer products to market are both exactly the sort of thing the planning fallacy applies to. I would just say what you've experienced is the planning fallacy, then. (It's not about the world, it's about our methods of forecasting -- when forecasting how long it will take to complete a project, humans seem to be systematically biased towards optimism.)

Thanks a lot for the summary at the start!

Could you elaborate on what it would mean to demonstrate 'savannah-to-boardroom' transfer? Our architecture was selected for in the wilds of nature, not our training data. To me it seems that when we use an architecture designed for language translation for understanding images we've demonstrated a similar degree of transfer.

I agree that we're not yet there on sample efficient learning in new domains (which I think is more what you're pointing at) but I'd like to be clearer on what benchmarks would show this. For example, how well GPT-4 can integrate a new domain of knowledge from (potentially multiple epochs of training on) a single textbook seems a much better test and something that I genuinely don't know the answer to.

I think it would be helpful if this dialog had a different name. I would hope this isn't the last dialog on timelines, and the current title is sort of capturing the whole namespace. Can we change it to something more specific?

2Ben Pace, the Vacationing Vagabond3y

You'd be more likely to get this change if you suggested a workable alternative.

3blf3y

An option is to just to add the month and year, something like "November 2023 AI Timelines".

9Daniel Kokotajlo3y

How about "AI Timelines (Nov '23)"

A question for all: If you are wrong and in 4/13/40 years most of this fails to come true, will you blame it on your own models being wrong or shift goalposts towards the success of the AI safety movement / government crack downs on AI development? If the latter, how will you be able to prove that AGI definitely would have come had the government and industry not slowed down development?

To add more substance to this comment: I felt Ege came out looking the most salient here. In general, making predictions about the future should be backed by heavy un... (read more)

Thank you for raising this explicitly. I think probably lots of people's timelines are based partially on vibes-to-do-with-what-positions-sound-humble/cautious, and this isn't totally unreasonable so deserves serious explicit consideration.

I think it'll be pretty obvious whether my models were wrong or whether the government cracked down. E.g. how much compute is spent on the largest training run in 2030? If it's only on the same OOM as it is today, then it must have been government crackdown. If instead it's several OOMs more, and moreover the training runs are still of the same type of AI system (or something even more powerful) as today (big multimodal LLMs) then I'll very happily say I was wrong.

Re humility and caution: Humility and caution should push in both directions, not just one. If your best guess is that AGI is X years away, adding an extra dose of uncertainty should make you fatten both tails of your distribution -- maybe it's 2X years away, but maybe instead it's X/2 years away.

(Exception is for planning fallacy stuff -- there we have good reason to think people are systematically biased toward shorter timelines. So if your AGI timelines are primarily based on p... (read more)

6Vladimir_Nesov3y

When models give particular ways of updating on future evidence, current predictions being wrong doesn't by itself make models wrong. Models learn, the way they learn is already part of them. An updating model is itself wrong when other available models are better in some harder-to-pin-down sense, not just at being right about particular predictions. When future evidence isn't in scope of a model, that invalidates the model. But not all models are like that with respect to relevant future evidence, even when such evidence dramatically changes their predictions in retrospect.

Great discussion! I am open to the following bet.

If, until the end of 2028, Metaculus' question about superintelligent AI:
Resolves non-ambiguously, I transfer to you 10 k January-2025-$ in the month after that in which the question resolved.
Does not resolve, you transfer to me 10 k January-2025-$ in January 2029. As before, I plan to donate my profits to animal welfare organisations.
The nominal amount of the transfer in $ is 10 k times the ratio between the consumer price index for all urban consumers and items in the United States, as reported b

... (read more)

4Daniel Kokotajlo1y

Thanks for proposing this bet. I think a bullet point needs to be added: [...] * The expected utility of money is the same to you in either case (i.e. if the utility you can get from additional money is the same after vs. before metaculus-announcing-superintelligence). Note that I think it is very much not the same. In particular I value post-ASI-announcement dollars much less than pre-ASI-announcement dollars, maybe orders off magnitude less. (Analogy: Suppose we were betting on 'US Government announces nuclear MAD with Russia and China is ongoing and advises everyone seek shelter' This is a more extreme example but gets the point across. If I somehow thought this was 60% likely to happen by 2028, it still wouldn't make sense for me to bet with you, because to a first approximation I dgaf about you wiring me $10k CPI-adjusted in the moments after the announcement.) As a result of the above I currently think that there is no bet we could make (at least not along the above lines) that would be rational for both of us to accept.

1Vasco Grilo1y

Thanks, Daniel. My bullet points are supposed to be conditions for the bet to be neutral "in terms of purchasing power, which is what matters if you also plan to donate the profits", not personal welfare. I agree a given amount of purchashing power will buy the winner less personal welfare given superintelligent AI, because then they will tend to have a higher real consumption in the future. Or are you saying that a given amount of purchasing power given superintelligent AI will buy not only less personal welfare, but also less impartial welfare via donations? If so, why? The cost-effectiveness of donations should ideally be constant across spending categories, including across worlds where there is or not superintelligent AI by a given date. Funding should be moved from the least to the most cost-effective categories until their marginal cost-effectiveness is equalised. I understand the altruistic market is not efficient. However, for my bet not to be taken, I think one would have to argue about which concrete decisions major funders like Open Philanthropy are making badly, and why they imply spending more money on worlds where there is no superintelligent AI relative to what is being done at the margin.

5Daniel Kokotajlo1y

I am saying that expected purchasing power given Metaculus resolved ASI a month ago is less, for altruistic purposes, than given Metaculus did not resolve ASI a month ago. I give reasons in the linked comment. Consider the analogy I just made to nuclear MAD -- suppose you thought nuclear MAD was 60% likely in the next three years, would you take the sort of bet you are offering me re ASI? Why or why not? I do not think any market is fully efficient and I think altruistic markets are extremely fucking far from efficient. I think I might be confused or misunderstanding you though -- it seems you think my position implies that OP should be redirecting money from AI risk causes to causes that assume no ASI? Can you elaborate?

3Vasco Grilo1y

Fair! I have now added a 3rd bullet, and clarified the sentence before the bullets: [...] I agree the bet is not worth it if superintelligent AI as defined by Metaculus' immediately implies donations can no longer do any good, but this seems like an extreme view. Even if AIs outperform humans in all tasks for the same cost, humans could still donate to AIs. I think the Cuban Missile Crisis is a better analogy for the period right after Metaculus' question resolves non-ambiguously than mutually assured destruction. For the former, there were still good opportunities to decrease the expected damage of nuclear war. For the latter, the damage had already been made.

5Daniel Kokotajlo1y

My view is not "can no longer do any good," more like "can do less good in expectation than if you had still some time left before ASI to influence things." For reasons why, see linked comment above. I think that by the time Metaculus is convinced that ASI already exists, most of the important decisions w.r.t. AI safety will have already been made, for better or for worse. Ditto (though not as strongly) for AI concentration-of-power risks and AI misuse risks.

2RHollerith1y

I think you mean in January 2029 or earlier if the question resolves before the end of 2028 otherwise there would be no need to introduce the CPI into the bet to keep things fair (or predictable).

3Vasco Grilo1y

Thanks, Richard! I have updated the bet to account for that. [...]

(5) Q1 2026: The next version comes online. It is released, but it refuses to help with ML research. Leaks indicate that it doesn't refuse to help with ML research internally, and in fact is heavily automating the process at its parent corporation. It's basically doing all the work by itself; the humans are basically just watching the metrics go up and making suggestions and trying to understand the new experiments it's running and architectures it's proposing.

@Daniel Kokotajlo, why do you think they would release it?

3Daniel Kokotajlo2y

Twas just a guess, I think it could go either way. In fact these days I'd guess they wouldn't release it at all; the official internal story would be it's for security and safety reasons.

A local remark about this: I've seen a bunch of reports from other people that GPT-4 is essentially unable to play tic-tac-toe, and this is a shortcoming that was highly surprising to me. Given the amount of impressive things it can otherwise do, failing at playing a simple game whose full solution could well be in its training set is really odd.

Huh. This is something that I could just test immediately, so I tried it.

It looks like this is true. When I play a game of tick-tack-toe with GPT-4 it doesn't play optimally, and it let me win in 3 turns.

http... (read more)

4Eli Tyre3y

Nope! It doesn't seem like it. https://chat.openai.com/share/b6878aae-faed-48a9-a15f-63981789f772 It played the exact same (bad) moves as before, and didn't notice when I had won the game. Also when I told it I won, it gave a false explanation for how. It seems like GPT-4 can't, or at least doesn't, play tick-tack-toe well?

@Daniel Kokotajlo what odds would you give me for global energy consumption growing 100x by the end of 2028? I'd be happy to bet low hundreds of USD on the "no" side.

ETA: to be more concrete I'd put $100 on the "no" side at 10:1 odds but I'm interested if you have a more aggressive offer.

3Daniel Kokotajlo3y

As previously discussed a couple times on this website, it's not rational for me to make bets on my beliefs about these things. Because I either won't be around to collect if I win, or won't value the money nearly as much. And because I can get better odds on the public market simply by taking out a loan.

6jsd3y

For context, Daniel wrote Is this a good way to bet on short timelines? (which I didn't know about when writing this comment) 3 years ago. HT Alex Lawsen for the link.

3O O3y

How about a bet on whether there appears to be a clear path to X instead? Or even more objectively some milestone that will probably be hit before we actually hit X.

5Daniel Kokotajlo3y

Yep, I love betting about stuff like that. Got any suggestions for how to objectively operationalize it? Or a trusted third party arbiter?

How were the distributions near the top elicited from each participant?

7habryka3y

I asked Ajeya, Daniel, and Ege to input their predictions for the two operationalizations into the UI for a random Metaculus market without submitting, and send me screenshots of the UI. Then I traced it over with Adobe Illustrator, combined the different predictions, and then made the final images.

E.g. suppose some AI system was trained to learn new video games: each RL episode was it being shown a video game it had never seen, and it's supposed to try to play it; its reward is the score it gets. Then after training this system, you show it a whole new type of video game it has never seen (maybe it was trained on platformers and point-and-click adventures and visual novels, and now you show it a first-person-shooter for the first time). Suppose it could get decent at the first-person-shooter after like a subjective hour of messing around with it. If

... (read more)

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?

Yes, this summary seems accurate.

3Vasco Grilo2y

9Daniel Kokotajlo2y

1Vasco Grilo2y

6Daniel Kokotajlo2y

3Vasco Grilo2y

Have you or other people worried about AI taken such loans (e.g. to increase donations to AI safety projects)? If not, why?

3Daniel Kokotajlo2y

1Vasco Grilo2y

3Daniel Kokotajlo2y

I gain money in expectation with loans, because I don't expect to have to pay them back. What specific bet are you offering?

2Vasco Grilo2y

3Daniel Kokotajlo2y

Thanks for doing the math on this and changing your mind! <3

1Vasco Grilo1y

3Daniel Kokotajlo1y

I think I still don't understand, sorry. Does "today" refer to the date the metaculus question resolves, or to today? What does today-$ mean?

1Vasco Grilo1y

0Vasco Grilo2y

Yes I do think there's a significant risk of large AI catastrophe in the next few years. To answer your specific question, maybe something like 5%? idk.

3Vasco Grilo2y

Are you much higher than Metaculus' community on Will ARC find that GPT-5 has autonomous replication capabilities??

3Daniel Kokotajlo2y

3IlluminateReality2y

8Daniel Kokotajlo2y

My 2024 probability has gone down from 15% to 5%. Other than that things are pretty similar, so just renormalize I guess.

9habryka2y

7ryan_greenblatt2y

1Vasco Grilo2y

6Daniel Kokotajlo2y

AI progress appears substantially faster than the scenario outlined in Ege's median world. In particular:

On "we have individual AI labs in 10 years that might be doing on the order of e.g. $30B/yr in revenue". OpenAI made $4 billion in revenue in 2024 and based on historical trends it looks like AI company re

... (read more)

Right around when I did this dialogue, I launched an agent benchmarks RFP to build benchmarks testing LLM agents on many-step real-world tasks. Thr... (read more)

Of course AI company employees have the most hands-on experience

There are some people at AI companies who work on AI agents that use non-public models, and those people are ahead of the curve. But that's a minority.

Yeah, good point, I've been surprised by how uninterested the companies have been in agents.

Another effect here is that the AI companies often don't want to be as reckless as I am, e.g. letting agents run amok on my machines.

It matches my own lived experience (e.g. I still use search way more than LLMs, even to learn about complex topics, because I have good Google Fu and LLMs make stuff up too much).
As you say, it seems like a plausible explanation for why my weird friends make way more use out of coding agents than giant AI companies.

7Daniel Kokotajlo1y

7Ajeya Cotra1y

If we had gotten both the personal assistants I was expecting, and the 2x faster benchmark progress than I was expecting, my timelines would be the same as yours are now.

8Daniel Kokotajlo1y

4Noosphere891y

8Ajeya Cotra1y

Yeah TBC, I'm at even less than 1-2 decades added, more like 1-5 years.

7Tao Lin1y

i've recently done more AI agents running amok and i've found Claude was actually more aligned and did stuff i asked it not to much less than oai models enough that it actaully made a difference lol

2Daniel Kokotajlo1y

lol what? Can you compile/summarize a list of examples of AI agents running amok in your personal experience? To what extent was it an alignment problem vs. a capabilities problem?

9Tao Lin1y

5Tao Lin1y

7maxnadeau1y

5elifland1y

5maxnadeau1y

8elifland1y

7Neel Nanda1y

9Ege Erdil1y

Progress on long context coherence, agency, executive function, etc. remains fairly "on trend" despite the acceleration of progress in reasoning and AI systems currently being more useful than I expected, so I don't update down by 2x or 3x (which is more like the speedup we've seen relative to my math or revenue growth expectations).

1Matt Putz1y

I'm curious what the biggest factors were that made you update?

6ryan_greenblatt1y

Mostly faster benchmark performance than I expected (see Ajeya's comment here) and o3 (and o1) being evidence that RL training can scalably work and RL can plausibly scale very far.

This post taught me a lot about different ways of thinking about timelines, thanks to everyone involved!

Rather than referencing scaling laws, my arguments stem from analysis of two specific mechanisms which I believe are missing from current LLMs:

Long-term memory. LLMs of course have no native mechanism for retaining new information beyond the scope of their token buffer. I don’t think it is possible to carry out a complex extended task, such as a week-long software engineering project, without long-term memory to manage the task, keep track of intermediate thoughts regarding

... (read more)

Thanks for this thoughtful and detailed and object-level critique! Just the sort of discussion I hope to inspire. Strong-upvoted.

Here are my point-by-point replies:

Of course there are workarounds for each of these issues, such as RAG for long-term memory, and multi-prompt approaches (chain-of-thought, tree-of-thought, AutoGPT, etc.) for exploratory work processes. But I see no reason to believe that they will work sufficiently well to tackle a week-long project. Briefly, my intuitive argument is that these are old school, rigid, GOFAI, Software 1.0 sorts of approaches, the sort of thing that tends to not work out very well in messy real-world situations. Many people have observed that even in the era of GPT-4, there is a conspicuous lack of LLMs accomplishing any really meaty creative work; I think these missing capabilities lie at the heart of the problem.

Likewise, thanks for the thoughtful and detailed response. (And I hope you aren't too impacted by current events...)

I agree that if no progress is made on long-term memory and iterative/exploratory work processes, we won't have AGI. My position is that we are already seeing significant progress in these dimensions and that we will see more significant progress in the next 1-3 years. (If 4 years from now we haven't seen such progress I'll admit I was totally wrong about something). Maybe part of the disagreement between us is that the stuff you think are mere hacky workarounds, I think might work sufficiently well (with a few years of tinkering and experimentation perhaps).

Wanna make some predictions we could bet on? Some AI capability I expect to see in the next 3 years that you expect to not see?

302

AI Timelines

302

Ω 81

Introduction

Summary of the Dialogue

Some Background on their Models

Habryka's Overview of Ajeya & Daniel discussion

Habryka's Overview of Ege & Ajeya/Daniel Discussion

The Dialogue

Visual probability distributions

Opening statements

Daniel

Ege

Ajeya

On in-context learning as a potential crux

Taking into account government slowdown

Recursive self-improvement and AI's speeding up R&D

Do we expect transformative AI pre-overhang or post-overhang?

Hofstadter's law in AGI forecasting

Summary of where we are at so far and exploring additional directions

Exploring conversational directions

Ege's median world

Far-off-distribution transfer

A concrete scenario & where its surprises are

Overall summary, takeaways and next steps

302

Ω 81

Baseline

Build it anyway

W

302

Ω 81

Baseline

Build it anyway

W