I had a nice conversation with Ege today over dinner, in which we identified a possible bet to make! Something I think will probably happen in the next 4 years, that Ege thinks will probably NOT happen in the next 15 years, such that if it happens in the next 4 years Ege will update towards my position and if it doesn't happen in the next 4 years I'll update towards Ege's position.
Drumroll...
I (DK) have lots of ideas for ML experiments, e.g. dangerous capabilities evals, e.g. simple experiments related to paraphrasers and so forth in the Faithful CoT agenda. But I'm a philosopher, I don't code myself. I know enough that if I had some ML engineers working for me that would be sufficient for my experiments to get built and run, but I can't do it by myself.
When will I be able to implement most of these ideas with the help of AI assistants basically substituting for ML engineers? So I'd still be designing the experiments and interpreting the results, but AutoGPT5 or whatever would be chatting with me and writing and debugging the code.
I think: Probably in the next 4 years. Ege thinks: probably not in the next 15.
Ege, is this an accurate summary?
You are basically asking me to give up money in expectation to prove that I really believe what I'm saying, when I've already done literally this multiple times. (And besides, hopefully it's pretty clear that I am serious from my other actions.) So, I'm leaning against doing this, sorry. If you have an idea for a bet that's net-positive for me I'm all ears.
Yes I do think there's a significant risk of large AI catastrophe in the next few years. To answer your specific question, maybe something like 5%? idk.
My sense is that this post holds up pretty well. Most of the considerations under discussion still appear live and important including: in-context learning, robustness, whether jank AI R&D accelerating AIs can quickly move to more general and broader systems, and general skepticism of crazy conclusions.
At the time of this dialogue, my timelines were a bit faster than Ajeya's. I've updated toward the views Daniel expresses here and I'm now about half way between Ajeya's views in this post and Daniel's (in geometric mean).
My read is that Daniel looks somewhat too aggressive in his predictions for 2024, though it is a bit unclear exactly what he was expecting. (This concrete scenario seems substantially more bullish than what we've seen in 2024, but not by a huge amount. It's unclear if he was intending these to be mainline predictions or a 25th percentile bullish scenario.)
AI progress appears substantially faster than the scenario outlined in Ege's median world. In particular:
I agree the discussion holds up well in terms of the remaining live cruxes. Since this exchange, my timelines have gotten substantially shorter. They're now pretty similar to Ryan's (they feel a little bit slower but within the noise from operationalizations being fuzzy; I find it a bit hard to think about what 10x labor inputs exactly looks like).
The main reason they've gotten shorter is that performance on few-hour agentic tasks has moved almost twice as fast as I expected, and this seems broadly non-fake (i.e. it seems to be translating into real world use with only a moderate lag rather than a huge lag), though this second part is noisier and more confusing.
This dialogue occurred a few months after METR released their pilot report on autonomous replication and adaptation tasks. At the time it seemed like agents (GPT-4 and Claude 3 Sonnet iirc) were starting to be able to do tasks that would take a human a few minutes (looking something up on Wikipedia, making a phone call, searching a file system, writing short programs).
Right around when I did this dialogue, I launched an agent benchmarks RFP to build benchmarks testing LLM agents on many-step real-world tasks. Thr...
One thing that I think is interesting, which doesn't affect my timelines that much but cuts in the direction of slower: once again I overestimated how much real world use anyone who wasn't a programmer would get. I definitely expected an off-the-shelf agent product that would book flights and reserve restaurants and shop for simple goods, one that worked well enough I would actually use it (and I expected that to happen before the one hour plus coding tasks were solved; I expected it to be concurrent with half hour coding tasks).
I can't tell if the fact that AI agents continue to be useless to me is a portent that the incredible benchmark performance won't translate as well as the bullish people expect to real world acceleration; I'm largely deferring to the consensus in my local social circle that it's not a big deal. My personal intuitions are somewhat closer to what Steve Newman describes in this comment thread.
It seems like anecdotally folks are getting like +5%-30% productivity boost from using AI; it does feel somewhat aggressive for that to go to 10x productivity boost within a couple years.
Of course AI company employees have the most hands-on experience
FWIW I am not sure this is right--most AI company employees work on things other than "try to get as much work as possible from current AI systems, and understand the trajectory of how useful the AIs will be". E.g. I think I have more personal experience with running AI agents than people at AI companies who don't actively work on AI agents.
There are some people at AI companies who work on AI agents that use non-public models, and those people are ahead of the curve. But that's a minority.
Yeah, good point, I've been surprised by how uninterested the companies have been in agents.
Another effect here is that the AI companies often don't want to be as reckless as I am, e.g. letting agents run amok on my machines.
Interestingly, I've heard from tons of skeptics I've talked to (e.g. Tim Lee, CSET people, AI Snake Oil) that timelines to actual impacts in the world (such as significant R&D acceleration or industrial acceleration) are going to be way longer than we say because AIs are too unreliable and risky, therefore people won't use them. I was more dismissive of this argument before but:
To put it another way: we probably both agree that if we had gotten AI personal assistants that shop for you and book meetings for you in 2024, that would have been at least some evidence for shorter timelines. So their absence is at least some evidence for longer timelines. The question is what your underlying causal model was: did you think that if we were going to get superintelligence by 2027, then we really should see personal assistants in 2024? A lot of people strongly believe that, you (Daniel) hardly believe it at all, and I'm somewhere in the middle.
If we had gotten both the personal assistants I was expecting, and the 2x faster benchmark progress than I was expecting, my timelines would be the same as yours are now.
That concrete scenario was NOT my median prediction. Sorry, I should have made that more clear at the time. It was genuinely just a thought experiment for purposes of eliciting people's claims about how they would update on what kinds of evidence. My median AGI timeline at the time was 2027 (which is not that different from the scenario, to be clear! Just one year delayed basically.)
To answer your other questions:
--My views haven't changed much. Performance on the important benchmarks (agency tasks such as METR's RE-Bench) has been faster than I expected for 2024, but the cadence of big new foundation models seems to be slower than I expected (no GPT-5; pretraining scaling is slowing down due to data wall apparently? I thought that would happen more around GPT-6 level). I still have 2027 as my median year for AGI.
--Yes, I and others have run versions of that exercise several times now and yes people have found it valuable. The discussion part, people said, was less valuable than the "force yourself to write out your median scenario" part, so in more recent iterations we mostly just focused on that part.
This post taught me a lot about different ways of thinking about timelines, thanks to everyone involved!
I’d like to offer some arguments that, contra Daniel’s view, AI systems are highly unlikely to be able to replace 99% of current fully remote jobs anytime in the next 4 years. As a sample task, I’ll reference software engineering projects that take a reasonably skilled human practitioner one week to complete. I imagine that, for AIs to be ready for 99% of current fully remote jobs, they would need to be able to accomplish such a task. (That specific category might be less than 1% of all remote jobs, but I imagine that the class of remote jobs requiring at least this level of cognitive ability is more than 1%.)
Rather than referencing scaling laws, my arguments stem from analysis of two specific mechanisms which I believe are missing from current LLMs:
Thanks for this thoughtful and detailed and object-level critique! Just the sort of discussion I hope to inspire. Strong-upvoted.
Here are my point-by-point replies:
Of course there are workarounds for each of these issues, such as RAG for long-term memory, and multi-prompt approaches (chain-of-thought, tree-of-thought, AutoGPT, etc.) for exploratory work processes. But I see no reason to believe that they will work sufficiently well to tackle a week-long project. Briefly, my intuitive argument is that these are old school, rigid, GOFAI, Software 1.0 sorts of approaches, the sort of thing that tends to not work out very well in messy real-world situations. Many people have observed that even in the era of GPT-4, there is a conspicuous lack of LLMs accomplishing any really meaty creative work; I think these missing capabilities lie at the heart of the problem.
I agree that if no progress is made on long-term memory and iterative/exploratory work processes, we won't have AGI. My position is that we are already seeing significant progress in these dimensions and that we will see more significant progress in the next 1-3 years. (If 4 years from now we haven't seen such progress I'll admit...
Likewise, thanks for the thoughtful and detailed response. (And I hope you aren't too impacted by current events...)
I agree that if no progress is made on long-term memory and iterative/exploratory work processes, we won't have AGI. My position is that we are already seeing significant progress in these dimensions and that we will see more significant progress in the next 1-3 years. (If 4 years from now we haven't seen such progress I'll admit I was totally wrong about something). Maybe part of the disagreement between us is that the stuff you think are mere hacky workarounds, I think might work sufficiently well (with a few years of tinkering and experimentation perhaps).
Wanna make some predictions we could bet on? Some AI capability I expect to see in the next 3 years that you expect to not see?
Sure, that'd be fun, and seems like about the only reasonable next step on this branch of the conversation. Setting good prediction targets is difficult, and as it happens I just blogged about this. Off the top of my head, predictions could be around the ability of a coding AI to work independently over an extended period of time (at which point, it is arguably an "engineering AI"). Two di...
Oooh, I should have thought to ask you this earlier -- what numbers/credences would you give for the stages in my scenario sketched in the OP? This might help narrow things down. My guess based on what you've said is that the biggest update for you would be Step 2, because that's when it's clear we have a working method for training LLMs to be continuously-running agents -- i.e. long-term memory and continuous/exploratory work processes.
Here's a sketch for what I'd like to see in the future--a better version of the scenario experiment done above:
1000x energy consumption in 10-20 years is a really wild prediction, I would give it a <0.1% probability.
It's several orders of magnitude faster than any previous multiple, and requires large amounts of physical infrastructure that takes a long time to construct.
1000x is a really, really big number.
2022 figures, total worldwide consumption was 180 PWh/year[1]
Of that:
(2 sig fig because we're talking about OOM here)
There has only been a x10 multiple in the last 100 years - humanity consumed approx. 18 PWh/year around 1920 or so (details are sketchy for obvious reasons).
Looking at doubling time, we have:
1800 (5653 TWh)
1890 (10684 TWh) - 90 years
1940 (22869 TWh) - 50
1960 (41814 TWh) - 20
1978 (85869 TWh) - 18
2018 (172514 TWh) - 40
So historically, the fastest rate of doubling has been 20 years.
It takes 5-10 years for humans to build a medium to large size power plant, assuming no legal constraints.
AGI is very unlikely to be able to build an individual plant much faster, although it could build more at once.
Let's ignore that and assume AGI can build instantly.
I strongly disagree. The underlying reason is that an actual singularity seems reasonably likely.
This involves super-exponential growth driven by vastly superhuman intelligence.
Large scale fusion or literal dyson spheres are both quite plausible relatively soon (<5 years) after AGI if growth isn't restricted by policy or coordination.
I think you aren't engaging with the reasons why smart people think that 1000x energy consumption could happen soon. It's all about the growth rates. Obviously anything that looks basically like human industrial society won't be getting to 1000x in the next 20 years; the concern is that a million superintelligences commanding an obedient human nation-state might be able to design a significantly faster-growing economy. For an example of how I'm thinking about this, see this comment.
I was surprised by this number (I would have guessed total power consumption was a much lower fraction of total solar energy), so I just ran some quick numbers and it basically checks out.
Plugging this in and doing some dimensional analysis, it looks like the earth uses about 2000x the current energy consumption, which is the same OOM.
A NOAA site claims it's more like 10,000x:
173,000 terawatts of solar energy strikes the Earth continuously. That's more than 10,000 times the world's total energy use.
But plugging this number in with the OWiD value for 2022 gives about 8500x multiplier (I think the "more than 10000x" claim was true at the time it was made though). So maybe it's an OOM off, but for a loose claim using round numbers it seems close enough for me.
[edit: Just realized that Richard121 quotes some of the same figures above for total energy use and solar ir...
A question for all: If you are wrong and in 4/13/40 years most of this fails to come true, will you blame it on your own models being wrong or shift goalposts towards the success of the AI safety movement / government crack downs on AI development? If the latter, how will you be able to prove that AGI definitely would have come had the government and industry not slowed down development?
To add more substance to this comment: I felt Ege came out looking the most salient here. In general, making predictions about the future should be backed by heavy uncertainty. He didn't even disagree very strongly with most of the central premises of the other participants, he just placed his estimates much more humbly and cautiously. He also brought up the mundanity of progress and boring engineering problems, something I see as the main bottleneck in the way of a singularity. I wouldn't be surprised if the singularity turns out to be a physically impossible phenomenon because of hard limits in parallelisation of compute or queueing theory or supply chains or materials processing or something.
Thank you for raising this explicitly. I think probably lots of people's timelines are based partially on vibes-to-do-with-what-positions-sound-humble/cautious, and this isn't totally unreasonable so deserves serious explicit consideration.
I think it'll be pretty obvious whether my models were wrong or whether the government cracked down. E.g. how much compute is spent on the largest training run in 2030? If it's only on the same OOM as it is today, then it must have been government crackdown. If instead it's several OOMs more, and moreover the training runs are still of the same type of AI system (or something even more powerful) as today (big multimodal LLMs) then I'll very happily say I was wrong.
Re humility and caution: Humility and caution should push in both directions, not just one. If your best guess is that AGI is X years away, adding an extra dose of uncertainty should make you fatten both tails of your distribution -- maybe it's 2X years away, but maybe instead it's X/2 years away.
(Exception is for planning fallacy stuff -- there we have good reason to think people are systematically biased toward shorter timelines. So if your AGI timelines are primarily based on p...
This random Twitter person says that it can't. Disclaimer: haven't actually checked for myself.
https://chat.openai.com/share/36c09b9d-cc2e-4cfd-ab07-6e45fb695bb1
Here is me playing against GPT-4, no vision required. It does just fine at normal tic-tac-toe, and figures out anti-tic-tac-toe with a little bit of extra prompting.
GPT-4 can follow the rules of tic-tac-toe, but it cannot play optimally. In fact it often passes up opportunities for wins. I've spent about an hour trying to get GPT-4 to play optimal tic-tac-toe without any success.
Here's an example of GPT-4 playing sub-optimally: https://chat.openai.com/share/c14a3280-084f-4155-aa57-72279b3ea241
Here's an example of GPT-4 suggesting a bad move for me to play: https://chat.openai.com/share/db84abdb-04fa-41ab-a0c0-542bd4ae6fa1
@Daniel Kokotajlo it looks like you expect 1000x-energy 4 years after 99%-automation. I thought we get fast takeoff, all humans die, and 99% automation at around the same time (but probably in that order) and then get massive improvements in technology and massive increases in energy use soon thereafter. What takes 4 years?
(I don't think the part after fast takeoff or all humans dying is decision-relevant, but maybe resolving my confusion about this part of your model would help illuminate other confusions too.)
Good catch. Let me try to reconstruct my reasoning:
Distinguishing:
(a) 99% remotable 2023 tasks automateable (the thing we forecast in the OP)
(b) 99% 2023 tasks automatable
(c) 99% 2023 tasks automated
(d) Overpower ability
My best guess at the ordering will be a->d->b->c.
Rationale: Overpower ability probably requires something like a fully functioning general purpose agent capable of doing hardcore novel R&D. So, (a). However it probably doesn't require sophisticated robots, of the sort you'd need to actually automate all 2023 tasks. It certainly doesn't require actually having replaced all human jobs in the actual economy, though for strategic reasons a coalition of powerful misaligned AGIs would plausibly wait to kill the humans until they had actually rendered the humans unnecessary.
My best guess is that a, d, and b will all happen in the same year, possibly within the same month. c will probably take longer for reasons sketched above.
I think one component is that the prediction is for when 99% of jobs are automatable, not when they are automated (Daniel probably has more to say here, but this one clarification seems important).
Ege, do you think you'd update if you saw a demonstration of sophisticated sample-efficient in-context learning and far-off-distribution transfer?
Manifold Market on this question:
Introduction
How many years will pass before transformative AI is built? Three people who have thought about this question a lot are Ajeya Cotra from Open Philanthropy, Daniel Kokotajlo from OpenAI and Ege Erdil from Epoch. Despite each spending at least hundreds of hours investigating this question, they still still disagree substantially about the relevant timescales. For instance, here are their median timelines for one operationalization of transformative AI:
You can see the strength of their disagreements in the graphs below, where they give very different probability distributions over two questions relating to AGI development (note that these graphs are very rough and are only intended to capture high-level differences, and especially aren't very robust in the left and right tails).
So I invited them to have a conversation about where their disagreements lie, sitting down for 3 hours to have a written dialogue. You can read the discussion below, which I personally found quite valuable.
The dialogue is roughly split in two, with the first part focusing on disagreements between Ajeya and Daniel, and the second part focusing on disagreements between Daniel/Ajeya and Ege.
I'll summarize the discussion here, but you can also jump straight in.
Summary of the Dialogue
Some Background on their Models
Ajeya and Daniel are using a compute-centric model for their AI forecasts, illustrated by Ajeya's draft AI Timelines report, and Tom Davidson's takeoff model where the question of "when transformative AI" gets reduced to "how much compute is necessary to get AGI and when will we have that much compute? (modeling algorithmic advances as reductions in necessary compute)".
Whereas Ege thinks such models should have a lot of weight in our forecasts, but that they likely miss important considerations and doesn't have enough evidence to justify the extraordinary predictions it makes.
Habryka's Overview of Ajeya & Daniel discussion
These disagreements probably explain some but not most of the differences in the timelines for Daniel and Ajeya.
Habryka's Overview of Ege & Ajeya/Daniel Discussion
Overall, whether AI will get substantially better at transfer learning (e.g. seeing an AI be trained on one genre of video game and then very quickly learn to play another genre of video game) would update all participants substantially towards shorter timelines.
We ended the dialogue with Ajeya, Daniel and Ege by putting numbers on how much various AGI milestones would cause them to update their timelines (with the concrete milestones proposed by Daniel). Time constraints made it hard to go into as much depth as we would have liked, but me and Daniel are excited about fleshing more concrete scenarios of how AGI could play out and then collecting more data on how people would update in such scenarios.
The Dialogue
Visual probability distributions
Opening statements
Daniel
Ege
Ajeya
On in-context learning as a potential crux
Taking into account government slowdown
Recursive self-improvement and AI's speeding up R&D
Do we expect transformative AI pre-overhang or post-overhang?
Hofstadter's law in AGI forecasting
Summary of where we are at so far and exploring additional directions
Exploring conversational directions
Ege's median world
Far-off-distribution transfer
A concrete scenario & where its surprises are
Overall summary, takeaways and next steps