I had a nice conversation with Ege today over dinner, in which we identified a possible bet to make! Something I think will probably happen in the next 4 years, that Ege thinks will probably NOT happen in the next 15 years, such that if it happens in the next 4 years Ege will update towards my position and if it doesn't happen in the next 4 years I'll update towards Ege's position.
Drumroll...
I (DK) have lots of ideas for ML experiments, e.g. dangerous capabilities evals, e.g. simple experiments related to paraphrasers and so forth in the Faithful CoT agenda. But I'm a philosopher, I don't code myself. I know enough that if I had some ML engineers working for me that would be sufficient for my experiments to get built and run, but I can't do it by myself.
When will I be able to implement most of these ideas with the help of AI assistants basically substituting for ML engineers? So I'd still be designing the experiments and interpreting the results, but AutoGPT5 or whatever would be chatting with me and writing and debugging the code.
I think: Probably in the next 4 years. Ege thinks: probably not in the next 15.
Ege, is this an accurate summary?
This post taught me a lot about different ways of thinking about timelines, thanks to everyone involved!
I’d like to offer some arguments that, contra Daniel’s view, AI systems are highly unlikely to be able to replace 99% of current fully remote jobs anytime in the next 4 years. As a sample task, I’ll reference software engineering projects that take a reasonably skilled human practitioner one week to complete. I imagine that, for AIs to be ready for 99% of current fully remote jobs, they would need to be able to accomplish such a task. (That specific category might be less than 1% of all remote jobs, but I imagine that the class of remote jobs requiring at least this level of cognitive ability is more than 1%.)
Rather than referencing scaling laws, my arguments stem from analysis of two specific mechanisms which I believe are missing from current LLMs:
Thanks for this thoughtful and detailed and object-level critique! Just the sort of discussion I hope to inspire. Strong-upvoted.
Here are my point-by-point replies:
Of course there are workarounds for each of these issues, such as RAG for long-term memory, and multi-prompt approaches (chain-of-thought, tree-of-thought, AutoGPT, etc.) for exploratory work processes. But I see no reason to believe that they will work sufficiently well to tackle a week-long project. Briefly, my intuitive argument is that these are old school, rigid, GOFAI, Software 1.0 sorts of approaches, the sort of thing that tends to not work out very well in messy real-world situations. Many people have observed that even in the era of GPT-4, there is a conspicuous lack of LLMs accomplishing any really meaty creative work; I think these missing capabilities lie at the heart of the problem.
I agree that if no progress is made on long-term memory and iterative/exploratory work processes, we won't have AGI. My position is that we are already seeing significant progress in these dimensions and that we will see more significant progress in the next 1-3 years. (If 4 years from now we haven't seen such progress I'll admit...
Likewise, thanks for the thoughtful and detailed response. (And I hope you aren't too impacted by current events...)
I agree that if no progress is made on long-term memory and iterative/exploratory work processes, we won't have AGI. My position is that we are already seeing significant progress in these dimensions and that we will see more significant progress in the next 1-3 years. (If 4 years from now we haven't seen such progress I'll admit I was totally wrong about something). Maybe part of the disagreement between us is that the stuff you think are mere hacky workarounds, I think might work sufficiently well (with a few years of tinkering and experimentation perhaps).
Wanna make some predictions we could bet on? Some AI capability I expect to see in the next 3 years that you expect to not see?
Sure, that'd be fun, and seems like about the only reasonable next step on this branch of the conversation. Setting good prediction targets is difficult, and as it happens I just blogged about this. Off the top of my head, predictions could be around the ability of a coding AI to work independently over an extended period of time (at which point, it is arguably an "engineering AI"). Two di...
Oooh, I should have thought to ask you this earlier -- what numbers/credences would you give for the stages in my scenario sketched in the OP? This might help narrow things down. My guess based on what you've said is that the biggest update for you would be Step 2, because that's when it's clear we have a working method for training LLMs to be continuously-running agents -- i.e. long-term memory and continuous/exploratory work processes.
Here's a sketch for what I'd like to see in the future--a better version of the scenario experiment done above:
1000x energy consumption in 10-20 years is a really wild prediction, I would give it a <0.1% probability.
It's several orders of magnitude faster than any previous multiple, and requires large amounts of physical infrastructure that takes a long time to construct.
1000x is a really, really big number.
2022 figures, total worldwide consumption was 180 PWh/year[1]
Of that:
(2 sig fig because we're talking about OOM here)
There has only been a x10 multiple in the last 100 years - humanity consumed approx. 18 PWh/year around 1920 or so (details are sketchy for obvious reasons).
Looking at doubling time, we have:
1800 (5653 TWh)
1890 (10684 TWh) - 90 years
1940 (22869 TWh) - 50
1960 (41814 TWh) - 20
1978 (85869 TWh) - 18
2018 (172514 TWh) - 40
So historically, the fastest rate of doubling has been 20 years.
It takes 5-10 years for humans to build a medium to large size power plant, assuming no legal constraints.
AGI is very unlikely to be able to build an individual plant much faster, although it could build more at once.
Let's ignore that and assume AGI can build instantly.
I strongly disagree. The underlying reason is that an actual singularity seems reasonably likely.
This involves super-exponential growth driven by vastly superhuman intelligence.
Large scale fusion or literal dyson spheres are both quite plausible relatively soon (<5 years) after AGI if growth isn't restricted by policy or coordination.
I think you aren't engaging with the reasons why smart people think that 1000x energy consumption could happen soon. It's all about the growth rates. Obviously anything that looks basically like human industrial society won't be getting to 1000x in the next 20 years; the concern is that a million superintelligences commanding an obedient human nation-state might be able to design a significantly faster-growing economy. For an example of how I'm thinking about this, see this comment.
I was surprised by this number (I would have guessed total power consumption was a much lower fraction of total solar energy), so I just ran some quick numbers and it basically checks out.
Plugging this in and doing some dimensional analysis, it looks like the earth uses about 2000x the current energy consumption, which is the same OOM.
A NOAA site claims it's more like 10,000x:
173,000 terawatts of solar energy strikes the Earth continuously. That's more than 10,000 times the world's total energy use.
But plugging this number in with the OWiD value for 2022 gives about 8500x multiplier (I think the "more than 10000x" claim was true at the time it was made though). So maybe it's an OOM off, but for a loose claim using round numbers it seems close enough for me.
[edit: Just realized that Richard121 quotes some of the same figures above for total energy use and solar ir...
A question for all: If you are wrong and in 4/13/40 years most of this fails to come true, will you blame it on your own models being wrong or shift goalposts towards the success of the AI safety movement / government crack downs on AI development? If the latter, how will you be able to prove that AGI definitely would have come had the government and industry not slowed down development?
To add more substance to this comment: I felt Ege came out looking the most salient here. In general, making predictions about the future should be backed by heavy uncertainty. He didn't even disagree very strongly with most of the central premises of the other participants, he just placed his estimates much more humbly and cautiously. He also brought up the mundanity of progress and boring engineering problems, something I see as the main bottleneck in the way of a singularity. I wouldn't be surprised if the singularity turns out to be a physically impossible phenomenon because of hard limits in parallelisation of compute or queueing theory or supply chains or materials processing or something.
Thank you for raising this explicitly. I think probably lots of people's timelines are based partially on vibes-to-do-with-what-positions-sound-humble/cautious, and this isn't totally unreasonable so deserves serious explicit consideration.
I think it'll be pretty obvious whether my models were wrong or whether the government cracked down. E.g. how much compute is spent on the largest training run in 2030? If it's only on the same OOM as it is today, then it must have been government crackdown. If instead it's several OOMs more, and moreover the training runs are still of the same type of AI system (or something even more powerful) as today (big multimodal LLMs) then I'll very happily say I was wrong.
Re humility and caution: Humility and caution should push in both directions, not just one. If your best guess is that AGI is X years away, adding an extra dose of uncertainty should make you fatten both tails of your distribution -- maybe it's 2X years away, but maybe instead it's X/2 years away.
(Exception is for planning fallacy stuff -- there we have good reason to think people are systematically biased toward shorter timelines. So if your AGI timelines are primarily based on p...
This random Twitter person says that it can't. Disclaimer: haven't actually checked for myself.
https://chat.openai.com/share/36c09b9d-cc2e-4cfd-ab07-6e45fb695bb1
Here is me playing against GPT-4, no vision required. It does just fine at normal tic-tac-toe, and figures out anti-tic-tac-toe with a little bit of extra prompting.
GPT-4 can follow the rules of tic-tac-toe, but it cannot play optimally. In fact it often passes up opportunities for wins. I've spent about an hour trying to get GPT-4 to play optimal tic-tac-toe without any success.
Here's an example of GPT-4 playing sub-optimally: https://chat.openai.com/share/c14a3280-084f-4155-aa57-72279b3ea241
Here's an example of GPT-4 suggesting a bad move for me to play: https://chat.openai.com/share/db84abdb-04fa-41ab-a0c0-542bd4ae6fa1
@Daniel Kokotajlo it looks like you expect 1000x-energy 4 years after 99%-automation. I thought we get fast takeoff, all humans die, and 99% automation at around the same time (but probably in that order) and then get massive improvements in technology and massive increases in energy use soon thereafter. What takes 4 years?
(I don't think the part after fast takeoff or all humans dying is decision-relevant, but maybe resolving my confusion about this part of your model would help illuminate other confusions too.)
Good catch. Let me try to reconstruct my reasoning:
Distinguishing:
(a) 99% remotable 2023 tasks automateable (the thing we forecast in the OP)
(b) 99% 2023 tasks automatable
(c) 99% 2023 tasks automated
(d) Overpower ability
My best guess at the ordering will be a->d->b->c.
Rationale: Overpower ability probably requires something like a fully functioning general purpose agent capable of doing hardcore novel R&D. So, (a). However it probably doesn't require sophisticated robots, of the sort you'd need to actually automate all 2023 tasks. It certainly doesn't require actually having replaced all human jobs in the actual economy, though for strategic reasons a coalition of powerful misaligned AGIs would plausibly wait to kill the humans until they had actually rendered the humans unnecessary.
My best guess is that a, d, and b will all happen in the same year, possibly within the same month. c will probably take longer for reasons sketched above.
I think one component is that the prediction is for when 99% of jobs are automatable, not when they are automated (Daniel probably has more to say here, but this one clarification seems important).
Ege, do you think you'd update if you saw a demonstration of sophisticated sample-efficient in-context learning and far-off-distribution transfer?
Manifold Market on this question:
Curated. I feel like over the last few years my visceral timelines have shortened significantly. This is partly in contact with LLMs, particularly their increased coding utility, and a lot downstream of Ajeya's and Daniel's models and outreach (I remember spending an afternoon on an arts-and-crafts 'build your own timeline distribution' that Daniel had nerdsniped me with). I think a lot of people are in a similar position and have been similarly influenced. It's nice to get more details on those models and the differences between them, as well as to hear Ege pushing back with "yeah but what if there are some pretty important pieces that are missing and won't get scaled away?", which I hear from my environment much less often.
There are a couple of pieces of extra polish that I appreciate. First, having some specific operationalisations with numbers and distributions up-front is pretty nice for grounding the discussion. Second, I'm glad that there was a summary extracted out front, as sometimes the dialogue format can be a little tricky to wade through.
On the object level, I thought the focus on schlep in the Ajeya-Daniel section and slowness of economy turnover in the Ajaniel-Ege se...
If human-level AI is reached quickly mainly by spending more money on compute (which I understood to be Kokotajlo's viewpoint; sorry if I misunderstood), it'd also be quite expensive to do inference with, no? I'll try to estimate how it compares to humans.
Let's use Cotra's "tens of billions" for training compared to GPT-4's $100m+, for roughly a 300x multiplier. Let's say that inference costs are multiplied by the same 300x, so instead of GPT-4's $0.06 per 1000 output tokens, you'd be paying GPT-N $18 per 1000 output tokens. I think of GPT output as analog...
Nice analysis. Some thoughts:
1. If scaling continues with something like Chinchilla scaling laws, the 300x multiplier for compute will not be all lumped into increasing parameters / inference cost. Instead it'll be split roughly half and half. So maybe 20x more data/trainingtime and 15x more parameters/inference cost. So, instead of $200/hr, we are talking more like $15/hr.
2. Hardware continues to improve in the near term; FLOP/$ continues to drop. As far as I know. Of course during AI boom times the price will be artificially high due to all the demand... Not sure which direction the net effect will be.
3. Reaching human-level AI might involve trading off inference compute and training compute, as discussed in Davidson's model (see takeoffspeeds.com and linked report) which probably is a factor that increases inference compute of the first AGIs (while shortening timelines-to-AGI) perhaps by multiple OOMs.
4. However much it costs, labs will be willing to pay. An engineer that works 5x, 10x, 100x faster than a human is incredibly valuable, much more valuable than if they worked only at 1x speed like all the extremely high-salaried human engineers at AI labs.
Subjectively there is clear improvement between 7b vs. 70b vs. GPT-4, each step 1.5-2 OOMs of training compute. The 70b models are borderline capable of following routine instructions to label data or pour it into specified shapes. GPT-4 is almost robustly capable of that. There are 3-4 more effective OOMs in the current investment scaling sprint (3-5 years), so another 2 steps of improvement if there was enough equally useful training data to feed the process, which there isn't. At some point, training gets books in images that weren't previously availabl...
The important thing for alignment work isn't the median prediction; if we had an alignment solution just by then, we'd have a 50% chance of dying from that lack.
I think the biggest takeaway is that nobody has a very precise and reliable prediction, so if we want to have good alignment plans in advance of AGI, we'd better get cracking.
I think Daniel's estimate does include a pretty specific and plausible model of a path to AGI, so I take his the most seriously. My model of possible AGI architectures requires even less compute than his, but I think the Hofst...
Ege, do you think you'd update if you saw a demonstration of sophisticated sample-efficient in-context learning and far-off-distribution transfer?
Yes.
Suppose it could get decent at the first-person-shooter after like a subjective hour of messing around with it. If you saw that demo in 2025, how would that update your timelines?
I would probably update substantially towards agreeing with you.
DeepMind released an early-stage research model SIMA: https://deepmind.google/discover/blog/sima-generalist-ai-agent-for-3d-virtual-environments/
It was tested on 6...
I disagree with this update -- I think the update should be "it takes a lot of schlep and time for the kinks to be worked out and for products to find market fit" rather than "the systems aren't actually capable of this." Like, I bet if AI progress stopped now, but people continued to make apps and widgets using fine-tunes of various GPTs, there would be OOMs more economic value being produced by AI in 2030 than today.
As a personal aside: Man, what a good world that would be. We would get a lot of the benefits of the early singularity, but not the risks.
Ma...
I am on a capabilities team at OpenAI right now
Um. What?
I guess I'm out of the loop. I thought you, Daniel, were doing governance stuff.
What's your rationale for working on capabilities if you think timelines are this compressed?
I'm doing safety work at a capabilities team, basically. I'm trying not to advance capabilities myself. I'm trying to make progress on a faithful CoT agenda. Dan Selsam, who runs the team, thought it would be good to have a hybrid team instead of the usual thing where the safety people and capabilities people are on separate teams and the capabilities people feel licensed to not worry about the safety stuff at all and the safety people are relatively out of the loop.
I found the discussion around Hofstadter's law in forecasting to be really useful as I've definitely found myself and others adding fudge factors to timelines to reflect unknown unknowns which may or may not be relevant when extrapolating capabilities from compute.
In my experience many people are of the feeling that current tools are primarily limited by their ability to plan and execute over longer time horizons. Once we have publicly available tools that are capable of carrying out even simple multi-step plans (book me a great weekend away with my parents with a budget of $x and send me the itinerary), I can see timelines amongst the general public being dramatically reduced.
Could you elaborate on what it would mean to demonstrate 'savannah-to-boardroom' transfer? Our architecture was selected for in the wilds of nature, not our training data. To me it seems that when we use an architecture designed for language translation for understanding images we've demonstrated a similar degree of transfer.
I agree that we're not yet there on sample efficient learning in new domains (which I think is more what you're pointing at) but I'd like to be clearer on what benchmarks would show this. For example, how well GPT-4 can integrate a new domain of knowledge from (potentially multiple epochs of training on) a single textbook seems a much better test and something that I genuinely don't know the answer to.
I think it would be helpful if this dialog had a different name. I would hope this isn't the last dialog on timelines, and the current title is sort of capturing the whole namespace. Can we change it to something more specific?
A local remark about this: I've seen a bunch of reports from other people that GPT-4 is essentially unable to play tic-tac-toe, and this is a shortcoming that was highly surprising to me. Given the amount of impressive things it can otherwise do, failing at playing a simple game whose full solution could well be in its training set is really odd.
Huh. This is something that I could just test immediately, so I tried it.
It looks like this is true. When I play a game of tick-tack-toe with GPT-4 it doesn't play optimally, and it let me win in 3 turns.
http...
@Daniel Kokotajlo what odds would you give me for global energy consumption growing 100x by the end of 2028? I'd be happy to bet low hundreds of USD on the "no" side.
ETA: to be more concrete I'd put $100 on the "no" side at 10:1 odds but I'm interested if you have a more aggressive offer.
(5) Q1 2026: The next version comes online. It is released, but it refuses to help with ML research. Leaks indicate that it doesn't refuse to help with ML research internally, and in fact is heavily automating the process at its parent corporation. It's basically doing all the work by itself; the humans are basically just watching the metrics go up and making suggestions and trying to understand the new experiments it's running and architectures it's proposing.
@Daniel Kokotajlo, why do you think they would release it?
...E.g. suppose some AI system was trained to learn new video games: each RL episode was it being shown a video game it had never seen, and it's supposed to try to play it; its reward is the score it gets. Then after training this system, you show it a whole new type of video game it has never seen (maybe it was trained on platformers and point-and-click adventures and visual novels, and now you show it a first-person-shooter for the first time). Suppose it could get decent at the first-person-shooter after like a subjective hour of messing around with it. If
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?
Introduction
How many years will pass before transformative AI is built? Three people who have thought about this question a lot are Ajeya Cotra from Open Philanthropy, Daniel Kokotajlo from OpenAI and Ege Erdil from Epoch. Despite each spending at least hundreds of hours investigating this question, they still still disagree substantially about the relevant timescales. For instance, here are their median timelines for one operationalization of transformative AI:
You can see the strength of their disagreements in the graphs below, where they give very different probability distributions over two questions relating to AGI development (note that these graphs are very rough and are only intended to capture high-level differences, and especially aren't very robust in the left and right tails).
So I invited them to have a conversation about where their disagreements lie, sitting down for 3 hours to have a written dialogue. You can read the discussion below, which I personally found quite valuable.
The dialogue is roughly split in two, with the first part focusing on disagreements between Ajeya and Daniel, and the second part focusing on disagreements between Daniel/Ajeya and Ege.
I'll summarize the discussion here, but you can also jump straight in.
Summary of the Dialogue
Some Background on their Models
Ajeya and Daniel are using a compute-centric model for their AI forecasts, illustrated by Ajeya's draft AI Timelines report, and Tom Davidson's takeoff model where the question of "when transformative AI" gets reduced to "how much compute is necessary to get AGI and when will we have that much compute? (modeling algorithmic advances as reductions in necessary compute)".
Whereas Ege thinks such models should have a lot of weight in our forecasts, but that they likely miss important considerations and doesn't have enough evidence to justify the extraordinary predictions it makes.
Habryka's Overview of Ajeya & Daniel discussion
These disagreements probably explain some but not most of the differences in the timelines for Daniel and Ajeya.
Habryka's Overview of Ege & Ajeya/Daniel Discussion
Overall, whether AI will get substantially better at transfer learning (e.g. seeing an AI be trained on one genre of video game and then very quickly learn to play another genre of video game) would update all participants substantially towards shorter timelines.
We ended the dialogue with Ajeya, Daniel and Ege by putting numbers on how much various AGI milestones would cause them to update their timelines (with the concrete milestones proposed by Daniel). Time constraints made it hard to go into as much depth as we would have liked, but me and Daniel are excited about fleshing more concrete scenarios of how AGI could play out and then collecting more data on how people would update in such scenarios.
The Dialogue
Visual probability distributions
Opening statements
Daniel
Ege
Ajeya
On in-context learning as a potential crux
Taking into account government slowdown
Recursive self-improvement and AI's speeding up R&D
Do we expect transformative AI pre-overhang or post-overhang?
Hofstadter's law in AGI forecasting
Summary of where we are at so far and exploring additional directions
Exploring conversational directions
Ege's median world
Far-off-distribution transfer
A concrete scenario & where its surprises are
Overall summary, takeaways and next steps