AGI Predictions

by Amandango, Ben Pace3 min read21st Nov 202031 comments

100

Forecasting & PredictionAIWorld Modeling
Frontpage

This post is a collection of key questions that feed into AI timelines and AI safety work where it seems like there is substantial interest or disagreement amongst the LessWrong community. 

You can make a prediction on a question by hovering over the widget and clicking. You can update your prediction by clicking at a new point, and remove your prediction by clicking on the same point. Try it out:

 

Add questions & operationalizations

This is not intended to be a comprehensive list, so I’d love for people to add their own questions – here are instructions on making your own embedded question. If you have better operationalizations of the questions, you can make your own version in the comments. If there's general agreement on an alternative operationalization being better, I'll add it into the post.

Questions

AGI definition

We’ll define AGI in this post as a unified system that, for almost all economically relevant cognitive tasks, at least matches any human's ability at the task. This is similar to Rohin Shah and Ben Cottier’s definition in this post.

Safety Questions

 

Timelines Questions

See Forecasting AI timelines, Ajeya Cotra’s OP AI timelines report, and Adam Gleave’s #AN80 comment, for more context on this breakdown. I haven’t tried to operationalize this too much, so feel free to be more specific in the comments.

The first three questions in this section are mutually exclusive — that is, the probabilities you assign to them should not sum to more than 100%.

 

Non-technical factor questions

 

Operationalizations

Safety Questions

1. Will AGI cause an existential catastrophe?

  • Existential catastrophe is defined here according to Toby Ord’s definition in the Precipice: “An event that causes extinction or the destruction of humanity’s long-term potential”.
  • This assumes that everyone currently working on AI alignment continues to do so.

2. Will AGI cause an existential catastrophe without additional intervention from the AI Alignment research community?

  • Roughly, the AI Alignment research community includes people working at CHAI, MIRI, current safety teams at OpenAI and DeepMind, FHI, AI Impacts, and similar orgs, as well as independent researchers writing on the AI Alignment Forum.
  • “Without additional intervention” = everyone currently in this community stops working on anything directly intended to improve AI safety as of today, 11/20/2020. They may work on AI in a way that indirectly and incidentally improves AI safety, but only to the same degree as researchers outside of the AI alignment community are currently doing this.

4. Will there be an arms race dynamic in the lead-up to AGI?

  • An arms race dynamic is operationalized as: 2 years before superintelligent AGI is built, there are at least 2 companies/projects/countries at the cutting edge each within 2 years of each others' technology who are competing and not collaborating.

 5. Will a single AGI or AGI project achieve a decisive strategic advantage?

  • This question uses Bostrom’s definition of decisive strategic advantage: “A level of technological and other advantages sufficient to enable it to achieve complete world domination” (Bostrom 2014).

 6. Will > 50% of AGI researchers agree with safety concerns by 2030?

  • “Agree with safety concerns” means: broadly understand the concerns of the safety community, and agree that there is at least one concern such that we have not yet solved it and we should not build superintelligent AGI until we do solve it (Rohin Shah’s operationalization from this post).

7. Will there be a 4 year interval in which world GDP growth doubles before the first 1 year interval in which world GDP growth doubles?

  • This is essentially Paul Christano’s operationalization of the rate of development of AI from his post on Takeoff speeds. I’ve used this specific operationalization rather than “slow vs fast” or “continuous vs discontinuous” due to the ambiguity in how people use these terms.

8. Will AGI cause existential catastrophe conditional on there being a 4 year period of doubling of world GDP growth before a 1 year period of doubling?

  • Uses the same definition of existential catastrophe as previous questions.

9. Will AGI cause existential catastrophe conditional on there being a 1 year period of doubling of world GDP growth without there first being a 4 year period of doubling?

  • For example, we go from current growth rates to doubling within a year.
  • Uses the same definition of existential catastrophe as previous questions.

 

Timelines Questions

9. Will we get AGI from deep learning with small variations, without more insights on a similar level to deep learning?

  • An example would be something like GPT-N + RL + scaling.

10. Will we get AGI from 1-3 more insights on a similar level to deep learning?

  • Self-explanatory.

11. Will we need > 3 breakthroughs on a similar level to deep learning to get AGI?

  • Self-explanatory.

12. Before reaching AGI, will we hit a point where we can no longer improve AI capabilities by scaling?

  • This includes: 1) We are unable to continue scaling, e.g. due to limitations on compute, dataset size, or model size, or 2) We can practically continue scaling but the increase in AI capabilities from scaling plateaus (see below).

13. Before reaching AGI, will we hit a point where we can no longer improve AI capabilities by scaling because we are unable to continue scaling?

  • Self-explanatory.

14. Before reaching AGI, will we hit a point where we can no longer improve AI capabilities by scaling because the increase in AI capabilities from scaling plateaus?

  • Self-explanatory.

 

Non-technical factor questions

15. Will we experience an existential catastrophe before we build AGI?

  • Existential catastrophe is defined here according to Toby Ord’s definition in the Precipice: “An event that causes extinction or the destruction of humanity’s long-term potential”.
  • This does not include events that would slow the progress of AGI development but are not existential catastrophes.

16. Will there be another AI Winter (a period commonly referred to as such) before we develop AGI?

  • From Wikipedia: “In the history of artificial intelligence, an AI winter is a period of reduced funding and interest in artificial intelligence research.”
  • This question asks about whether people will *refer* to a period as an AI winter, for example, Wikipedia and similar sources refer to it as a third AI winter.

 

Additional resources

 

Big thanks to Ben Pace, Rohin Shah, Daniel Kokotajlo, Ethan Perez, and Andreas Stuhlmüller for providing really helpful feedback on this post, and suggesting many of the operationalizations.

100

31 comments, sorted by Highlighting new comments since Today at 6:03 AM
New Comment

Great post! I am very curious about how people are interpreting Q10 and Q11, and what their models are. What are prototypical examples of 'insights on a similar level to deep learning'? 

Here's a break-down of examples of things that come to my mind:

Historical DL-level advances: 

  • the development of RL (Q-learning algorithm, etc.)
  • Original formulation of a single neuron i.e. affine transformation + non-linearity

Future possible DL-level:

  • a successor to back-prop (e.g. the how biological neurons learn)
  • a successor to the Q-learning family (e.g. neatly generalizing and extending 'intrinsic motivation' hacks)
  • full brain simulation
  • an alternative to the affine+activation recipe

Below DL-level major advances:

  • an elegant solution to learn from cross-modal inputs in a self-supervised fashion (babies somehow do it)
  • a breakthrough in active learning
  • a generalizable solution to learning disentangled and compositional representations
  • a solution to adversarial examples

Grey areas: 

  • breakthroughs in neural architecture search
  • a breakthrough in neural Turing machine-type research

I'd also like to know how people's thinking fits in with my taxonomy: Are people who leaned yes on Q11 basing their reasoning on the inadequacy of the 'below DL-level advances' list, or perhaps on the necessity of the 'DL-level advances' list? Or perhaps people interpreted those questions completely differently, and don't agree with my dividing lines?

Thank you for asking this question and for giving that break-down. I was wondering something similar. I am not an AI scientist but DL seems like a very big deal to me, and thus I was surprised that so many people seemed to think we need more insights on that level. My charitable interpretation is that they don't think DL is a big deal.

At time of writing, I'm assigning the highest probability to "Will AGI cause an existential catastrophe?" at 85%, with the next-highest predictions at 80% and 76%. Why ... why is everyone so optimistic?? Did we learn something new about the problem actually being easier, or our civilization more competent, than previously believed?

Should—should I be trying to do more x-risk-reduction-relevant stuff (somehow), or are you guys saying you've basically got it covered? (In 2013, I told myself it was OK for dumb little ol' me to personally not worry about the Singularity and focus on temporal concerns in order to not have nightmares, and it turned out that I have a lot of temporal concerns which could be indirectly very relevant to the main plot, but that's not my real reason for focusing on them.)

IMO, we decidedly do not "basically have it covered."

That said, IMO it is generally not a good idea for a person to try to force themselves on problems that will make them crazy, desperate need or no.

I am often tempted to downplay how much catastrophe-probability I see, basically to decrease the odds that people decide to make themselves crazy in the direct vicinity of alignment research and alignment researchers.

And on the other hand, I am tempted by the HPMOR passage:

"Girls?" whispered Susan. She was slowly pushing herself to her feet, though Hermione could see her limbs swaying and quivering. "Girls, I'm sorry for what I said before. If you've got anything clever and heroic to try, you might as well try it."

(To be clear, I have hope. Also, please just don't go crazy and don't do stupid things.)

For me, it's because there's disjunctively many ways that AGI could not happen (global totalitarian regime, AI winter, 55% CFR avian flu escapes a BSL4 lab, unexpected difficulty building AGI & the planning fallacy on timelines which we totally won't fall victim to this time...), or that alignment could be solved, or that I could be mistaken about AGI risk being a big deal, or... 

Granted, I assign small probabilities to several of these events. But my credence for P(AGI extinction | no more AI alignment work from community) is 70% - much higher than my 40% unconditional credence. I guess that means yes, I think AGI risk is huge (remember that I'm saying "40% chance we just die to AGI, unconditionally"), and that's after incorporating the significant contributions which I expect the current community to make. The current community is far from sufficient, but it's also probably picking a good amount of low-hanging fruit, and so I expect that its presence makes a significant difference.

EDIT: I'm decreasing the 70% to 60% to better match my 40% unconditional, because only the current alignment community stops working on alignment. 

I've gone from roughly 2/3 to 1/2 on existential catastrophe (I've put 58% here, was feeling pessimistic) based on the big projects having safety teams who I think are doing really good work. That probably falls under our civilization being more competent than previously believed.

In the following, an event is "catastrophic" if it endangers several human lives; it need not be an existential catastrophe.

Edit: I meant to say "deceptive alignment", but the meaning should be clear either way.

”Catastrophic” is normally used in the term ”global catastrophic risk” and means something like “kills 100,000s of people”, so I do think “doesn’t necessarily kill but could’ve killed a couple of people” is a fairly different meaning. In retrospect I realize that I put my answer to the second question far too high — if it just means “a deceptive aligned system nearly gives a few people in hospital a fatal dosage but it’s stopped and we don’t know why the system messed up” then it’s quite plausible nothing this substantial will happen as a result of that.

”Catastrophic” is normally used in the term ”global catastrophic risk” and means something like “kills 100,000s of people”, so I do think “doesn’t necessarily kill but could’ve killed a couple of people” is a fairly different meaning.

Agreed. In retrospect, I might have opted for "pre-AGI nearly-deadly accident caused by deceptive alignment." 

In retrospect I realize that I put my answer to the second question far too high — if it just means “a deceptive aligned system nearly gives a few people in hospital a fatal dosage but it’s stopped and we don’t know why the system messed up” then it’s quite plausible nothing this substantial will happen as a result of that.

I intended the situation to be more like "we catch the AI pretending to be aligned, but actually lying, and it almost or does kill at least a few people as a result of that." 

With #1, I'm trying to have people predict the "deception is robustly instrumental behavior, but AIs will be bad at it at first and so we'll catch them." #2 is trying to operationalize whether this would be viewed as a fire alarm.

Some ways you might think scenario #1 won't happen:

  • You don't think deception will be incentivized
  • Fast takeoff means the AI is never smart enough to deceive but dumb enough to get caught
  • Our transparency tools won't be good enough for many people to believe it was actually deceptively aligned

I suspect this is intentional, but the set of predictions in redundant, in the sense that probabilities for three of them mathematically imply the probability of the forth due to the law of total probability.

In particular, if #1 is and #6 is , then #7 and #8 are and , and we have the equality

The probability I would assign to #8 intuitively is about 0,41. Math based on my other three predictions yields (doing the calculation now) 0.476. I am going to predict the math output rather than my intuition.

Did anyone else calculate their level of inconsistency?

The probability I would assign to #8 intuitively is about 0,41. Math based on my other three predictions yields (doing the calculation now) 0.476. I am going to predict the math output rather than my intuition.

I think the correct response to this realization is not to revise your final answer so as to make it consistent with the first three. It is to revise all four answers so that they are maximally intuitive, subject to the constraint that they be jointly consistent. Which answer comes last is just an artifact of the order of presentation, so it isn't a rational basis for privileging some answers over others.

This is only true if, for example, you think AI would cause GDP growth. My model assigns a lot of probability to ‘AI kills everyone before (human-relevant) GDP goes up that fast’, so questions #7 and #8 are conditional on me being wrong about that. If we can last any small multiples of a year with AI smart enough to double GDP in that timeframe, then things probably aren't as bad as I thought.

How to add your own questions:

  1. Go to elicit.org/binary
  2. Type your question into the field at the top
  3. Click on the question title, and click the copy URL button
  4. Paste the URL into the LessWrong editor

See our launch post for more details!

There is a huge difference in the responses to Q1 (“Will AGI cause an existential catastrophe?”) and Q2 (“...without additional intervention from the existing AI Alignment research community”), to a point that seems almost unjustifiable to me. To pick the first matching example I found (and not to purposefully pick on anybody in particular), Daniel Kokotajlo thinks there's a 93% chance of existential risk without the AI Alignment community's involvement, but only 53% with. This implies that there's a ~43% chance of the AI Alignment community solving the problem, conditional on it being real and unsolved otherwise, but only a ~7% chance of it not occurring for any other reason, including the possibility of it being solved by the researchers building the systems, or the concern being largely incorrect.

What makes people so confident in the AI Alignment research community solving this problem, far above that of any other alternative?

I also noticed Daniel’s difference in probabilities there, and thought they were substantial. But it doesn’t seem unreasonable to me. The existing AI x-risk community has changed the global conversation on AI and also been responsible for much in the way of funding and direct research on many related technical problems. I could talk about the specific technical work, or the impact that things like the AI FOOM Debate had on Superintelligence had on OpenPhil, or CFAR on FLI on Musk on OpenAI. Or I could go into detail about the research being done on topics like Iterated Amplification and Agent Foundations and so on and ways that this seems to me to be clear progress on subproblems. I’m not sure exactly what alternatives you might have in mind.

To emphasize, the clash I'm perceiving is not the chance assigned to these problems being tractable, but to the relative probability of ‘AI Alignment researchers’ solving the problems, as compared to everyone else and every other explanation. In particular, people building AI systems intrinsically spend a degree of their effort, even if completely unconvinced about the merits of AI risk, trying to make systems aligned, just because that's a fundamental part of building a useful AI.

I could talk about the specific technical work, or the impact that things like the AI FOOM Debate had on Superintelligence had on OpenPhil, or CFAR on FLI on Musk on OpenAI. Or I could go into detail about the research being done on topics like Iterated Amplification and Agent Foundations and so on and ways that this seems to me to be clear progress on subproblems.

I have a sort of Yudkowskian pessimism towards most of these things (policy won't actually help; Iterated Amplification won't actually work), but I'll try to put that aside here for a bit. What I'm curious about is what makes these sort of ideas only discoverable in this specific network of people, under these specific institutions, and particularly more promising than other sorts of more classical alignment.

Isn't Iterated Amplification in the class of things you'd expect people to try just to get their early systems to work, at least with ≥20% probability? Not, to be clear, exactly that system, but just fundamentally RL systems that take extra steps to preserve the intentionality of the optimization process.

To rephrase a bit, it seems to me that a worldview in which AI alignment is sufficiently tractable that Iterated Amplification is a huge step towards a solution, would also be a worldview in which AI alignment is sufficiently easy (though not necessarily easy) that there should be a much larger prior belief that it gets solved anyway.

FWIW, I made these judgments quickly and intuitively and thus could easily have just made a silly mistake. Thank you for pointing this out.

So, what do I think now, reflecting a bit more?

--The 7% judgment still seems correct to me. I feel pretty screwed in a world where our entire community stops thinking about this stuff. I think it's because of Yudkowskian pessimism combined with the heavy-tailed nature of impact and research. A world without this community would still be a world where people put some effort into solving the problem, but there would be less effort, by less capable people, and it would be more half-hearted/not directed at actually solving the problem/not actually taking the problem seriously.

--The other judgment? Maybe I'm too optimistic about the world where we continue working. But idk, I am rather impressed by our community and I think we've been making steady progress on all our goals over the last few years. Moreover, OpenAI and DeepMind seem to be taking safety concerns mildly seriously due to having people in our community working there. This makes me optimistic that if we keep at it, they'll take it very seriously, and that would be great.

I interpreted the question as something like "if nobody cares about safety and there isn't a community that takes a special interest in it, will we be safe". I don't think it's specifically this AI Alignment community solving it, it's just that if nobody tries to solve the problem, the problem will stay unsolved.

Edit: And I do now see that I misinterpreted the question. Updated my second estimate downwards because of that. Thanks for pointing this out!

I suspect this question is misworded:

Will there be a 4 year interval in which world GDP growth doubles before the first 1 year interval in which world GDP growth doubles?

Do you mean in which world GDP doubles? World GDP growth doubles when it goes from, say, 0.5% yearly growth to 1% yearly growth.

Personally, I suspect world GDP is most likely to next double in a period after a severe war or depression, so you might want to rephrase to avoid that scenario if that isn't what you're thinking about.

This was a good catch! I did actually mean world GDP, not world GDP growth. Because people have already predicted on this, I added the correct questions above as new questions, and am leaving the previous questions here for reference:

I really appreciate the effort that went into collecting all of these questions, framing them clearly, and coding the clickable predictions.

Thanks a lot for the feature and this post! I'll be really interested by an analysis after a lot of answers are in.

That was fun. This time, I tried not to update too much on other people's predictions.In particular, I'm at 1% for "Will we experience an existential catastrophe before we build AGI?" and at 70% for "Will there be another AI Winter (a period commonly referred to as such) before we develop AGI?", but would probably defer to a better aggregate on the second one.

So the following, for example, don't count as "existential risk caused by AGI", right?

  • many AIs
    • an economy run by advanced AIs amplifying negative externalities, such as pollution, leading to our demise
    • an em world with minds evolving to the point of being non-valuable anymore ("a Disneyland without children")
    • a war by transcending uploads
  • narrow AI
    • a narrow AI killing all humans (ex.: by designing grey goo, a virus, etc.)
    • a narrow AI eroding trust in society until it breaks apart
  • intermediary cause by an AGI, but not ultimate cause
    • a simulation shutdown because our AI didn't have a decision theory for acausal cooperation
    • an AI convincing a human to destroy the world

"Will > 50% of AGI researchers agree with safety concerns by 2030?"

From my research, I think they mostly already do, they just use different framings, and care about different time frames.

I've been seeing an intermittent bug on a few of these where tapping to record an answer causes the question text to disappear. Sometimes scrolling away and back fixes it.

Chrome browser on Android phone.

This is intentional. The question text shares space with the list of users and their respective predictions. On mobile, this means when you tap on a section, you see the users who voted in the corresponding range, until you tap away.

Ah, makes sense. I guess I just need to get used to the interface.

Yeah, we had to make some tradeoffs because I really wanted them to fit into a small space, and also to never resize when you interact with them, while also not dominating any post in which they are in. Not sure whether we hit the perfect balance of the tradeoffs.

What level of background in AI alignment are you assuming/desiring for respondents? Is it just “all readers” where the assumption is that any cultural osmosis etc. is included in what you're trying to measure?

Yeah, any LWer is welcome to record their predictions :)