This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the second section in the reading guide, Forecasting AI. This is about predictions of AI, and what we should make of them.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. My own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post. Feel free to jump straight to the discussion. Where applicable, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: Opinions about the future of machine intelligence, from Chapter 1 (p18-21) and Muehlhauser, When Will AI be Created?
Summary
Opinions about the future of machine intelligence, from Chapter 1 (p18-21)
- AI researchers hold a variety of views on when human-level AI will arrive, and what it will be like.
- A recent set of surveys of AI researchers produced the following median dates:
- for human-level AI with 10% probability: 2022
- for human-level AI with 50% probability: 2040
- for human-level AI with 90% probability: 2075
- Surveyed AI researchers in aggregate gave 10% probability to 'superintelligence' within two years of human level AI, and 75% to 'superintelligence' within 30 years.
- When asked about the long-term impacts of human level AI, surveyed AI researchers gave the responses in the figure below (these are 'renormalized median' responses, 'TOP 100' is one of the surveyed groups, 'Combined' is all of them').
- There are various reasons to expect such opinion polls and public statements to be fairly inaccurate.
- Nonetheless, such opinions suggest that the prospect of human-level AI is worthy of attention.
- Predicting when human-level AI will arrive is hard.
- The estimates of informed people can vary between a small number of decades and a thousand years.
- Different time scales have different policy implications.
- Several surveys of AI experts exist, but Muehlhauser suspects sampling bias (e.g. optimistic views being sampled more often) makes such surveys of little use.
- Predicting human-level AI development is the kind of task that experts are characteristically bad at, according to extensive research on what makes people better at predicting things.
- People try to predict human-level AI by extrapolating hardware trends. This probably won't work, as AI requires software as well as hardware, and software appears to be a substantial bottleneck.
- We might try to extrapolate software progress, but software often progresses less smoothly, and is also hard to design good metrics for.
- A number of plausible events might substantially accelerate or slow progress toward human-level AI, such as an end to Moore's Law, depletion of low-hanging fruit, societal collapse, or a change in incentives for development.
- The appropriate response to this situation is uncertainty: you should neither be confident that human-level AI will take less than 30 years, nor that it will take more than a hundred years.
- We can still hope to do better: there are known ways to improve predictive accuracy, such as making quantitative predictions, looking for concrete 'signposts', looking at aggregated predictions, and decomposing complex phenomena into simpler ones.
- More (similar) surveys on when human-level AI will be developed
Bostrom discusses some recent polls in detail, and mentions that others are fairly consistent. Below are the surveys I could find. Several of them give dates when median respondents believe there is a 10%, 50% or 90% chance of AI, which I have recorded as '10% year' etc. If their findings were in another form, those are in the last column. Note that some of these surveys are fairly informal, and many participants are not AI experts, I'd guess especially in the Bainbridge, AI@50 and Klein ones. 'Kruel' is the set of interviews from which Nils Nilson is quoted on p19. The interviews cover a wider range of topics, and are indexed here.
10% year 50% year 90% year Other predictions Michie 1972
(paper download)Fairly even spread between 20, 50 and >50 years Bainbridge 2005 Median prediction 2085 AI@50 poll
200682% predict more than 50 years (>2056) or never Baum et al
AGI-092020 2040 2075 Klein 2011 median 2030-2050 FHI 2011 2028 2050 2150 Kruel 2011- (interviews, summary) 2025 2035 2070 FHI: AGI 2014 2022 2040 2065 FHI: TOP100 2014 2022 2040 2075 FHI:EETN 2014 2020 2050 2093 FHI:PT-AI 2014 2023 2048 2080 Hanson ongoing Most say have come 10% or less of the way to human level - Predictions in public statements
Polls are one source of predictions on AI. Another source is public statements. That is, things people choose to say publicly. MIRI arranged for the collection of these public statements, which you can now download and play with (the original and info about it, my edited version and explanation for changes). The figure below shows the cumulative fraction of public statements claiming that human-level AI will be more likely than not by a particular year. Or at least claiming something that can be broadly interpreted as that. It only includes recorded statements made since 2000. There are various warnings and details in interpreting this, but I don't think they make a big difference, so are probably not worth considering unless you are especially interested. Note that the authors of these statements are a mixture of mostly AI researchers (including disproportionately many working on human-level AI) a few futurists, and a few other people.
(LH axis = fraction of people predicting human-level AI by that date)
Cumulative distribution of predicted date of AI
As you can see, the median date (when the graph hits the 0.5 mark) for human-level AI here is much like that in the survey data: 2040 or so.
I would generally expect predictions in public statements to be relatively early, because people just don't tend to bother writing books about how exciting things are not going to happen for a while, unless their prediction is fascinatingly late. I checked this more thoroughly, by comparing the outcomes of surveys to the statements made by people in similar groups to those surveyed (e.g. if the survey was of AI researchers, I looked at statements made by AI researchers). In my (very cursory) assessment (detailed at the end of this page) there is a bit of a difference: predictions from surveys are 0-23 years later than those from public statements. - What kinds of things are people good at predicting?
Armstrong and Sotala (p11) summarize a few research efforts in recent decades as follows.
Note that the problem of predicting AI mostly falls on the right. Unfortunately this doesn't tell us anything about how much harder AI timelines are to predict than other things, or the absolute level of predictive accuracy associated with any combination of features. However if you have a rough idea of how well humans predict things, you might correct it downward when predicting how well humans predict future AI development and its social consequences. - Biases
As well as just being generally inaccurate, predictions of AI are often suspected to subject to a number of biases. Bostrom claimed earlier that 'twenty years is the sweet spot for prognosticators of radical change' (p4). A related concern is that people always predict revolutionary changes just within their lifetimes (the so-called Maes-Garreau law). Worse problems come from selection effects: the people making all of these predictions are selected for thinking AI is the best things to spend their lives on, so might be especially optimistic. Further, more exciting claims of impending robot revolution might be published and remembered more often. More bias might come from wishful thinking: having spent a lot of their lives on it, researchers might hope especially hard for it to go well. On the other hand, as Nils Nilson points out, AI researchers are wary of past predictions and so try hard to retain respectability, for instance by focussing on 'weak AI'. This could systematically push their predictions later.
We have some evidence about these biases. Armstrong and Sotala (using the MIRI dataset) find people are especially willing to predict AI around 20 years in the future, but couldn't find evidence of the Maes-Garreau law. Another way of looking for the Maes-Garreau law is via correlation between age and predicted time to AI, which is weak (-.017) in the edited MIRI dataset. A general tendency to make predictions based on incentives rather than available information is weakly supported by predictions not changing much over time, which is pretty much what we see in the MIRI dataset. In the figure below, 'early' predictions are made before 2000, and 'late' ones since then.
Cumulative distribution of predicted Years to AI, in early and late predictions.
We can learn something about selection effects from AI researchers being especially optimistic about AI from comparing groups who might be more or less selected in this way. For instance, we can compare most AI researchers - who tend to work on narrow intelligent capabilities - and researchers of 'artificial general intelligence' (AGI) who specifically focus on creating human-level agents. The figure below shows this comparison with the edited MIRI dataset, using a rough assessment of who works on AGI vs. other AI and only predictions made from 2000 onward ('late'). Interestingly, the AGI predictions indeed look like the most optimistic half of the AI predictions.
Cumulative distribution of predicted date of AI, for AGI and other AI researchers
We can also compare other groups in the dataset - 'futurists' and other people (according to our own heuristic assessment). While the picture is interesting, note that both of these groups were very small (as you can see by the large jumps in the graph).
Cumulative distribution of predicted date of AI, for various groups
Remember that these differences may not be due to bias, but rather to better understanding. It could well be that AGI research is very promising, and the closer you are to it, the more you realize that. Nonetheless, we can say some things from this data. The total selection bias toward optimism in communities selected for optimism is probably not more than the differences we see here - a few decades in the median, but could plausibly be that large.
These have been some rough calculations to get an idea of the extent of a few hypothesized biases. I don't think they are very accurate, but I want to point out that you can actually gather empirical data on these things, and claim that given the current level of research on these questions, you can learn interesting things fairly cheaply, without doing very elaborate or rigorous investigations. - What definition of 'superintelligence' do AI experts expect within two years of human-level AI with probability 10% and within thirty years with probability 75%?
“Assume for the purpose of this question that such HLMI will at some point exist. How likely do you then think it is that within (2 years / 30 years) thereafter there will be machine intelligence that greatly surpasses the performance of every human in most professions?” See the paper for other details about Bostrom and Müller's surveys (the ones in the book).
In-depth investigations
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some taken from Luke Muehlhauser's list:
- Instead of asking how long until AI, Robin Hanson's mini-survey asks people how far we have come (in a particular sub-area) in the last 20 years, as a fraction of the remaining distance. Responses to this question are generally fairly low - 5% is common. His respondents also tend to say that progress isn't accelerating especially. These estimates imply that any given sub-area of AI, human-level ability should be reached in about 200 years, which is strongly at odds with what researchers say in the other surveys. An interesting project would be to expand Robin's survey, and try to understand the discrepancy, and which estimates we should be using. We made a guide to carrying out this project.
- There are many possible empirical projects which would better inform estimates of timelines e.g. measuring the landscape and trends of computation (MIRI started this here, and made a project guide), analyzing performance of different versions of software on benchmark problems to find how much hardware and software contributed to progress, developing metrics to meaningfully measure AI progress, investigating the extent of AI inspiration from biology in the past, measuring research inputs over time (e.g. a start), and finding the characteristic patterns of progress in algorithms (my attempts here).
- Make a detailed assessment of likely timelines in communication with some informed AI researchers.
- Gather and interpret past efforts to predict technology decades ahead of time. Here are a few efforts to judge past technological predictions: Clarke 1969, Wise 1976, Albright 2002, Mullins 2012, Kurzweil on his own predictions, and other people on Kurzweil's predictions.
- Above I showed you several rough calculations I did. A rigorous version of any of these would be useful.
- Did most early AI scientists really think AI was right around the corner, or was it just a few people? The earliest survey available (Michie 1973) suggests it may have been just a few people. For those that thought AI was right around the corner, how much did they think about the safety and ethical challenges? If they thought and talked about it substantially, why was there so little published on the subject? If they really didn’t think much about it, what does that imply about how seriously AI scientists will treat the safety and ethical challenges of AI in the future? Some relevant sources here.
- Conduct a Delphi study of likely AGI impacts. Participants could be AI scientists, researchers who work on high-assurance software systems, and AGI theorists.
- Signpost the future. Superintelligence explores many different ways the future might play out with regard to superintelligence, but cannot help being somewhat agnostic about which particular path the future will take. Come up with clear diagnostic signals that policy makers can use to gauge whether things are developing toward or away from one set of scenarios or another. If X does or does not happen by 2030, what does that suggest about the path we’re on? If Y ends up taking value A or B, what does that imply?
- Another survey of AI scientists’ estimates on AGI timelines, takeoff speed, and likely social outcomes, with more respondents and a higher response rate than the best current survey, which is probably Müller & Bostrom (2014).
- Download the MIRI dataset and see if you can find anything interesting in it.
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about two paths to the development of superintelligence: AI coded by humans, and whole brain emulation. To prepare, read Artificial Intelligence and Whole Brain Emulation from Chapter 2. The discussion will go live at 6pm Pacific time next Monday 29 September. Sign up to be notified here.
I'm not convinced AI researchers are the most relevant experts for predicting when human-level AI will occur, nor the circumstances and results of its arrival. Similarly, I'm not convinced that excellence in baking cakes coincides that well with expertise in predicting the future of the cake industry, nor the health consequences of one's baking. Certainly both confer some knowledge, but I would expect someone with background in forecasting for instance to do better.
If a person wanted to make their prediction of human-level AI entirely based on what was best for them, without regard to truth, when would be the best time? Is twenty years really the sweetest spot?
I think this kind of exercise is helpful for judging the extent to which people's predictions really are influenced by other motives - I fear it's tempting to look at whatever people predict and see a story about the incentives that would drive them there, and take their predictions as evidence that they are driven by ulterior motives.
The question isn't asking when the best time for the AI to be created is. It's asking what the best time to predict the AI will be created is. E.g. What prediction sounds close enough to be exciting and to get me that book deal, but far enough away as to be not obviously wrong and so that people will have forgotten about my prediction by the time it hasn't actually come true. This is an attempt to determine how much the predictions may be influenced by self-interest bias, etc.
If one includes not only the current state of affairs on Earth for predicting when superintelligent AI will occur, but considers the whole of the universe (or at least our galaxy) it raises the question of an AI-related Fermi paradox: Where are they?
I assume that extraterrestrial civilizations (given they exist) which have advanced to a technological society will have accelerated growth of progress similar to ours and create a superintelligent AI. After the intelligence explosion the AI would start consuming energy from planets and stars and convert matter... (read more)
The question “When Will AI Be Created?” is an interesting starting-point, but it is not sufficiently well-formulated for a proper Bayesian quantitative forecast.
We need to bring more rigor to this work.
The tactic from Luke’s article that has not been done in sufficient detail is “decomposition.”
My attention has shifted from the general question of “When will we have AGI,” to the question of what are some technologies which might become components of intelligent systems with self-improving capabilities, and when will these technologies become available.
Prop... (read more)
How would you like this reading group to be different in future weeks?
I think this is pretty good at the moment. Thanks very much for organizing this, Katja - it looks like a lot of effort went into this, and I think it will significantly increase the amount the book gets read, and dramatically increase the extent to which people really interact with the ideas.
I eagerly await chapter II, which I think is a major step up in terms of new material for LW readers.
"My own view is that the median numbers reported in the expert survey do not have enough probability mass on later arrival dates. A 10% probability of HLMI not having been developed by 2075 or even 2100... seems too low"
Should Bostrom trust his own opinion on this more than the aggregated judgement of a large group of AI experts?
'One result of this conservatism has been increased concentration on "weak AI" - the variety devoted to providing aids to human thought - and away from "strong AI" - the variety that attempts to mechanise human-level intelligence' - Nils Nilson, quoted on p18.
I tend to think that 'weak AI' efforts will produce 'strong AI' regardless, and not take that much longer than if people were explicitly trying to get strong AI. What do you think?
Have there been any surveys asking experts when they expect superintelligence, rather than asking about HLMI? I'd be curious to see the results of such a survey, asking about time until "machine intelligence that greatly surpasses the performance of every human in most professions" as FHI's followup question put it, or similar. Since people indicated that they expected a fairly significant gap between HLMI and superintelligence, that would imply that the results of such a survey (asking only about superintelligence) should give longer time estimates, but I wouldn't be surprised if the results ended up giving very similar time estimates to the surveys about time until HLMI.
We we talk about the arrival of "human-level" AI, we don't really care if it is at a human level at various tasks that we humans work on. Rather, if we're looking at AI risk, we care about AI that's above human level in this sense: It can outwit us.
I can imagine some scenarios in which AI trounces humans badly ton its way to goal achievement, while lacking most human areas of intelligence. These could be adversarial AIs: An algotrading AI that takes over the world economy in a day or a military AI that defeats an enemy nation in an instant w... (read more)
What did you find least persuasive in this week's reading?
Bostrom talks about a recent set of surveys of AI experts in which 'human-level machine intelligence' was given 10% probability of arriving by (median) 2022. How much should we be concerned about this low, but non-negligible chance? Are we as a society more or less concerned than we should be?
Was there anything in particular in this week's reading that you would like to learn more about, or think more about?
Which arguments do you think are especially strong in this week's reading?
Did you change your mind about anything as a result of this week's reading?
I respectfully disagree with Muehlhauser's claim that the AI expert surveys are of little use to us, due to selection bias. For one thing, I think the scale of the bias is unlikely to be huge: probably a few decades. For another thing, we can probably roughly understand its size. For a third, we know which direction the bias is likely to go in, so we can use survey data as lower bound estimates. For instance, if AI experts say there is a 10% chance of AGI by 2022, then probably it isn't higher than that.
After all this, when do you think human level AI will arrive?
Given all this inaccuracy, and potential for bias, what should we make of the predictions of AI experts? Should we take them at face value? Try to correct them for biases we think they might have, then listen to them? Treat them as completely uninformative?
The "Cumulative distribution of predicted Years to AI, in early and late predictions" chart is interesting... it looks like expert forecasts regarding how many years left until AI have hardly budged in ~15 years.
If we consider AI to be among the reference class of software projects, it's worth noting that software projects are famously difficult to forecast development timelines for and are famous for taking much longer than expected. And that's when there isn't even new math, algorithms, etc. to invent.
"Small sample sizes, selection biases, and - above all - the inherent unreliability of the subjective opinions elicited mean that one should not read too much into these expert surveys and interviews. They do not let us draw any strong conclusion." - Bostrom, p21
Do you agree that we shouldn't read too much into e.g. AI experts predicting human-level AI with 90% probability by 2075?
I think an important fact for understanding the landscape of opinions on AI, is that AI is often taken as a frivolous topic, much like aliens or mind control.
Two questions:
1) Why is this?
2) How should we take it as evidence? For instance, if a certain topic doesn't feel serious, how likely is it to really be low value? Under what circumstances should I ignore the feeling that something is silly?
In both GOFAI and ML, progress is hampered by the inconvenient fact that stupid methods perform unreasonably well. It's hard for a logic engine to out-perform a finite-state automata or regular expressions; it's hard for an ML system to out-perform naive Bayes, Markov models, or regression. Intelligence is usually more trouble than it's worth.
Intelligometry
Opinions about the future and expert elicitation
Predictions of our best experts, statistically evaluated, are nonetheless biased. Thank you Katja for contributing additional results and compiling charts. But enlarging quantity of people being asked will not result in better predictive quality. It would be funny to see results of a poll on HLMI time forecast within our reading group. But this will only tell us who we are and nothing about the future of AGI. Everybody in our reading group is at least a bit biased by having read chapters of Nick... (read more)