Jul 11, 2012
Predictions of the future rely, to a much greater extent than in most fields, on the personal judgement of the expert making them. Just one problem - personal expert judgement generally sucks, especially when the experts don't receive immediate feedback on their hits and misses. Formal models perform better than experts, but when talking about unprecedented future events such as nanotechnology or AI, the choice of the model is also dependent on expert judgement.
Ray Kurzweil has a model of technological intelligence development where, broadly speaking, evolution, pre-computer technological development, post-computer technological development and future AIs all fit into the same exponential increase. When assessing the validity of that model, we could look at Kurzweil's credentials, and maybe compare them with those of his critics - but Kurzweil has given us something even better than credentials, and that's a track record. In various books, he's made predictions about what would happen in 2009, and we're now in a position to judge their accuracy. I haven't been satisfied by the various accuracy ratings I've found online, so I decided to do my own.
Some have argued that we should penalise predictions that "lack originality" or were "anticipated by many sources". But hindsight bias means that we certainly judge many profoundly revolutionary past ideas as "unoriginal", simply because they are obvious today. And saying that other sources anticipated the ideas is worthless unless we can quantify how mainstream and believable those sources were. For these reasons, I'll focus only on the accuracy of the predictions, and make no judgement as to their ease or difficulty (unless they say things that were already true when the prediction was made).
Conversely, I won't be giving any credit for "near misses": this has the hindsight problem in the other direction, where we fit potentially ambiguous predictions to what we know happened. I'll be strict about the meaning of the prediction, as written. A prediction in a published book is a form of communication, so if Kurzweil actually meant something different to what was written, then the fault is entirely his for not spelling it out unambiguously.
One exception to that strictness: I'll be tolerant on the timeline, as I feel that a lot of the predictions were forced into a "ten years from 1999" format. So I'll estimate the prediction accurate if it happened at any point up to the end of 2011, if data is available.
The number of predictions actually made seem to vary from source to source; I used my copy of "The Age of Spiritual Machines", which seems to be the original 1999 edition. In the chapter "2009", I counted 63 prediction paragraphs. I then chose ten numbers at random between 1 and 63, and analysed those ten predictions for correctness (those wanting to skip directly to the final score can scroll down). Seeing Kurzweil's nationality and location, I will assume all prediction refer only to technologically advanced nations, and specifically to the United States if there is any doubt. Please feel free to comment on my judgements below; we may be able to build a Less Wrong consensus verdict. It would be best if you tried to reach your own conclusions before reading my verdict or anyone else's. Hence I present the ten predictions, initially without commentary:
My scale for judging the predictions is: true, weakly true, weakly false, false.
Prediction 5: My office and the computer I'm typing on seem pretty full of cables. Nevertheless, it is true there has been a rise in wireless technology, and wireless computer components, even if they're not ubiquitous. I'll grade this as a weakly true.
Prediction 7: I have failed to find proper data for the first prediction. Anecdotally, it certainly seems false - keyboards are still in ubiquitous use, and I've never personally seen anyone use voice recognition to write documents of any length or even to send texts (a few personal experiments with Siri notwithstanding). The second claim in false: according to an assessment by the National Institute of Standards and Technology, the accuracy of CSR is still nowhere near surpassing human transcription. This leads extra credence to the first claim being false as well: without the diminished error rate, it's very hard to see CSR being used for the majority of text creation. False.
Prediction 8: Apart from the belief that the animated personality would be visual, this is a near-perfect description of Siri and similar assistants. The term "ubiquitous" is tricky, but if we interpret it to mean "to be found everywhere" (rather than "everyone has one"), then the prediction is weakly true (knocked down from true because of the uncertainty about ubiquity).
Prediction 18: Without needing to do the research, I think we can take this claim as evidently true.
Prediction 20: All the stuff about voice recognition is false. The only device that fits that description today is the smartphone, which has not achieved penetration of more than 50% among teenagers in 2011 (teenagers are the median "students of all ages"; adding in university students as well as pre-teens should lower the proportion, not raise it). "Learning materials are accessed through wireless communication" is hard to interpret, as it doesn't give any estimate to what proportion of learning material we are talking about. So though we can give Kurzweil kudos for imagining something like the smartphone, the prediction is weakly false.
Prediction 26: One can quibble about inexpensive, as the products seem to be in the $600 range, but those products certainly exist for book and magazine reading (though not for most signs and displays, as far as I can tell - certainly not in a form the blind can use). The second sentence is true for some screen readers, making the prediction essentially true.
Prediction 44: The relative quantifier in the last sentence ("though, are still predominantly conventional") makes it clear that we should expect intelligent highways to be common among long-distance highways - this isn't a few experimental roads we're talking about. Though we have a few self-driving cars, we have nothing like the intelligent roads implied in this prediction, which specifically implies that most cars on those roads will be self-driven. False.
Prediction 48: The first part of the prediction is true. The second sentence seems false, whether one measures the underclass through relative income (where inequality has been increasing) or through an absolute standard of educational attainment (where the various graduating rates have gone up, implying the underclass is decreasing). There are other ways one could measure the underclass, giving different results. Since one could read the underclass as increasing or decreasing, should we take Kurzweil's claim that it is stable as the correct mean? No. All that means is that had he spelt out his claim in more detail at the time, it would likely have ended up false. Ambiguity does not make a false statement true. The last sentence is virtually impossible to confirm or infirm, so the whole prediction is weakly true and weakly false.
Prediction 53: This is a tricky one. The Wii and similar game consoles seem to fit the bill to some extent. However the tone suggests he is talking about a virtual reality experience, which is not what we currently have. So, does he mean virtual reality, or does he mean "games like what they had in 1999, except with much better graphics and features"? How would someone at the time have read the prediction? Again, ambiguity cannot be used to make a false statement true. I'm going to work on the assumption that had he merely meant "graphics and features of video games will improve a lot", he would have said so (certainly his prediction seems to promise much more than that). So the prediction is false.
But what if he was talking about modern games? For a start, his initial sentence gets the relative size of the industries wrong (though that can be read as a throw-away statement rather than a prediction). He also doesn't consider things like Facebook games, which make up a large part of the games industry, and are certainly not interactive virtual environments. What about "these virtual environments allow..."? Well, the statement is possibly an utter triviality, claiming that games exist which feature rafting, hang-gliding or erotic situations (that was already true in 1999). Or it claims that features like these are a major component of the most most popular games today, which is false (now, if he'd said "blowing things up with a marvellous amount of weapons..."). Fantasy environment is a much more common feature, so, I'm taking that as correct. Under this interpretation, the prediction is weakly true and weakly false for games. In total, reading the statement either way, I'll classify it as (contentiously) weakly false.
Note: I did read Kurzweil's assessment of his own predictions, after I had conducted my own analysis. In that assessment, nearly every ambiguous clause is interpreted in Kurzweil's favour. This could be Kurzweil twisting the predictions in his direction; it could be a blatant example of hindsight bias; or it could be that what Kurzweil meant to say was different from what he wrote. Unfortunately, there is no way for us to tell, so we must make do with what was written and interpret it as best we can.
So, out of the ten predictions, five are to some extent true, four are to some extent false, and one is unclassifiable (reading through the rest of the predictions, completely informally, these proportions seem roughly correct).
Now imagine Kurzweil as a predictor who gives predictions, each with independent probability p of bring true (alternately, assume that a fixed proportion p of the 63 predictions are true, and pretend 63 is high enough that we can treat p as continuous without much loss). If we start with a uniform prior on p between 0 and 1, then we can update given this data. Model prediction 48 as true or false with equal probability. Then the posterior must be proportional to (1-p)5p5 + (1-p)4p6:
This has a mean above 54%, which I'd say is excellent. A prediction record over 50% for a decade that included huge increases in computer power, September 11th and the great recession is intuitively a very good one. Alas there is no central repository of prediction records from various futurists, but in the absence of that, his track record certainly feels impressive. Don't let the hindsight bias blind you to how hard this was, and don't simply think of every prediction as binary: generally, there are far more ways for a prediction to be false than there are for them to be true.
On the other hand, if we look at Kurzweil's own ranking of the predictions he gave in the "Age of Spiritual Machines", he grades himself as having either 102 out of 108 or 127 out of 147 correct (with caveats that "even the predictions that were considered 'wrong' in this report were not all wrong"). I've plotted the lower 127/147≈0.86 accuracy on the above graph; that is very far from being a mean estimate (it's in the 99th percentile of the probability distribution). But let's give Kurzweil all we can: we'll reclassify the arguable prediction 53 as being true (posterior proportional to (1-p)4p6 + (1-p)3p7):
That is still not enough to make his accuracy estimate reasonable: his estimate is in the 96th percentile of the probability distribution. Let's be even more generous: let's reclassify the intermediate prediction 48 as also being true (posterior proportional to (1-p)3p7):
Those were very generous adjustments; changing two results is a lot from a sample of ten. But even with the most generous adjustments and taking Kurzweil's lowest estimate of his own accuracy, he is still extraordinarily overconfident: his estimate is in the 94th percentile of the probability distribution. For fun, I flipped another prediction from false to true: even then, his estimate is in the 81th percentile of the probability distribution (and recall that if we were rigorous about the timeline that Kurzweil claimed, at least one of the true prediction would be false).
So what can this tell us about Kurzweil as a futurist, and about the predictions he makes? Essentially two points stand out:
So I feel we should take Kurzweil's predictions as a good baseline, with much wider error bars and caveats, paying relatively less attention to those areas where we feel that being a good Bayesian updater becomes important. We should thus probably pay more attention to his models than to his interpretation of his models.