Nov 11, 2009
In my journeys across the land, I have, to date, encountered four sets of probability calibration tests. (If you just want to make bets on your predictions, you can use Intrade or another prediction market, but these generally don't record calibration data, only which of your bets paid out.) If anyone knows of other tests, please do mention them in the comments, and I'll add them to this post. To avoid spoilers, please do not post what you guessed for the calibration questions, or what the answers are.
The first, to boast shamelessly, is my own, at http://www.acceleratingfuture.com/tom/?p=129. My tests use fairly standard trivia questions (samples: "George Washington actually fathered how many children?", "Who was Woody Allen's first wife?", "What was Paul Revere's occupation?"), with an emphasis towards history and pop culture. The quizzes are scored automatically (by computer) and you choose whether to assign a probability of 96%, 90%, 75%, 50%, or 25% to your answer. There are five quizzes with fifty questions each: Quiz #1, Quiz #2, Quiz #3, Quiz #4 and Quiz #5.
The second is a project by John Salvatier (LW account) of the University of Washington, at http://calibratedprobabilityassessment.org/. There are three sets of questions with fifty questions each; two sets of general trivia, and one set of questions about relative distances between American cities (the fourth set, unfortunately, does not appear to be working at this time). The questions do not rotate, but are re-ordered upon refreshing. The probabilities are again multiple choice, with ranges of 51-60%, 61-70%, 71-80%, 81-90%, and 91-100%, for whichever answer you think is more probable. These quizzes are also scored by computer, but instead of spitting back numbers, the computer generates a graph, showing the discrepancy between your real accuracy rate and your claimed accuracy rate. Links: US cities, trivia #1, trivia #2.
The third is a quiz by Steven Smithee of Black Belt Bayesian (LW account here) at http://www.acceleratingfuture.com/steven/?p=96. There are three sets, of five questions each, about history, demographics, and Google rankings, and two sets of (non-testable) questions about the future and historical counterfactuals. (EDIT: Steven has built three more tests in addition to this one, at http://www.acceleratingfuture.com/steven/?p=102, http://www.acceleratingfuture.com/steven/?p=106, and http://www.acceleratingfuture.com/steven/?p=136). This test must be graded manually, and the answers are in one of the comments below the test (don't look at the comments if you don't want spoilers!).
The fourth is a website by Tricycle Developments, the web developers who built Less Wrong, at http://predictionbook.com/. You can make your own predictions about real-world events, or bet on other people's predictions, at whatever probability you want, and the website records how often you were right relative to the probabilities you assigned. However, since all predictions are made in advance of real-world events, it may take quite a while (on the order of months to years) before you can find out how accurate you were.