In order to improve our prediction and calibration skills, the Cambridge, MA rationalist community has been community has been running a prediction game (and keeping score) at a succession of rationalist group houses since 2013.
Below are the rules:
The game consists of a series of prediction markets. Each market consists of a question that will (within a reasonable timeframe) have a well-defined binary or multiple-choice answer, an initial probability estimate (called the "house bet"), and a sequence of players' probability estimates. Each prediction is scored on how much more or less accurate it is than the preceding prediction (we do it this way because the previous player's prediction is evidence, and one of the skills this game is meant to develop is updating properly based on what other people think).
Any player can create a market. To create a market, a player writes the question to be predicted on a whiteboard or on a giant sticky note on the wall with a house bet. The house bet should be something generally reasonable, but does not need to be super well-informed (this is abusable in theory but has not been abused in practice).
To make a prediction, a player writes their name and their probability estimate under the most recent prediction (or the house bet if there are no predictions so far). The restrictions on predictions are:
When a market is settled (i.e. the correct answer becomes known), each prediction is given points equal to:
100 * log2(
probability given to the correct answer /
previous probability given to the correct answer)
In a binary market where the correct answer is no, each prediction's implied probability of "no" is used (e.g. if a player predicted 0.25, that is treated as p(no)=0.75).
This is a strictly proper scoring rule, meaning that the optimal strategy (strategy with the highest expected points) is to bet one's true beliefs about the question.
The points from each market are tracked in a spreadsheet, along with the date each market settled. The points from each market decay by a factor of e every 180 days.
The score of each player with a positive score is written on one of our whiteboards and is updated semi-regularly.
Example binary outcome market:
Example multiple-choice market:
Can you elaborate more on whether there have been noticeable results in either A) taking successful actions based on the most recent predictions or B) improving the forecasting skills of the players? And if so- how were these things measured? How would you prefer to measure them?
We don't have well-defined stats on how well people's prediction skills have improved over time. From my anecdotal observations, pretty much everyone (myself included) starts out vastly overconfident, and then after losing a lot of points in their first few predictions, reaches an appropriate level of confidence. I'm not sure if anyone goes from ok to great though.
There have been few times that we've been able to take actions based on the predictions, because that requires the following combination of factors that tends not to occur together:
The examples where the predictions led to decisions are: