Most times I go to update a forecast, nothing has happened. There are several ways to update on this, which I explore below.
This post was written after Ozzie Gooen suggested I play around with Squiggle, following up on his earlier Multivariate estimation & the Squiggly language. Squiggle is an experimental domain-specific language for working with probabilities, which is currently in the very early stages of its existence. It can probably do anything Guesstimate can do, while being more easily copied and pasted, at the cost of not having a graphical interface. It currently uses vim as a text editor, so the user has to press i to switch to insert mode and edit the code, and in general it's on the very early stages, e.g., syntax and functionality are still being figured out.
1. Naïve decay
Perhaps the most straightforward way of updating on the passage of time is as follows: If one assigns a probability to an event happening in a year, then the monthly probability is such that , or . Now, if a month has passed, then the remaining probability is .
In Squiggle, this would look as follows:
## Press i to edit. ## Initial setup yearly_probability=0.9 period_probability_function(epsilon)=1-(1-yearly_probability)^(1/epsilon) probability_decayed(t, total_time, period_probability) = 1-(1-period_probability)^(total_time-t) ## Monthly decomposition total_time_monthly=12 # months in a year monthly_probability=period_probability_function(total_time_monthly) probability_decayed_monthly(t)=probability_decayed(t, total_time_monthly, monthly_probability) probability_decayed_monthly ### probability_decayed_monthly(6)
And this would look like so:
The type signature of this function is . That is, I take a probability (0.9), and output a function that goes from time passed, t, to probability remaining after time t. If I wanted to know what the probability is after 6 months, I'd uncomment the last line,
probability_decayed_monthly(6), in this case equal to 0.684.
2. Unprincipled decay.
A strategy which works well as a forecasting technique in practice, but which is totally unprincipled, is to consider probabilities over probabilities. I can ask myself what the lowest and upper bets I'd be willing to take would be, and they might be something like
- 1:9, where I receive 9 if the event doesn't happen
- 1:4, where I pay 4 if the event doesn't happen.
This bets correspond to a 90% and 80% probability, respectively. So I might consider the decay as before, except that this time I add some uncertainty.
To make the shape of the decay more salient, I'll use 1:20 vs 1:2 (95% vs 66%), rather than 1:9 vs 1:4 (90% vs 80%).
## Initial setup yearly_probability_max=0.95 yearly_probability_min=0.66 period_probability_function(epsilon, yearly_probability)=1-(1-yearly_probability)^(1/epsilon) probability_decayed(t, time_periods, period_probability) = 1-(1-period_probability)^(time_periods-t) ## Monthly decomposition months_in_a_year=12 monthly_probability_min=period_probability_function(months_in_a_year, yearly_probability_min) monthly_probability_max=period_probability_function(months_in_a_year, yearly_probability_max) probability_decayed_monthly_min(t)=probability_decayed(t, months_in_a_year, monthly_probability_min) probability_decayed_monthly_max(t)=probability_decayed(t, months_in_a_year, monthly_probability_max) probability_decayed_monthly(t)=probability_decayed_monthly_min(t) to probability_decayed_monthly_max(t) probability_decayed_monthly ## probability_decayed_monthly(6) ## mean(probability_decayed_monthly(6))
probability_decayed_max are similar to the
probability_decayed_monthly of the previous section.
And this would look as follows:
probability_decayed_monthly(6) is now a distribution:
The type signature of this function is: . That is, I take a pair of probabilities (0,66, 0.95), and output a function that goes from time passed, t, to a probability density function. If I wanted to know what the probability of the event happening was after six months had passed, I'd look at the mean of the last picture, or uncomment the last line,
mean(probability_decayed_monthly(6)), changing the type signature to .
In practice, Squiggle doesn't represent as a function, but as its own type for probability distributions, which allows for some optimizations. But writing the type signature as above makes the point that Squiggle, though a domain specific language, can deal with convoluted inputs.
So far, we've been decaying our probabilities without giving thought to the information we gain about the process which produces our events. For example, if the monthly probability of an event happening used to be 20%, but it hasn't happened in the last several years, maybe the probability isn't 20% anymore.
In particular, consider the following question on Good Judgement Open: Will Venezuelan opposition leader Juan Guaidó be detained or arrested by Venezuelan authorities before 1 January 2021?. Guaidó was last arrested the 13th of January 2019, very shortly after he became a major public figure, and the GJO question opened a year afterwards. One consideration when using Laplace's rule is whether to begin just before or just after the last incident; in this case I chose to begin just afterwards, because this produces a more reasonable initial probability.
## Setup number_of_arrests_since_last_time=0 time_passed_since_arrest_before_question_opened=12 ## months remaining_time_until_question_resolution=12 ## ## Updating according to Laplace's rule monthly_probability_Laplace(t) = (number_of_arrests_since_last_time+1)/(time_passed_since_arrest_before_question_opened+t+2) probability_decayed_monthly_Laplace(t)=1 - (1-monthly_probability_Laplace(t))^(remaining_time_until_question_resolution-t) probability_decayed_monthly_Laplace
Note that Laplace's rule comes from a very weak prior, so one might in general want to update even faster. In any case, decay according to Laplace's rule would look like:
The type signature is , where we take the amount of time already passed, and we produce a function which tells us the remaining probability after even more time has passed.
Compare to the actual decay on GJO, which is both slower and discontinuous.
Updating because time has passed and nothing has happened is a time intensive but perhaps easily automatable part of forecasting. One level of automation might be to have a button to automatically apply any of different types of decay. Another level of automation might be to decay probabilities automatically according to a predetermined schedule unless some news pops up in Google News, in which case forecasters are alerted. I'd also imagine that this sort of thing could be more integrated with a platform like Metaculus, such that the user doesn't have to input the current date, or the days passed since the last update, because these are automatically pulled by the platform.