Forecasting rare events

5Alejandro1

2RomeoStevens

3polymathwannabe

0VipulNaik

New Comment

Another type of rare event, not as important as the ones you discuss, but with a large community devoted to forecasting its odds: major upsets in sports, like the recent Germany-Brazil 7-1 blowout in the World Cup. Here is a 538 article discussing it and comparing it to other comparable upsets in other sports.

Potentially valuable since you have a large amount of professionals not only attempting to forecast odds but actually betting beliefs.

Off the top of my head, other rare events worth anticipating:

Assassination of a head of state and/or coup d'état

War between and/or within highly developed countries

A new pandemic

Unavoidable meteorite

Extraterrestrial invasion

Thanks! I added pandemics (though not in the depth I should have). I'll look at some of the others.

In an earlier post, I looked at some general domains of forecasting. This post looks at some more specific classes of forecasting, some of which overlap with the general domains, and some of which are more isolated. The common thread to these classes of forecasting is that they involve

rare events.Different types of forecasting for rare events

When it comes to rare events, there are three different classes of forecasts:

Point-in-time-independent probabilistic forecasts: Forecasts that provide a probability estimate for the event occurring in a given timeframe, but with no distinction based on the point in time. In other words, the forecast may say "there is a 5% chance of an earthquake higher than 7 on the Richter scale in this geographical region in a year" but the forecast is not sensitive to the choice of year. These are sufficient to inform decisions on general preparedness. In the case of earthquakes, for instance, the amount of care to be taken in building structures can be determined based on these forecasts. On the other hand, it's useless for deciding the timing of specific activities.Point-in-time-dependent probabilistic forecasts: Forecasts that provide a probability estimate that varies somewhat over time based on history, but aren't precise enough for a remedial measure that substantially offsets major losses. For instance, if I know that an earthquake will occur in San Francisco in the next 6 months with probability 90%, it's still not actionable enough for a mass evacuation of San Francisco. But some preparatory measures may be undertaken.Predictions made with high confidence (i.e., a high estimated probability when the event is predicted) and a specific time, location, and characteristics: Precise predictions of date and time, sufficient for remedial measures that substantially offset major losses (but possibly at huge, if much smaller, cost). The situation with hurricanes, tornadoes, and blizzards is roughly in this category.Statistical distributions: normal distributions versus power law distributions

Perhaps the most ubiquitous distribution used in probability and statistics is the normal distribution. The normal distribution is a symmetric distribution whose probability density function decays

superexponentiallywith distance from the mean (more precisely, it is exponential decay in the square of the distance). In other words, the probability decays slowly at the beginning, and faster later. Thus, for instance, the ratio of pdfs for 2 standard deviations from the mean and 1 standard deviation from the mean is greater than the ratio of pdfs for 3 standard deviations from the mean and 2 standard deviations from the mean. To give explicit numbers: about 68.2% of the distribution lies between -1 and +1 SD, 95.4% lies between -2 and +2 SD, 99.7% lies between -3 and +3 SD, and 99.99% lies between -4 and +4 SD. So the probability of being more than 4 standard deviations is less than 1 in 10000.If the probability distribution for intensity looks (roughly) like a normal distribution, then high-intensity events are extremely unlikely. So, if the probability distribution for intensity is normal, we do not have to worry about high-intensity events much.

The types of situations where rare event forecasting becomes more important is where events that are high-intensity, or "extreme" in some sense, occur rarely but not as rarely as in a normal distribution. We say that the tails of such distributions are

thickerthan those of the normal distribution, and the distributions are termed "thick-tailed" or "fat-tailed" distributions. [Formally, the thickness of tails is measured using a quantity called excess kurtosis, which sees how the fourth central moment compares with the square of the second central moment (the second central moment is the variance, and it is the square of the standard deviation), then subtracts off the number 3, which is the corresponding value for the normal distribution. If the excess kurtosis for a distribution is positive, it is a thick-tailed distribution.]The most common example of such distributions that is of interest to us is power law distributions. Here, the probability is proportional to a negative power. So the decay is like a power. If you remember some basic precalculus/calculus, you'll recall that power functions (such as the square function or cube function) grow more slowly than exponential functions. So power law distributions decay more subexponentially: they decay more slowly than exponential decay (to be more precise, the decay

starts off as fast, then slows down). As noted above, the pdf for the normal distribution decays exponentially in thesquareof the distance from the mean, so the upshot is that power law distributions decay more slowly than normal distributions.For most of the rare event classes we discuss, to the extent that it has been possible to pin down a distribution, it has looked a lot more like a power law distribution than a normal distribution. Thus, rare events need to be heeded. (There's obviously a selection effect here: for those cases where the distributions are close to normal, forecasting rare events just isn't that challenging, so they wouldn't be included in my post).

UPDATE: Aaron Clauset, who appears in #4, pointed me (via email) to his Rare Events page, containing the code (Matlab and Python) that he used in his terrorism statistics paper mentioned as an update at the bottom of #4. He noted in the email that the statistical methods are fairly general, so interested people could use the code if they were interested in cross-applying to rare events in other domains.TalebismsOne of the more famous advocates of the idea that people overestimate the ubiquity of normal distributions and underestimate the prevalence of power law distributions is Nassim Nicholas Taleb. Taleb calls the world of normal distributions Mediocristan (the world of mediocrity, where things are mostly ordinary and weird things are very very rare) and the world of power law distributions Extremistan (the world of extremes, where rare and weird events are more common). Taleb has elaborated on this thesis in his book

The Black Swan, though some parts of the idea are also found in his earlier bookFooled by Randomness.I'm aware that a lot of people swear by Taleb, but I personally don't find his writing very impressive. He does cover a lot of important ideas but they didn't originate with him, and he goes off on a lot of tangents. In contrast, I found Nate Silver's

The Signal and the Noisea pretty good read, and although it wasn't focused on rare eventsper se, the parts of it that did discuss such forecasting were used by me in this post.(Sidenote: My criticism of Taleb is broadly similar to that offered by Jamie Whyte here in Standpoint Magazine. Also, here's a review by Steve Sailer of Taleb. Sailer is much more favorably inclined to the normal distribution than Taleb is, and this is probably related to his desire to promote IQ distributions/

The Bell Curvetype ideas, but I think many of Sailer's criticisms are spot on).Examples of rare event classes that we discuss in this postThe classes discussed in this post include:

at leastCategory #1, some argue it is Category #2 or Category #3. Hypothesized to follow a power law distribution.Other examples of rare events would also be appreciated.

#1: EarthquakesEarthquake prediction remains mostly in category 1: there are probability estimates of the occurrence of earthquakes of a given severity or higher within a given timeframe, but these estimates do not distinguish between different points in time. In

The Signal and the Noise, statistician and forecasting expert Nate Silver talks to Susan Hough (Wikipedia) of the United States Geological Survey and describes what she has to say about the current state of earthquake forecasting:The whole Silver chapter is worth reading, as is the Wikipedia page on earthquake prediction, which covers much of the same ground.

In fact, even for the time-independent earthquake forecasting, currently the best known forecasting method is the extremely simple Gutenberg-Richter law, which says that for a given location, the frequency of earthquakes obeys a power law with respect to intensity. Since the Richter scale is logarithmic (to base 10), this means that adding a point on the Richter scale makes the frequency of earthquakes decrease to a fraction of the previous value. Note that the Gutenberg-Richter law can't be the full story: there are probably absolute limits on the intensity of the earthquake (some people believe that an earthquake of intensity 10 or higher is impossible). But so far, it seems to have the best track record.

Why haven't we been able to come up with better models? This relates to the problem of overfitting common in machine learning and statistics: when the number of data points is very small, and quite noisy, then trying a more complicated law (with more freely varying parameters)

ends up fitting the noise in the data rather than the signal, and therefore ends up being a poor fit for new, out-of-sample data. The problem is dealt with in statistics using various goodness of fit tests and measures such as the Akaike information criterion, and it's dealt with in machine learning using a range of techniques such as cross-validation, regularization, and early stopping. These approaches can generally work well in situations where there is lots of dataandlots of parameters. But in cases where there is very little data, it often makes sense to just manually select a simple model. The Gutenberg-Richter law has two parameters, and can be fit using a simple linear regression. There isn't enough information to reliably fit even modestly more complicated models, such as the characteristic earthquake models, and past attempts based on characteristic earthquakes failed in both directions (a predicted earthquake at Parkfield never materialized, and the probability of the 2011 Japan earthquake was underestimated by the model relative to the Gutenberg-Richter law).Silver's chapter and other sources do describe some possibilities for short-term forecasting based on foreshocks and aftershocks, and seismic disturbances, but note considerable uncertainty.

The existence of time-independent forecasts for earthquakes has probably had major humanitarian benefits. Building codes and standards, in particular, can adapt to the probability of earthquakes. For instance, building standards are greater in the San Francisco Bay Area than in other parts of the United States, partly because of the greater probability of earthquakes. Note also that Gutenberg-Richter does make out-of-sample predictions: it can use the frequency of low-intensity earthquakes to predict the frequency of high-intensity earthquakes, and therefore obtain a time-independent forecast of such an earthquake in a region that may never have experienced it.

#2: Volcanic eruptionsVolcanoes are an easier case than earthquakes. Silver's book doesn't discuss them, but the Wikipedia article offers basic information. A few points:

coolingdirection), and might even shift the intercept of other long-term secular and cyclic trends in climate (the reason is that the dust particles released by volcanoes into the atmosphere reduce the extent to which solar radiation is absorbed). For instance, the 1991 Mount Pinatubo eruption is credited with causing the next 1-2 years to be cooler than they otherwise would be, masking the heating effect of a strong El Nino.#3: Extreme weather events (lightning, hurricanes/cyclones, blizzards, tornadoes)

Forecasting for lightning and thunderstorms has improved quite a bit over the last century, and falls squarely within Category #3. In

The Signal and the Noise, Nate Silver notes that the probability of an American dying from lightning has dropped from 1 in 400,000 in 1940 to 1 in 11,000,000 today, and a large part of the credit goes to better weather forecasting causing people to avoid the outdoors at the times and places that lightning might strike.Forecasting for hurricanes and cyclones (which are the same weather phenomenon, just at different latitudes) is quite good, and getting better. It falls squarely in category #3: in addition to having general probability estimates of the likelihood of particular types of extreme weather events, we can forecast them a day or a few days in advance, allowing for preparation and minimization of negative impact.

The precision for forecasting the eye of the storm has increased about 3.5-fold in length terms (so about 12-fold in area terms) over the last 25 years. Nate Silver notes that 25 years ago, the National Hurricane Center's forecasts for where a hurricane would hit on landfall, made three days in advance, were 350 miles off on average. Now they're about 100 miles off on average. Most of the major hurricanes that hit the United States, and many other parts of the world, were forecast well in advance, and people even made preparations (for instance, by declaring holidays, or stocking up on goods). Blizzard forecasting is also fairly impressive: I was at Chicago in 2011 when a blizzard hit, and it had been forecast at least a day in advance. With tornadoes, tornado warning alerts are often issued, albeit the tornado often doesn't actually touch down even after the alert is issued (fortunately for us).

See also my posts on weather forecasting and climate forecasting.

#4: Major terrorist actsTerrorist attacks are interesting. It has been claimed that the frequency-damage relationship for terrorist attacks follows a power law. The academic paper that popularized this observation is a paper by Aaron Clauset, Maxwell Young and Kristian Gleditsch titled "On the Frequency of Severe Terrorist Attacks" (Journal of Conflict Resolution 51(1), 58 - 88 (2007)), here. Bruce Schneier wrote a blog post about a later paper by Clauset and Frederick W. Wiegel, and see also more discussion here, here, here, and here (I didn't select these links through a very discerning process; I just picked the top results of a Google Search).

Silver's book does allude to power laws for terrorism, but I couldn't find any reference to Clauset in his book (oops, seems like my Kindle search was buggy!) and says the following about Clauset:

So terrorist attacks are

at leastin category 1. What about categories 2 and 3? Can we forecast terrorist attacks the way we can forecast volcanoes, or the way we can forecast hurricanes. One difference between terrorist acts and the "acts of God" discussed so far is that to the extent one has inside information about a terrorist attack that's good enough to predict it with high accuracy, it's usually also sufficient to actuallypreventthe terrorist attack. So Category 3 becomes trickier to define. Should we count the numerous foiled terrorist plots as evidence that terrorist acts can be successfully "predicted" or should we only consider successful terrorist acts in the denominator? And another complication is that terrorist acts are responsive to geopolitical decisions in ways that earthquakes are definitely not, with extreme weather events falling somewhere in between.As for Category 2, the evidence is unclear, but it's highly likely that terrorist acts can be forecast in a time-dependent fashion to quite a degree. If you want to crunch the numbers yourself, the Global Terrorism Database (website, Wikipedia) and Suicide Attack Database (website, Wikipedia) are available for you to use. I discussed some general issues with political and conflict forecasting in my earlier post on the subject.

UPDATE: Clauset emailed me with some corrections to this section of the post, which I have made. He also pointed to a recent paper he co-wrote with Ryan Woodward about estimating the historical and future probabilities of terror events, available on the ArXiV. Here's the abstract:#5: Power outagesPower outages could have many causes. Note that insofar as we can forecast the phenomena underlying the causes, this can be used to reduce, rather than simply forecast, power outages.

My impression is that when it comes to power outages, we are at Category 2 in forecasting. Load forecasting can identify seasons, times of the day, and special occasions when power demand will be high. Note that the infrastructure needs to built for peak capacity.

We can't quite be in Category 3, because in cases where we can forecast more finely, we could probably prevent the outage anyway.

What sort of preventive measures do people undertake with knowledge of the frequency of power outages? In places where power outages are more likely, people are more likely to have backup generators. People may be more likely to use battery-powered devices. If you know that a power outage is likely to happen in the next few days, you might take more care to charge the batteries on your devices.

#6: Server outagesIn our increasingly connected world, websites going down can have a huge effect on the functioning of the Internet and of the world economy. As with power infrastructure, the complexity of server infrastructure needed to increase uptime increases very quickly. The point is that routing around failures at different points in the infrastructure requires redundancy. For instance, if any one server fails 10% of the time, and the failures of different components are independent, you'd need two servers to get to a 1% failure rate. But in practice, the failures aren't independent. For instance, having loads of servers in a single datacenter covers the risk of any given server there crashing, but it doesn't cover the risk of the datacenter itself getting disconnected (e.g., losing electricity, or getting disconnected from the Internet, or catching fire). So now we need multiple datacenters. But multiple datacenters are far from each other, so that increases the time costs of synchronization. And so on. For more detailed discussions of the issues, see here and here.

My impression is that server outages are largely Category 1: we can use the probability of outages to determine the trade-off between the cost of having redundant infrastructure and the benefit of more uptime. There is an element of Category 2: in some cases, we have knowledge that traffic will be higher at specific times, and additional infrastructure can be brought to bear for those times. As with power infrastructure, server infrastructure needs to be built to handle peak capacity.

#7: Financial crisesThe forecasting of financial crises is a topic worthy of its own post. As with climate science, financial crisis forecasting has the potential for heavy politicization, given the huge stakes both of forecasting financial crises and of any remedial or preventative measures that may be undertaken. In fact, the politicization and ideology problem is probably substantially worse in financial crisis forecasting. At the same time, real-world feedback occurs faster, providing more opportunity for people to update their beliefs and less scope for people getting away with sloppiness because their predictions take too long to evaluate.

A literally taken strong efficient market hypothesis (EMH) (Wikipedia) would suggest that financial crises are almost impossible to forecast, while a weaker reading of the EMH would suggest that the financial

marketis efficient (Wikipedia): it's hard tomake moneyoff the business of forecasting financial crises (for instance, you may know that a financial crisis is imminent with high probability, but the element of uncertainty, particularly with regards to timing, can destroy your ability to leverage that information to make money). On the other hand, there are a lot of people, often subscribed to competing schools of economic thought, who successfully forecast the 2007-08 financial crisis, at least in broad strokes.Note that there are people who reject the EMH, yet claim that financial crises are very hard to forecast in a time-dependent fashion. Among them is Nassim Nicholas Taleb, as described here. Interestingly, Taleb's claim to fame appears to have been that he was able to forecast the 2007-08 financial crisis, albeit it was more of a time-independent forecast than a specific timed call. The irony was noted by by Jamie Whyte here in Standpoint Magazine.

I found a few sources of information for financial crises, that are discussed below.

Economic Predictions records predictions made by many prominent people and how they compared to what transpired. In particular, this page on their website notes how many of the top investors, economists, and bureaucrats missed the financial crisis, but also identifies some exceptions: Dean Baker, Med Jones, Peter Schiff, and Nouriel Roubini. The page also discusses other candidates who claim to have forecasted the crisis in advance, and reasons why they were not included. While I think they've put in a fair deal of effort into their project, I didn't see good evidence that they have a strong grasp of the underlying fundamental issues they are discussing.

An insightful

generaloverview of the financial crisis is found in Chapter 1 of Nate Silver'sThe Signal and the Noise, a book that I recommend you read in its entirety. Silver notes four levels of forecasting failure.Silver finds a common thread among all the failures (emphases in original):

While I find Silver's analysis plausible and generally convincing, I don't think I have enough of an inside-view understanding of the issue.

A few other resources that I found, but didn't get a chance to investigate, are listed below:

Financial Times).#8: PandemicsI haven't investigated this thoroughly, but here are a few of my impressions and findings:

preventglobal pandemics.The Signal and the Noise, titled "Role Models", discusses forecasting and prediction in the domain of epidemiology. The goal of epidemiologists is to obtain predictive models that have a level of accuracy and precision similar to those used for the weather. However, the greater complexity of human behavior, as well as the self-fulfilling and self-canceling nature of various predictions, makes the modeling problem harder. Silver notes that agent-based modeling (Wikipedia) is one of the commonly used tools. Silver cites a few examples from recent history where people were overly alarmed about possible pandemics, when the reality turned out to be considerably milder. However, the precautions taken due to the alarm may still have saved lives. Silver talks in particular of the 1976 swine flu outbreak (where the reaction turned out be grossly disproportional to the problem, and caused its own unintended consequences) and the 2009 flu pandemic.Financial Timeshere. I think Silver doesn't discuss this (which is a surprise, since it would have fit well with the theme of his chapter).#9: Near-earth object impactsI haven't looked into this category in sufficient detail. I'll list below the articles I read.

Earth and Planetary Science Letters, Volume 222, Issue 1, 15 May 2004, Pages 1–15.Natureafter the Chelyabinsk meteor.