Forecasting extreme outcomes

1Zac Hatfield-Dodds

New Comment

This is statistically neat, but I'd recommend Taleb's *Statistical Consequences of Fat Tails: Real World Preasymptotics, Epistemology, and Applications* - the most extreme cases are in practice always those for which you assumed the wrong distribution! e.g. there are many cases where a system spends *most* of it's time in a regime characterized by a normal distribution, and then *rarely* a different mechanism in the underlying dynamics shifts the whole thing wildly out of that - famously including the blowup of Long Term Capital Management after just four years.

This document explores and develops methods for forecasting extreme outcomes, such as the maximum of a sample of

nindependent and identically distributed random variables. I was inspired to write this by Jaime Sevilla’srecent postwith research ideas in forecasting and, in particular, hissuggestionto write an accessible introduction to theFisher–Tippett–Gnedenko Theorem.I’m very grateful to Jaime Sevilla for proposing this idea and for providing great feedback on a draft of this document.

## Summary

The Fisher–Tippett–Gnedenko Theorem is similar to a central limit theorem, but for the

maximumof random variables. Whereas central limit theorems tell us about what happens on average, the Fisher–Tippett–Gnedenko Theorem tells us what happens in extreme cases. This makes it especially useful in risk management, when we need to pay particular attention to worst case outcomes. It could be a useful tool for forecasting tail events.This document introduces the theorem, describes the limiting probability distribution and provides a couple of examples to illustrate the use (and misuse!) of the Fisher–Tippett–Gnedenko Theorem for forecasting. In the process, I introduce a

toolthat computes the distribution of the maximumniid random variables that follow a normal distribution centrally but with an (optional) right Pareto tail.Summary:

niid random variables—which is itself a random variable—converges asngrows to infinity, then it must converge to ageneralised extreme value (GEV) distributionnsuch random variables for largen– but this can give very bad results when our assumptions / judgements are wrongnrandom variables based on the distribution of the underlying random variables, we need accurate judgements about the right tail of the underlying random variables because the maximum will very likely be drawn from the tail, especially asngets largeThis simple toolmight be good enough for forecasting purposes in many caseskstandard deviations above the mean and that there is a Pareto tail beyond this pointn(the number of samples of the underlying random variables)k(the number of SDs above the mean at which the Pareto tail starts); set this high if you don’t want a Pareto tailnsamples of the underlying random variablesI expect the time-poor reader to get most of the value from this document by reading the

informal statement of the Fisher–Tippett–Gnedenko Theorem, theoverviewof the generalised extreme value distribution, andthe shortest and tallest people in the worldexample, and then maybe making a copy and playing around with thetoolfor forecasting the maximum ofnrandom variables that follow normal distributions with Pareto tails (consultingthisas needed).