You are viewing a version of this post published on the 1st Mar 2023. This link will always display the most recent version of the post.

~ A Parable of Forecasting Under Model Uncertainty ~

You, the monarch, need to know when the rainy season will begin, in order to properly time the planting of the crops. You have two advisors, Pronto and Eternidad, who you trust exactly equally. 

You ask them both: "When will the next heavy rain occur?"

Pronto says, "Three weeks from today."

Eternidad says, "Ten years from today."

"Good," you say. "I will begin planting the crops in a little bit over five years, the average of your two predictions."

Pronto clears his throat. "If I may, Your Grace. If I am right, we should start preparing for the planting immediately. If Eternidad is right, we should expect an extreme drought, and will instead need to use the crown's resources to begin buying up food from our neighbors, for storage. These two predictions reflect totally different underlying world models, and demand two totally different and non-overlapping responses. Beginning the planting in five years is the wrong choice under either model, and guarantees that the nation will starve regardless of which of us is right."

Eternidad adds: "Indeed, Your Grace. From Pronto's point of view, waiting five years to prepare is just as bad as waiting ten years – the rains will be long passed, by his model. From my perspective, likewise, we should take action now to prepare for drought. We must allocate resources today, one way or the other. What you face is not so much a problem of prediction but a decision problem with an important component of probability. Absolutely do not view our predictions as two point estimates to be averaged and aggregated – view them instead as two distinct and mutually exclusive futures that must be weighed separately to determine the best allocation of resources. Unfortunately, given the unrectifiable disagreement between Pronto and myself, the best course of action is that we do our best to make reasonable preparations for both possibilities. We should spend some fraction of our treasury on planting grain now, in case the rains arrive soon, and the remainder on purchasing food for long-term storage, in the case of prolonged drought."

You, the monarch, ponder this. You do not want to have to split your resources. Surely there must be some way of avoiding that? Finally you say: "It seems like what I need from you two is a probability distribution of rain likelihood going forward into the future. Then I can sample your distributions and get a more informative median date."

Pronto again clears his throat. "No, Your Grace. Let us take the example of the simplest distribution, and derive what conclusions we may, and thereby show that this approach doesn't actually help the situation. Let us assume, for the sake of argument, that I think the odds of rain on any given day are about 3% and Eternidad thinks that odds of rain on any given day are about 0.02%. Under this simple model, we can be said to each have a different uniform distribution over dates of first rainfall. The odds that it will not have rained by some given future day will follow an exponential decay process; the probability that it will have rained by t=3 weeks under my distribution of 3% probability of rain per day is ~50%. The probability that it will not have rained by t=10 years under Eternidad's distribution of 0.02% probability of rain per day is ~50%. Thus we arrive at the same median rain estimates as before, via an assumption of a uniform distribution."

Eternidad interjects: "To be sure, Your Grace, neither of us actually believes that there's a uniform distribution of rain on any given day. Pronto is merely making an abstract point about how assumptions of distribution-shape influence subsequent conclusions."

Pronto continues. "Indeed. And observe, Your Grace: At the 5-year mark, the average of our two cumulative probability forecasts under a uniform distribution would be ~65%, not 50%. Which is interesting, is it not, Your Grace? And furthermore, if we take two cumulative probability distributions with 50% cumulative probability at 3 weeks and 10 years, respectively, and average these two curves, we compute a median 50% crossover point of 116 days from now! Not 5 years, as you had guessed before! The shape of the distributions matters tremendously in determining the weighted median of the two models. This is another reason why it would be a mistake to simply average 5 years with 3 weeks and call that your expectation date, without understanding the structure of the models that gave rise to those numbers.

Pronto continues yet further:  "However, even if we make assumptions about the shapes of our probability distributions over time, it still doesn't help you choose the best 'median' date in a practical sense. Planting the seeds in expectation of rain in 116 days is still too late given my forecast model, and too early given Eternidad's. We could each be increasingly sophisticated in articulating our models, but the fact remains that they are wildly different models, and under the circumstances, they simply do not lend themselves to sampling down to a gross median. We could implicitly have normal distributions; we could have elephant-shaped distributions; it doesn't matter. There is no trick that we can do to render a single useful consensus date from these disparate models."

You say, annoyed: "But what if I have to simply make one, single decision, based on a median expectation date? What if I don't have the resources to 'plan for both', as you say?"

Eternidad says, "Then we're screwed, Your Grace."

You shout, "Curse you both! I just want the betting odds for what date to expect the rains to come by!"

Eternidad and Pronto look at each other thoughtfully.

Pronto offers, "Eternidad and I would both like to bet that the rains will fall in three weeks."

You splutter. "You changed your mind, Eternidad? Or is this some kind of collusion? Traitors!"

Eternidad: "No, Your Grace. But if we must choose one or the other, then we should go ahead and plant now. If the rains do come, we collectively won a coin flip, and our worries are over. If the rains don't come, we can desperately try some other scheme to feed the people, having wasted a large allotment of our resources. This would still be better than digging our own graves by refusing to do any planting at all."

You interrupt, "But what if we create a Market for Betting in the bazaar, and allow the citizenry at large to place bets on their own distributions for the date of first rainfall?"

Once again, your two advisors glance at each other. Pronto speaks first: "There are broadly two schools of thought on the question of rain. There are those like myself, who reason that the rainy season pretty much always starts at the same time of year, leading to a prediction of the rains likely starting a few weeks from now. There are those like Eternidad, who defer to the auguries of the priests and prophets and the consultation of omens and entrails - sometimes called "bio-anchors" due to its reliance on a deep understanding of the biology of chicken innards - and who thus anticipate a great cataclysmic drought in the near future. If we take the consensus of this Market for Betting that you propose, then we will likely end up with a consensus date that is somewhere in the vicinity of 1 year from now, and then we all subsequently starve to death due to not having prepared properly. No individual person in the kingdom actually thinks that the rains will fall one year from now. We are either facing a normal rainy season or a drought, not some hybrid of the two models."

You fume, "Foolish advisors. My understanding of probability distributions and betting odds is very sophisticated. I have used my skills and knowledge to reliably win millions of coins off of my fellow monarchs in games of chance. What's so different about this situation?"

Eternidad speaks: "Three reasons. Firstly, games of chance rarely, if ever, involve competing incompatible and mutually exclusive models of the world. Games tend to be closed systems that are fairly thoroughly understood, making them poor analogues for thinking about the complexity of the real world in many cases. Secondly, you usually play many iterations of these games of chance, and so the frequency of your victories converges gradually, over many iterations, to align with your betting odds. One-off high-variance situations like this one should not be treated as iterated games. And thirdly, this is not just a forecasting problem but a decision problem. You are, if I may be blunt, confused about which tools are appropriate to solve the problem. You may determine very solid and well-calibrated betting odds for a median date of first rainfall, and yet these betting odds are only useful for minimizing the amount of money that you would lose on a bet, and not at all useful for actually determining how to allocate our state resources. If you only care about betting odds, then feel free to average together mutually incompatible distributions reflecting mutually exclusive world-models. If you care about planning then you actually have to decide which model is right or else plan carefully for either outcome."

You ponder this, and eventually decide that your advisors are correct. Unfortunately, you had already bet the entire treasury on a scheme involving J-shaped clay pegs stamped with pictograms of primates in various attitudes of repose. These monkey j-pegs did not appreciate in value as you expected, and the people of the land starved.

 


 

Meta: This was originally written for the ill-fated FTX Future Fund prize. In short, the entire approach of obtaining useful expectation-dates for future technology developments by averaging together wildly disparate world-models is, as I describe here, useful only for determining betting odds, and totally useless for planning and capital-allocation purposes.


 

New Comment
22 comments, sorted by Click to highlight new comments since: Today at 10:44 PM

This is similar to a scenario described by Michael Lewis, in The Big Short. In Lewis' telling, Michael Burry noticed that there was a company (Liberty Interactive, if I remember correctly), that was in legal trouble. This legal trouble was fairly serious --- it might have resulted in the liquidation of the company. However, if the company came through the legal trouble, it had good cash flow and was a decent investment.

Burry noticed that the company was trading at a steep discount to what cash flow analysis would predict its share price to be. He realized that what was occurring was that there was one group of investors who were betting that the company would survive its legal troubles, and trade at a "high" price, and there was another group of investors who thought that the stock was going to go to zero because of the legal trouble the company found itself in. Burry read the legal filings himself, came to the conclusion that it was probable that the company would survive its brush with the law, and invested heavily in it. As it turn out, his prediction was proven correct, and he made a nice return.

Burry's position was a likely outcome. The short-sellers who thought that the stock would go to zero bet on another likely outcome. The only truly unlikely outcome is the one that the market, as a whole, was predicting when Burry made his investment. The price of the stock was an average of two viewpoints that, in a fundamental sense, could not be averaged. Either the company loses its court case, and the stock goes to zero. Or the company survives its court case (perhaps paying a fine in the process), and proceeds with business as usual. As a result, the current market price of the company is not a good guide to its long-term value, and it was possible, as Burry did, to beat the market.

I'm confused by this example. This seems exactly the kind of time where an averaged point estimate is the correct answer. Say there's a 50% chance the company survives and is worth $100 and a 50% chance it doesn't and is worth $0. In this case, I am happy to buy or sell the price at $50.

Doing research to figure out it's actually an 80% chance of $100 means you can buy a bunch and make $30 in expected profit. This isn't anything special though - if you can do research and form better beliefs than the market, you should make money. The different world models don't seem relevant here to me?

This is an example where the true distribution of future prices is bimodal (with the average between the modes). If all you can do is buy or sell stock, then you actually have to disagree with the market about the distribution to make money. 

Without having information about the probability of default, there might still be something to do based on the vol curve.

Someone disagree voted with this and I curious know why. (concretely: if you have information contradicting this, I'd like to here about that so I don't incorrectly update on it)

As a result, the current market price of the company is not a good guide to its long-term value, and it was possible, as Burry did, to beat the market.

That doesn't sound right. That tactic doesn't make you more (or less) likely to beat the market than any other tactic.

The current price isn't an accurate representation of its actual long-term value, but it's an accurate representation of the average of its possible long-term values weighted by probability (from the market's point of view).

So you might make a bet that wins more often than it loses, but when it loses it will lose a lot more than it wins, etc. You're only beating the market when you get lucky, not on average; unless, of course, you have better insights than the market, but that's not specific to this type of trade.

Shouldn't the king just make markets for "crop success if planted assuming three weeks" and "crop success if planted assuming ten years" and pick whichever is higher? Actually, shouldn't the king define some metric for kingdom well-being (death rate, for instance) and make betting markets for this metric under his possible roughly-primitive actions?

This fable just seems to suggest that you can draw wrong inferences from betting markets by naively aggregating. But this was never in doubt, and does not disprove that you can draw valuable inferences, even in the particular example problem.

These would be good ideas. I would remark that many people definitely do not understand what is happening when naively aggregating, or averaging together disparate distributions. Consider the simple example of the several Metaculus predictions for date of AGI, or any other future event. Consider the way that people tend to speak of the aggregated median dates. I would hazard most people using Metaculus, or referencing the bio-anchors paper, think the way the King does, and believe that the computed median dates are a good reflection of when things will probably happen.

Agreed.

It seems like the moral of this parable should be “don’t make foolish, incoherent hedges” — however, the final explanations given by Eternidad don’t touch on this at all. I would be more satisfied by this parable if the concluding explanations focused on the problems of naive data aggregation.

The “three reasons” given are useful ideas, but the king’s decision in this story is foolish even if this scenario was all three: a closed game, an iterated game, and only a betting situation. (Just imagine betting on a hundred coin flips that the coin will land on its edge every time.)

Curated. A parable explaining a probability lesson that many would benefit from – what's not to love? I like the format, I found the dialog/parable amusing rather than dry, and I think the point is valuable (and due to the format, memorable). I'll confess that I think this post will have me looking at blends of different forecasts more carefully, especially as regards to actual decision-making (particular regarding AI forecasts which are feeling increasingly relevant to decision-making these day).

Thank you for this! I was trying to explain this idea to someone recently and couldn't come up with a good way to put it. Now I have something to point to that puts it nicely!

This parable seems to prove too much, since it suggests the same action, "Prepare Now!" to any question of a possible disaster, not just "When will the next heavy rain occur?" What am I missing?

It seems implied that the chance of a drought here is 50%. If there is a 50% chance of basically any major disaster in the foreseeable future, the correct action is "Prepare Now!".

Generally, you should hedge. Devote some resources toward planting and some resources toward drought preparedness, allocated according to your expectation. In the story, the King trust the advisors equally, and should allocate toward each possibility equally, plus or minus some discounting. Just don't devote resources toward the fake "middle of the road" scenario that nobody actually expects. 

If you are in a situation where you really can only do one thing or the other, with no capability to hedge, then I suppose it would depend on the details of the situation, but it would probably be best to "prepare now!" as you say.

The devil, as they say, is in the details. But worst case scenario is to flip a coin - don't be Buridan's Ass and starve to death because you can't decide which equidistant pile of food to eat.

A related metaphor that I like:

Suppose you are in a boat heading down a river, and there are rocks straight ahead. You might not be sure whether it is best to veer left or right, but you must pick one and put all your effort into it. Averaging the two choices is certain disaster.

(Source, as I recall, is Geoffrey Moore's book Crossing the Chasm.)

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year. Will this post make the top fifty?

Thanks for the post, it's useful to be reminded of every now and then. The first time I thought about it was when thinking about the statement by doctors that "somebodies life expectancy is 6 months". This also actually means that there is a high chance to die very soon, but if that doesn't happen they'll probably live on for many years.

Planning as if they would live for +/- 6 months is useless in that case.

I am surprised the advisors don't propose the king to follow the weighted average of decisions rather than thinking about predictions and picking the associated decision.

This is intuitively the formal model underlying the obvious strategy of preparing for either outcomes.

That is probably close to what they would suggest if this weren't mainly just a metaphor for the weird ways that I've seen people thinking about AI timelines.

It might be a bit more complex than a simple weighted average because of discounting, but that would be the basic shape of the proper hedge.

You may be interested in submitting this to the Open Philanthropy AI Worldviews Contest. (I have no connection whatsoever to the contest; just an interested observer here.)