Equity Premium Puzzles

From 1870 to 2015, stocks in the United States have returned around 8.4% per year in real (inflation adjusted) terms, while short-term government bonds (called Treasury bills in the US) have only returned around 2.1% per year. Though the United States is somewhat of an outlier in how well its equity markets have performed in this period, the picture in other countries isn't much different: stocks have much higher returns than bonds. This fact is called the equity premium puzzle, and I'll get into why exactly it's so puzzling in this essay. I note here that while the discussion about the equity premium puzzle often focuses on broad stock indices, the puzzle is actually present in all asset markets: junk bonds, real estate, foreign exchange, et cetera.

If the difference between 8.4% and 2.1% per year looks small to you, remember that the logic of compounding works such that at these rates of return an initial investment into stocks (with cash from dividends, buybacks, etc. reinvested into the same portfolio) would double in value roughly every 8 years, while the same investment into bonds would double every 33 years. If the time horizon is on the order of a few decades, the difference between the value of the two portfolios becomes enormous. This, and many other interesting findings, are available in The Rate of Return on Everything.

The most common explanation offered for this difference in returns is that stocks are riskier and therefore it's natural that they command a higher rate of return: people who buy stocks are compensated for the risk that they are taking on. To see why this explanation is by itself not sufficient to explain anything, it's enough to notice that a short position on the stock market is at least just as risky (and, in fact, more risky) than a long position, and yet it has an abysmal and deeply negative mean return. Aside from being short the S&P 500, you can probably think of plenty of other risks you could take for which you would not be rewarded: if you jump from the top floor of a tall building and hope to make it safely to the ground, you take an enormous risk and nobody will reward you for doing so.

A better explanation is not that people are rewarded for taking on risk as such, but that they are rewarded for taking on risk that other people are willing to pay a premium to insure against. If you're buying insurance against your house burning down, you might not care that much if the purchase is of negative expected value, because in exchange for that you get to unload the risk of your house burning down on a counterparty. If the terms of the contract are good enough, that might well be a good decision from your point of view, since your house burning down would be very bad for you.

However, this still leaves us with a mystery: if the insurance industry is competitive, then even if people on the demand side are willing to pay large premiums to insure their houses against fires, supply competition should cut down premiums to some small markup over expected cost. This is because while it's risky for you as an individual to hold the risk of your house burning down, if a big insurance or reinsurance company holds a bundle of ten thousand similar contracts which aren't correlated with each other, they successfully diversify their portfolio and bring down the risk they take by a lot while maintaining the same expected return.

The only way that premiums can get to be much higher than expected cost is therefore if something goes wrong in the argument from the preceding paragraph. The obvious candidate is the assumption that it is possible at all to construct a portfolio of ten thousand different insurance claims which are uncorrelated. For example, a big forest fire or a heat wave likely increases the chance of burning down of all ten thousand houses. How much risk we can diversify away depends on how much of the risk is idiosyncratic (specific to an individual) and how much of it is systemic (shared across all individuals). Only systemic risks can account for large differences between the premiums charged by insurance companies and the expected cost of the events that are being insured against.

If we apply this insight to the equity premium puzzle, we see that an equity premium makes sense if buying stocks means you take on some systemic risk which other people are willing to pay you a premium for you to take on. Again, the most obvious candidate here is the risk of economic recession. Stocks tend to do well when the economy is doing well and badly when the economy is doing badly, and we can think of "the state of the economy" as a systemic risk: there's no way to diversify the risk that there will simply be less GDP to go around everyone. So we can at least explain why stocks have a higher return than bonds in qualitative terms, which is encouraging.

Unfortunately for our simple story, further problems begin to crop up when we look at not just the existence of the equity premium but its magnitude, especially compared to the amount of risk that is being taken. The standard deviation of real stock returns in the US over the same period 1870-2015 was around 20% per year. If we try to square this amount of risk with the magnitude of the gap between equity and Treasury bill returns, we end up having to postulate absurdly high levels of risk aversion and these postulates mean our models fail to fit other findings about asset returns, for example the relative stability of riskless short-term rates of return.

An important point here is that the equity premium is still not quite as well measured as we might like it to be. International stock returns over long horizons are highly correlated since long-run economic growth is shared across the world and plausibly dominates most of the variance in the forecasts, and as such the dozens of different stock markets we have access to actually don't give that much extra evidence over just looking at the S&P 500 about whether there is a long-run equity premium or not. The naive standard error estimate is easy to compute: we have a difference of $8.4 \% - 2.1 \% = 6.3 \%$ in annual returns of stocks versus bills, the standard deviation of stock returns is $20 \%$ per year, and our time window is $145$ years long. If we put all that together we end up with a standard error of $20 \%/\sqrt{145} = 1.66 \%$ on the excess return of stocks over bonds, so if we take a two sigma confidence interval it's entirely plausible that the equity premium is only half of the naive estimate $6.3 \%$ . While the data provides strong evidence for the existence of the equity premium, the standard errors are large enough that our estimate of its magnitude can still plausibly range from $3 \%$ per year to $9.5 \%$ per year. Regardless of this fact, all "plausible" values of the premium are still much too large to be accounted for in quantitative terms.

Things get even worse because it turns out that not only is the average return on the stock market much higher than it "should be", the expected return also varies over time much more than it should. We can forecast returns on the stock market with simple dividend yield regressions, especially if we are forecasting mean returns over a five to ten year horizon instead of annual returns, and these regressions tell us that the average return of 8.4% per year actually masks a lot of variation over the course of the business cycle. Stock returns are low in good economic times and high in bad economic times.

To illustrate this with a current example, currently the Metaculus community forecasts that the S&P 500 will realize an annual real return of only 5.4% per year from 2022 to 2031, considerably below the historical mean return of 8.4%:

Puzzles of excess volatility

The main obstacle to recovering the magnitude of the equity premium quantitatively, as well as the magnitude of its variation over the business cycle, is that our economy is not as risky as stocks make it look. Both consumption growth and GDP growth vary by much less than stock returns—the difference is approximately an order of magnitude. In fact if we expect that GDP growth continues on a particular trend line and recessions are just temporary falls in GDP, then since the stock price of a company depends not only on its current cash flows but its entire future stream of cash flows we should expect stock returns to vary less than GDP, but in fact they vary by much more. This is the so-called "puzzle of excess volatility", and it's apparent not only in stock markets but also in other markets, most notably in foreign exchange.

In 1988 Campbell and Shiller came up with a way of formalizing all this discussion which until then had been up in the air. If you're not interested in the technical details you may skip ahead, but it's an important milestone in the history of asset pricing, so I cover it here to explain how we know some of the things we know on the subject.

We start with the definition of stock returns over a period: your returns equal price appreciation plus the dividends you earn on the stock, where we fold other cash transfers such as stock buybacks into the dividends for simplicity. Symbolically, we can express that as

$R_t = \frac{P_{t+1} + D_t}{P_t}$

where $R_t$ is the gross return in period $t$ , $P_t$ is the start-of-period price in period $t$ , and $D_t$ denotes the dividends paid out for this stock in period $t$ . If we take natural logarithms of both sides and let variables in lowercase denote the logarithms of the variables in uppercase, we get

$r_t = \log(R_t) = \log \left( \frac{P_{t+1} + D_t}{P_t} \right) = p_{t+1} - p_t + \log \left(1 + e^{d_t - p_{t+1}} \right) \approx p_{t+1} - p_t + e^{d_t - p_{t+1}}$

If dividend yields don't vary by too much, we can denote the exponential of the "average value" of $d_t - p_{t+1}$ by $\rho$ ( $\rho$ would typically be $\approx 0.04$ for the S&P 500), and then we can approximate this further to get the Campbell-Shiller one period return identity

$r_t \approx \rho + p_{t+1} - p_t + \rho (d_t - p_{t+1} - \log(\rho))$

Typically since we're only interested in variations in returns we discard the overall constants in this identity, so we can think of $r_t \approx p_{t+1} - p_t + \rho (d_t - p_{t+1})$ as holding up to a constant depending on $\rho$ .

Importantly, the only ingredient in this linearized identity is the definition of return. We've made no assumptions about how the stock market works beyond that.

One thing we can now do is to put $p_t$ on the left hand side and iterate this identity forward up to some time $T$ . If we do that, we get

$p_t - d_t = (1 - \rho) (d_{t+1} - d_t) - r_t + (1 - \rho) (p_{t+1} - d_{t+1})= \ldots$

$= \sum_{k=t}^{T-1} (1 - \rho)^{k-t+1} \Delta d_k - \sum_{k=t}^{T-1} (1-\rho)^{k-t} r_k + (1 - \rho) ^{T - t} (p_T - d_T)$

This expression may seem scary, but what it expresses is actually very intuitive. If the price of a stock is high today relative to its current dividends, as a matter of definition there are only three things that can happen in the future:

1. Future dividend growth will be high.

2. Future returns on the stock will be low.

3. The price will be even higher compared to the dividends in the future.

Those three possibilities correspond to the three terms in the right hand side of that formula.

The reason we go through with this funny derivation is now we can actually understand what the puzzle of excess volatility is about. As a matter of accounting, any volatility in $p_t - d_t = pd_t$ must come from volatility in one of the three terms in the right hand side. Since we in fact know that there's excess volatility, one of these terms must be the culprit. We can figure out which term is responsible by noting that taking covariances of both sides with $pd_t$ and dividing by the variance of $pd_t$ gives an identity $1 = \beta_d - \beta_r + \beta_p$ which certain regression coefficients must obey. We can then go and run these regressions to see which of the betas are contributing to the sum. What we find in the data is that $\beta_d, \beta_p$ are both approximately zero, while $\beta_r$ is approximately $-1$ and dominates most of the sum. In other words, for broad stock market indices (not for individual stocks!), high price dividend ratios forecast neither strong future dividend growth nor even higher future price dividend ratios; they merely forecast weak future returns.

If we take expectations of the Campbell-Shiller present value identity at time $t$ , we see that the puzzle of excess volatility is the same puzzle as the puzzle of time-varying expected returns, which is the same puzzle as the time-varying equity premium! In some sense, there's "only one puzzle" about all these aberrant behaviors of asset markets, and "equity premium puzzle" is as good of a name as any.

Remember that we already have a question about how much stock returns will be from 2022 to 2031. With the insight we get from the Campbell-Shiller present value formula, we might wonder what will contribute to the stock returns: will it be further increases in price-dividend ratios (meaning decreases in dividend yields), which have been trending upward for the past forty years; or will it be stronger growth in dividends? A simple way to operationalize a forecast of this is the following question:

If the Campbell-Shiller present value formula looks too complicated, one alternative way to understand the decline in dividend yields is to use the simple Gordon growth formula. This special case of Campbell-Shiller states that if a stock had a constant dividend growth of $g$ and a constant rate of return $r$ which went on forever, then its dividend yield would be equal to $r - g$ . In the past few decades we have seen an overall downward trend in various real rates of return that's been more pronounced than the decline in the rate of economic growth, so we might think that this explains the secular downward trend in dividend yields. However, if we actually look at real S&P 500 returns over the past 40 years, they've been rather high: around 9% per year on average.

While we see no evidence of it yet, it's theoretically also possible that the current low dividend yields correspond to high $g$ rather than low $r$ —in other words, to high anticipated future dividend growth. This would most likely have to come together with expectations of stronger future economic growth. Much like it was with the dotcom boom, the question is whether the currently low dividend yields will pay off in some form in the future, or whether we will simply have lower future expected returns going forward for the foreseeable future.

Explanations

While the finding that there are time-varying expected returns on asset markets and that they are responsible for "excess volatility" is ironclad, it's not at all clear where this variation is coming from. Shiller advocated a "behavioralist" explanation which attributed the variation to investor irrationality, but this explanation seems quite weak since the time variation corresponds quite well to the peaks and troughs of the business cycle. Moreover, irrationality can't account for the level of the equity premium unless we assume that people have been irrational in the same way and in roughly the same magnitude for a century and a half. With all these caveats, however, the primary scientific weakness of behavioralist explanations is that they don't actually make any predictions beyond the patterns we observe in the data, so they are of little predictive value.

The alternative explanation is that time-varying expected returns correspond to either time-varying risks or variation in people's willingness to bear risk over the course of the business cycle. 2009 was a great time to buy stocks, but you may be more scared of buying stocks when you're more afraid to get a pay cut or to lose your job. These ideas can indeed be pushed to produce models which can quantitatively reproduce the time-varying equity premium along with its magnitude, but all of the models we get in this way have other undesirable properties which make them not very convincing as resolutions of the puzzle.

Perhaps the class of models which have been most popular recently are the "long-run risk" or "recursive utility" models of the equity premium. If you've done reinforcement learning or economics before, you might have run into dynamic value functions of the kind

$V = \mathbb E \left[ \sum_{k=0}^{\infty} \beta^k u(c_k) \right]$

where $0 < \beta < 1$ is a time discounting factor, $u : \mathbb R \to \mathbb R$ is an instantaneous utility function with some usual properties and $c_k$ is how much the agent consumes in period $k$ . While this definition is convenient to work with, the central problem it has is that regardless of what we pick $u$ to be, we treat different times and different states of the world in an identical way. Both of them contribute as summands to $V$ : time $k$ is scaled by a factor $\beta^k$ and a particular state of the world is scaled by its probability of occurring, but beyond that, the way they contribute to total value $V$ is identical. This is a problem because we know that people are actually much more reluctant to spread out consumption over states of the world than they are to spread it out over time. In concrete terms, people are much more willing to cut their consumption by half this year to consume twice as much next year, while they would be considerably more reluctant to take a gamble which cuts their consumption this year by half with a probability of $50 \%$ and doubles it with a probability of $50 \%$ .

In economic terms, this form of the value function cannot separate risk aversion from the elasticity of intertemporal substitution, and this is really the mystery of the equity premium puzzle: risk-free rates of return are low and don't vary too much along with changes in consumption growth which implies a high elasticity of intertemporal substitution; while the equity premium is high, which implies high relative risk aversion. For $V$ of the type I wrote above, we have that these two quantities always multiply to 1 as a matter of definition, so the fact that both of them must be high gives a contradiction.

The way to fix this is to do aggregation over time using a CES function instead of a naive sum, and this is the origin of the long-run risk model. The reason it's called the "long-run risk model" is because it has a curious property: if the model holds, then stock prices respond to news about future consumption that aren't reflected in today's consumption. In fact, the only reason people are scared of recessions is because of what it signals about the long-term future of their consumption stream rather than the immediate and current effect on their consumption. Whether this is a feature or a bug of the model is up for debate, but we can certainly create some questions to test this prediction:

Other popular explanations which have different properties from the long-run risk model when it comes to this kind of question include habit models and idiosyncratic risk models. Macro-Finance is a literature review on the subject which covers these models along with many other proposed explanations.

Conclusion

There are many different facets of the equity premium that I haven't been able to get into in this essay, as it's already running quite long. These include the surprising connection between foreign exchange volatility and the equity premium puzzle, cross-country correlations of stock returns, et cetera. Still, I hope I was able to give a good overview of the subject which poses the central questions related to the puzzle and goes over some of the directions that academic investigation of the subject has taken.

Metaculus

Your Notebook is now a Draft.

Pending

Submitted

Equity Premium Puzzles

Puzzles of excess volatility

Explanations

Conclusion