All of SimonM's Comments + Replies

Is Metaculus Slow to Update?

No - I think probability is the thing supposed to be a martingale, but I might be being dumb here.

Just to confirm: Writingpt, the probability ofAat timet, aspt=E[1A∣Ft](hereFtis the sigma-algebra at timet), we see thatptmust be a martingale via the tower rule []. The log-oddsxt=logpt1−ptare not martingales unlesspt≡constbecause Itô gives us dxtdef=dlogpt1−ptItô=1pt(1−pt)dptmartingale part+12(1(1−pt)2−1 p2t)d[p]tdrift part. So unlessptis continuous and of bounded variation (⇒d[p]t=0, but this also implies thatpt≡const; the integrand of the drift part only vanishes ifpt≡12for allt), the log-odds are not a martingale. Interesting analysis on log-odds might still be possible (just usedpt=pt+1−ptand d[p]t=(pt+1−pt)2for discrete-time/jump processes as we naturally get when working with real data), but it's not obvious to me if this comes with any advantages over just working withptdirectly.
Thoughts on the SPIES Forecasting Method?

So, what do you think? Does this method seem at all promising? I'm debating with myself whether I should begin using SPIES on Metaculus or elsewhere.

I'm not super impressed tbh. I don't see "give a 90% confidence interval for x" as a question which comes up frequently? (At least in the context of eliciting forecasts and estimates from humans - it comes up quite a bit in data analysis).

For example, I don't really understand how you'd use it as a method on Metaculus. Metaculus has 2 question types - binary and continuous. For binary you have to give the prob... (read more)

2022 ACX predictions: market prices

17. Unemployment below five percent in December: 73 (Kalshi said 92% that unemployment never goes above 6%; 49 from Manifold)

I'm not sure exactly how you're converting 92% unemployment < 6% to < 5%, but I'm not entirely convinced by your methodology?

15. The Fed ends up doing more than its currently forecast three interest rate hikes: None (couldn't find any markets)

Looking at the SOFR Dec-22 3M futures 99.25/99.125 put spread on the 14-Feb, I put this probability at ~84%. 

Thanks for doing this, I started doing it before I saw your competition an... (read more)

2Sam Marks5mo
Thanks for this feedback! Re 17: You are right to be skeptical, because my methodology for this one was silly and ad hoc. I somewhat arbitrarily turned a 92% chance that unemployment never goes above 6% into a 80% chance that unemployment isn't above 5% in December. This is completely unprincipled, but I didn't have any better ideas, and the alternative was to ignore the Kalshi market completely and defer entirely to the 5 betters on Manifold, which seemed worse. If you have a more reasonable way of getting a number here, I'll happily defer to it. Re 15: Thanks! I'll edit that number in and point to your comment. Thanks also for the work you put into doing this last year! That post (along with Zvi's re-predictions) led to me running a small prediction contest with a handful of friends. That went well, was a lot of fun, and straightforwardly grew into me asking Scott if he wanted me and Eric to run the same thing for the ACX community. So, making up some numbers and hoping I can use Shapley values correctly, I estimate that you get 40% of the credit for this year's prediction contest happening.
Capturing Uncertainty in Prediction Markets

And one way to accomplish that would be to bet on what percentage of bets are on "uncertainty" vs. a prediction.

How do you plan on incentivising people to bet on "uncertainty"? All the ways I can think of lead to people either gaming the index, or turning uncertainty into a KBC.

Capturing Uncertainty in Prediction Markets

The market and most of the indicators you mentioned would be dominated by the 60 that placed large bets

I disagree with this. Volatility, liquidity, # predictors, spread of forecasts will all be affected by the fact that 20 people aren't willing to get involved. I'm not sure what information you think is being lost by people stepping away? (I guess the difference between "the market is wrong" and "the market is uninteresting"?)

What is being lost is related to your intuition in the earlier comment: Without knowing how many people of the "I've studied this subject, and still don't think a reasonable prediction is possible" variety didn't participate in the market, it's very hard to place any trust in it being the "right" price. This is similar to the "pundit" problem where you are only hearing from the most opinionated people. If 60 nutritionist are on TV and writing papers saying eating fats is bad, you may try to draw the "wrong" conclusion from that.; because unknown to you, 40 nutritionists believe "we just don't know yet". And these 40 are provided no incentives to say so. Take the Russia-Kiev question [] on Metaculus which had a large number of participants. It hovered at 8% for a long time. If prediction markets are to be useful beyond just pure speculation, that market didn't tell me how many knowledgable people thought an opinion was simply not possible. The ontological skepticism signal is missing - people saying there is no right or wrong that "exists" - we just don't know. So be skeptical of what this market says. As for KBC - most markets allow you to change/sell your bet before the event happens; especially for longer-term events. So my guess is that this is already happening. In fact, the uncertainty index would seperate out much of the "What do other people think?" element into it's own question. For locked in markets like ACX where the suggestion is to leave your prediction blank if you don't know, imagine every question being paired with "What percentage of people will leave this prediction blank?"
Capturing Uncertainty in Prediction Markets

There are a bunch of different metrics which you could look at on a prediction market / prediction platform to gauge how "uncertain" the forecast is:

  • Volatility - if the forecast is moving around quite a bit, there are two reasons:
    • Lots of new information arriving and people updating efficiently
    • There is little conviction around "fair value" so traders can move the price with little capital
  • Liquidity - if the market is 49.9 / 50.1 in millions of dollars, then you can be fairly confident that 50% is the "right" price. If the market is 40 / 60 with $1 on t
... (read more)
All these indicators are definitely useful for a market observer. And betting on these indicators would make for an interesting derivatives market - especially on higher volume questions. The issue I was referring to is that all these indicators are still only based on traders who felt certain enough to bet on the market. Say 100 people who have researched East-Asian geopolitics saw the question "Will China invade Taiwan this year?". 20 did not feel confident enough to place a bet. Of the remaining 80 people, 20 bet small amounts because of their lack of certainty. The market and most of the indicators you mentioned would be dominated by the 60 that placed large bets. A LOT of information about uncertainty would be lost. And this would have been fairly useful information about an event. The goal would be to capture the uncertainty signal of the 40 that did not place bets, or placed small bets. One way to do that would be to make "uncertainty" itself a bettable property of the question. And one way to accomplish that would be to bet on what percentage of bets are on "uncertainty" vs. a prediction.
Prediction Markets are for Outcomes Beyond Our Control

Prediction markets function best when liquidity is high, but they break completely if the liquidity exceeds the price of influencing the outcome. Prediction markets function only in situations where outcomes are expensive to influence.


There are a ton of fun examples of this failing:

Money-generating environments vs. wealth-building environments (or "my thoughts on the stock market")

I don't know enough about how equities trade during earnings, but I do know a little about how some other products trade during data releases and while people are speaking.

In general, the vast, vast, vast majority of liquidity is withdrawn from the market before the release. There will be a few stale orders people have left by accident + a few orders left in at levels deemed ridiculously unlikely. As soon as the data is released, the fastest players will general send quotes making a (fairly wide market) around their estimate of the fair price. Over time (a... (read more)

Use Normal Predictions

I agree identifying model failure is something people can be good at (although I find people often forget to consider it). Pricing it they are usually pretty bad at.

Use Normal Predictions

I'd personally be more interested in asking someone for their 95% CI than their 68% CI, if I had to ask them for exactly one of the two. (Although it might again depend on what exactly I plain to do with this estimate.)

I'm usually much more interested in a 68% CI (or a 50% CI) than a 95% CI because:

  1. People in general arent super calibrated, especially at the tails
  2. You won't find out for a while how good their intervals are anyway
  3. What happens most often is usually the main interest. (Although in some scenarios the tails are all that matters, so again, depends
... (read more)
Oh okay. Maybe I just haven't yet understood what you do with a 68% CI. re 1: maybe we just have different intuitions - I somehow feel people are always better at qualitative stuff than quantitative - and identifying model failure is more qualitative.
1Jan Christian Refsgaard7mo
I agree with both points If you are new to continuous predictions then you should focus on the 50% Interval as it gives you most information about your calibration, If you are skilled and use for example a t-distribution then you have σ for the trunk and ν for the tail, even then few predictions should land in the tails, so most data should provide more information about how to adjust σ, than how to adjust ν Hot take: I think the focus 95% is an artifact of us focusing on p<0.05 in frequentest statistics.
Use Normal Predictions

Under what assumption?

1/ You aren't "[assuming] the errors are normally distributed". (Since a mixture of two normals isn't normal) in what you've written above.

2/ If your assumption is  then yes, I agree the median of is ~0.45 (although 

from scipy import stats
stats.chi2.ppf(.5, df=1)
>>> 0.454936

would have been an easier way to illustrate your point). I think this is actually the assumption you're making. [Which is a horrible assumption, because if it were true, you would already be perfectly calibrated].

3/ I guess ... (read more)

1Jan Christian Refsgaard7mo
Our ability to talk past each other is impressive :) Yes this is almost the assumption I am making, the general point of this post is to assume that all your predictions follow a Normal distribution, with μ as "guessed" and with a σ that is different from what you guessed, and then use X2 to get a point estimate for the counterfactual σ you should have used. And as you point out if (counterfactual) σ=1 then the point estimate suggests you are well calibrated. In the post counter factual σ is ^σz
The Unreasonable Feasibility Of Playing Chess Under The Influence

I think the controversy is mostly irrelevant at this point. Leela performed comparably to Stockfish in the latest TCEC season and is based on Alpha Zero. It has most of the "romantic" properties mentioned in the post.

Not just in the latest TCEC season, they've been neck-and-neck for quite a bit now
Use Normal Predictions

That isn't a "simple" observation.

Consider an error which is 0.5 22% of the time, 1.1 78% of the time. The squared errors are 0.25 and 1.21. The median error is 1.1 > 1. (The mean squared error is 1)

1Jan Christian Refsgaard7mo
Yes you are right, but under the assumption the errors are normal distributed, then I am right: If: p∼Bern(0.78)σ=p×N(0,1.1)+(p−1)N(0,0.5) Then E[σ2]≈0.37 Which is much less than 1. proof: import scipy as sp x1 = sp.stats.norm(0, 0.5).rvs(22 * 10000) x2 = sp.stats.norm(0, 1.1).rvs(78 * 10000) x12 = pd.Series(np.array(x1.tolist() + x2.tolist())) print((x12 ** 2).median())
Use Normal Predictions

Metaculus uses the cdf of the predicted distribution which is better If you have lots of predictions, my scheme gives an actionable number faster

You keep claiming this, but I don't understand why you think this

Use Normal Predictions

If you suck like me and get a prediction very close then I would probably say: that sometimes happen :) note I assume the average squared error should be 1, which means most errors are less than 1, because 02+22=2>1

I assume you're making some unspoken assumptions here, because  is not enough to say that. A naive application of Chebyshev's inequality would just say that .

To be more concrete, if you were very weird, and either end up forecasting 0.5 s.d. or 1.1 s.d. away, (still with mean 0 and average... (read more)

1Jan Christian Refsgaard7mo
I am making the simple observation that the median error is less than one because the mean squares error is one.
Use Normal Predictions

Go to your profile page. (Will be something like{some number}/). Then in the track record section, switch from Brier Score to "Log Score (continuous)"

I upvoted all comments in this thread for constructive criticism, response to it, and in the end even agreeing to review each other!
Two ominous charts on the financial markets

The 2000-2021 VIX has averaged 19.7, sp500 annualized vol 18.1.

I think you're trying to say something here like 18.1 <= 19.7, therefore VIX (and by extension) options are expensive. This is an error. I explain more in detail here, but in short you're comparing expected variance and expected volatility which aren't the same thing.

From a 2ndary source: "The mean of the realistic volatility risk premium since 2000 has been 11% of implied volatility, with a standard deviation of roughly 15%-points" from

... (read more)
Use Normal Predictions

I still think you're missing my point.

If you're making ~20 predictions a year, you shouldn't be doing any funky math to analyse your forecasts. Just go through each one after the fact and decide whether or not the forecast was sensible with the benefit of hindsight.

I am even explaining what an normal distribution is because I do not expect my audience to know...

I think this is exactly my point, if someone doesn't know what a normal distribution is, maybe they should be looking at their forecasts in a fuzzier way than trying to back fit some model to them.


... (read more)
8Jan Christian Refsgaard7mo
I would love you as a reviewer of my second post as there I will try to justify why I think this approach is better, you can even super dislike it before I publish if you still feel like that when I present my strongest arguments, or maybe convince me that I am wrong so I dont publish part 2 and make a partial retraction for this post :). There is a decent chance you are right as you are the stronger predictor of the two of us :)
Use Normal Predictions

I disagree with that characterisation of our disagreement, I think it's far more fundamental than that.

  1. I think you misrepresent the nature of forecasting (in it's generality) versus modelling in some specifics
  2. I think your methodology is needlessly complicated
  3. I propose what I think is a better methodology

To expand on 1. I think (although I'm not certain, because I find your writing somewhat convoluted and unclear) that you're making an implicit assumption that the error distribution is consistent from forecast to forecast. Namely your errors when forecastin... (read more)

I am sorry if I have straw manned you, and I think your above post is generally correct. I think we are cumming from two different worlds.

You are coming from Metaculus where people make a lot of predictions. Where having 50+ predictions is the norm and the thus looking at a U(0, 1) gives a lot of intuitive evidence of calibration.

I come from a world where people want to improve in all kids of ways, and one of them is prediction, few people write more than 20 predictions down a year, and when they do they more or less ALWAYS make dichotomous predictions. I ... (read more)

Two ominous charts on the financial markets

d/ is actually completely consistent with the vol market (I point this out here), so it's not clear that's their recommendation.

I should have checked before posting, thanks for this.
Use Normal Predictions

If you think 2 data points are sufficient to update your methodology to 3 s.f. of precision I don't know what to tell you. I think if I have 2 data point and one of them is 0.99 then it's pretty clear I should make my intervals wider, but how much wider is still very uncertain with very little data. (It's also not clear if I should be making my intervals wider or changing my mean too)

3Jan Christian Refsgaard7mo
I don't know what s.f is, but the interval around 1.73 is obviously huge, with 5-1-0 data points it's quite narrow if your predictions are drawn from N(1, 1.73), that is what my next post will be about. There might also be a smart way to do this using the Uniform, but I would be surprised if it's dispersion is smaller than a chi^2 distribution :) (changing the mean is cheating, we are talking about calibration, so you can only change your dispersion)
Use Normal Predictions

you are missing the step where I am transforming arbitrary distribution to U(0, 1)

I am absolutely not missing that step. I am suggesting that should be the only step.

(I don't agree with your intuitions in your "explanation" but I'll let someone else deconstruct that if they want)

4Jan Christian Refsgaard7mo
Hard disagree, From two data points I calculate that my future intervals should be 1.73 times wider, converting these two data points to U(0,1) I get [0.99, 0.25] How should I update my future predictions now?
Use Normal Predictions

you need less data to check whether your squared errors are close to 1 than whether your inverse CDF look uniform

I don't understand why you think that's true. To rephrase what you've written:

"You need less data to check whether samples are approximately N(0,1) than if they are approximately U(0,1)"

It seems especially strange when you think that transforming your U(0,1) samples to N(0,1) makes the problem soluble.

3Jan Christian Refsgaard7mo
TLDR for our disagreement: SimonM: Transforming to Uniform distribution works for any continuous variable and is what Metaculus uses for calibration Me: the variance trick to calculate σz from this post is better if your variables are form a Normal distribution, or something close to a normal. SimonM: Even for a Normal the Uniform is better.
2Jan Christian Refsgaard7mo
you are missing the step where I am transforming arbitrary distribution to U(0, 1) medium confident in this explanation: Because the square of random variables from the same distributions follows a gamma distribution, and it's easier to see violations from a gamma than from a uniform, If the majority of your predictions are from a weird distributions then you are correct, but if they are mostly from normal or unimodal ones, then I am right. I agree that my solution is a hack that would make no statistician proud :) Edit: Intuition pump, a T(0, 1, 100) obviously looks very normal, so transforming to U(0,1) and then to N(0, 1) will create basically the same distribution, the square of a bunch of normal is Chi^2, so the Chi^2 is the best distribution for detecting violations, obviously there is a point where this approximation sucks and U(0, 1) still works
Use Normal Predictions

(If this makes no sense, then ignore it): Using an arbitrary distribution for predictions, then use its CDF (Universality of the Uniform) to convert to , and then transform to z-score using the inverse CDF (percentile point function) of the Unit Normal. Finally use this as  in when calculating your calibration.

Well, this makes some sense, but it would make even more sense to do only half of it.

Take your forecast, calculate it's percentile. Then you can do all the traditional calibration stuff. All this stuff with z-scores is needless... (read more)

1Daniel S7mo
Where can I access this for my profile on Metaculus? I have everything unlocked but don't see it in the options.
4Jan Christian Refsgaard7mo
Can I use this image for my "part 2" posts, to explain how "pros" calibrate their continuous predictions?, And how it stacks up against my approach?, I will add you as a reviewer before publishing so you can make corrections in case I accidentally straw man or misunderstand you :) I will probably also make a part 3 titled "Try t predictions" :), that should address some of your other critiques about the normal being bad :)
4Jan Christian Refsgaard7mo
This is a good point, but you need less data to check whether your squared errors are close to 1 than whether your inverse CDF look uniform, so if the majority of predictions are normal I think my approach is better. The main advantage of SimonM/Metaculus is that it works for any continuous distribution.
Two ominous charts on the financial markets

I absolutely considered writing about the difference between risk-neutral probabilities and real-world probabilities in this context but decided against because: Over the course of a year, the difference is going to be small relative to the width of the forecast

I'd be interested to hear if you think the differences would be material to my point. ie that [-60%, +30%] isn't a ~90% range that stocks return next year and that his forecast is not materially different to what the market is forecasting.

1Clark Benham7mo
The 2000-2021 VIX has averaged 19.7, sp500 annualized vol 18.1. From a 2ndary source: "The mean of the realistic volatility risk premium since 2000 has been 11% of implied volatility, with a standard deviation of roughly 15%-points" from [] . So 1/3 of the time the premia is outside [-4%,26%], which swamps a lot of vix info about true expect vol. -60% would the worst draw down ever, the prior should be <<1%. However, 8 years have been above 30% since 1928 (9%), seems you're using a non-symetric CI. The reasoning for why there'd be such a drawdown is backwards in OP: because real rates are so low the returns for owning stocks has declined accordingly. If you expect 0% rates and no growth stocks are priced reasonably, yielding 4%/year more than bonds. Thinking in the level of rates not changes to rates makes more sense, since investments are based on current projected rates. A discounted cash flow analysis works regardless of how rates change year to year. Currently the 30yr is trading at 2.11% so real rates around the 0 bound is the consensus view.
Use Normal Predictions

I think you're advocating two things here:

  1. Make a continuous forecast when forecasting a continuous variable
  2. Use a normal distribution to approximate your continuous forecast

I think that 1. is an excellent tip in general for modelling. Here is Andrew Gelman making the same point

However, I don't think it's actually always good advice when eliciting forecasts. For example, fairly often people ask whether or not they should make a question on Metaculus continuous or binary. Almost always my answer is "make it binary". Binary questions get considerable more inte... (read more)

6Jan Christian Refsgaard7mo
Agreed 100% on 1) and with 2) I think my point is "start using the normal predictions as a gate way drug to over dispersed and model based predictions" I stole the idea from Gelman and simplified it for the general community, I am mostly trying to raise the sanity waterline by spreading the gospel of predicting on the scale of the observed data. All your critiques of normal forecasts are spot on. Ideally everybody would use mixtures of over-dispersed distributions or models when making predictions to capture all sources of uncertainty It is my hope that by educating people in continuous prediction the Metaculus trade off you mention will slowly start to favor the continuous predictions because people find it as easy as binary prediction... but this is probably a pipe dream, so I take your point
Two ominous charts on the financial markets

That's not a chart of "real rates", that's the spread between a 10y rate and a spot inflation estimate. Real rates is (ideally) the rate paid on an inflation linked bond, or at least the k-year rate minus the k-year forecast inflation. The BoE have historic data here going back to '85 and the rally is several hundred basis points less than your chart implies.

Two ominous charts on the financial markets

Sorry, I should have made that more clear. I am talking about the period since the start of the interest rate decline (mid 1980s to today).

I think you're going to have to be more explicit about what time period you're forecasting your market collapse. (Or whatever it is you're forecasting, it's still not clear to me).

Let me try to rephrase that: I think we will be seeing a fundamental change in the financial markets due to an end to the 35year long reduction of real interest rates. And most of the actors have only known investing in an environment with mor

... (read more)
Two ominous charts on the financial markets

I'm not sure what you consider to be "neutral" to hold, but forward returns for holding cash don't look great either.

(I'm also not sure what you're trying to say about Warren Buffett, can you be more explicit)

Two ominous charts on the financial markets

tl;dr all your conclusions are equally consistent with equity returns being similar to the past going forward

a) To infer future average equity returns from the past decades seems to me quite dangerous.

Agreed, although what else do we have?

b) The valuation level (Shiller CAPE) and the elimination of the interest rate reduction effect for stock valuations indicate that expected future equity returns will probably be significantly lower than those of past decades.

Which decades are you looking at? As recently as the decade before last (2000s) we had negative r... (read more)

1Clark Benham7mo
Options Nitpick: You can't use equity index* option prices as true probabilities because adding hedges to a portfolio makes the whole portfolio more valuable. People then buy options based on their value when added to the portfolio, not as individual investments. The first reason option hedges make your portfolio more valuable is preferences: people don't just want to maximize their expected return, but also reduce the chance they go broke. People don't like risk and hedges reduce risk, ergo they pay more to get rid of risk. However you can't just subtract X vols to adjust as this "risk premia" isn't constant over time. Secondly hedges maximize long term returns (or why you shouldn't sell options) You want to maximize your average geometric annual return not average annual return. You care about geometric averages because if for 3 years your returns were +75%, +75%, -100%, your don't have 50% more money then when you started but 0. The average of annual returns was 10.7% over the past 30 years, but if you'd invested in 1992 you'd've only compounded at 8.5% year till 2022. Geometric returns are the the nth root of a product of n numbers and have the approximation = Average Annual Return - (variance/2). If you could reduce variance and not reduce Annual returns, your portfolio (market + hedges) would grow faster than the market. These reasons are why despite the worst Annual return being -48% in 1931, you say there's a 5% chance of > -50% returns based on option markets. *I'm specifically talking index options because that's the portfolio investors have (or something similar) and the total is what they care about. If you were to use prices as true probabilities for say a merger going through these reasons don't apply as much and would be more accurate. PS. I've referred to investors as all having the same portfolio because most people do have highly correlated index holdings and it's at this level of generality you can think about investors as a class.
Sorry, I should have made that more clear. I am talking about the period since the start of the interest rate decline (mid 1980s to today). Let me try to rephrase that: I think we will be seeing a fundamental change in the financial markets due to an end to the 35year long reduction of real interest rates. And most of the actors have only known investing in an environment with more or less constantly decreasing real interest rates. I believe this combination could lead to widespread panic in the markets once the people making investment decision realize that they don't know anymore how the markets react due to the new environment. I believe that rationale for investing in equities is quite widespread today. Obviously, there is an alternative (accepting secure negative real returns), but in order to avert guaranteed losses investors take on risk. I am not saying that this is necessarily the wrong strategy, but it poses an interesting question: How will investors with this motivation for holding stocks react in a downturn?
When required to be fully invested this is trueish. However you can sit in cash while no appealing investments exist. And buy in size when prices become more appealing. inb4 market timing is not possible Have a look at Warren Buffett's track record and the amount of cash he held in early 2000 and now.
Scott Alexander 2021 Predictions: Market Prices - Resolution

OK, so I am obviously biased but I'll look to see if I think this is fair.


Yeah, this is definitely my bad. I didn't ask you (or Scott) whether or not you were happy with me comparing your comments to market forecasts. I apologise. I also didn't intend to make this as normative as it sounds. (FWIW in the past I have gone to bat for your forecasting skills and given your forecast and a market forecast most of the time I would expect to update away from the market and towards you)

I'll let Simon decide what to do with the rest. I also find it super weird

... (read more)
Scott Alexander 2021 Predictions: Market Prices - Resolution

So I'm a little worried we've used different sources for your forecasts, but to explain where we differ:

  1. We agree
  2. We agree
  3. We agree
  4. Happy to change your number, although your forecast was: "Depending on what counts as ‘recalled’ this is either at least 10%, or it’s damn near 0%. I don’t see how you get 5%. Once you get an election going, anything can happen. Weird one, I’d need more research."  Which I averaged to 5%. Happy to change to 1%?
  5. We agree
  6. "It’s definitely a thing that can happen but there isn’t that much time involved, and the timing doesn’t seem
... (read more)
9Zvi8mo [] is the canonical version. Surprised the differences were this big. The struggle on knowing when to update all versions is real, especially now that there's 3x. Then beyond that your decisions seem fine. And no need to apologize for doing the exercise, it's good to check things, long as it's clear what's being done. When/if I do predictions for 2022 I'll see what I can do about also including explicit fairs (and ideally, where I'd call BS on a market, and where I wouldn't).
Scott Alexander 2021 Predictions: Market Prices - Resolution

If anyone can figure out how to format that table, I would appreciate it, thanks!

I have been trying to format tables on LW for a while have up and started using images.
Retail Investor Advantages

I don't think you've found the most unbiased description of PFOF out there

Combining Forecasts

I realise you've been very careful about avoiding mentioning any explicit average in your section on "Combining External Forecasts", I was wondering if you had any thoughts on mean-vs-median (links below)

I was also wondering if you had any thoughts on extremising the forecasts you're ensembling too. (The classic example of 4 people all forecasting 60% but all based on independent information)

Retail Investor Advantages

I'm afraid you're confused about how PFOF works. It's absolutely not about "frontrunning trades"

1ChristianKl8mo [] writes: To me that sounds like leogao description of the PFOF being due to misinformed traders seems wrong.
Retail Investor Advantages

Okay, but your examples are now all the same as your "2." (which I don't disagree with). Size isn't the advantage here, it's being able to be involved in weird things. (I was disagreeing with your point "3.")

Fair enough. I suppose I'm having trouble coming up with examples of opportunities that are both not weird, and also not systematizable. (Though I do think evaluation of individual penny stocks counts.) I'm keeping that as separate from 2 though because I think that if you do find something like that, the retail trader is potentially advantaged. And in general, I think it's true on a spectrum — the more capacity a strategy has, the more you shouldn't expect to beat the market with it. I think of the market as like an ecosystem. If you look at a cubic meter of rainforest, there's a ton of activity going on at different levels, from bacterium up to tree. Different organisms are taking advantage of different metabolic opportunities of different sizes and types (and their activity provides opportunities to each other). Each creature has its niche. I think of the market as like that. You've got big long-term macro funds taking positions that last for months. And you've got little nimble HFT shops making money off of the big slow macro fund's predictable-on-short-timescales trading behavior. And I claim the retail trader can potentially find a niche here too. And part of what they should look for are opportunities that are not worth the time of bigger firms. (Though note that this might just mean that this retail trader is currently being undervalued, if they can find opportunities that are worth their while, but wouldn't be worth the while of an employee at a firm.)
Retail Investor Advantages

Small size means you can look for opportunities with a good return, but low capacity (e.g. some opportunity that could turn 10k into 20k, but couldn't turn 10M into 20M). I think this is a much bigger deal than the low slippage advantage that comes from small size.

I'm kinda curious as to what sort of opportunities people think these are (especially in developed markets)

The sorts of things which have low enough variance to be "good" trades without doing them systematically would require large, concrete mispricings. I struggle to see how the opporunity is li... (read more)

These opportunities are going to be especially not in developed markets. Or not things that a firm can do systematically. I agree that 10k (or much less) profit per trade is not too small for an HFT shop, if it's part of an overall strategy that does many trades. The capacity of a trading strategy isn't how much it makes per trade, but how much capital you can productively allocate to it over its lifetime. And a firm is not going to devote weeks of an employee's time to developing a strategy that's only ever going to make 10k. Instead, I'm thinking of weird, one-off things like: * taking advantage of credit card sign-up bonus arbitrages * deciding that a particular house for sale (not the whole housing sector) is undervalued * speculating on a rare book, or piece of art, or other collectible * betting on obscure cryptocurrencies that you've done some analysis of * taking advantage of DeFi yield farming schemes These are generally going to be weird small things that a traditional firm can't easily trade in a systematic way, such that it's not worth their time to look into them. Note that it might not be worth the retail investor's time either, depending on how they value their time and their opportunity costs. But in some cases I think you can stumble upon knowledge that you can then take advantage of, without worrying that your knowledge / reasoning is mistaken because there shouldn't be a $20 bill on the sidewalk. If you stumble upon some information that makes you think the S&P500 is undervalued, you should be a lot more skeptical of that than of some analysis that suggests some obscure collectible / penny stock / cryptocurrency / NFT is undervalued.
Retail Investor Advantages

Two of those "advantages" aren't as much "advantages" as the market telling you that it thinks it knows better than you. The fact that you have lower trading costs and lower slippage (actually the same thing) is because the market doesn't respect you.

Re: information acquisition cost. Sure, you might have one small piece of information that BigTradingFirm doesn't have, but they have plenty of information you don't have. The relative value of the information is what matters.  

Suppose as a domain expert you highly suspect company X will fail within timeframe Y. This company is pretty small and there is a reasonable amount of irreducible uncertainty so you (or anyone else) could make a maximum of $10k off of this bet. It costs you ~nothing on the margin to take this opportunity, but it would cost BigFund more than $10k in opportunity cost to acquire this information and act on it, so it's not worth it to them to bother with it. Also, the market underestimating me is a good thing for my bottom line.
1[comment deleted]8mo
“Sure, you might have one small piece of information that BigTradingFirm doesn't have, but they have plenty of information you don't have. The relative value of the information is what matters.” As an example, let’s say you’re a scientist who works in the field of bioprinting. A new company IPOs, planning to make artificial tissue for transplant via bioprinting. You’ve been working with similar technology for 25 years, know the founder personally, and are certain that the tech won’t work and the founder’s a dishonest-yet-charismatic person with a history of exploiting others to make themselves look good. So you short the stock. A hedge fund doesn’t have your experience. But they do have lots of information about your industry, historical performance of companies in this sector, advisors (including from your peers), regulatory insight, and much more. They understand that the CEO can be replaced, the product can pivot, etc. They have better overall judgment about how to weigh and synthesize all the information about the company into a prediction about where the price will go.
5Ege Erdil8mo
To elaborate on the information acquisition cost point; small pieces of information won't be worth tying up a big amount of capital for. If you have a company worth $1 billion and you have very good insider info that a project of theirs that the market implicitly values at $10 million is going to flop, if the only way you can express that opinion is to short the stock of the whole company that's likely not even worth it. Even with 10% margin you'd be at best making a 10% return on capital over the time horizon that the market figures out the project is bad (maybe O(1) years), and that mean return would come with way more risk than just buying into the S&P 500, so your Sharpe would be much worse. In general this kind of trading is only worth it if your edge over the market is big enough. If you just know something the market doesn't know that's not very useful unless you can find someone to bet on that exact thing rather than have to involve a ton of other variance in your trades, and even if you try to do that people can figure out what you're up to and refuse to take the other side of your trades anyway.
Base Rates and Reference Classes

I did a similar calculation not just for the base rate of completing his term, but of being the next nominee and the next US President a while back

Average probabilities, not log odds

I think it would be perhaps helpful to link to a few people advocating averaging log-odds rather than averaging probabilities, eg: 

Personally, I see this question as being an empirical question. Which method works best?

In the cases I care about, both averaging log odds and taking a median far outperform taking a mean. (Fwiw Metaculus agrees that it's a very safe bet too)

In contrast, there are no conditions under which average log odds is the co

... (read more)
Thanks for the links! Contrived how? What additional structure do you imagine I added? In what sense do you claim that averaging log odds preserves additivity of probability for disjoint events in the face of an example showing that the straightforward interpretation of this claim is false? It isn't; you can tell because additivity of probability for disjoint events continues to hold after Bayesian updates. [Edit: Perhaps a better explanation for why it isn't a Bayesian update is that it isn't even the same type signature as a Bayesian update. A Bayesian update takes a probability distribution and some evidence, and returns a probability distribution. Averaging log-odds takes some finite set of probabilities, and returns a probability]. I'm curious what led you to believe this, though.
Worth checking your stock trading skills

I figured that's the first thing someone would think of upon hearing "7x" which is why I mentioned "This was done using a variety of strategies across a large number of individual names" in the OP.

Right, I wasn't disagreeing with you, just explaining why 7x isn't strong evidence in my own words.

Can you please give some examples of such people? I wonder if there are any updates or lessons there for me.

Yes, but I don't think there's a huge amount of value in doing that. If you spend any time following stock touts on twitter / stock picking forums etc you wil... (read more)

The people I follow generally don't advertise their track record? For the hedge fund manager I mentioned, I had to certify that I'm an accredited investor and sign up for his fund letters to get his past returns. For the ones that do, e.g., paid services on SeekingAlpha that advertise past returns, it has not been my experience that they "then fail to do so out of sample" (at least the ones that passed my filter of being worth subscribing to). Personally, I wish I had seen a post like this 10 years ago. My guess is that there's at least 2 or 3 people on LW who could become good traders if they tried. Even if 10 times that many people try and don't succeed, that seems overall a win from my perspective, as the social/cultural/signaling and monetary gains from the winners more than offset the losses. In part I want LW to become a bigger cultural force, and clear success stories that can't be dismissed as "luck" seem very helpful for that. Pre-tax. Maybe try some of my tips, if you haven't already? :)
Worth checking your stock trading skills

Without even checking, I can think of a bunch of assets which 7x'ed since Jan 2020. (BTC/general crypto, TSLA, GME/AMC etc). So yes, I agree this depends on the portfolio you ran.

Personally, I have seen enough people claiming to outperform, but then fail to do so out of sample. (I mean, out of sample for me, not for them) for me to doubt any claim on the internet without a trading record.

Either way, I think it's very hard to convince me with just ~1.5 years of evidence that you have edge. I think if you showed me ~1k trades with some sensible risk parameters at all times, then I could be convinced. (Or if in another year and a half you have $300mm because you've managed to 7x your small HF AUM, I will be convinced).

I figured that's the first thing someone would think of upon hearing "7x" which is why I mentioned "This was done using a variety of strategies across a large number of individual names" in the OP. Just to further clarify, I have some exposure to crypto but I'm not counting it for this post, I bought some TSLA puts (forgot whether I made a profit overall), and didn't touch AMC. I had a 0.1% exposure to some GME calls which went to 1% of my portfolio and that's the only involvement there. Can you please give some examples of such people? I wonder if there are any updates or lessons there for me. I don't think I've done that many trades (depending on how you define a trade, e.g., presumably accumulating a position across different days doesn't count as separate trades). Maybe in the low hundreds? But why would you need ~1k trades to verify that I was not doing particularly high variance strategies? I guess this is mostly academic though, as it would take a lot of labor to parse my trade logs and understand the underlying market mechanics to figure out what I was doing and how much risk I was taking (e.g., some pair/arbitrage trades were spread across several brokers depending on where I could find borrow). I don't supposed you'd actually want to do this? (I also have some privacy concerns on my end, but maybe could be persuaded if the "value added" in doing this seems really high.) I'm definitely not expecting such high returns going forward. ("600% return" was meant to be Bayesian evidence to update on, not used to directly set expectations. I thought that went without saying around here...) Obviously there was a significant amount of luck involved, for example as I mentioned the market was particularly inefficient last year. One of the hedge fund managers I follow had returns similar to mine this year and last year, but not in the years before that. I'd guess 20-50% above market returns is a realistic expectation if market conditions stay similar to today's, and
Worth checking your stock trading skills

Everyone else has already pointed out that you misunderstood what EMH states, so I wont bother adding to their chorus. (Except to say I agree with them).

I will also disagree with:

at most one-in-five people [...] It should therefore probably update us nontrivially away from the possibility that the post author just got lucky.

1 in 5 isn't especially strong evidence. How many of the other 5 people would you expect to be publishing on the internet saying "You should trade stocks".

(You're not wrong, but I wanted to flag: the way I read John's comment, the word "nontrivially" already admitted this. If he thought it was strong evidence I'd expect him to have used a stronger word. Nothing wrong with adding clarification, but I don't particularly think you're disagreeing with him on this point.)
I agree this isn't a very strong argument. I think theoretically we can probably get a much tighter probability bound than 20% by looking directly at the variance of my strategy, and concluding that given that variance, the probability of getting 600% return by chance (assuming expected return = market return) is <p for some smaller p. But in practice I'm not sure how to compute this variance. Intuitively I can point to the fact that my portfolios did not have very high leverage/beta, nor did I put everything into a single or very few highly volatile stocks or sectors, which are probably the two most common high variance strategies people use. (Part of the reason for me writing this post is that while LW does have a number of people who achieved very high investment returns, they all AFAIK did it by using one of these two methods, which makes it hard to cite them as exemplifying the power of rationality.) Assuming the above is still not very convincing, I wonder what kind of evidence would be...
Risk Premiums vs Prediction Markets

I don't really know how you incentivise people (seriously) in the non-real money prediction markets.

Non-money prediction markets have lots of difficulties to them:

  • How do you size your bet? (Ie knowing a probability vs "higher" or "lower" than market estimate
  • Difficult to arbitrage (ie share information between markets)
  • How do you show your conviction (this is 50% and I'm certain it's a coin flip vs this is 50% because I don't understand the question)
Worth checking your stock trading skills

I don't know enough about the etiquette here, but I am having to fight the urge to post a bunch of memes along the lines of "It's not the bull market, I really am a genius".

I would strongly advise anyone who's considering following this to consider doing this with considerably less than their whole portfolio and with much lower expectations than 7x'ing your money.

A lot of people respond to things like this post by assuming that the author was lucky. This is usually correct, at least when applied to random claimants on the interwebs, but we can put some bounds on it: the higher the returns, the more luck required to achieve them, assuming an efficient market. The efficient markets model says that any strategy during this period had expected return of 50%. So, if the post author used a strategy with probability p of achieving 600% returns, and 1-p of losing everything, then efficient markets implies p*(100+600) + (1-p)*0 = (100+50), i.e. p = 0.21 (roughly). This is the highest probability any strategy could have of achieving 600% returns during this period, without exploiting any market inefficiency. In other words: at most one-in-five people could achieve returns that high without exploiting market inefficiency. It should therefore probably update us nontrivially away from the possibility that the post author just got lucky. (Though depending on your priors, you might still think the post author got lucky.)
Load More