Thoughts on the SPIES Forecasting Method?

So, what do you think? Does this method seem at all promising? I'm debating with myself whether I should begin using SPIES on Metaculus or elsewhere.

I'm not super impressed tbh. I don't see "give a 90% confidence interval for x" as a question which comes up frequently? (At least in the context of eliciting forecasts and estimates from humans - it comes up quite a bit in data analysis).

For example, I don't really understand how you'd use it as a method on Metaculus. Metaculus has 2 question types - binary and continuous. For binary you have to give the prob... (read more)

2022 ACX predictions: market prices

17. Unemployment below five percent in December:

73(Kalshi said 92% that unemployment never goes above 6%; 49 from Manifold)

I'm not sure exactly how you're converting 92% unemployment < 6% to < 5%, but I'm not entirely convinced by your methodology?

15. The Fed ends up doing more than its currently forecast three interest rate hikes:

None(couldn't find any markets)

Looking at the SOFR Dec-22 3M futures 99.25/99.125 put spread on the 14-Feb, I put this probability at ~84%.

Thanks for doing this, I started doing it before I saw your competition an... (read more)

25mo

Thanks for this feedback!
Re 17: You are right to be skeptical, because my methodology for this one was
silly and ad hoc. I somewhat arbitrarily turned a 92% chance that unemployment
never goes above 6% into a 80% chance that unemployment isn't above 5% in
December. This is completely unprincipled, but I didn't have any better ideas,
and the alternative was to ignore the Kalshi market completely and defer
entirely to the 5 betters on Manifold, which seemed worse. If you have a more
reasonable way of getting a number here, I'll happily defer to it.
Re 15: Thanks! I'll edit that number in and point to your comment.
Thanks also for the work you put into doing this last year! That post (along
with Zvi's re-predictions) led to me running a small prediction contest with a
handful of friends. That went well, was a lot of fun, and straightforwardly grew
into me asking Scott if he wanted me and Eric to run the same thing for the ACX
community. So, making up some numbers and hoping I can use Shapley values
correctly, I estimate that you get 40% of the credit for this year's prediction
contest happening.

Capturing Uncertainty in Prediction Markets

And one way to accomplish that would be to bet on what percentage of bets are on "uncertainty" vs. a prediction.

How do you plan on incentivising people to bet on "uncertainty"? All the ways I can think of lead to people either gaming the index, or turning uncertainty into a KBC.

Capturing Uncertainty in Prediction Markets

The market and most of the indicators you mentioned would be dominated by the 60 that placed large bets

I disagree with this. Volatility, liquidity, # predictors, spread of forecasts will all be affected by the fact that 20 people aren't willing to get involved. I'm not sure what information you think is being lost by people stepping away? (I guess the difference between "the market is wrong" and "the market is uninteresting"?)

16mo

What is being lost is related to your intuition in the earlier comment:
Without knowing how many people of the "I've studied this subject, and still
don't think a reasonable prediction is possible" variety didn't participate in
the market, it's very hard to place any trust in it being the "right" price.
This is similar to the "pundit" problem where you are only hearing from the most
opinionated people. If 60 nutritionist are on TV and writing papers saying
eating fats is bad, you may try to draw the "wrong" conclusion from that.;
because unknown to you, 40 nutritionists believe "we just don't know yet". And
these 40 are provided no incentives to say so.
Take the Russia-Kiev question
[https://www.metaculus.com/questions/9459/russian-troops-in-kiev-in-2022/] on
Metaculus which had a large number of participants. It hovered at 8% for a long
time. If prediction markets are to be useful beyond just pure speculation, that
market didn't tell me how many knowledgable people thought an opinion was simply
not possible.
The ontological skepticism signal is missing - people saying there is no right
or wrong that "exists" - we just don't know. So be skeptical of what this market
says.
As for KBC - most markets allow you to change/sell your bet before the event
happens; especially for longer-term events. So my guess is that this is already
happening. In fact, the uncertainty index would seperate out much of the "What
do other people think?" element into it's own question.
For locked in markets like ACX where the suggestion is to leave your prediction
blank if you don't know, imagine every question being paired with "What
percentage of people will leave this prediction blank?"

Capturing Uncertainty in Prediction Markets

There are a bunch of different metrics which you could look at on a prediction market / prediction platform to gauge how "uncertain" the forecast is:

- Volatility - if the forecast is moving around quite a bit, there are two reasons:
- Lots of new information arriving and people updating efficiently
- There is little conviction around "fair value" so traders can move the price with little capital

- Liquidity - if the market is 49.9 / 50.1 in millions of dollars, then you can be fairly confident that 50% is the "right" price. If the market is 40 / 60 with $1 on t

16mo

All these indicators are definitely useful for a market observer. And betting on
these indicators would make for an interesting derivatives market - especially
on higher volume questions. The issue I was referring to is that all these
indicators are still only based on traders who felt certain enough to bet on the
market.
Say 100 people who have researched East-Asian geopolitics saw the question "Will
China invade Taiwan this year?". 20 did not feel confident enough to place a
bet. Of the remaining 80 people, 20 bet small amounts because of their lack of
certainty.
The market and most of the indicators you mentioned would be dominated by the 60
that placed large bets. A LOT of information about uncertainty would be lost.
And this would have been fairly useful information about an event.
The goal would be to capture the uncertainty signal of the 40 that did not place
bets, or placed small bets. One way to do that would be to make "uncertainty"
itself a bettable property of the question. And one way to accomplish that would
be to bet on what percentage of bets are on "uncertainty" vs. a prediction.

Prediction Markets are for Outcomes Beyond Our Control

Prediction markets function best when liquidity is high, but they break completely if the liquidity exceeds the price of influencing the outcome. Prediction markets function only in situations where outcomes are expensive to influence.

There are a ton of fun examples of this failing:

- Libor
- "Chicken Libor"
- Every sport, all the time
- Option expiries (I don't have a good single link for this)

Money-generating environments vs. wealth-building environments (or "my thoughts on the stock market")

I don't know enough about how equities trade during earnings, but I do know a little about how some other products trade during data releases and while people are speaking.

In general, the vast, vast, vast majority of liquidity is withdrawn from the market before the release. There will be a few stale orders people have left by accident + a few orders left in at levels deemed ridiculously unlikely. As soon as the data is released, the fastest players will general send quotes making a (fairly wide market) around their estimate of the fair price. Over time (a... (read more)

Use Normal Predictions

I agree identifying model failure is something people can be good at (although I find people often forget to consider it). Pricing it they are usually pretty bad at.

Use Normal Predictions

I'd personally be more interested in asking someone for their 95% CI than their 68% CI, if I had to ask them for exactly one of the two. (Although it might again depend on what exactly I plain to

dowith this estimate.)

I'm usually *much* more interested in a 68% CI (or a 50% CI) than a 95% CI because:

- People in general arent super calibrated, especially at the tails
- You won't find out for a while how good their intervals are anyway
- What happens most often is usually the main interest. (Although in some scenarios the tails are all that matters, so again, depends

17mo

Oh okay.
Maybe I just haven't yet understood what you do with a 68% CI.
re 1: maybe we just have different intuitions - I somehow feel people are always
better at qualitative stuff than quantitative - and identifying model failure is
more qualitative.

17mo

I agree with both points
If you are new to continuous predictions then you should focus on the 50%
Interval as it gives you most information about your calibration, If you are
skilled and use for example a t-distribution then you have σ for the trunk and ν
for the tail, even then few predictions should land in the tails, so most data
should provide more information about how to adjust σ, than how to adjust ν
Hot take: I think the focus 95% is an artifact of us focusing on p<0.05 in
frequentest statistics.

Use Normal Predictions

Under what assumption?

1/ You aren't "[assuming] the errors are normally distributed". (Since a mixture of two normals isn't normal) in what you've written above.

2/ If your assumption is then yes, I agree the median of is ~0.45 (although

```
from scipy import stats
stats.chi2.ppf(.5, df=1)
>>> 0.454936
```

would have been an easier way to illustrate your point). I think this is *actually *the assumption you're making. [Which is a horrible assumption, because if it were true, you would already be perfectly calibrated].

3/ I guess ... (read more)

17mo

Our ability to talk past each other is impressive :)
Yes this is almost the assumption I am making, the general point of this post is
to assume that all your predictions follow a Normal distribution, with μ as
"guessed" and with a σ that is different from what you guessed, and then use X2
to get a point estimate for the counterfactual σ you should have used. And as
you point out if (counterfactual) σ=1 then the point estimate suggests you are
well calibrated.
In the post counter factual σ is ^σz

The Unreasonable Feasibility Of Playing Chess Under The Influence

I think the controversy is mostly irrelevant at this point. Leela performed comparably to Stockfish in the latest TCEC season and is based on Alpha Zero. It has most of the "romantic" properties mentioned in the post.

27mo

Not just in the latest TCEC season, they've been neck-and-neck for quite a bit
now

Use Normal Predictions

That isn't a "simple" observation.

Consider an error which is 0.5 22% of the time, 1.1 78% of the time. The squared errors are 0.25 and 1.21. The median error is 1.1 > 1. (The mean squared error is 1)

17mo

Yes you are right, but under the assumption the errors are normal distributed,
then I am right:
If:
p∼Bern(0.78)σ=p×N(0,1.1)+(p−1)N(0,0.5)
Then E[σ2]≈0.37 Which is much less than 1.
proof:
import scipy as sp
x1 = sp.stats.norm(0, 0.5).rvs(22 * 10000)
x2 = sp.stats.norm(0, 1.1).rvs(78 * 10000)
x12 = pd.Series(np.array(x1.tolist() + x2.tolist()))
print((x12 ** 2).median())

Use Normal Predictions

Metaculus uses the cdf of the predicted distribution which is better If you have lots of predictions, my scheme gives an actionable number faster

You keep claiming this, but I don't understand why you think this

Use Normal Predictions

If you suck like me and get a prediction very close then I would probably say: that sometimes happen :) note I assume the average squared error should be 1, which means most errors are less than 1, because 0

^{2+2}2=2>1

I assume you're making some unspoken assumptions here, because is not enough to say that. A naive application of Chebyshev's inequality would just say that .

To be more concrete, if you were very weird, and either end up forecasting 0.5 s.d. or 1.1 s.d. away, (still with mean 0 and average... (read more)

17mo

I am making the simple observation that the median error is less than one
because the mean squares error is one.

Use Normal Predictions

Go to your profile page. (Will be something like https://www.metaculus.com/accounts/profile/{some number}/). Then in the track record section, switch from Brier Score to "Log Score (continuous)"

Use Normal Predictions

I'd be happy to.

57mo

I upvoted all comments in this thread for constructive criticism, response to
it, and in the end even agreeing to review each other!

Two ominous charts on the financial markets

The 2000-2021 VIX has averaged 19.7, sp500 annualized vol 18.1.

I think you're trying to say something here like 18.1 <= 19.7, therefore VIX (and by extension) options are expensive. This is an error. I explain more in detail here, but in short you're comparing expected variance and expected volatility which aren't the same thing.

... (read more)From a 2ndary source: "The mean of the realistic volatility risk premium since 2000 has been 11% of implied volatility, with a standard deviation of roughly 15%-points" from https://www.sr-sv.com/realistic-volatility-risk-premia

Use Normal Predictions

I still think you're missing my point.

If you're making ~20 predictions a year, you shouldn't be doing any funky math to analyse your forecasts. Just go through each one after the fact and decide whether or not the forecast was sensible with the benefit of hindsight.

I am even explaining what an normal distribution is because I do not expect my audience to know...

I think this is exactly my point, if someone doesn't know what a normal distribution is, maybe they should be looking at their forecasts in a fuzzier way than trying to back fit some model to them.

... (read more)A

87mo

I would love you as a reviewer of my second post as there I will try to justify
why I think this approach is better, you can even super dislike it before I
publish if you still feel like that when I present my strongest arguments, or
maybe convince me that I am wrong so I dont publish part 2 and make a partial
retraction for this post :). There is a decent chance you are right as you are
the stronger predictor of the two of us :)

Use Normal Predictions

I disagree with that characterisation of our disagreement, I think it's far more fundamental than that.

- I think you misrepresent the nature of forecasting (in it's generality) versus modelling in some specifics
- I think your methodology is needlessly complicated
- I propose what I think is a better methodology

To expand on 1. I think (although I'm not certain, because I find your writing somewhat convoluted and unclear) that you're making an implicit assumption that the error distribution is consistent from forecast to forecast. Namely your errors when forecastin... (read more)

I am sorry if I have straw manned you, and I think your above post is generally correct. I think we are cumming from two different worlds.

You are coming from Metaculus where people make a lot of predictions. Where having 50+ predictions is the norm and the thus looking at a U(0, 1) gives a lot of intuitive evidence of calibration.

I come from a world where people want to improve in all kids of ways, and one of them is prediction, few people write more than 20 predictions down a year, and when they do they more or less ALWAYS make dichotomous predictions. I ... (read more)

Two ominous charts on the financial markets

d/ is actually completely consistent with the vol market (I point this out here), so it's not clear that's their recommendation.

17mo

I should have checked before posting, thanks for this.

Use Normal Predictions

If you think 2 data points are sufficient to update your methodology to 3 s.f. of precision I don't know what to tell you. I think if I have 2 data point and one of them is 0.99 then it's pretty clear I should make my intervals wider, but how much wider is still very uncertain with very little data. (It's also not clear if I should be making my intervals wider or changing my mean too)

37mo

I don't know what s.f is, but the interval around 1.73 is obviously huge, with
5-1-0 data points it's quite narrow if your predictions are drawn from N(1,
1.73), that is what my next post will be about. There might also be a smart way
to do this using the Uniform, but I would be surprised if it's dispersion is
smaller than a chi^2 distribution :) (changing the mean is cheating, we are
talking about calibration, so you can only change your dispersion)

Use Normal Predictions

you are missing the step where I am transforming arbitrary distribution to U(0, 1)

I am absolutely not missing that step. I am suggesting that should be the *only* step.

(I don't agree with your intuitions in your "explanation" but I'll let someone else deconstruct that if they want)

47mo

Hard disagree, From two data points I calculate that my future intervals should
be 1.73 times wider, converting these two data points to U(0,1) I get
[0.99, 0.25]
How should I update my future predictions now?

Use Normal Predictions

you need less data to check whether your squared errors are close to 1 than whether your inverse CDF look uniform

I don't understand why you think that's true. To rephrase what you've written:

"You need less data to check whether samples are approximately N(0,1) than if they are approximately U(0,1)"

It seems especially strange when you think that transforming your U(0,1) samples to N(0,1) makes the problem soluble.

37mo

TLDR for our disagreement:
SimonM: Transforming to Uniform distribution works for any continuous variable
and is what Metaculus uses for calibration
Me: the variance trick to calculate σz from this post is better if your
variables are form a Normal distribution, or something close to a normal.
SimonM: Even for a Normal the Uniform is better.

27mo

you are missing the step where I am transforming arbitrary distribution to U(0,
1)
medium confident in this explanation: Because the square of random variables
from the same distributions follows a gamma distribution, and it's easier to see
violations from a gamma than from a uniform, If the majority of your predictions
are from a weird distributions then you are correct, but if they are mostly from
normal or unimodal ones, then I am right. I agree that my solution is a hack
that would make no statistician proud :)
Edit: Intuition pump, a T(0, 1, 100) obviously looks very normal, so
transforming to U(0,1) and then to N(0, 1) will create basically the same
distribution, the square of a bunch of normal is Chi^2, so the Chi^2 is the best
distribution for detecting violations, obviously there is a point where this
approximation sucks and U(0, 1) still works

Use Normal Predictions

(If this makes no sense, then ignore it): Using an arbitrary distribution for predictions, then use its CDF (Universality of the Uniform) to convert to , and then transform to z-score using the inverse CDF (percentile point function) of the Unit Normal. Finally use this as in when calculating your calibration.

Well, this makes some sense, but it would make even more sense to do only half of it.

Take your forecast, calculate it's percentile. Then you can do all the traditional calibration stuff. All this stuff with z-scores is needless... (read more)

17mo

Where can I access this for my profile on Metaculus? I have everything unlocked
but don't see it in the options.

47mo

Can I use this image for my "part 2" posts, to explain how "pros" calibrate
their continuous predictions?, And how it stacks up against my approach?, I will
add you as a reviewer before publishing so you can make corrections in case I
accidentally straw man or misunderstand you :)
I will probably also make a part 3 titled "Try t predictions" :), that should
address some of your other critiques about the normal being bad :)

47mo

This is a good point, but you need less data to check whether your squared
errors are close to 1 than whether your inverse CDF look uniform, so if the
majority of predictions are normal I think my approach is better.
The main advantage of SimonM/Metaculus is that it works for any continuous
distribution.

Two ominous charts on the financial markets

I absolutely considered writing about the difference between risk-neutral probabilities and real-world probabilities in this context but decided against because: **Over the course of a year, the difference is going to be small relative to the width of the forecast**

I'd be interested to hear if you think the differences would be material to my point. ie that [-60%, +30%] isn't a ~90% range that stocks return next year and that his forecast is not materially different to what the market is forecasting.

17mo

The 2000-2021 VIX has averaged 19.7, sp500 annualized vol 18.1.
From a 2ndary source: "The mean of the realistic volatility risk premium since
2000 has been 11% of implied volatility, with a standard deviation of roughly
15%-points" from https://www.sr-sv.com/realistic-volatility-risk-premia/
[https://www.sr-sv.com/realistic-volatility-risk-premia/] . So 1/3 of the time
the premia is outside [-4%,26%], which swamps a lot of vix info about true
expect vol.
-60% would the worst draw down ever, the prior should be <<1%. However, 8 years
have been above 30% since 1928 (9%), seems you're using a non-symetric CI.
The reasoning for why there'd be such a drawdown is backwards in OP: because
real rates are so low the returns for owning stocks has declined accordingly. If
you expect 0% rates and no growth stocks are priced reasonably, yielding 4%/year
more than bonds. Thinking in the level of rates not changes to rates makes more
sense, since investments are based on current projected rates. A discounted cash
flow analysis works regardless of how rates change year to year. Currently the
30yr is trading at 2.11% so real rates around the 0 bound is the consensus view.

Use Normal Predictions

I think you're advocating two things here:

- Make a continuous forecast when forecasting a continuous variable
- Use a normal distribution to approximate your continuous forecast

I think that 1. is an excellent tip in general for modelling. Here is Andrew Gelman making the same point

However, I don't think it's actually always good advice when eliciting forecasts. For example, fairly often people ask whether or not they should make a question on Metaculus continuous or binary. Almost always my answer is "make it binary". Binary questions get considerable more inte... (read more)

67mo

Agreed 100% on 1) and with 2) I think my point is "start using the normal
predictions as a gate way drug to over dispersed and model based predictions"
I stole the idea from Gelman and simplified it for the general community, I am
mostly trying to raise the sanity waterline by spreading the gospel of
predicting on the scale of the observed data. All your critiques of normal
forecasts are spot on.
Ideally everybody would use mixtures of over-dispersed distributions or models
when making predictions to capture all sources of uncertainty
It is my hope that by educating people in continuous prediction the Metaculus
trade off you mention will slowly start to favor the continuous predictions
because people find it as easy as binary prediction... but this is probably a
pipe dream, so I take your point

Two ominous charts on the financial markets

That's not a chart of "real rates", that's the spread between a 10y rate and a spot inflation estimate. Real rates is (ideally) the rate paid on an inflation linked bond, or at least the k-year rate minus the k-year forecast inflation. The BoE have historic data here going back to '85 and the rally is several hundred basis points less than your chart implies.

Two ominous charts on the financial markets

Sorry, I should have made that more clear. I am talking about the period since the start of the interest rate decline (mid 1980s to today).

I think you're going to have to be more explicit about what time period you're forecasting your market collapse. (Or whatever it is you're forecasting, it's still not clear to me).

... (read more)Let me try to rephrase that: I think we will be seeing a fundamental change in the financial markets due to an end to the 35year long reduction of real interest rates. And most of the actors have only known investing in an environment with mor

Two ominous charts on the financial markets

I'm not sure what you consider to be "neutral" to hold, but forward returns for holding cash don't look great either.

(I'm also not sure what you're trying to say about Warren Buffett, can you be more explicit)

Two ominous charts on the financial markets

tl;dr all your conclusions are equally consistent with equity returns being similar to the past going forward

a) To infer future average equity returns from the past decades seems to me quite dangerous.

Agreed, although what else do we have?

b) The valuation level (Shiller CAPE) and the elimination of the interest rate reduction effect for stock valuations indicate that expected future equity returns will probably be significantly lower than those of past decades.

Which decades are you looking at? As recently as the decade before last (2000s) we had negative r... (read more)

17mo

Options Nitpick: You can't use equity index* option prices as true probabilities
because adding hedges to a portfolio makes the whole portfolio more valuable.
People then buy options based on their value when added to the portfolio, not as
individual investments.
The first reason option hedges make your portfolio more valuable is preferences:
people don't just want to maximize their expected return, but also reduce the
chance they go broke. People don't like risk and hedges reduce risk, ergo they
pay more to get rid of risk. However you can't just subtract X vols to adjust as
this "risk premia" isn't constant over time.
Secondly hedges maximize long term returns (or why you shouldn't sell options)
You want to maximize your average geometric annual return not average annual
return. You care about geometric averages because if for 3 years your returns
were +75%, +75%, -100%, your don't have 50% more money then when you started but
0. The average of annual returns was 10.7% over the past 30 years, but if you'd
invested in 1992 you'd've only compounded at 8.5% year till 2022.
Geometric returns are the the nth root of a product of n numbers and have the
approximation = Average Annual Return - (variance/2). If you could reduce
variance and not reduce Annual returns, your portfolio (market + hedges) would
grow faster than the market.
These reasons are why despite the worst Annual return being -48% in 1931, you
say there's a 5% chance of > -50% returns based on option markets.
*I'm specifically talking index options because that's the portfolio investors
have (or something similar) and the total is what they care about. If you were
to use prices as true probabilities for say a merger going through these reasons
don't apply as much and would be more accurate.
PS. I've referred to investors as all having the same portfolio because most
people do have highly correlated index holdings and it's at this level of
generality you can think about investors as a class.

07mo

Sorry, I should have made that more clear. I am talking about the period since
the start of the interest rate decline (mid 1980s to today).
Let me try to rephrase that: I think we will be seeing a fundamental change in
the financial markets due to an end to the 35year long reduction of real
interest rates. And most of the actors have only known investing in an
environment with more or less constantly decreasing real interest rates. I
believe this combination could lead to widespread panic in the markets once the
people making investment decision realize that they don't know anymore how the
markets react due to the new environment.
I believe that rationale for investing in equities is quite widespread today.
Obviously, there is an alternative (accepting secure negative real returns), but
in order to avert guaranteed losses investors take on risk. I am not saying that
this is necessarily the wrong strategy, but it poses an interesting question:
How will investors with this motivation for holding stocks react in a downturn?

47mo

When required to be fully invested this is trueish.
However you can sit in cash while no appealing investments exist. And buy in
size when prices become more appealing.
inb4 market timing is not possible
Have a look at Warren Buffett's track record and the amount of cash he held in
early 2000 and now.

Scott Alexander 2021 Predictions: Market Prices - Resolution

Thanks for flagging, fixed

Scott Alexander 2021 Predictions: Market Prices - Resolution

I'll add that note

Scott Alexander 2021 Predictions: Market Prices - Resolution

OK, so I am obviously biased but I'll look to see if I think this is fair.

Yeah, this is definitely my bad. I didn't ask you (or Scott) whether or not you were happy with me comparing your comments to market forecasts. I apologise. I also didn't intend to make this as normative as it sounds. (FWIW in the past I have gone to bat for your forecasting skills and given your forecast and a market forecast most of the time I would expect to update away from the market and towards you)

... (read more)I'll let Simon decide what to do with the rest. I also find it super weird

Scott Alexander 2021 Predictions: Market Prices - Resolution

So I'm a little worried we've used different sources for your forecasts, but to explain where we differ:

- We agree
- We agree
- We agree
- Happy to change your number, although your forecast was: "Depending on what counts as ‘recalled’ this is either at least 10%, or it’s damn near 0%. I don’t see how you get 5%. Once you get an election going, anything can happen. Weird one, I’d need more research." Which I averaged to 5%. Happy to change to 1%?
- We agree
- "It’s definitely a thing that can happen but there isn’t that much time involved, and the timing doesn’t seem

98mo

https://thezvi.wordpress.com/2021/04/27/scott-alexander-2021-predictions-buy-sell-hold/
[https://thezvi.wordpress.com/2021/04/27/scott-alexander-2021-predictions-buy-sell-hold/]
is the canonical version. Surprised the differences were this big. The struggle
on knowing when to update all versions is real, especially now that there's 3x.
Then beyond that your decisions seem fine.
And no need to apologize for doing the exercise, it's good to check things, long
as it's clear what's being done.
When/if I do predictions for 2022 I'll see what I can do about also including
explicit fairs (and ideally, where I'd call BS on a market, and where I
wouldn't).

Scott Alexander 2021 Predictions: Market Prices - Resolution

If anyone can figure out how to format that table, I would appreciate it, thanks!

28mo

I have been trying to format tables on LW for a while have up and started using
images.

Retail Investor Advantages

I don't think you've found the most unbiased description of PFOF out there

Combining Forecasts

I realise you've been very careful about avoiding mentioning any explicit average in your section on "Combining External Forecasts", I was wondering if you had any thoughts on mean-vs-median (links below)

- When pooling forecasts, use the geometric mean of the odds
- My current best guess on how to aggregate forecasts

I was also wondering if you had any thoughts on extremising the forecasts you're ensembling too. (The classic example of 4 people all forecasting 60% but all based on independent information)

Retail Investor Advantages

I'm afraid you're confused about how PFOF works. It's absolutely not about "frontrunning trades"

18mo

https://public.com/learn/payment-for-order-flow-pfof-explained-and-why-it-matters
[https://public.com/learn/payment-for-order-flow-pfof-explained-and-why-it-matters]
writes:
To me that sounds like leogao description of the PFOF being due to misinformed
traders seems wrong.

Retail Investor Advantages

Okay, but your examples are now all the same as your "2." (which I don't disagree with). Size isn't the advantage here, it's being able to be involved in weird things. (I was disagreeing with your point "3.")

28mo

Fair enough. I suppose I'm having trouble coming up with examples of
opportunities that are both not weird, and also not systematizable. (Though I do
think evaluation of individual penny stocks counts.)
I'm keeping that as separate from 2 though because I think that if you do find
something like that, the retail trader is potentially advantaged. And in
general, I think it's true on a spectrum — the more capacity a strategy has, the
more you shouldn't expect to beat the market with it.
I think of the market as like an ecosystem. If you look at a cubic meter of
rainforest, there's a ton of activity going on at different levels, from
bacterium up to tree. Different organisms are taking advantage of different
metabolic opportunities of different sizes and types (and their activity
provides opportunities to each other). Each creature has its niche.
I think of the market as like that. You've got big long-term macro funds taking
positions that last for months. And you've got little nimble HFT shops making
money off of the big slow macro fund's predictable-on-short-timescales trading
behavior.
And I claim the retail trader can potentially find a niche here too. And part of
what they should look for are opportunities that are not worth the time of
bigger firms. (Though note that this might just mean that this retail trader is
currently being undervalued, if they can find opportunities that are worth their
while, but wouldn't be worth the while of an employee at a firm.)

Retail Investor Advantages

Small size means you can look for opportunities with a good return, but low capacity (e.g. some opportunity that could turn 10k into 20k, but couldn't turn 10M into 20M). I think this is a much bigger deal than the low slippage advantage that comes from small size.

I'm kinda curious as to what sort of opportunities people think these are (especially in developed markets)

The sorts of things which have low enough variance to be "good" trades without doing them systematically would require large, concrete mispricings. I struggle to see how the opporunity is li... (read more)

28mo

These opportunities are going to be especially not in developed markets. Or not
things that a firm can do systematically.
I agree that 10k (or much less) profit per trade is not too small for an HFT
shop, if it's part of an overall strategy that does many trades. The capacity of
a trading strategy isn't how much it makes per trade, but how much capital you
can productively allocate to it over its lifetime. And a firm is not going to
devote weeks of an employee's time to developing a strategy that's only ever
going to make 10k.
Instead, I'm thinking of weird, one-off things like:
* taking advantage of credit card sign-up bonus arbitrages
* deciding that a particular house for sale (not the whole housing sector) is
undervalued
* speculating on a rare book, or piece of art, or other collectible
* betting on obscure cryptocurrencies that you've done some analysis of
* taking advantage of DeFi yield farming schemes
These are generally going to be weird small things that a traditional firm can't
easily trade in a systematic way, such that it's not worth their time to look
into them.
Note that it might not be worth the retail investor's time either, depending on
how they value their time and their opportunity costs. But in some cases I think
you can stumble upon knowledge that you can then take advantage of, without
worrying that your knowledge / reasoning is mistaken because there shouldn't be
a $20 bill on the sidewalk. If you stumble upon some information that makes you
think the S&P500 is undervalued, you should be a lot more skeptical of that than
of some analysis that suggests some obscure collectible / penny stock /
cryptocurrency / NFT is undervalued.

Retail Investor Advantages

Two of those "advantages" aren't as much "advantages" as the market telling you that it thinks it knows better than you. The fact that you have lower trading costs and lower slippage (actually the same thing) is because the market doesn't respect you.

Re: information acquisition cost. Sure, you might have one small piece of information that BigTradingFirm doesn't have, but they have plenty of information you don't have. The relative value of the information is what matters.

08mo

Suppose as a domain expert you highly suspect company X will fail within
timeframe Y. This company is pretty small and there is a reasonable amount of
irreducible uncertainty so you (or anyone else) could make a maximum of $10k off
of this bet. It costs you ~nothing on the margin to take this opportunity, but
it would cost BigFund more than $10k in opportunity cost to acquire this
information and act on it, so it's not worth it to them to bother with it.
Also, the market underestimating me is a good thing for my bottom line.

1[comment deleted]8mo

48mo

“Sure, you might have one small piece of information that BigTradingFirm doesn't
have, but they have plenty of information you don't have. The relative value of
the information is what matters.”
As an example, let’s say you’re a scientist who works in the field of
bioprinting. A new company IPOs, planning to make artificial tissue for
transplant via bioprinting. You’ve been working with similar technology for 25
years, know the founder personally, and are certain that the tech won’t work and
the founder’s a dishonest-yet-charismatic person with a history of exploiting
others to make themselves look good. So you short the stock.
A hedge fund doesn’t have your experience. But they do have lots of information
about your industry, historical performance of companies in this sector,
advisors (including from your peers), regulatory insight, and much more. They
understand that the CEO can be replaced, the product can pivot, etc. They have
better overall judgment about how to weigh and synthesize all the information
about the company into a prediction about where the price will go.

58mo

To elaborate on the information acquisition cost point; small pieces of
information won't be worth tying up a big amount of capital for.
If you have a company worth $1 billion and you have very good insider info that
a project of theirs that the market implicitly values at $10 million is going to
flop, if the only way you can express that opinion is to short the stock of the
whole company that's likely not even worth it. Even with 10% margin you'd be at
best making a 10% return on capital over the time horizon that the market
figures out the project is bad (maybe O(1) years), and that mean return would
come with way more risk than just buying into the S&P 500, so your Sharpe would
be much worse.
In general this kind of trading is only worth it if your edge over the market is
big enough. If you just know something the market doesn't know that's not very
useful unless you can find someone to bet on that exact thing rather than have
to involve a ton of other variance in your trades, and even if you try to do
that people can figure out what you're up to and refuse to take the other side
of your trades anyway.

Base Rates and Reference Classes

I did a similar calculation not just for the base rate of completing his term, but of being the next nominee and the next US President a while back

[linkpost] Acquisition of Chess Knowledge in AlphaZero

There's already some discussion here

Average probabilities, not log odds

I think it would be perhaps helpful to link to a few people advocating averaging log-odds rather than averaging probabilities, eg:

- When pooling forecasts, use the geometric mean of the odds
- My current best guess on how to aggregate forecasts

Personally, I see this question as being an empirical question. Which method works best?

In the cases I care about, both averaging log odds and taking a median *far *outperform taking a mean. (Fwiw Metaculus agrees that it's a very safe bet too)

... (read more)In contrast, there are no conditions under which average log odds is the co

39mo

Thanks for the links!
Contrived how? What additional structure do you imagine I added? In what sense
do you claim that averaging log odds preserves additivity of probability for
disjoint events in the face of an example showing that the straightforward
interpretation of this claim is false?
It isn't; you can tell because additivity of probability for disjoint events
continues to hold after Bayesian updates. [Edit: Perhaps a better explanation
for why it isn't a Bayesian update is that it isn't even the same type signature
as a Bayesian update. A Bayesian update takes a probability distribution and
some evidence, and returns a probability distribution. Averaging log-odds takes
some finite set of probabilities, and returns a probability]. I'm curious what
led you to believe this, though.

Worth checking your stock trading skills

I figured that's the first thing someone would think of upon hearing "7x" which is why I mentioned "This was done using a variety of strategies across a large number of individual names" in the OP.

Right, I wasn't disagreeing with you, just explaining why 7x isn't strong evidence in my own words.

Can you please give some examples of such people? I wonder if there are any updates or lessons there for me.

Yes, but I don't think there's a huge amount of value in doing that. If you spend any time following stock touts on twitter / stock picking forums etc you wil... (read more)

19mo

The people I follow generally don't advertise their track record? For the hedge
fund manager I mentioned, I had to certify that I'm an accredited investor and
sign up for his fund letters to get his past returns. For the ones that do,
e.g., paid services on SeekingAlpha that advertise past returns, it has not been
my experience that they "then fail to do so out of sample" (at least the ones
that passed my filter of being worth subscribing to).
Personally, I wish I had seen a post like this 10 years ago. My guess is that
there's at least 2 or 3 people on LW who could become good traders if they
tried. Even if 10 times that many people try and don't succeed, that seems
overall a win from my perspective, as the social/cultural/signaling and monetary
gains from the winners more than offset the losses. In part I want LW to become
a bigger cultural force, and clear success stories that can't be dismissed as
"luck" seem very helpful for that.
Pre-tax.
Maybe try some of my tips, if you haven't already? :)

Worth checking your stock trading skills

Without even checking, I can think of a bunch of assets which 7x'ed since Jan 2020. (BTC/general crypto, TSLA, GME/AMC etc). So yes, I agree this depends on the portfolio you ran.

Personally, I have seen enough people claiming to outperform, but then fail to do so out of sample. (I mean, out of sample for me, not for them) for me to doubt any claim on the internet without a trading record.

Either way, I think it's very hard to convince me with just ~1.5 years of evidence that you have edge. I think if you showed me ~1k trades with some sensible risk parameters at all times, then I could be convinced. (Or if in another year and a half you have $300mm because you've managed to 7x your small HF AUM, I will be convinced).

69mo

I figured that's the first thing someone would think of upon hearing "7x" which
is why I mentioned "This was done using a variety of strategies across a large
number of individual names" in the OP. Just to further clarify, I have some
exposure to crypto but I'm not counting it for this post, I bought some TSLA
puts (forgot whether I made a profit overall), and didn't touch AMC. I had a
0.1% exposure to some GME calls which went to 1% of my portfolio and that's the
only involvement there.
Can you please give some examples of such people? I wonder if there are any
updates or lessons there for me.
I don't think I've done that many trades (depending on how you define a trade,
e.g., presumably accumulating a position across different days doesn't count as
separate trades). Maybe in the low hundreds? But why would you need ~1k trades
to verify that I was not doing particularly high variance strategies? I guess
this is mostly academic though, as it would take a lot of labor to parse my
trade logs and understand the underlying market mechanics to figure out what I
was doing and how much risk I was taking (e.g., some pair/arbitrage trades were
spread across several brokers depending on where I could find borrow). I don't
supposed you'd actually want to do this? (I also have some privacy concerns on
my end, but maybe could be persuaded if the "value added" in doing this seems
really high.)
I'm definitely not expecting such high returns going forward. ("600% return" was
meant to be Bayesian evidence to update on, not used to directly set
expectations. I thought that went without saying around here...) Obviously there
was a significant amount of luck involved, for example as I mentioned the market
was particularly inefficient last year. One of the hedge fund managers I follow
had returns similar to mine this year and last year, but not in the years before
that. I'd guess 20-50% above market returns is a realistic expectation if market
conditions stay similar to today's, and

Worth checking your stock trading skills

Everyone else has already pointed out that you misunderstood what EMH states, so I wont bother adding to their chorus. (Except to say I agree with them).

I will also disagree with:

at mostone-in-five people [...] It should therefore probably update us nontrivially away from the possibility that the post author just got lucky.

1 in 5 isn't especially strong evidence. How many of the other 5 people would you expect to be publishing on the internet saying "You should trade stocks".

29mo

(You're not wrong, but I wanted to flag: the way I read John's comment, the word
"nontrivially" already admitted this. If he thought it was strong evidence I'd
expect him to have used a stronger word. Nothing wrong with adding
clarification, but I don't particularly think you're disagreeing with him on
this point.)

39mo

I agree this isn't a very strong argument. I think theoretically we can probably
get a much tighter probability bound than 20% by looking directly at the
variance of my strategy, and concluding that given that variance, the
probability of getting 600% return by chance (assuming expected return = market
return) is <p for some smaller p. But in practice I'm not sure how to compute
this variance. Intuitively I can point to the fact that my portfolios did not
have very high leverage/beta, nor did I put everything into a single or very few
highly volatile stocks or sectors, which are probably the two most common high
variance strategies people use. (Part of the reason for me writing this post is
that while LW does have a number of people who achieved very high investment
returns, they all AFAIK did it by using one of these two methods, which makes it
hard to cite them as exemplifying the power of rationality.)
Assuming the above is still not very convincing, I wonder what kind of evidence
would be...

Risk Premiums vs Prediction Markets

I don't really know how you incentivise people (seriously) in the non-real money prediction markets.

Non-money prediction markets have lots of difficulties to them:

- How do you size your bet? (Ie knowing a probability vs "higher" or "lower" than market estimate
- Difficult to arbitrage (ie share information between markets)
- How do you show your conviction (this is 50% and I'm certain it's a coin flip vs this is 50% because I don't understand the question)

Worth checking your stock trading skills

I don't know enough about the etiquette here, but I am having to fight the urge to post a bunch of memes along the lines of "It's not the bull market, I really am a genius".

I would strongly advise anyone who's considering following this to consider doing this with considerably less than their whole portfolio and with much lower expectations than 7x'ing your money.

79mo

A lot of people respond to things like this post by assuming that the author was
lucky. This is usually correct, at least when applied to random claimants on the
interwebs, but we can put some bounds on it: the higher the returns, the more
luck required to achieve them, assuming an efficient market. The efficient
markets model says that any strategy during this period had expected return of
50%. So, if the post author used a strategy with probability p of achieving 600%
returns, and 1-p of losing everything, then efficient markets implies
p*(100+600) + (1-p)*0 = (100+50), i.e. p = 0.21 (roughly). This is the highest
probability any strategy could have of achieving 600% returns during this
period, without exploiting any market inefficiency.
In other words: at most one-in-five people could achieve returns that high
without exploiting market inefficiency. It should therefore probably update us
nontrivially away from the possibility that the post author just got lucky.
(Though depending on your priors, you might still think the post author got
lucky.)

No - I think probability is the thing supposed to be a martingale, but I might be being dumb here.