PSA: Predictions markets often have very low liquidity; be careful citing them.

Eye You

I see people repeatedly make the mistake of referencing a very low liquidity prediction market and using it to make a nontrivial point. Usually the implication when a market is cited is that its number should be taken somewhat seriously, that it's giving us a highly informed probability. Sometimes a market is used to analyze some event that recently occurred; reasoning here looks like "the market on outcome O was trading at X%, then event E happened and the market quickly moved to Y%, thus event E made O less/more likely."

Who do I see make this mistake? Rationalists, both casually and gasp in blog posts. Scott Alexander and Zvi (and I really appreciate their work, seriously!) are guilty of this. I'll give a recent example from each of them.

From Scott's Mantic Monday post on March 2:

Having Your Own Government Try To Destroy You Is (At Least Temporarily) Good For Business
On Friday, the Pentagon declared AI company Anthropic a “supply chain risk”, a designation never before given to an American firm. This unprecedented move was seen as an attempt to punish, maybe destroy the company. How effective was it?
Anthropic isn’t publicly traded, so we turn to the prediction markets. Ventuals.com has a “perpetual future” on Anthropic stock, a complicated instrument attempting to track the company’s valuation, to be resolved at the IPO. Here’s what they’ve got:
Upon the “supply chain risk” designation, predicted value at IPO fell from about $550 billion to $475 billion - then, after a day or two, went back up to $550 billion. No effect!
A coarser yes-no Polymarket tells the same story:
The chance of Anthropic getting a $500 billion+ valuation in 2026 fell from 90% to 76%, before rebounding to 83%.
Why have the markets shrugged off this seemingly important event?
Partly it’s because Anthropic seems likely to win on appeal. Hegseth has said the government will keep using Anthropic for the next six months (undermining his case that they’re a national security risk) and has signed a substantially similar contract with OpenAI (undermining his case that their contract terms were unworkable). The prediction markets think the courts will be sympathetic:
[link to this Manifold Market]
But even in the 28% of timelines where the designation sticks, things don’t seem so bad...

(The first market that Scott quoted, the Ventuals future, is not a typical market that people reference -- I had never seen it before -- and is kind of complicated to analyze. I did an analysis of it but have decided not to include it in the main post as it brings the focus away from the specific point I want to make. I'll attach the analysis as a comment to this post.)

Let's take a look at the Polymarket market that Scott cites. Here's what its order book looks like when I'm writing this:

So, if I wanted to change the chance of Anthropic getting a $500B+ valuation from 90% to 75%, I'd have to spend checks clipboard $59. Okay, maybe we should add in the liquidity from the Yes side as well. In which case... $370. Someone could manipulate Scott Alexander and his tens of thousands of readers (some of whom are very powerful people who will be making important decisions based on their beliefs about Anthropic!) for a few hundred bucks.

What about the Manifold market Scott cites? Well, first of all, Manifold is a play money market, which means we have little a priori reason to expect it to be accurate or efficient. The utility (or lack thereof) of play money markets is not what I want to talk about in this post, though. What I want to focus on here is the (lack of) activity in the market that Scott references. Let's look again at a chart of it.

This is not what an active or efficient market looks like. There has been ~0 activity from March 9 to March 15.

Let's look at an example from Zvi now. From his Feb 26 AI newsletter:

The prediction markets on this situation are highly inefficient. Kalshi as of this writing has bounced around to 37% chance of declaration of Supply Chain Risk, versus Polymarket at 22% for very close to the same question.
Another way to measure how likely things are to go very wrong is that Kalshi has a market on ‘Will Anthropic release Claude 5 this year?’ which is basically a proxy for ‘does the American government destroy Anthropic?’ and Polymarket has whether it will be released by April 30. The Kalshi market is down from 95% (which you should read as ~100%) to 90%. Polymarket’s with a shorter timeline is at 38%.

I looked at these markets on Feb 26 and found that they were not very liquid. From my notes: "$1k trade is gonna move the market 20% on Polymarket. Kalshi market is a joke, each side is like 40 cents wide." Zvi was also live tweeting about these markets.

When Zvi tweets "The @Polymarket for Hegseth 'ban Claude by March 31' has crashed to 15%", the implication is that this market is worth taking seriously, etc.

Zvi is correct that these markets aren't efficient, but is wrong that there is Alpha. There isn't money to be made in these markets because they're tiny. In fact, due to how large the bid/ask spreads was on the Kalshi market, its odds would fluctuate 20%+ just based on whether the last trade was at the bid or the offer.

So, PSA: Please check the liquidity/activity/volume/spread of a prediction market before you reference it!

There's a corollary to be made about how prediction markets are causing people to make predictable epistemic errors. (Do people want me to make a post on this?)

Manifold has calibration statistics about their markets. They seem well calibrated and the accuracy peaks around 10-20 traders. They even cite a paper from 2007 looking at small markets in 2006 that concludes that 16 traders should be sufficient. Polymarket plots brier score vs volume with .09 up to $10k, then .08 up to $25k, .07 up to $100k, .05 up to $250k, .04 up to $500k, .03 up to $1M, .02 for $1M+. Brier score is mean squared error (e.g. if I say 70% and it happens I have a score of .09), so that suggests it's fairly good even at relatively low liquidities.

The way I see it, it being cheap to swing the market isn't sufficient for manipulation. The question is how fast it gets corrected by other traders. Basically, if someone (or someone's trading bot) notices they can make bank. The theory then goes that manipulators increase accuracy by incentivizing informed trading - though I think this is a long run thing.

On manipulation, Manifold cites Scott Alexander's manipulation experiment, which got corrected in an hour:

I tested this by manipulating a scandal market about Manifold CEO Austin Chen, with his permission. It was originally at 4%; at 4:50 PM California time, I spent M$200 in play money (=~ $2 in real money) to manipulate it up to 95%. By 5:30 California time, it was back down to 4%. This isn’t a great example, because real attackers might be more subtle, make multiple small bets, only try to push it a few percent, etc - but I think it’s a pretty good sign. (the embed says the market currently has 78 traders and M48k liquidity, which is pretty high for Manifold)

It also cites an old paper by Robin Hanson, which in the introduction cites a few other papers showing failed manipulation in both the field and lab experiments. I don't know of bigger attempts now that prediction markets are mainstream, and would be interested in seeing data about Polymarket and Predictit.

Prediction markets of course have problems, but they seem to not get much worse at mid-low liquidity.

I think you are correct to look at activity though. Will activity will spike back up after manipulation? Perhaps someone should try noise trading a couple inactive markets as an experiment.

Polymarket's brier score, from their accuracy page , along with Manifold's data broken out by liquidity as found in the comment they linked from their accuracy page

Has anyone seen good research on how prediction market liquidity relates to accuracy / calibration / etc.?

On a quick search I found Tetlock 2009 which found that liquidity doesn't help:

I investigate the relationship between liquidity and market efficiency using data from short-horizon binary outcome securities listed on the TradeSports exchange. I find that liquidity does not reduce—and sometimes increases—deviations of prices from financial and sporting event outcomes.

And then a more recent blog post analysis of Polymarket data finding that it does matter:

Market microstructure analysis reveals significant accuracy improvements correlating with trading volume. Markets exceeding $100,000 in total volume achieve 84% accuracy, while those below $10,000 fall to 61%. This pattern reflects the efficient market hypothesis applied to prediction markets—adequate liquidity enables proper price discovery.

It seems intuitively obvious that it'd matter a bunch, but I'm wondering exactly how much liquidity I should be looking for in a market, and how much to trust small- or medium-sized markets.

'Accuracy' is not a great statistic imo (it just tells you about the direction, e.g. if you say 51% for something that happens you are 'accurate'). Thankfully Polymarket publishes Brier scores. You can see that while liquidity helps, it's already not bad for the lower liquidity buckets (.09 up to $10k, then .08 up to $25k, .07 up to $100k, .05 up to $250k, .04 up to $500k, .03 up to $1M, .02 for $1M+)

Calibration City https://calibration.city/ is also a great resource for looking into questions about calibration of various prediction market platforms!

Appendix: Ventuals Market

The first market Scott references is a Ventuals market which purports to function as a future on Anthropic stock value. This is my first time hearing of Ventuals, and I'm going to ignore the question of whether the mechanism behind this product actually works. Let's just look at the liquidity of the market. I looked at the volume traded over the four day period (Friday to Monday) that Scott is talking about. I found there was ~$600,000k in volume (which is honestly better than I expected).
What kind of trading would have produced the market behavior we saw -- $530 to $480 and back to $530 (a ~10% move down and back) in $600k of volume? Let's take a look at the order book to see how much this market will move based on trade size. (I don't have the historical order book available so I'm using the current order book.)

A $40k sell would move the market down ~10% here. The Friday-Monday price behavior could be explained by a single person selling $40k on the news and then changing their mind the next day and buying $40k back! Of course, this is just one possible scenario... but it illustrates that the numbers involved here are small enough that a single, not particularly rich person could single-handedly move these markets.
Another way to think about this is: how much money could a good trader realistically have made here? I looked more closely at the volume and price data (not pictured here) and found that ~$200k traded on Friday as the price went from $530 to $480 and stayed around $480. Given that there are significant up candles in this period and the expected price movement per $ that we found earlier, this matches up pretty well with something like $125k selling and $75k buying. Let's say you have good reason to believe that this news shouldn't affect the value of this product. Maybe you buy 30% of the selling volume at an average of $500, and close your position at an average of $525. Then you'd make (525-500) dollars per share with .3*125k/500 shares traded for a grand total of $1875.

Is $600k a lot of money for one future on a niche trading platform? I feel very uncertain about this fact.

It's hard to say because it depends on who's trading. If every trader is putting in $100 orders, then a market move probably represents a real probability change, not just a liquidity issue. But the market will pop if a single trader puts in a $100K buy order.

In the context of financial markets, $600k is extremely small. Here are some ~average daily volumes for context:

US 10yr treasury futures (extremely high volume): $200bil

MSFT (very high volume stock): $13bil

NWS (~smallest stock in SP500): $35mil

GME (GameStop): $150mil

In the context of a niche trading platform? Idk. I was surprised because I didn’t realize this market existed at all.

Using a snapshot of the order book doesn't give a good sense of the cost to manipulate a market. For instance, Kalshi has a market about mars colonization by 2050 (https://kalshi.com/markets/kxcolonizemars/colonize-mars/kxcolonizemars-50). It would only cost $500 to push the probability up to 85% for a moment. It would probably cost millions to hold it there for more than a few hours.

I think the reason people cite Manifold is because it has a track record of being competitive and accurate. You wouldn't assume that an online chess tournament will be easy to win just because they are only playing for fun.

This was ~my thought process upon seeing the Anthropic Polymarket numbers 2 weeks ago (as documented on discord).

Wow, over 90%+ on Anthropic going over 500B.
At first I thought it might be a good idea for EAs to hedge on this market
Then I realized that the total size of this market is like in the single-digit thousands of dollars.
which in turn makes me more pessimistic on how much to update on prediction market probabilities in practice, at least for questions like this one.