Designing Prediction Markets

ToasterLightning

Prerequisite: basic familiarity with what a prediction market is

So you want to run a prediction market. You need a way for people to trade shares. What are your options?

CLOBs and Markets

If you were making a prediction market from scratch, you'd probably come up with a Central Limit Order Book (CLOB). Traders post BUY and SELL orders, stating what they're willing to buy and sell, and at what price, and you record these orders in your book.

Alice posts: "I'll buy 20 YES shares at $0.60 each"
Bob posts: "I'll sell 25 YES shares at $0.65 each"
Carol posts: "I'll buy 10 YES shares at $0.61 each"

When someone wants to trade, and doesn't want to wait for someone to fulfill their order, you match them with the best available offer. Pretty intuitive.

This system shows up directly in Hypixel Skyblock and other MMOs. The Bazaar lets you post orders and wait, or instantly fulfill existing orders. Have some Enchanted Iron to sell? You can list it at 540 coins and wait for a buyer, or instantly sell it by fulfilling the highest buy order at 470 coins.

The gap between the highest buy order ("bid") and the lowest sell order ("ask") is called the bid-ask spread.

Enter the Market Maker

CLOBs work well, but they have a problem: they need people actively posting orders from both sides. If nobody's posting orders, the market can't function. The spread can also become very wide when few traders are active, making it expensive to trade.

This is where market makers come in. A market maker continuously posts both buy and sell orders, ensuring there's always someone to trade with.

Market makers profit by maintaining a gap in prices between their bid and asks. For example:

They buy YES shares at $0.60
Sell YES shares at $0.65
Whenever one person buys YES from them and another person sells it to them, they pocket the $0.05 difference

This is called crossing the spread. The market maker provides liquidity to the market and is compensated for it through the spread. In traditional finance, firms like Citadel Securities make billions doing exactly this. In Hypixel Skyblock, this strategy is called "bazaar flipping".

Pricing Shares

How do Market Makers price their shares? In an existing market, they can simply look at existing price charts to determine their prices, but that's not very feasible with prediction markets. Thus, we need some way of determining a fair price for shares.

For simplicity, let's ignore extracting profit. We'll assume someone's paying our market maker, whom we'll call Duncan, a flat fee to provide this service, and they're otherwise operating in a fully efficient market where the bid-ask spread is 0.

Duncan holds some inventory of YES and NO shares, and people can trade with him. How should Duncan price his shares? Examining the question, we can see some key constraints:

Prices should sum to $1: If a YES share pays out $1 when the market resolves YES, and a NO share pays out $1 when the market resolves NO, then the price of a YES share and a NO share together should be exactly $1, since the market can only resolve to one of these two options.
Price equals probability: If YES shares cost $0.70, then that means the market thinks there's a 70% chance of the outcome being YES, since that's how expected value works. This is the key mechanism by which prediction markets work and even if you don't know the details of the market implementation yet, you should know this already.

Creating New Shares

Duncan needs the ability to issue shares. Otherwise, he'll run out of them, and won't be able to trade anymore. (No, he can't just raise share prices in an inverse relationship with his supply, since he sells both YES and NO shares this would violate the constraint that prices must sum to $1.)

Fortunately, it's very easy to issue new shares. Since YES and NO sum to 1, for every dollar Duncan receives from a trader, he can mint one YES share and one NO share as a pair. When the market resolves, he'll pay out $1 to holders of the winning share type, fully covering his obligation.

From this, we can infer that any valid formula must have certain properties: buying YES must raise P(YES), the probability must depend on inventory ratios (when Duncan holds a lot of NO, the probability is high because it means he's sold a lot of YES), and YES shares should always cost less than $1, except when the market is at 100%, and vice versa. Since 0 and 1 aren't probabilities, this should never happen.

A Natural Probability Formula

Given these constraints, you might come up with this formula for deriving the probability from Duncan's inventory (and thus the prices of YES and NO):

where $y$ is Duncan's YES inventory and $n$ is Duncan's NO inventory.

When $y = n$ (such as when the market is initialized), the probability is 50%
If Duncan fully runs out of YES shares, the probability is 1, meaning you can't profit from buying YES anymore and you can buy NO for free.
If Duncan fully runs out of NO shares, the probability is 0.

This formula seems to satisfy all of our desiderata, and is fairly intuitive. Since P(YES) is the price of yes, we now know how to price our shares.

Discrete Shares

If Duncan has 50 YES and 50 NO shares, probability is 50%, so shares cost $0.50 each.

You give Duncan $1, and tell him you want to buy YES.

YES costs $0.50, so $1 buys 2 YES shares
He mints 1 YES + 1 NO (inventory: 51 YES, 51 NO)
Duncan gives you 2 YES shares in exchange (inventory: 49 YES, 51 NO)
New probability: 51/(49+51) = 51%

Another example. Duncan has 100 YES and 50 NO:

Probability: 50/150 = 33.33%
Price per YES: $0.33
Your $1 buys 3 YES shares
He mints $1 of shares (inventory: 101 YES, 51 NO)
He gives you back 3 YES: (inventory: 98 YES, 51 NO)
New probability: 51/149 = 34.23%

You might have noticed the problem already: Duncan isn't accounting for how the purchase itself affects the price.

When you buy multiple shares at once, you're getting them all at the initial price, but each share you buy should be more expensive than the last! You get a discount on bulk purchases!

Duncan could solve this by selling shares one at a time or even fractions of a share at a time, adjusting the price after each infinitesimal sale. But this is computationally expensive and assumes shares are discrete units rather than infinitely divisible.

For a continuous equation, we need to use calculus and solve a differential equation

The Calculus of Market Making

(warning: differential equations)

Let's formalize this problem. Suppose Duncan starts with $y_{s}$ YES shares and $n_{s}$ NO shares. You deposit $m$ dollars. and buy YES from Duncan.

After the trade:

Duncan has minted $m$ new shares of each type
NO inventory: $n = n_{s} + m$
YES inventory: $y = y_{s} + m - sold$

where "sold" is the quantity of YES shares Duncan gives to the trader. (In this context, s stands for "starting".)

The market probability at any point is:

P = \frac{n}{y + n}

Substituting our inventory formulas:

P = \frac{n_{s} + m}{(y_{s} + m - sold) + (n_{s} + m)} = \frac{n_{s} + m}{y_{s} + 2 m + n_{s} - sold}

Since we're obeying the constraint price equals probability, the rate at which Duncan sells you shares is determined by the current probability.

The trader deposits money at rate $d m$ and receives shares at rate $d (sold)$ . The price per marginal share is $\frac{d m}{d (sold)}$ . Since we want the price to be the probability, we get:

\frac{d m}{d (sold)} = \frac{n_{s} + m}{y_{s} + 2 m + n_{s} - sold}

Since we're taking money as our input, we take the reciprocal:

\frac{d (sold)}{d m} = \frac{y_{s} + 2 m + n_{s} - sold}{n_{s} + m}

This is our initial differentiation equation. I encourage you to try to solve it on your own, but if you don't know calculus or get stuck, the solution is enclosed below.

\frac{d (sold)}{d m} = \frac{y_{s} + 2 m + n_{s}}{n_{s} + m} - \frac{sold}{n_{s} + m}

\frac{sold}{n_{s} + m} + \frac{d (sold)}{d m} = \frac{y_{s} + 2 m + n_{s}}{n_{s} + m}

Multiply both sides by $n_{s} + m$ :

sold + \frac{d (sold)}{d m} \cdot (n_{s} + m) = y_{s} + 2 m + n_{s}

\int (sold + \frac{d (sold)}{d m} \cdot (n_{s} + m)) \cdot d m = \int (y_{s} + 2 m + n_{s}) \cdot d m

Observe that the $\frac{d}{d m} sold = \frac{d (sold)}{d m}$ and $\frac{d}{d m} (n_{s} + m) = 1$ . By product rule, then:

sold \cdot (n_{s} + m) = m y_{s} + m^{2} + m n_{s} + C

sold = \frac{m (y_{s} + m + n_{s}) + C}{n_{s} + m}

sold = m + \frac{m y_{s} + C}{n_{s} + m}

$sold (0) = 0$ , since if you spend no money you don't get any shares. If you plug in $m = 0, sold = 0$ and solve for $C$ , you get $C = 0$ , so we can just drop that term.

sold = m + \frac{m y_{s}}{n_{s} + m}

Since $n$ is just $n_{s} + m$ and $y$ is $y_{s} + m - s o l d$ , we get:

y = y_{s} + m - m - \frac{m y_{s}}{n_{s} + m}

y = y_{s} - \frac{m y_{s}}{n_{s} + m}

You might notice the term $n_{s} + m$ shows up in the denominator of a term of $y$ , and is equivalent to $n$ . If you multiply $y$ and $n$ together, you get:

y n = y_{s} (n_{s} + m) - m y_{s} = y_{s} n_{s} + m y_{s} - m y_{s}

y n = y_{s} n_{s}

The product of Duncan's YES and NO shares remains constant, regardless of the trade!^[1]

Constant Product Market Maker

Thus, we've discovered the fundamental invariant:

y n = k

where $k$ is a constant determined by Duncan's initial inventory. Because YES * NO is always constant, we call this a Constant Product Market Maker (CPMM).

So Duncan, knowing this, has determined an algorithm for pricing shares:

Receive money from trader
Mint YES and NO shares
Give out exactly enough YES shares (or NO shares, depending on what the trader wants) to maintain the constant product $y \cdot n = k$

Here's an example of this in practice:

Duncan starts out by initializing a market with $50 of liquidity. (Initial inventory: 50 YES, 50 NO)
He solves for his constant product, which needs to remain invariant. $k = 50 \cdot 50 = 2500$
You bet $50 on YES. Duncan uses this to mint more shares. (Inventory: 100 YES, 100 NO)
He now needs to pay out enough YES shares that he reaches his constant product again. $y n = k$ , solve for $y$ .
$y = \frac{k}{n}$
Plug in NO and $k$ . $y = \frac{2500}{100} = 25$
He has 100 YES, and needs to have 25 YES, so he gives you 75 YES shares in exchange for your $50. (Inventory: 25 YES, 100 NO)
The new probability is $\frac{n}{y + n} = \frac{100}{25 + 100} = 80 %$ .

Meanwhile, if a trader wants to sell shares, it's similarly simple: He adds the shares to his inventory, figures out how many YES + NO pairs he needs to give up in order to reach the constant product, and then exchanges these pairs for cash and gives them to the trader, removing the shares from circulation. Alternatively, and perhaps more elegantly, the trader can simply buy the opposite share and then give pairs to Duncan in exchange for cash.

(Note that, since Duncan's inventory is inversely related to the market probability, that means Duncan pockets a lot of money from traders when the market resolves counter to expectations, and loses more of his initial liquidity the more confident a correct market is.)

In fact, this process can be fully automated, creating an Automated Market Maker (AMM). This is the foundation of Uniswap, and many prediction market protocols.

Conclusion

Starting from basic constraints about prediction markets (prices sum to 1, price equals probability), we derived a unique solution. We didn't just arbitrarily choose the CPMM out of a list of options. It emerged, inexorably, from the requirements we placed.

When you properly formalize a problem with the right constraints, there's often exactly one correct answer. Independent researchers, solving similar problems with similar constraints, will converge on the same solution. When Newton and Leibniz invented calculus, they didn't get similar results because they shared their work, or because they were working on the same problem (they were working in very different fields). They got similar results because they were working on a class of problems with the same underlying structure, even if the similarities are not obvious at first.

The market itself does Bayesian updating—on expectation, as more people trade, the probability approaches the true likelihood, based on the accumulated knowledge of the traders. Our pricing mechanism had to respect this Bayesian structure. The constant product formula isn't arbitrary; it's what you get when you correctly formalize "each marginal share should cost the current probability" in continuous terms. While this isn't an empirical fact about the territory, the laws of probability nevertheless have carved out a unique shape in design space, and your map had better match it.^[2]

(This is especially obvious in the context of a prediction market (which is, in a certain sense, the purest form of market, separating the trading and aggregating of information from everything else), but it applies to markets and AMMs in full generality, being used in DeFi and Crypto space.)

^{^}
If you don't know calculus, this is the important part.
^{^}
Ok, I'm completely overstating my case here and these paragraphs are largely joking, there are other solutions to this problem if you pick different probability functions matching these desiderata or come at prediction market design from a different cases, many of which have their own pros and cons, and Hanson explicitly wrote about Constant Function Market Makers. It's just that this one is very intuitive and has useful properties for a purely probabilistic YES/NO market which is why I wrote about it

I've seen similar derivations before, but it's been a few years since I looked at AMMs in detail. I've spent some time recently looking at mechanism design for prediction markets, so this is a timely reminder!

Three questions --
Would you agree that this captures your main conclusion for a binary prediction market:
CPMM "price = probability"?

I seem to recall that CPMMs easily generalize to multiple assets. Instead of $a b = a^{'} b^{'}$ you have $a b c = a^{'} b^{'} c^{'}$ and so on. Do you happen to know if that matches the generalization of your prediction market with binary outcome to one with categorical outcome?

You mention in your 2nd footnote that there are "different probability functions matching these desiderata". Care to say more on this?

If you could link me to these similar derivations I'd be interested to read them, I mostly wrote and worked through this because I couldn't find any existing ones from first principles and was sure it would be possible.

price = probability is a general rule for prediction markets, it's more that CPMMs can be derived from the probability function we described (no/(no+yes)).
The generalization I'm using in my current implementation is https://manifoldmarkets.notion.site/Multi-CPMM-62fe5b99013c4d5a87dfa84e0b8fa642, although I believe Manifold currently uses some bizarre auto arbitrage system between linked binary predictions for exclusive categories. Similarly, you can also extend CPMM by adding a parameter to allow market initialization at different probabilities, and also to allow users to inject liquidity without pushing probability towards 50%.
Regarding other probability functions, there are of course a whole family of constant function market makers that CPMM is a member of. As a trivial example, (no^2/(no2+yes^2)) should also match our desiderata, I believe.
Additionally, starting from the angle of "We have a market maker with pools of shares, how do they calculate probability from these pools" is just one approach you can take.
There's also the LMSR (Logarithmic Market Scoring Rule) also developed by Robin Hanson, which approaches it from an entirely different angle, starting from asking how you can score predictors based on how well they performed, and then applying this to rewards and incentive alignment in a prediction market. This is actually more reflective of the Bayesian structure of the market than CPMM is, I was largely joking when I made that claim.
There's also DPM (Dynamic Parimutuel), which adapts existing parimutuel betting systems to prediction markets. It does have the disadvantage of not being able to know ahead of time how much money you'll receive from your bet, only how much money you'll receive in expectation from your bet, but it has some advantages of its own.
I have this paper saved to read through and think about, I don't really understand it yet but it also proposes a unique solution to this problem.
Largely, CPMM is the one I understand most intuitively out of all of these, which is part of why I'm using it in my personal prediction market implementation.

Thanks for the questions!

Edit: After encountering some problems I have since done more research. Multi-CPMM is dumb and bad and Multi-Binary (manifold's current implementation) is superior in every regard actually. I am signing the manifold markets apology form (reason for behavior: thought the decision was for architectural reasons, was repelled by the auto arbitrage nature). I will hereby respect Manifold Markets and I will NOT talk down on the greatest prediction markets platform of all time.

I will try to dig up some references for you. Sorry it really was a small side project and has been several years.

Ah so I can't imagine a probability function for that market that isn't . $\frac{y^{2}}{y^{2} + n^{2}}$ is a fine pricing function that doesn't appear to adhere to the rules of probability theory. If I try to compose two $\frac{y^{2}}{y^{2} + n^{2}}$ markets, one conditional on the other, then can I multiply their prices to find the joint probability? Does this violate "price=probability"?

"price=probability is a general rule for prediction markets" is a very interesting claim. Seems obvious, but then you have to ask yourself whose probability?

I'm familiar with some of those operations (I've skimmed the Boyd paper). Unfortunately, there are a lot of different ways of expressing the same constraints, so I can't immediately tell whether Manifold's implementation is equivalent to what I had imagined.

Thanks for your answers, I'll look into some of the other ideas you referenced.

What I mean is that the if the AMM estimates the probability at .75, it should charge .75 for a marginal YES share, by law of expected utility. I don't think a different probability function should alter the probabulity theory, just change the pricing curve.

If "price=probability", then changing the pricing curve is equivalent to changing how the AMM updates its probability estimates (on evidence of buy/sell orders).

Yes, but it just affects how liquidity is allocated, and it doesn't just affect how the AMM updates, it affects how users trade as well since they respond to that, either way they'd want to bet to their true probability. So changing the pricing curve is largely a matter of market dynamics and incentives, rather than actually affecting the probabilistic structure.

price = probability is a general rule for prediction markets, it's more that CPMMs can be derived from the probability function we described (no/(no+yes)).
The generalization I'm using in my current implementation is https://manifoldmarkets.notion.site/Multi-CPMM-62fe5b99013c4d5a87dfa84e0b8fa642, although I believe Manifold currently uses some bizarre auto arbitrage system between linked binary predictions for exclusive categories. Similarly, you can also extend CPMM by adding a parameter to allow market initialization at different probabilities, and also to allow users to inject liquidity without pushing probability towards 50%.
Regarding other probability functions, there are of course a whole family of constant function market makers that CPMM is a member of. As a trivial example, (no^2/(no2+yes^2)) should also match our desiderata, I believe.
Additionally, starting from the angle of "We have a market maker with pools of shares, how do they calculate probability from these pools" is just one approach you can take.
There's also the LMSR (Logarithmic Market Scoring Rule) also developed by Robin Hanson, which approaches it from an entirely different angle, starting from asking how you can score predictors based on how well they performed, and then applying this to rewards and incentive alignment in a prediction market. This is actually more reflective of the Bayesian structure of the market than CPMM is, I was largely joking when I made that claim.
There's also DPM (Dynamic Parimutuel), which adapts existing parimutuel betting systems to prediction markets. It does have the disadvantage of not being able to know ahead of time how much money you'll receive from your bet, only how much money you'll receive in expectation from your bet, but it has some advantages of its own.
I have this paper saved to read through and think about, I don't really understand it yet but it also proposes a unique solution to this problem.
Largely, CPMM is the one I understand most intuitively out of all of these, which is part of why I'm using it in my personal prediction market implementation.