Robin Hanson's Futarchy is a proposal to let prediction markets make governmental decisions. We can view an operating Futarchy as an agent, and ask if it is aligned with the interests of its constituents. I am aware of two main failures of alignment: (1) since predicting rare events is rewarded in proportion to their rareness, prediction markets heavily incentivise causing rare events to happen (I'll call this the *entropy-market problem*); (2) it seems prediction markets would not be able to assign probability to existential risk, since you can't collect on bets after everyone's dead (I'll call this the *existential risk problem*). I provide three formulations of (1) and solve two of them, and make some comments on (2). (Thanks to Scott for pointing out the second of these problems to me; I don't remember who originally told me about the first problem, but also thanks.)

## Futarchy and Agents

My motivation here is twofold. On the one hand, Futarchy is something which many people are enthusiastic about. There is a current effort to implement it for small-to-medium-size group coordination. Futarchy is just the idea of using prediction markets to make decisions that matter, and decision markets are already a thing. So, from one perspective, we're already on our way there. But, there are alignment problems which have not been addressed. The misaligned Futarchy problem seems less bad that the misaligned superhuman AI problem. Arguably, we are already embedded in misaligned markets, and Futarchy seems like an improvement on those. Furthermore, Futarchy is made of people; if a Futarchy becomes too evil, the people can just stop, right?...

Well, one way or the other, it seems like it would be a good idea to go ahead and solve the Futarchy alignment problem before there's a problem.

My second reason for being interested in this is as a case-study for alignment. Maybe the Futarchy alignment problem is an easier instance, which could inform us about the AI alignment problem? Or perhaps it could be more directly relevant, informing logical-induction decision theories which look something like Futarchies?

More specifically, the entropy-market problem is an instance of "directionality" in decision theory, as discussed in The Foundations of Epistemic Decision Theory by Jason Konek and Ben Levinstein. The idea is that it's not always beneficial to enforce rationality constraints. A failure of a decision-theoretic constraint suggests some failure of rationality somewhere, but some ways of enforcing the constraint are "backward". For example, Aumann's Agreement Theorem does not imply that you should accept the Modesty Argument; doing so may harm your epistemology rather than improve it. It seems to me that getting directionality right would yield more convincing arguments for decision theory (relating to my puzzling at the end of this post), since it would explain when enforcing a constraint should be compelling. More importantly, perhaps, it may yield insight into the sort of directionality we need to extend causal decision theory to logical counterfactuals.

So, how does a prediction market get directionality wrong?

## Entropy-Market Problem

In a typical prediction market, if you can reliably predict an event which the market assigns probability , you stand to make on your initial investment. This means potentially huge returns for making rare events happen. This seems like a bad thing by default; it is easier to commit unpredictable acts of destruction than of creation. Effectively, a prediction market is also a market for purchasing chaos and havoc in the world; the one who can make the unthinkable happen has a fat paycheck waiting for them.

I offer three possible formalizations of the problem. All three rest on the idea of *asymmetric incentives to cause unlikely events*, but give different notions of "symmetric incentives" to aim for.

- The differential value of being able to make an event slightly more likely is the same as that of making the event slightly less likely.
- The value of being able to make an event happen is always the same as the value of preventing it.
- The value of being able to add to the log-odds of the event is the same as the value of subtracting .

In all three cases, I am ignoring the question of how difficult the event is to cause or prevent. Instead, I am trying to equalize the ROI of manipulating the "true" probability of an event. This does not *eliminate* the value of betting on events which you can personally manipulate -- the "directionality" problem is not totally solved. However, both of my solutions at least don't *give arbitrarily high ROI* to manipulators. (As a consequence, they also avoid giving arbitrarily high ROI to legitimate prediction of events the market doesn't expect, which is unfortunate. The ROI can still be decent, though; the market is still incentivised to achieve calibrated probabilities overall.)

The first interpretation is based on the intuition that what's more important is the incentive with respect to shifting probabilities a little bit. We can never cause or prevent an event with *complete* certainty. So, perhaps it's better to be sure that little shifts in the probability of an event aren't incentivised one way or the other.

The second interpretation seems like the most obvious. It is based on the intuition that what we want to avoid incentivising is someone stepping in and forcing an unlikely event to happen. Put simply, we want a would-be assassin to have the same incentive as a would-be bodyguard, rather than infinitely more.

The third interpretation is somewhere between the two. It could be justified by the intuition that it's probably about as difficult to double the odds of an event as cut them in half. Unlike the first or second interpretation, it explicitly takes into account that it's likely a lot harder to move probabilities which are already near 1 or 0. I haven't figured out whether this interpretation is achievable yet, but I liked it enough that I wanted to include it.

# Solution 1: Equalized Differential ROI

How do we change the ROI of the market? The method I'll employ is to lock away some money with a transaction. For example, normally a share of an event which the market assigns probability 0.1 would cost 10¢ and pay out $1 on success or $0 on failure. We can add an artificial cost of 50¢; now, it costs 60¢, pays $1.50 on success, and pays back your 50¢ on failure. The ROI of these transactions goes down because the cost of initial investment goes up. Suppose you had assigned probability 0.2 to the event in our example. For the unmodified bet formula, the expected value we assign to the bet would be 20¢, at 10¢ cost, so ROI = 1. (ROI is net income / investment.) In the new scheme, the expected value is 70¢ -- but at a cost of 60¢. So, the new ROI is 1/6. I will call this extra money set aside a "reserve".

Now, suppose we add a reserve of for , and otherwise.

If the market assigns probability to an event, and we can do something to make that event happen with probability ( positive), then we can pay for a contract which we assign expected worth . The ROI is .

On the other hand, if we can make the event *less* probable, we would take the other side of the bet: pay for a bet which pays off $1 if the event doesn't happen. The expected value for us is , for an ROI of .

Obviously, this gets me the differential property I wanted; whether increasing the probability of the event or decreasing it, the limit of ROI/x as x gets very small is . This is, of course, because the function is the same in both directions: we've got the much stronger property that adding or subtracting x is always equally valuable. I didn't put this property on my list of problem formulations, though, since it doesn't strike me as intuitively appealing; adding and subtracting are not intuitively similarly difficult when is close to 1 or 0, *except* when is very small.

Another issue with this solution is that setting aside a reserve to modify the ROI is an artificial warping of the market, which can be easily corrected by external creditors. You will receive the reserve back with certainty; so, a lender might happily lend you up to the quantity you've got in reserve, for a small fee. This problem is somewhat corrected in the next proposed solution.

# Solution 2: Equalized Event-Forcing ROI

We've seen that we can artificially *decrease* ROI (fragile as the method is when external lenders arrive on the scene). Can we artificially *increase* ROI? Yes, but there is a cost. We can artificially decrease the cost of an investment by offering leverage. For example, suppose we offer leverage of half the cost. Where I would normally pay $p for a contract paying out $1 on an event, now I pay $p/2; the contract pays out $1-p/2, and if the event I bet on *doesn't* happen, I still need to pay back the $p/2.

We can extend the solution from the previous problem by interpreting negative reserve as leverage. Again, we use the formula for the reserve (leverage when negative). The cost of a contract is now in all cases, and the payout is if the event happens (and if it doesn't). If we can make the event happen, the ROI is *always 100%*, no matter what the market assigns! Obviously, if we can *prevent* the event from happening, we just take the opposite side of the bet, which *also* gives an ROI of 100%.

Once again, I got more than I asked for: not only are the values of making an event happen or preventing it identical to each other, they're also the same for all values of . This means there's no special incentive for manipulation as gets small (which might be concerning even if the manipulation were even-handed with respect to causing and preventing the event).

An added bonus of this scheme is that we can dump the risk of leveraging bets on on the people taking the side of the bet: the reserve of the other side of the bet is always the amount we need for leverage, so why not use that? It's not a very nice thing to do to the people betting on unlikely events, granted, but it solves the problem mentioned in the previous section: now the lenders will be more hesitant to loan money based on what's in reserve. This makes my warped market incentives more robust to external forces.

Half the people in the market will be unhappy comparing the situation to a normal betting market, but the other half will be happy about the free leverage and increased ROI. So, this market won't automatically be out-competed by alternate betting markets. Granted, any solution of the kind I'm exploring here works by blocking some contracts which people would be happier to make than the ones offered. Perhaps, for a betting market tied to a futarchy, people will be understanding of the need to avoid incentivising disorder.

Obviously, there are more formulas and desirable properties one could play around with. (The reserve formula is somewhat fun.) Since it *is* a futarchy, the system could even adapt to purposefully increase ROI of betting on events which we'd like to happen, and decrease the ROI of the reverse; I haven't explored this yet, but it seems even more promising.

## Forecasting Doom

What of the second problem I mentioned at the beginning, the existential risk problem? It would be a sad irony if the aspiring rationalist community worked hard to put a futarchy in place only to find that the new government *still* systematically under-allocated resources to deal with existential risk.

The inability of a prediction market to forecast existential risk is a special case of an inaccuracy in forecasting events correlated with changes in the value of money. If money will predictably be worth half as much in a certain outcome, the market underestimates the odds of that outcome by that very ratio, just as if the bet contracts failed to pay out half the time. (Observation due to Scott.) Actually, though, *that* problem is not so hard to solve. We *know* exactly how this effect skews predictions about the value of money itself; so, we can adjust the market-estimated probabilities accordingly. Similarly, we can adjust for the skew on an arbitrary event by looking at conditional bets on the value of money given . (If such conditional bets are not traded enough, we might assume no adjustment is needed, or perhaps do something more clever.) None of this changes how the market itself operates; it's all an "overlay" to let you see the true probability estimates implied by the market.

Unfortunately, this doesn't do much good for existential risk. Bets on the value of money can't be re-scaled appropriately when that value is zero.

However, I think this problem is also fixable. In fact, logical-inductor decision theories don't run into any problem like this. Granted, the agents there aren't using money for anything *but* the prediction market, so are easier to give purely epistemic concerns. However, I don't think that's the best thing to point to. A more important factor is that a logical induction market doesn't adjudicate its own destruction by waiting for it to happen. It adjudicates with proofs instead. It can see versions of itself die in many alternate worlds, and to the extent it knows which world it resides in, it may see itself die as well.

We can adjudicate doom before doom actually happens. We "simply" need a sufficiently trusted underlying adjudication system which is capable of estimating the death toll taken by nuclear missiles, plagues, asteroids, nanotech disasters, and AIs and all variety of conditional bets about the logic of such events.

I think many people think of prediction markets as the last word in knowledge aggregation, but the need for an underlying knowledge system which is slower but more trusted is nothing new. If you were to make bets on fundamental physics, you would not make a contract which hinges directly on a property of a particular particle. Instead, you would have to bet on the appearance of a peer-reviewed article in a sufficiently trusted venue with such-and-such a result, or something of that kind. A prediction market can only adjudicate bets on a direct fact in so far as the fact is accessible to it -- represented in data accessible by the computer. Beyond that, you need witnesses, data-entry, *something* to get the knowledge to the system in a trusted way. The advantage of the prediction market is that it can tell you what to expect *before* the trusted knowledge system does.

Adjudicating existential risk is hard. A Futarchy would likely need to leave the question of how to do it somewhat open, perhaps first installing a very minimal system and developing better options over time. Coming up with high-quality underlying knowledge systems to do the adjudication seems like the direction to go, though. Fortunately, in many cases, we can solve this well with an open market -- different bets can hinge on different knowledge systems, based on the whims of those making the contracts. When it comes to the critical bets for a Futarchy's decision-making, though, a top-down choice must be made as to how to adjudicate.

The property of futarchy that I really don't like is the fact that one person with a lot of money can bet on "Policy X will lead to bad outcome Y," causing policy X to never be tried in the first place, and all of that person's money to be refunded, allowing them to make the same bets next time.

This may or may not be a problem in practice, but I would really like to see a good fix for it in theory.

This problem is what causes the failure to take the 10 in the 5 and 10 problem described here. One trader in the logical inductor can say that taking the 10 will lead to 0 utility, and then get all his money back, because the markets conditional on taking the 10 never get resolved. (I often refer to this problem as "the futarchy exploit.")

The only way I see to get around this is:

It would be interesting to see this style of solution fleshed out, to see exactly how strong the assumptions have to be in order to avoid trouble.

The analog of EXP3 is to have investors put their money on policies (rather than predictions about policy outcomes), to pick each policy with probability proportional to the amount of money behind it, and then to take money away from the people who financed the chosen option based on how badly it performs relative to the best possible outcome (giving that money to the people who financed the non-chosen options). This prevents you from cheating the system in the way you describe, though it also means that investing is quite risky even if you know

exactlywhat is going to happen.In this analogy, futarchy corresponds to estimating Q values (with a regression loss defined by the market maker you use in the decision markets) and then picking the Q-maximizing action. This can have lower variance but has no guarantees of any kind.

I suspect the optimal thing is to run both kinds of markets in parallel, to use the policy market with the EXP3 rule for picking actions, and to use the decision markets only for variance reduction.

I have thought about this a little bit in the context of online learning, and suspect that we can prove an optimality theorem along these lines. It would be nice to see the analogous claim with markets, and the market version would probably be more relevant to alignment. A clear and convincing exposition would also likely be of interest to researchers in RL.

(As usual, this comment is not intended as a land grab, if anyone executes on this idea and it works out it's all theirs.)

In my current way of thinking about futarchy, it seems like the right way to do this is through good adjudication. It passes the buck, just like my assumption in a recent post that a logical inductor had a correct logical counterfactual in its underlying deductive system. But for a futarchy, the situation isn't quite as bad. We could rely on human judgement somehow.

But another alternative for an underlying adjudication system occurred to me today. Maybe the market could be adjudicated via

models. My intuition is that a claim of existential risk (if made in the underlying adjudication system rather than as a bet) must be accompanied by a plausible model - a relatively short computer program which fits the data so far. A counter-claim would have to give an alternative plausible model which shows no risk. These models would lead to payouts.This could address your problem as well, since a counterfactual claim of doom could be (partially) adjudicated as false by giving a casual model. (I don't intend this proposal to help for logical counterfactuals; it just allows regular causal counterfactuals, described in some given formalism.) But I haven't thought through how this would work yet.

Now, whether a perfect market should pick up an existential risk signature is different from whether a real market would. The behaviour of the Dow around the Cuban missile crisis isn't encouraging in that regards:

It seems that a rational market should implicitly price existential risk. Imagine a world with no chance of disaster, and a functioning futures market that prices everything relative to everything else.

Then add in a 10% chance of immediate extinction at some specific time t, and assume the market participants know this. Then futures contracts for perishable consumption goods after t should go down 10% (relative to before t), while futures contract for durable consumption goods will change values as a function of their depreciation rate.

This kind of signature should be possible to pick up from the futures prices.

You make a good point. I wonder if there's a nice way to exploit this to make

conditionalrisk estimates, consumable by futarchy.The problem is that this is rational behaviour for a market, but, if it fails, it's not really exploitable.

As in, the only way to profit is if a disaster happens, and then you've just profited a little bit by having a more rational consumption profile, rather than profited arbitrarily.

When modeling the incentives to change probabilities of events, it probably makes sense to model the payoff of changing probabilities of events and the cost of changing probabilities of events separately. You'd expect someone to alter the probabilities if they gain more in expectation from the bets than the cost to them of altering the probabilities. If someone bets on an event and changes the probability that it occurs from p to q, then their expected payoff is qp−1 times their investment, so if, in a prediction market in which there are n possible outcomes, the expected payoff you can get from changing the probability distribution from (p1,...,pn) to (q1,...,qn) is proportional to maxi(qipi)−1.

Modeling the cost of changing a probability distribution seems harder to model, but the Fisher information metric might be a good crude estimate of how difficult you should expect it to be to change the probability distribution over outcomes from one distribution to another.

I found a paper addressing the entropy-market issue, probably more thoroughly than I do (but I haven't read it yet): http://www.cs.duke.edu/csed/honors/pengshi.pdf