Voting Theory Introduction

Scott Garrabrant

Sequence Introduction

This is the first post in a sequence in which I will propose a new voting system!

In this post, I introduce the framework and notation, and give some background on voting theory.

In the next post, I will show you the best voting system you've probably never heard of, maximal lotteries. (Seriously, it's really good.)

After that, I will make it even better, and propose a new system: maximal lottery-lotteries.

Then comes the bad news: I can't prove that maximal lottery-lotteries exist! (Or alternatively, good news: You can try to solve a cool new open problem in voting theory!)

Thanks to Jessica Taylor for first introducing me to maximal lotteries, and Sam Eisenstat for spending many hours with me trying to prove the existence of maximal lottery-lotteries.

Generalizing Voting Theory

A voting system is a function that takes in a distribution on utility functions on a set of candidates, and produces a distribution on that set of candidates.

This is not what a voting theorist will tell you a voting system is. That is because they like to make a bunch of extra assumptions:

The set of candidates is finite.
The function only uses the preorders on candidates implied by the utility functions.

The output distribution assigns probability 1 to a single candidate.
The input distribution is the uniform distribution on some finite set.

I am going to play along with some of these assumptions, but I want my types and notation to treat them as explicit assumptions, so I will use notation that will make it easy to remove these assumptions as needed.

Technically, I also make one assumption that voting theorists usually don't make. That is that the voting system is homogeneous. Homogeneous voting systems are only a function of what proportion of voters have each preference, not on the absolute number of voters that have each preference. Thus, when I said the input was a distribution on utility functions, rather than a multi-set of utility functions, I threw out the information that would have allowed for non-homogeneous voting systems. I don't know of any seriously proposed voting systems that are non-homogeneous, so this is not a significant extra assumption. Throughout the sequence, I will take for granted that all voting systems are homogeneous.

Assumption 2 is most likely to be violated by voting theorists, as in approval voting and range voting. Next is 3, and there is a small subset of voting theorists who think about non-determinism. Assumption 4 is not very important, and while it is usually made, it also usually does not matter. Assumption 1 is almost never violated, or at least when it is, the field is called something other than voting theory.

A utility function on a set is just a function from $S \to [0, 1]$ . I will write $Δ (S)$ for the set of distributions on the set $S$ . (I'm not going to worry about the sigma algebras; usually it will be obvious/not matter.)

If $f$ is a voting system, $C$ is a set of candidates, and $V \in Δ (C \to [0, 1])$ is a distribution on utility functions on C, I will write $f_{C} (V)$ for the output of the voting system on $V$ , so $f_{C} (V) \in Δ (C)$ .

Some Important Criteria

To get more comfortable with this formalism, we will translate three important voting criteria.

Condorcet Criterion

If a candidate would defeat all others in one-on-one elections, that candidate should win.

Translated to our formalism, $f$ satisfies the Condorcet criterion if whenever there exists a $c \in C$ such that for all $d \in C$ , we have $P_{v \sim V} (v (c) > v (d)) > \frac{1}{2}$ , we have $f_{C} (V) (c) = 1$ .

We can think of $P_{v \sim V} (v (c) > v (d))$ as saying when we randomly choose a voter $v$ , what is the probability that $v$ prefers $c$ to $d$ . If this probability is greater than $\frac{1}{2}$ , then a majority of voters prefer $c$ to $d$ .

Consistency Criterion

If two disjoint electorates would produce the same result, then combined, they should also produce that result.

Translated to our formalism, $f$ is consistent if whenever $f_{C} (V_{0}) = f_{C} (V_{1}) = D$ , we have $f_{C} (p V_{0} + (1 - p) V_{1}) = D$ for all $p \in [0, 1]$ .

Participation Criterion

No voter should be made worse off by voting compared to staying home.

Translated to our formalism, $f$ satisfies the participation criterion if for any $V \in Δ (C \to [0, 1])$ , $v : C \to [0, 1]$ , and $p \in [0, 1]$ , we have $v (f_{C} (p v + (1 - p) V)) \geq v (f_{C} (V))$ .

(We abused notation in two minor ways above. First, we gave $v$ a distribution on $C$ as input. This is to be interpreted in the obvious way, by taking the expected utility. Second, we combined $V$ , a distribution on utility functions with $v$ , a single utility function. Here, we are identifying $v$ with the probability distribution that is deterministically $v$ with probability 1.)

Clone Independence

If a set of candidates are clones (meaning they are clumped together in all voters preference orderings), the result of the election should be the same as if you replaced the clones with a single candidate.

Translated to our formalism, let $C = X ⊔ {y} ⊔ Z$ , and assume that for all $x \in X$ and $z \in Z$ and $v \in C \to [0, 1]$ with $V (v) > 0$ , we have $v (x) > v (z) \Leftrightarrow v (x) > v (y)$ and $v (x) < v (z) \Leftrightarrow v (x) < v (y)$ . Then, $f$ satisfies clone independence if for all inputs as of this form, $f_{C} (V) (x) = f_{C ∖ Z} (V) (x)$ for all $x \in X$ . (Where in $f_{C ∖ Z} (V) (x)$ , $V$ is restricted to the domain $C ∖ Z$ in the obvious way.)

Here, $Z$ is a set of candidates that are all clones of $y$ . If you remove all of the clones, every candidate other than $y$ will have their probability of winning unchanged. As a consequence, after we we remove the clones, $y$ 's probability of winning will increase by the probability that one of the clones would have won.

Brief Summary of Voting Theory

Voting theory the field is pretty into the four assumptions I give above, rarely violating any of them. The below discussion is also assuming the four assumptions given above. (I am not an expert, so I might be wrong about what voting theorists think.)

First there is Arrow's Impossibility theorem, which basically says there is always a possibility that introducing a third candidate can change the winner between two other candidates. That is the first sign that we are not going to get to have nice things.

Condorcet vs Consistency

Then, there is a conflict between the Condorcet criterion and the Consistency criterion. Most prefer the Condorcet criterion, and bite the bullet on consistency.

This is a bigger bullet to bite than you might think. This is because consistency is highly entangled with participation (which is also incompatible with Condorcet). While neither criterion technically implies the other, basically every voting system that is seriously proposed satisfies consistency if and only if it satisfies participation (even among proposals that do not satisfy the four above assumptions).

Violations of participation are also referred to as the no-show paradox. Imagine you are going to vote in an election and your favorite candidate is $c$ . Also there are all the other voters, and let's say that if you don't vote, c will win anyway. so $V_{0}$ is you, who prefers $c$ . $V_{1}$ is everyone else, which also prefers $c$ . Then you vote, and the result is some other candidate $d$ ! That shouldn't happen! You shouldn't make your favorite candidate lose by voting for them!

If you want to go the other direction, the Condorcet criterion is not an especially fun bullet to bite either. If one candidate is preferred in all one on one elections, that candidate really should win! (There are also a bunch of proposals that fail both criteria.)

Consistent methods

The most well known systems that satisfy consistency are plurality and Borda count. Plurality is the standard default voting system. Everyone votes for their favorite, and whichever candidate gets the most votes wins. Borda count is just where each voter ranks all the candidates and gives 0 votes to their least favorite, 1 vote to their second least favorite, 2 votes to their third least favorite and so on. Indeed, the only way to satisfy consistency is to rank all the candidates and have some function from position in the ranking to number of points, and give the win to the candidate with the most points. (Plurality is a special case of this where only the first choice gets one point.)

(Approval voting, range voting, and random dictatorship also satisfy consistency, but are not counted in this discussion.)

Condorcet Methods

There are a bunch of systems that satisfy the Condorcet criterion. If there is a Condorcet winner (which is often the case, especially with a small number of candidates), then all Condorcet voting systems agree, so the difference between different Condorcet systems does not matter that much.

They all have different ways of deciding the winner when no Condorcet winner exists. Most of the good Condorcet systems also satisfy clone independence, but there are a bunch of options and no clear winner. They are all kind of ugly and do things like manipulate the weighted directed graph of how much each candidate wins or loses to each other candidate one-on-one, but many think that the Condorcet criterion is important enough to justify the aesthetic issues.

Clone Invariant Methods

The main two Condorcet methods that are also independent of clones are ranked pairs and the Schulze method. Instant runoff voting, also know as "Ranked Choice," which is the primary voting system reform that is actually making traction in the United States, is also independent of clones, but this (and simplicity) is basically all it has going for it. It is not very good, and it is not Condorcet or consistent, but is still better than plurality. Besides that, all seriously considered clone independent voting systems (approval voting, range voting, random dictatorship) violate one of the four assumptions above in addition to the Condorcet criterion.

Fortunately, things will get better in the next post, where we will remove the assumption that the output must be deterministic.

[-]Ben Pace2y70

The function only uses the preorders on candidates implied by the utility functions.

I didn't know what this meant, so I asked someone around me, and now I believe that this means "Your voting function is only allowed to use the information in voters' ballots, and isn't allowed to use any information that is in their utility function but that they didn't write down on their ballot".

For example if someone is voting strategically, you aren't allowed to check that by comparing it to their actual preferences; you aren't allowed to use info that's only stored in their utility function.

[-]Scott Garrabrant2y70

Yeah, that is correct, although note that the ballot could just ask the voters for their utility function. The assumption means the ballot asks for a ranking of candidates, possibly with ties, and no other information.

[-]ESRogs2y20

The assumption means the ballot asks for a ranking of candidates, possibly with ties, and no other information.

Note that this is only true for ranked methods, and not scored methods, like Approval Voting, Star Voting, etc.

[-]Ben Pace2y40

Checking my understanding here: is this assumption ruling out quadratic voting, which asks for a weighted-ranking of candidates?

[-]Scott Garrabrant2y20

Yes, although I bet that Condorcet and consistent are still incompatible for deterministic voting systems, even if you allow taking in the entire utility functions

[-]Quinn2y10

Why is a codomain of [0,1] more general than a preorder?

The (implicit, as OP says "obvious/not matter") measurability of a compact set seems like more structure than a preorder, to me, and I'm not thinking of "generalization" as imposing more structure.

[-]Scott Garrabrant2y30

Yeah, preorder is misleading. I was only trying to say with as few characters as possible that they are only considering a ranking of candidates possibly with ties. (Which is a preorder, but is less general.)

LESSWRONG
LW