LESSWRONG
is fundraising!
LW

(Geometrically) Maximal Lottery-Lotteries Are Probably Not Unique — LessWrong

epistemic/ontological status: almost certainly all of the following -

a careful research-grade writeup of some results I arrived at a genuinely kinda shiny open(?) question in theoretical psephology that we are near-certainly never going to get to put into practice for any serious non-cooked-up purpose let alone at scale without already not actually needing it, like, not even a little;
utterly dependent on definitions I have come up with and theorems I have personally proven partially using them, which I have done with a professional mathematician's care; some friends and strangers have also checked them over;
my attempt to follow up on proving that something that can reasonably be called a maximal lottery-lottery exists, to characterize it better, and to sketch a construction;
my attempt to to craft the last couple of missing pieces, to sinter the result together, and to display it working;
definitely not a 15-minute read;
(EDIT) Currently, not quite actually about the same thing as the Maximal Lottery-Lottery sequence
the second half of something complete for now

Maximal Lottery-Lotteries: At This Point, Hopefully Not Quite As Much As You Want To Know

"...and so my definition of the modularity criterion - which replaces both participation and consistency - will have to wait for the next post, as will my construction of maximal lottery-lotteries." -Lorxus

This post is a lot of entirely new work in the same vein as the Maximal Lottery-Lotteries Sequence. It's the second of two posts in a sequence, the first of which is linked here.

As per my usual policy, if I've managed to misunderstand anything, write anything up unclearly or incorrectly, or you have any questions about what I've written, please comment below or at https://www.admonymous.co/lorxus and I'll fix it when I can.

If you're reading this post for the first time, you might want to keep the notation reference on hand. If you're relatively inexperienced with reading text which treats dense mathematical notation on par with the English language, slow down and make sure you understand what each mathematical expression really means (or refers to) before continuing. Mathematical notation is extremely compact and precise; if I tried to write all the below in prose, it'd be three times as long and a tenth as clear - but all compression comes with compute costs, and math notation is no exception.

When we last left off, I'd just finished tying maximal lottery-lotteries to weighted geometric means as motivated by the Nash solution to cooperative bargaining, and then used direct utility data to rework an old lemma into a new existence proof. Then I promised I'd define a new property, modularity, which would replace consistency and participation, and use it to help motivate the construction of a maximal lottery-lottery. Time to make good on promises.

Replacing Consistency and Participation

"Game loss, every time. But it's really consistent." -Alex Steacy

For reasons Scott never made fully public, he came to believe at the end that maximal lottery-lotteries were not consistent, not least because the combination of consistency, Condorcetness, and dupe-resistance together uniquely determine a maximal lottery up to ties. I agree, but I think the problem is much worse than that for consistency, primarily because of the connection between failures of consistency and geometric rationality/the AM-GM boundary. As I pointed out when discussing the improved dominance criterion for lottery-lotteries, we should treasure some kinds of minor failures of consistency, because those are where an AM-GM boundary cashes out most clearly as "the majority may not further tyrannize this minority".

Figuring out what special property we want a maximal lottery-lottery to have that's stronger than participation, weaker than consistency, and lying in the same conceptual space as both - let's call it modularity - requires a little care, even before starting. The trap to avoid is to assume that just because whatever modularity is must be weaker than consistency and stronger than participation, it must also be most straightforward to define modularity as some modification of one or the other. Instead, the tool we'll use to help us strike the right balance between the Scylla^[1] of majoritarian tyranny and the Charybdis^[2] of single new voters possibly causing electoral outcomes to swing wildly on joining is called the geometric mean. Like I described in the previous post, a geometric mean is like an arithmetic mean (or average), except that we multiply instead of adding and we take the nth root instead of dividing out by n. In particular, we'll use our old friend weighted geometric means again to represent the fact that the electorates we want to join together are not generally of precisely equal size or significance.

Here's a desirable property a voting system could have that uses a weighted-geometric-mean construction. It's generally stronger than participation, weaker than consistency, and centrally employs an AM-GM boundary construction for its intended purposes in the definition. In it and in the following text, we'll always tacitly assume that we pick appropriately to reflect a power/population-proportional representation in the joined electorate:

( $p$ -)Modularity: The outcome of an election run over a joined pair of subelectorates should always be at least as good, as measured by the $p$ -weighted geometric mean of the aggregated utility functions of the two electorates, as the result obtained if we were to - before the election started - grant dictatorship to either of the two electorates, or if we were to employ $p$ -weighted random dictatorship.
Equivalently, let $V, W \in Δ V_{C}$ be electorates and write $U = V ⊔ W$ . Then for discretionary choice of intermediacy parameter $0 < p \leq 1$ , a ( $p$ -)modular $f$ satisfies all three of $G_{p} (V (f_{C} (V)), W (f_{C} (V)))$ ; $G_{p} (V (f_{C} (W)), W (f_{C} (W)))$ ; $G_{p} (V (f_{C} (p \cdot V + (1 - p) \cdot W)), W (f_{C} (p \cdot V + (1 - p) \cdot W)))$ $\leq G_{p} (V (f_{C} (U)), W (f_{C} (U)))$ .

This sounds like a complicated property, but that's because prose and math are both badly suited to describing this. All modularity says is that your voting system does at least as graceful a job of handling the joins of conflicting electorates as the first two nitwit ideas you might come up with - arbitrarily picking an electorate to be the only ones to get to vote, and picking a dictator-electorate through the natural coin-flip. So it's a natural one: if your electoral system can't consistently do at least as well handling this very important task as a very simple algorithm would, it's not worth implementing. It has the right shape, too, being related to but generally weaker than consistency on one end and strictly stronger than participation on the other, under the usual limiting relaxation where we let one of $V, W$ put all its measure on one element from $Δ C$ and take $p$ as determined by $| V |, | W |$ .

Described a bit more formally and more simply than the math: we want for a voting system to handle joins of electorates at least a little gracefully, no matter the finite-multiple discrepancy in their importances. And we're not obsessed with making sure that each electorate ends up strictly better off than they were before being joined - that's not possible in general even in principle. Instead, we'll be happy if once we take the expected voter satisfaction over each electorate and then take the $p$ -weighted geometric mean over the two results, our voting system does at least as well by that measure as if we'd let one or the other electorate decide, and it also does as least as well as if we'd picked the next least bad overly-simple option and used a $p$ -weighted lottery to decide which electorate we wanted to let decide for everyone. If our voting system does at least as well over the two joined voterbases as all three of those canonical possible plans for handling the join, we call it a modular voting system.

The question remains of how precisely modularity fits with the other two "fair-joining" criteria, consistency and participation.

Proposition: (Modularity $⊥$ consistency) To see this, it suffices to note that consistency demands linearity of joined outcomes on the nose, while modularity demands an outcome at least as good as some triple of reasonable outcomes calculated nonlinearly.
Lemma: (Modularity $⪰$ participation) Call the singleton voter $w \in V_{C}$ and the voterbase $V \in Δ V_{C}$ , with $U = (p \cdot {w} + (1 - p) \cdot V)$ for some choice of $p$ . Then by assumption of modularity, $G_{p} (w (f (U)), V (f (U))) \geq G_{p} (w (f (V)), V (f (V)))$ so that $w (f (U))^{p} \cdot V (f (U))^{1 - p} \geq w (f (V))^{p} \cdot V (f (V))^{1 - p}$ . Now, certainly $V (f (V)) \geq V (f (U))$ , so that $w (f (U))^{p} \cdot V (f (U))^{1 - p} \cdot \frac{1}{V (f (U))^{1 - p}} \geq w (f (V))^{p} \cdot V (f (V))^{1 - p} \cdot \frac{1}{V (f (V))^{1 - p}}$ and $w (f (U))^{p} \geq w (f (V))^{p}$ . Therefore, $w (f (U)) = w (f (p \cdot {w} + (1 - p) \cdot V)) \geq w (f (V))$ , and $f$ is participatory.

So overall, as far as how the three "fair-joining" properties match up against each other:

For how much is required of the electoral outcome - and by extension, how well voters and electorates are protected in the average case, assuming they're protected at all:
Consistency > Modularity >> Participation
For which properties logically entail which:
[Consistency $⊥$ Modularity] $⪰$ Participation

Exact linearity is a much stricter condition to have be true than requiring the inequalities about outcomes be satisfied, but those inequalities in turn are a stronger guarantee than merely guaranteeing that any individual do at least as well by participating as not.

Constructing Maximal Lottery-Lotteries

"You must construct additional l~~otteries~~ pylons." -Paul "Aldaris" Eiding

All of this suggests three possible ways of explicitly constructing a maximal lottery-lottery which turn out in short order to be equivalent.^[3]

First, we might think to build up a maximal lottery-lottery by treating all the voters as electorates with a size of 1. Then, construct a full binary tree whose leaves are each labelled with one of the voters and their utility data. Label all non-leaf nodes with lottery-lotteries corresponding to whatever mix of candidate-lotteries argmaxes the population-weighted geometric mean of the expected utilities of the two subelectorates; that is, the stem lottery-lottery is given by $L = {a r g m a x}_{K \in Δ^{2} C} G_{p} (E_{v \sim V} v (K), E_{w \sim W} w (K))$ , where $V, W$ are the child nodes' electorates. And because a continuous function on a compact space always attains its maximum, we don't even need to worry about whether this lottery-lottery exists.

Alternatively, we might completely skip that entire process and immediately find a lottery-lottery that geometrically maximizes over the electorate's outcomes, such that $M := {a r g m a x}_{L \in Δ^{2} C} (Π_{v \in V} E_{c \sim\sim L} v (c))^{\frac{1}{| V |}} = {a r g m a x}_{L \in Δ^{2} C} G_{v \in V} E_{c \sim\sim L} v (c)$ ^[4]. Thankfully, taking population-weighted geometric means in this way is associative, in the way that taking population-weighted arithmetic means would be and that taking unweighted geometric means would not be, so these two definitions are equivalent - one inductive and constructive, the other with explicit formula and maximally simple procedure.

Finally, we could do the appealingly geometric-rationality-flavored thing, and start seeking to instantiate the Nash bargaining solution which happens to be both an egalitarian and utilitarian solution by rescaling all $v \in V$ such that $w_{v} (c) := \frac{1}{E_{c \sim\sim L} v (c)} v (c)$ , so that $E_{c \sim\sim L} w_{v} (c) = 1$ . After that, we geometrically maximize, calculating $M := {a r g m a x}_{L \in Δ^{2} C} G_{v \in V} E_{c \sim\sim L} w_{v} (c)$ , and by the argument found in the above, we should get that by looking for lottery-lotteries that give every voter precisely $E_{c \sim\sim L} w_{v} (c) = 1$ . Except wait - the rescaling process just amounts to a shift by a constant in log-expectation, as that link even tells us! This construction is also the same construction as the two constructions above.

It gets even better. Because this construction is also the same construction as the two constructions above, and is also the exact same solution as the one that instantiates the Nash bargaining solution to cooperative bargaining (up to constant shifts in log-expectation and a need to suitably handle voterbases with different weights), we have no need to spend additional time or effort proving that the abstract solution we proved to exist is equivalent to one (and thus all) of these - it already is one!

Fine, then - actually rolling up our sleeves and constructing a maximal lottery-lottery turned out to be startlingly straightforward, once we put ourselves in the right frame of mind. But how about characterizing one? I won't provide airtight proofs for some of these, but rather just sketches.

Proposition: (Binary tree constructions of MLLs are modular) We'll do the hardest cases of voter + voter, voter + finite electorate, and finite electorate + finite electorate, so that the proof for all finite electorates - the hardest case - follows by induction.
Let $v, w \in V$ be voters. Then the lottery-lottery for their join is given by $L = {a r g m a x}_{K \in Δ^{2} C} \sqrt{v (K) \cdot w (K)}$ , and certainly ${m a x}_{K \in Δ^{2} C} \sqrt{v (K) \cdot w (K)}$ is at least as large as $\sqrt{v (f (v)) \cdot w (f (v))}$ , $\sqrt{v (f (w)) \cdot w (f (w))}$ , and $\sqrt{v (f (p \cdot v + (1 - p) \cdot w)) \cdot w (f (p \cdot v + (1 - p) \cdot w))}$ , since all three of the lottery-lottery outcomes $f (v), f (w), f (p \cdot v + (1 - p) \cdot w)$ are available to be argmaxed^[5] over.
Now let $W$ be an electorate of finite size. Then the lottery-lottery for their join is given by $L = {a r g m a x}_{K \in Δ^{2} C} v (K)^{p} \cdot W (K)^{1 - p}$ , and certainly ${m a x}_{K \in Δ^{2} C} v (K)^{p} \cdot W (K)^{1 - p}$ is at least as large as $v (f (v))^{p} \cdot W (f (v))^{1 - p}$ , $v (f (W))^{p} \cdot W (f (W))^{1 - p}$ , and $v (f (p \cdot v + (1 - p) \cdot W))^{p} \cdot W (f (p \cdot v + (1 - p) \cdot W))^{1 - p}$ , since all three of the lottery-lottery outcomes $f (v), f (W), f (p \cdot v + (1 - p) \cdot W)$ are available to be argmaxed over.
Finally, let $V$ be an electorate of finite size. Then the lottery-lottery for the join $X = V ⊔ W$ is given by $L = {a r g m a x}_{K \in Δ^{2} C} V (K)^{p} \cdot W (K)^{1 - p}$ , and certainly ${m a x}_{K \in Δ^{2} C} V (K)^{p} \cdot W (K)^{1 - p}$ is at least as large as $V (f (V))^{p} \cdot W (f (V))^{1 - p}$ , $V (f (W))^{p} \cdot W (f (W))^{1 - p}$ , and $V (f (p \cdot V + (1 - p) \cdot W))^{p} \cdot W (f (p \cdot V + (1 - p) \cdot W))^{1 - p}$ , since all three of the lottery-lottery outcomes $f (V), f (W), f (p \cdot v + (1 - p) \cdot W)$ are available to be argmaxed over.^[6]

Proposition: (Argmax constructions of MLLs result in something lottery-Smith) Let $L \in Δ^{2} C$ be the result of the described argmax construction process. It suffices to show that those candidate lotteries are in fact over $Δ S$ , such that $L \in Δ^{2} S$ . But because argmax picks lotteries to maximize based on global geometric maximization, the families of candidate-lotteries that it picks are exactly those that correspond to utility-maximal partitions of unity, and all of those are the lottery-Smith ones.

Proposition: (Argmax constructions of MLLs are lottery-dupe-resistant) Let $D = {d_{i}}$ be a lottery-dupe set, and for each $d_{i} \in D$ , let $C_{i} \in Δ C$ be a candidate-lottery such that all voters $v \in V$ are indifferent between $d_{i}, C_{i}$ . Let $L \in Δ (C \cup D)$ be the output of either argmax construction when applied to $C \cup D$ , and let $M \in Δ C$ be the output of the same argmax construction when applied to $C$ , making all the same choices, except that we replace every instance of $d_{i}$ with $C_{i}$ . $M$ is utility-maximal by construction, so since voters don't have meaningfully different preferences on $C \cup D$ as compared to $C$ , $L$ must also be utility-maximal. Without loss of generality, since the argmax construction results in something lottery-Smith, we may pass to the case where some $d_{i}$ would belong in the lottery-Smith set if eligible; otherwise, none of the dupes could ever have made it into the argmax construction at all. Then the corresponding $C_{i}$ is already in the lottery-Smith set. Because $M$ is utility-maximal, it must already have the "correct amount" of $C_{i}$ , and because the only time the argmax could have made different choices is in trading off between $C_{i}, d_{i}$ , $L$ must assign the same probability to $(C_{i} + d_{i})$ as $M$ does to $C_{i}$ alone.

Lemma: (All maximal lottery-lotteries are modular, dupe-trenchcoat-resistant, and lottery-Smith.) Let $M \in Δ^{2} C$ be a maximal lottery-lottery. From the previous three lemmas, $M$ is modular, lottery-Smith, and dupe-trenchcoat resistant.

Unfortunately, for now this is where it ends. As far as I can tell, while all maximal lottery-lotteries satisfy modularity, dupe-trenchcoat-resistance, and the lottery-Smith criterion, they aren't unique at all - they're roughly characterizable as any of the countably many geometric-utility-maximal lotteries (with the same maximin value for every voter) over a basket of lotteries which satisfy the Smith condition, or equally well as some lottery-lottery in the convex hull of some effectively calculable generating set of maximal lottery-lotteries - though I do expect that in general and in practice, this generating set, whose cardinality is trivially at most the number of candidates, will in fact be strictly smaller. I'm not totally sure, but I doubt that maximal lottery-lotteries can even be uniquely characterized as lottery-Smith, lottery-dupe-resistant, and modular, even if the unique can be up to ties in geometric-expectation of voter utility. Maybe I haven't made modularity strong enough? Or maybe being lottery-Smith is too broad? I don't know, and I'm happy to leave that for someone else.

Scraps and Leftovers

If we can say anything at all, we must say it clearly; and on those topics which we cannot speak clearly, we must fall silent. -Wittgenstein

This was the unfinished part; the part where all the scraps congealed together before I wove them into a definition or theorem or sometimes a whole section to go above. At one point this entire section used to live in the previous post. This part thus very nearly didn't make it into the final writeup. You're reading this because I figured out what more to do with it and because I specifically want for you to; I am personally thanking you for having read all the way through to here and I am asking for your feedback.

To make up for the lack of a uniqueness theorem above, here's three sketches, each of one last thing that occurred to me - and likely to many of you at some point along the way - and which I thus wanted to devote at least a little space to discussing, in spite of their being off the obvious path we took in generalizing voting lotteries to voting lottery-lotteries.^[7]

What about weighting voters differently? If we can pull that off, we could implement some flavor of Demarchistic/futarchy-lite arrangement where voting in favor of measures that the populace endorses the effects of in hindsight means you get a touch more Vote Power than your average Jove.

Essentially - what if before we begin the constructive procedures above, we apply some exponent to some of the voters in order to $p$ -weight them? Or equivalently, when we take the geometric mean over all voters' outcomes for argmaxxing, we raise a voter's expected value to some power slightly multiplicatively larger or smaller than 1, before we take products and roots? I expect that basically all of the desirable properties (like voters never having expected utility constant-0, and approximating a Nash bargaining solution under natural relaxations) that hold of the unweighted maximal lottery-lottery process will hold of this weighted maximal lottery-lottery process as well, assuming that we consistently modify expected values according to the voter weights the whole way through. (I will not spell out the folly of setting some voter weights to 0 or to $\infty$ , or even to particularly multiplicatively large or small numbers.)
What's up with the fractal structure Scott posited?

For this one, there's actually a handful of factors I think might be contributing to it. First off, there's the two I mentioned as asides in the previous post - the fact that one of the ways I defined for constructing maximal lottery-lotteries is straight-up fractal - the binary-tree one can be made so fairly easily. Addditionally, any convex combinations of maximal lottery-lotteries itself a maximal lottery-lottery, so any time you can pinpoint some additional maximal lottery such that the voterbase is indifferent between it and the basket of maximal lotteries they already like, you now have an entire family of maximal lottery-lotteries for each one you knew you had before. Finally, a chaos game-style argument reveals a Sierpinski gasket-esque structure within - or possibly simply on - the set of maximal lottery-lotteries, given the latter's natural simplicial-ish structure as a convex hull.
Is there literally anything else at all anyone could do with this?

On a friend's recommendation, I've actually started mulling over mechanism design and thus begun to wonder whether there's some very thorny mutual auctioning problem in search of an isomorphic solution to maximal lottery-lotteries. No idea whether it'd work for anything yet, or how; and if I did know I wouldn't say.

Notation Reference

"First. You have to understand the problem... Introduce suitable notation. Separate the various parts of the condition. Can you write them down?" -Pólya György

Throughout this post, I consistently use the following font conventions for mathematical objects:

Lowercase letters in normal Computer Modern (like $a, b$ ) are always candidates or voters.
Uppercase letters in normal Computer Modern (like $A, B$ ) are always lotteries or voterbases.
Blackboard bold is only ever used as an object for the voterspace $V_{C}$ . It also gets reserved for use as the expected value symbol $E$ and the geometric mean symbol $G$ .
Fraktur is only ever used for the candidate set $C$ and the dregs set $D$ . I would also have used it for the Smith set $S$ , but \frak{S} is famously bad. I thought it was a G for years until grad school because it used to be standard for the symmetry group on n letters. Seriously, just look at it: $S$ .
The calligraphic font is only ever used for lottery-lotteries (like $A, B$ ) and for the Smith set $S$ .

Here's a guide to the standard mathematical notation I use extensively here:

Morphism/Hom-sets - $H o m (A, B)$ - the set of functions from $A$ to $B$
Disjoint union - $A ⊔ B$ - the union of the sets $A$ and $B$ , which are either implicitly assumed to be disjoint or else elements in the union are tagged with their set of origin; sometimes used to implicitly partition another set (e.g. $S = A ⊔ B ⊔ C$ ; $C = S ⊔ D$ (the partition of the candidate set into Smith and dregs))
Sampling/drawing from a probability distribution - $x \sim X$ - $x$ is a random draw from $X$
Sampling/drawing from a lottery-lottery - $X \sim X$ ; $x \sim\sim X$ - $X$ is a randomly drawn lottery from $X$ ; $x$ is a random draw from a randomly drawn lottery from $X$
Probability notation - $P_{x \sim X} e (x)$ - the probability of $e (x)$ happening, where $x$ is drawn from $X$
Expected value notation - $E_{x \sim X} f (x)$ - the expected value of $f (x)$ , where $x$ is drawn from $X$
Geometric mean notation - $G_{x \in X} f (x)$ - the geometric mean of the values of $f (x)$ , where $x$ ranges over $X$
Weighted geometric mean notation - $G_{p} (x, y) := x^{p} y^{1 - p}$ - the $p$ -weighted geometric mean of $x, y$
Candidate sets - $c \in C$ - an set of arbitrary elements (candidates) and possibly lotteries on those candidates
Smith sets and dregs - $C = S ⊔ D$ - the candidate set is partitionable into a Smith set (guaranteed to be nonempty) and a dregs set (about which there are no guarantees)
Candidate lotteries - $Δ C$ - the set of candidate-tagged partitions of unity; the set of probability distributions of the set of candidates
Utility-functions - $V_{C} ≅ H o m (C, [0, 1])$ - the set of functions from the set of candidates to the unit interval^[8]
Voterbases/electorates - $V \in Δ V_{C}$ - vote shares (or probability distributions) over the set of utility functions on candidates
Candidate lottery-lotteries/distribution-distributions - $Δ^{2} C$ - the set of candidate-tagged-partition-of-unity-tagged partitions of unity; the set of probability distributions of probability distributions of the set of candidates^[9]
Domination - $A ⪰ B$ ; $A ⪰ B$ - object A dominates object B; as an abuse of notation, property A logically entails property B.

^{^}
A rock shoal; a six-headed sea monster; renowned in myth for snatching up a few sailors from ships passing too close to it.
^{^}
A whirlpool; a vast sea-monster of the deep; renowned in myth for thrice-daily drinking the sea and belching it out to create a whirlpool to drown entire ships trying too hard to avoid the rocks.
^{^}
Always a good sign when your three strong guesses turn out to be the same guess! That means it's thrice as strong, that's just math.
^{^}
Treat them all like individual electorates of size 1 and use the same cooperative-bargaining-fair joining process, and because the joining process is fair, we could also just join them all at once to instantiate the Nash bargaining solution - put that way, it almost sounds obvious.
^{^}
"Argmaxed"? "Argmaxxed"? "Argmax'd"? " $a r g m a x$ -ed"? " $a r g m a x e d$ "? Wait, I've got it! "Argmaxxing" for the present tense in accordance with standard English morphophonetics but "argmaxed" for the past tense by analogy to "axed".
^{^}
"Argmaxxing"? "Argmaxing"? "Argmax'ing"? " $a r g m a x$ -ing"? " $a r g m a x i n g$ "? Whatever. "Argmax" doesn't even look like a real word anymore and hasn't for most of the last week.
^{^}
"It is well indeed that we are so foolish, else what freedom we have would be wasted on us. Thus is it written in the Book of Cold Rain that one must never take the shortest path between two points."
^{^}
Scott sees no need for a footnote here and thus suppresses it in his notation.
^{^}
Scott notates this one more literally as $Δ Δ C$ .

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

16

(Geometrically) Maximal Lottery-Lotteries Are Probably Not Unique

16

16

Maximal Lottery-Lotteries: At This Point, Hopefully Not Quite As Much As You Want To Know

Replacing Consistency and Participation

Constructing Maximal Lottery-Lotteries

Scraps and Leftovers

Notation Reference