In the previous post, I presented the concept of a maximal lottery-lottery as follows:

We start with a finite set of candidates  and a distribution  on utility functions on . Given lottery-lotteries , we say that  dominates  if . A maximal lottery-lottery is an  such that for all other  dominates .

Conjecture: For any  and , there always exists a maximal lottery-lottery.

We discussed how we can construct a powerful and elegant new voting system, if only we can guarantee that a maximal lottery-lottery exists. However it is an open problem whether or not they exist. In this post, I will discuss a few different angles of attacking this conjecture.

An Infinite Game

First, consider the following partial result:

Theorem: Let  be any nonempty finite subset of . Then, there exists an  such that for all other  dominates .

Proof: Consider the following symmetric two-player zero-sum game between Alice and Bob: Alice chooses an , and Bob chooses a . Alice gets utility equal to , and Bob gets the negative of this utility. This game must have a Nash equilibrium, and in this Nash equilibrium, Alice must play a mixed strategy . Since this is a Nash equilibrium in a symmetric zero-sum game, Alice must get at least 0 utility in expectation, no matter what  Bob plays. Alice getting at least 0 utility in expectation is exactly saying that  dominates 

This proof is valid (you can check the details), so why doesn't it generalize to infinite subsets , and in particular ? The problem is that games with infinitely many options need not have Nash equilibria. For example consider the game where Alice and Bob each choose a natural number, and whoever chooses the larger number wins.

Fortunately, often in cases like this, you can get past finiteness assumptions by instead using compactness assumptions, and  is compact. Indeed, games with a compact option space and continuous utility functions always have mixed strategy equilibria. 

Unfortunately, the game we constructed does not have a continuous utility function! This is easy to see, because if  assigns probability 1 to a single utility function , if , then Alice wins and gets utility 1, but as we continuously move around , we can discontinuously jump to the game being a tie, and then to Bob winning.

Fortunately, there are a bunch of papers analyzing situations like this. Discontinuities like this show up a bunch in auctions, which also have an infinite compact option space and discontinuous utilities, resulting from discontinuities in who wins the auction.

Unfortunately, Sam Eisenstat and I spent a while looking through a bunch of papers like this, and failed to find anything that we could apply. Several things looked like they might be able to work, but everything we tried ultimately turned out to be a dead end.

A Limit of Finite Games

The above partial result is actually pretty practically useful on its own. Even if maximal lottery-lotteries do not exist, you can always just take  to be the set of all lotteries that assign each candidate a probability that is a multiple of for some large natural number . Because of this, practically speaking, we can use the maximal lottery-lotteries voting system today, without ever proving the conjecture!

Further, we can consider what happens as we send  to infinity. If there is an odd number of voters that each have a utility function in general position, we will always get a unique distribution on  that dominates all other distributions on , and we can look at what happens to  as  goes to infinity. Hopefully this converges, but even if it doesn't, we can take any limit point, and ask whether that limit point is a maximal lottery-lottery. My guess is that the sequence does in fact converge, and that the (unique) limit point is a maximal lottery-lottery, but I have not had any luck yet proving either of these.

However, this enables us to get some empirical results! We can just compute some of these finite approximations. Even with three candidates, we have to solve a game with  different options, and I have not yet optimized it beyond using an off the shelf linear programming solver in Haskell, running for a few minutes, so the highest I got was . (I am sure we can do much better.)

I took three candidates and three voters, and gave each voter a utility function in general position. Here is a picture of . The blank space means that 0 (or near 0) utility is put there. The numbers show the negative of the natural logarithm of the probability that that point is chosen. (Remember, this is a distribution on distributions on a three element set, or equivalently a distribution on the triangle.)

______________________4.4.5.5.5.3.4.4.5.5.5.5.4.5.5.__________
 __________________________7.____________4.6.5.8.5.4.________
  __________________4.5.5.______6.5.__5.__5.9.5.4.4.4.______
   ________________4.5.5.4.______5.____5.__5.__4.4.5.5.____
    ______________5.5.5.5.7.____5.4.5.7.5.108.6.5.4.4.4.5.
     ______________5.________________________5.4.5.____7.
      ____________7.5.4.5.____________6.5.6.__4.__5.6.__
       ______5.4.5.8.__5.8.____6.5.__6.__7.__6.4.6.4.5.
        ____6.7.5.5.5.5.__5.6.8.6.5.7.5.____6.5.5.__4.
         6.6.5.4.5.4.4.5.6.__6.__6.5.__6.____5.6.7.4.
          __5.6.5.________5.__5.6.________5.6.8.5.5.
           4.5.5.__4.6.__5.6.7.5.6.6.____6.4.5.7.4.
            5.4.5.5.5.__5.5.5.7.5.________5.6.4.4.
             6.5.5.5.6.__5.__5.5.6.____7.8.4.7.7.
              5.5.5.5.5.____5.5.5.5.__5.6.7.__4.
               7.4.5.6.__6.5.5.4.6.____4.4.5.5.
                4.__8.5.__5.7.__5.5.5.__6.____
                 5.6.6.6.5.6.5.______________
                  4.________________________
                   ________________________
                    ______________________
                     ____________________
                      __________________
                       ________________
                        ______________
                         ____________
                          __________
                           ________
                            ______
                             ____
                              __

 for the same candidates and voters looks similar, but with less detail, and some minor changes. This is a lot of data, but since we are going to use it to sample and then sample from the sample, for the purpose of maximal lottery-lotteries, we only really care about the probability each candidate is elected. For the above example, we get (0.262, 0.415, 0.323) for  and (0.263,0.417,0.320) for , which seems like some slight empirical evidence towards converging. Obviously there is a lot of room for a much more careful and larger experimental analysis. 

Fractals!?!

I believe that maximal lottery-lotteries exist, and are the limit of the above finite approximations, and (when they don't put all probability mass on the distribution that puts all probability mass on a single candidate) usually have some fractal structure. I think the complexity of the above picture is evidence of this, but I also have some mathematical intuition for why this makes sense (which feels hard to put into words):

Being a Nash equilibrium involves equalizing a bunch of constraints. These constraints are combined with the fact that the distribution is supported entirely within the triangle, and so where probability mass would naturally want to be placed outside the triangle, it gets cut off on the boundary. When it is cut off like this, something has to change inside the triangle to fix new constraints that become broken, and this causes the constraints to sort of reflect off the boundary of the triangle, somewhat like ripples. (Some of this intuition is coming from seeing similar things happen when studying nim.) Sorry that I don't have a more articulate description, but the point is, I think we are getting fractals.

Unfortunately, I think this is going to make it harder to show that maximal lottery-lotteries exist, since we will likely not have simple closed forms of what the maximal lottery-lotteries are.

A Generalization of Colonel Blotto

Another way to look at maximal lottery-lotteries is as a generalization of the Colonel Blotto game. 

In the the continuous Colonel Blotto game, there are two opposing officers that each have some resources, that they can distribute between some battlefields. We will assume that they each have 1 total unit of resources. Whoever sends the most resources to any given battlefield wins that battlefield. (If they send the same amount, they each win half the battlefield.) Each officer gets utility equal to the number of battlefields that they win. This game and its variants have been analyzed quite a bit over the last century.

Finding maximal lotteries is like solving a variant of the Colonel Blotto game. In the original game, you choose how to distribute resources from the simplex where each battlefield gets non-negative resources, and the total of all the resources is 1, In this variant, you instead have some other convex polytope representing the allowable ways to distribute resources.

To reduce finding maximal lottery-lotteries to a Colonel Blotto game, we consider each voter to be a battlefield. (We will only discuss the case where there is a finite set of equal weight voters here.) Each candidate can be thought of as assigning a utility to each voter, which we can also interpret as sending a number of resources equal to that utility. Thus, the polytope of allowable ways to send resources to battlefields is just the convex hull of these points that come from the candidates. Each point in this convex hull corresponds to (at least) one distribution on candidates, and a solution to this Colonel Blotto game corresponds to a maximal lottery-lottery. 

This old RAND paper analyzing Colonel Blotto, seems like some further evidence that we might be looking at fractals.

A Call for Help

I think that proving the existence of maximal lottery-lotteries would actually be a substantial advance for voting theory. (Although it might take some work to get people interested in something that challenges this many of the assumptions of the field.) However, I am not intending to work on this anymore. If you want to take a shot at proving they exist,  and have questions, get in touch with me. 

Without a proof of existence, I am probably not going to bother writing this up as more than these blogposts, but if anyone manages to prove it, I would be happy to write a paper together. Also if anyone feels very excited about working on this and getting it published in spite of not having a proof, I could maybe be up for collaborating on something like that, especially if someone collects a bunch of experimental evidence and better fractal pictures and is willing to do a lot of the writing.

New to LessWrong?

New Comment
16 comments, sorted by Click to highlight new comments since: Today at 5:22 PM

(I no longer thing this strategy works, see the thread here; sorry for wasting your time Scott!)

Here's an approach which I think could work, though there are definitely some finicky details (including a potentially-fatal question about convergence at the end, which I couldn't deal with---and which I think may not be the weakest link given that the other steps are also a bit shaky).

For lottery-lotteries  and , write:

We say that B strictly dominates A if .

Lemma: if  strictly dominates , then there is a continuous  that also strictly dominates , i.e. such that  has a density with respect to Lebesgue measure and .

(ETA: this lemma is false, see comment from Scott below. I think it's fixed but am not sure.)

Proof: (I'll assume for convenience that there is a finite set of  voters although I think this is easy to relax. We could assume that  is a point mass if we wanted though obviously  won't be. I think this could be greatly simplified.)

Suppose that . For each utility function  in the support of the electorate  we'll define  as the space of values for utility function  that occur with positive probability under . Note that  is countable and hence that there is a finite set  which has at least  of the probability mass. Let  be the space of lotteries that have a utility in . Let  be the union of  over all the utility functions .

The idea is that  is just a finite union of hyperplanes, and so it divides the space of lotteries into a finite set of open regions. Within each of these regions, continuously shifting 's probability between nearby lotteries results in continuous changes in , plus a discontinuous term whose magnitude is at most  (corresponding to the positive-probability utilities in  that aren't in ).

If  assigns probability  to  then it is trivial to construct  by spreading out  continuously within each of these regions: there is guaranteed to be some spreading that changes  by strictly less than  (since we can make the continuous part of the change arbitrarily small and the discontinuous part is at most ) and so we can find some way of spreading out mass that results in a continuous  with .

If  assigns positive probability to  we need one more step. The idea is that for each point in  we can find a "safe" direction to shift the probability from  so that it's no longer in : if we pick a random direction each voter prefers to move one way or the other, and so one direction must have majority approval (note that the argument would break at this point if  depended on the probability that  was at least as good as , rather than being defined symmetrically). Once we've picked such a direction for every point on , we can associate all of the probability from  into one of the adjacent open regions and proceed as before. (ETA: this step fails if you are on the edge of the simplex.)

(This lemma might be enough to prove the result when combined with standard fixed point theorems. I'll finish by using standard tools from CS but that may just be because it's what I'm familiar with. I also can't easily do the convergence analysis at the end so that might not work and/or be much simpler from a mathematical perspective.)

Proving the theorem given the lemma: We'll describe a sequence of distributions  such that  implies that  has  entropy.

We'd like the sequence to be convergent  in the sense that for any continuous . If this were the case then we'd infer that we can't have  for a continuous , since that would give us an absolute  which  for all sufficiently large , implying that  has  entropy. By the main lemma, we'd conclude that no  (continuous or otherwise) can strictly dominate . For now I'll present the algorithm to generate , in the next section I'll discuss convergence.

Our construction is follow-the-regularized leader with a standard analysis for playing 2-player games. It doesn't converge to a Nash equilibrium but I think it will still be good enough for us.

Let  be the entropy of a lottery-lottery (relative to Lebesgue measure on the simplex) and take  for a sequence  (the exact decay rate of  isn't important as long as it goes to zero but the sum diverges). Note that . and that . As a result, the distributions  change very slowly, and we have  for any lottery-lottery .

Now suppose that  . Since  diverges, we can find arbitrarily large  such that . Then:

  • Since  was picked as the maximizer: 
  • Then we can split off the last term from that sum: 
  • Then we can use the fact that  was itself a maximizer, to replace every  except the last with , and obtain: 
  • Continuing in this way, we get 
  • But now we use the fact that each summand , since  is symmetric and .
  • Thus 
  • Thus 

(Note that we used the finite number of candidates at this step to get . Effectively we use finite candidates to show that there is only finitely much space, and use the continuity of  to argue that we only need to learn up to finite precision, and therefore this is no worse than a finite learning problem.)

Convergence: we want to argue that there exists a distribution  such that  for any continuous . I don't remember my topology very well, but I think we can do this as long as  converges to a limit for every open set . I can just take  to be that limit, and under compactness I get a probability distribution.

(ETA: I think we can actually use any limit point and don't need convergence, see below. So I think this is fine but am not sure.)

So we need to rule out the situation where there is some open set  and real numbers  such that  and  each happen infinitely often. This seems like a really messed up situation that shouldn't happen, but I don't have a clean proof tonight.

Some notes that make me think we should be fine:

  • We don't actually need the  to converge to a limit, we could define  if that limit exists. And if we want we can take  or  and the argument above still goes through.
  • If  but  then the two distributions must be fairly far apart and so the average of the distributions has significantly higher entropy than the average of their entropies. That's a strange situation, since we picked each  to maximize entropy (plus the linear function ).
  • By sequential compactness, we can extract a pair of lottery-lotteries  and  at a significant distance from one another such that the iterates get arbitrarily close to each of them infinitely often. That seems even more messed up.
  • If we really have a problem here, it seems like we can extend the algorithm above to make it converge more nicely by adding traders who push  back into the interval  (such that if you go out of the interval infinitely often they make an infinite profit and eventually enforce a hard constraint on ).

I'll probably revisit this tomorrow, but overall non-convergence seems weird up enough that I have a higher probability of something going wrong at some other step in the argument. I think that the main issues with convergence are really the ones in the first lemma that let us focus on continuous .

[This comment is no longer endorsed by its author]Reply

If  assigns positive probability to  we need one more step. The idea is that for each point in  we can find a "safe" direction to shift the probability from  so that it's no longer in : if we pick a random direction each voter prefers to move one way or the other, and so one direction must have majority approval (note that the argument would break at this point if  depended on the probability that  was at least as good as , rather than being defined symmetrically). Once we've picked such a direction for every point on , we can associate all of the probability from  into one of the adjacent open regions and proceed as before.

Haven't ready through everything yet, but I am skeptical here with respect to points on the boundary of the simplex. 

I agree that's a bug in the proof (and the lemma obviously can't be true as written given that e.g. if 90% of voters are single-issue voters who hate candidate X, then no continuous lottery-lottery can dominate a lottery that puts 0 probability on X).

I haven't thought about how serious this is. It feels like we could hope to fix things by considering a game where you pick (i) which candidates to give probability 0, which is a discrete choice from a set of size and hence relatively unproblematic, (ii) what distribution to play within the corresponding simplex, where we can make a similar continuity argument to handle the infinitely many options.

I don't know if that plays nicely, I'll probably think about it tomorrow.

At face value it looks like it will be OK: we just define the same algorithm, but now we define a probability distribution over  simplices and define conditional entropy as the sum of the discrete entropy of the outer choice + the continuous entropy over the chosen simplex. (But there are a bunch of subtleties.)

ETA: you can't just take the topology generated by the open sets of every face and then run my argument in exactly the same way, because that isn't compact and so we don't get a limit point . But we can define our algorithm over the disjoint union of all the faces which is compact (note that a given lottery-lottery is no longer uniquely represented but that's not a problem). The lemma still shows that if any  strictly dominates  then there must be a continuous  (in the disjoint union) that dominates . And we can still find arbitrarily large  for which  , which leads to a contradiction just as before. So overall I think that this fix works.

Ok, here are some questions to help me understand/poke holes in this proof. (Don't think too hard on these questions. If the answers are not obvious to you, then I am asking the wrong questions.

  1. Does the argument (or a simple refactorization of the argument you also believe) decompose through "If  strictly dominates , then there is a  that also strictly dominates  such that the probability of any voter being indifferent between something sampled from  and something sampled from  is 0 (or negligable)."
  2. If Yes to 1, do you believe the above lemma is also true for an arbitrary ?
  3. If Yes to 1, do you believe the above lemma is true if we replace "strictly dominates" with "" for some fixed .
  4. If No to 1, is there some minor modification that will give me a similar looking lemma the argument does decompose through?

#1: It doesn't. The previous version implied that there was a  for which the probability of ties was arbitrarily low, but the new version can have lots of voters who are indifferent. If B puts its mass in the interior of a face F, then we redistribute probability mass within the interior of F, but some voters assign the same utility to everything in F.

#4: The current lemma is:

If B strictly dominates A, then there is a face F of the simplex and a B' which is continuous over F such that B' strictly dominates A.

I still haven't understood all of your argument, but have you missed the fact that some faces are entirely contained in ?

(Your arguments look similar to stuff we did when trying to apply this paper.)

I think this is OK (though still lots of room for subtleties). Spelling this aspect out in more detail:

  • Fix some arbitrary A which is strictly dominated by B.
  • We claim that there exists a face F and a continuous B' over F such that B' also dominates A.
  • Sample some lottery from B to obtain a concrete lottery b that strictly dominates A.
  • If b is a vertex we are done. Otherwise, let F be the face such that b lies in the interior of F.
  • For each voter, their level sets are either hyperplanes in F or else they are all of F.
  • We can ignore the voters who are indifferent within all of F, because any B' supported in F will be the same as b from those voters' perspectives.
  • Now define  as before, but restricting to the voters who have preferences within F.
  • We obtain a continuous distribution B' for which  if we ignore the voters who were indifferent. But  for the voters who are indifferent, so we have  overall.
  • (Of course this just goes through the existence of an open set of lotteries all of which strictly dominate A, we can just take B' uniform over that set.) 

This lemma is what we need, because we will run follow the leader over the space of pairs (F, A) where F is a face and A is a distribution over that face. So we conclude that the limit is not dominated by any pair (F, B') where F is a face and B' is a continuous distribution.

Ok, I believe this version of the Lemma, and am moving on to trying to get the rest of the argument.

Actually we could take  to be any limit point of  (in the sense that  is a limit point of  for any continuous ) and then get the same conclusion. I think compactness guarantees the existence of such a limit point (e.g. choose some countable basis for the topology and then restrict the sequence so that one open set after another has a limit), and so the convergence worries are resolved.

Hm, I'm now pretty skeptical about the limit step. In particular, if  converges to a limit for every open set , we can't take  to be the limit of those probabilities (since it doesn't satisfy countable additivity even though the simplex is compact). In general the space of probability distributions is not compact in the relevant topology, and so I think we can't possibly have  based only on the fact that  is the expectation of a bounded function.

This seems like it's plausibly the same topological problem that would break fixed-point theorems, and so I think it's the most likely candidate for where the whole thing breaks and why this strategy doesn't make any progress.

There are various ways to try to route around the problem, but it feels like it may just be the same thing you've been working on, so it seems probably easier to start with looking for an examples where this breaks my algorithm and then confirming that it's the same problem the other approaches run into.

To really break the strategy, what we'd want is a sequence  and a lottery  such that  but there is no way to take the limit  (e.g. in earth mover distance) for which . If we found that then I could still try to argue that such sequences are never produced by the algorithm, but it would at least show I need a different proof strategy and likely indicate that the strategy didn't make progress.

ETA here is easy counterexample:

  • There are three candidates, X, Y, Z. There is a voter who only likes X and a voter who only likes Y.
  • B puts 1/2 of its probability mass on Z, and the other half spread uniformly over X/Y lotteries.
  •  puts its probability mass on lotteries between X and Y where X is almost certain to win (converging to certainty as ).
  • In the limit,  always wins the X voters, and it wins Y voters half of the time (whenever B picks Z) and so it has a 3/4 win probability.
  • It's clear that  which puts all of its mass on X. So it always wins the X voters, and ties on the Y voters half of the time (when B samples Z), so it has a 5/8 win probability.
  • If we add a Z voter with half weight or something, then we'll have  but .

Still would take work to turn it into a counterexample for the original algorithm. Would be curious to do that and see if I actually believe that game ought to have an equilibrium (or to understand why it can't be done).

And here's a significantly worse counterexample, showing that there need not be any  near  such that :

  • There are three candidates X, Y, Z and there are three voters: one only likes X, one has u(X) = 2 and u(Y) = 3, one has u(X) = 2 and u(Z) = 3.
  • B puts all of its mass on X.
  •  puts 1/3 of its mass on X, 1/3 of its mass on a lottery between Y and Z with the probability of Y approaching 2/3 from above as , and 1/3 on a lottery between Y and Z with the probability of Y approaching 1/3 from below.

Under these conditions:

  •  puts 1/3 of its mass on X, 1/3 on (2/3 Y, 1/3 Z), and 1/3 on (1/3 Y, 2/3 Z).
  • We can compute that  never beats , and loses 4/9 of the time.
  • On the other hand, every  beats  2/9 of the time (and still loses 4/9 of the time).
  • If  puts any mass on gambles where X doesn't get 100% of the probability, it loses the X-loving voter.
  • As a result, any  near  must lose against  at least 2/9 of the time, and can win at most 5/9 of the time.
  • So there's no way for it to match .

In light of that example I think the basic proof strategy is probably no good. It may be tough to construct a concrete example where the algorithm fails, but if it works it would have to be due to some property other than the fact that no continuous  can beat infinitely many , which is all I really wanted to do.

(I'll retract the original proposal.)

Epistemic status: rough sketch, there might be holes but I don't see any atm.

I think you need to slightly weaken the definition of an MLL and then you'll have an existence theorem.

Let's start with general discontinuous games. We have a game between two players, with pure strategy spaces and that we assume to be compact Polish spaces and (discontinuous) utility functions . Then we can take the closure of the graphs of and get upper hemicontinuous multivalued utility functions. I claim that for such multivalued games, Nash equilibria (in some sense) exist.

Let be a compact Polish space and a multivalued upper hemicontinuos function on it. What does it mean for to be a "maximum" of ? We can define it as follows: for any , the maximal possible value of is greater or equal to the lowest possible value of . It is not hard to see that maxima form a non-empty closed set. This leads to a corresponding notion of "best response" in multivalued games, and a corresponding notion of Nash equilibrium.

Notice that since we only care about the maximal and minimal possible values of , we might as well require that takes convex multivalues (i.e. closed intervals), since otherwise we can always take pointwise convex hull. Expected values of intervals can be defined by separately taking the expected value of the upper and lower ends of the interval.

I think that the existence of Nash equilibria follows from the Kakutani theorem in the usual way. The key observation is, for an upper hemicontinuous function, the maximal possible value is a (single-valued) upper semicontinuous function and the lowest possible value is a (single-valued) lower semicontinuous function. This implies that the best-response mapping is upper hemicontinuous.

Applying it to MLL, what I think we get is a notion of "weak" dominance where we only require that , and otherwise the definition is the same.

Yeah, I believe this works, and that it feels too weak.

For example, if there is a unanimous winner, you only have to pick them half the time, and can do whatever you want the other half of the time.

Yes, this is a good point. Maybe we can strengthen the "weak-MLL" criterion in other ways while preserving existence. For example, we can consider the "-dominance" condition and look for an LL that is "weak -maximal" for the highest possible . The function on the LHS is lower-semincontinuous, hence there exists a maximal for which a weak -maximal LL exists.

I have a reduction of this problem to a (hopefully) simpler problem. First up, establish the notation used.

[n] refers to the set  is the number of candidates. Use  as an abbreviation for the space , it's the space of probability distributions over the candidates. View  as embedded in , and set the origin at the center of .

At this point, we can note that we can biject the following: 
1: Functions of type  
2: Affine functions of type  
3: Functions of the form , where , and and , and everything's suitably set so that these functions are bounded in  over . (basically, we extend our affine function to the entire space with the Hahn-Banach theorem, and use that every affine function can be written as a linear function plus a constant) We can reexpress our distribution  over utility functions as a distribution over these normal vectors .

Now, we can reexpress the conjecture as follows. Is it the case that there exists a  s.t. for all , we have 

 Where  is the function that's -1 if the quantity is negative, 0 if 0, and 1 if the quantity is positive. To see the equivalence to the original formulation, we can rewrite things as 

 Where the bold 1 is an indicator function. And split up the expectation and realize that this is a probability, so we get 

 

 And this then rephrases as 

 Which was the original formulation of the problem.

Abbreviating the function  as , then a necessary condition to have a  that dominates everything is that 

 If you have this property, then you might not necessarily have an optimal  that dominates everything, but there are  that get a worst-case expectation arbitrarily close to 0. Namely, even if the worst possible  is selected, then the violation of the defining domination inequality happens with arbitrarily small magnitude. There might not be an optimal lottery-lottery, but there are lottery-lotteries arbitrarily close to optimal where this closeness-to-optimality is uniform over every foe. Which seems good enough to me. So I'll be focused on proving this slightly easier statement and glossing over the subtle distinction between that, and the existence of truly optimal lottery-lotteries.

As it turns out, this slightly easier statement (that sup inf is 0 or higher) can be outright proven assuming the following conjecture.

Stably-Good-Response Conjecture: For every , and , there exists a  and a  s.t. 

Pretty much, for any desired level of suckage and any foe , there's a probability distribution  you can pick which isn't just a good response (this always exists, just pick  itself), but a stably good response, in the sense that there's some nonzero level of perturbation to the foe where  remains a good response no matter how the foe is perturbed.

Theorem 1 Assuming the Stably-Good-Response Conjecture, .

I'll derive a contradiction from the assumption that . Accordingly, assume the strict inequality.

In such a case, there is some  s.t. . Let the set . Now, every  lies in the interior of  for some , by the Stably-Good-Response Conjecture. Since  is a compact set, we can isolate a finite subcover and get some finite set  of probability distributions  s.t. .

Now, let the set . Since , this family of sets manages to cover all of  (convex hull of our finite set.) Further, for any fixed  is a continuous function  (a bit nonobvious, but true nontheless because there's only finitely many vertices to worry about). Due to continuity, all the sets  will be open. Since we have an open cover of , which is a finite simplex (and thus compact), we can isolate a finite subcover, to get a finite set  of  s.t. . And now we can go 

 The first strict inequality was from how all  had some  which made  get a bad score. The  was from expanding the set of options. The  was from how  is a continuous linear function when restricted to , both of which are compact convex sets, so the minimax theorem can be applied. Then the next  was from restricting the set of options, and the  was from how every  had some  that'd make  get a good score, by construction of  (and compactness to make the inequality a strict one).

But wait, we just showed , that's a contradiction. Therefore, our original assumption must have been wrong. Said original assumption was that , so negating it, we've proved that 

 As desired.