Mixed strategy Nash equilibrium

by Meni_Rosenfeld 4 min read16th Oct 201047 comments


Inspired by: Swords and Armor: A Game Theory Thought Experiment

Recently, nick012000 has posted Swords and Armor: A Game Theory Thought Experiment. I was disappointed to see many confused replies to this post, even after a complete solution was given by Steve_Rayhawk. I thought someone really ought to post an explanation about mixed strategy Nash equilibria. Then I figured that that someone may as well be me.

I will assume readers are familiar with the concepts of a game (a setting with several players, each having a choice of strategies to take and a payoff which depends on the strategies taken by all players) and of a Nash equilibrium (an "optimal" assignment of strategies such that, if everyone plays their assigned strategy, no player will have a reason to switch to a different strategy). Some games, like the famous prisoner's dilemma, have a Nash equilibrium in so-called "pure strategies" (as opposed to mixed strategies, to be introduced later). Consider, however, the following variant of the matching pennies game:

Player 1 is a general leading an attacking army, and player 2 is the general of the defending army. The attacker can attack from the east or west, and the defender can concentrate his defenses on the east or west. By the time each side learns the strategy of its enemy, it is too late to switch strategies. Attacking where the defenses aren't concentrated gives a great advantage; additionally, due to unspecified tactical circumstances, attacking from the east gives a slight advantage. The sides have no interest in cooperating, so this is a zero-sum game (what one side wins, the other loses).

This elaborate description can be summarized in the following payoff matrix (these payoffs are for the attacker; the defender's payoffs are their negatives):

  2: East 2: West
1: East -1 2
1: West 1 -2

What strategy should each side play? The attacker can think, "Overall, going east is advantageous. So I'll go east." The defender, anticipating this, might say, "Surely they will go east, so that's where I'll wait for them." But after some deliberation, the attacker will realize this, and say "They will expect me from the east! I'll surprise them and go west." You can see where this is going - the defender will think "they'll try to be smart and go west; I'll be smarter and be ready for them", the attacker will think "I was right the first time, east is the way to go" and so on, ad infinitum.

Indeed, looking at the table does not reveal any obvious Nash equilibrium. Every assignment of strategies, represented as a square in the table, will leave one side preferring the alternative. So, are the sides deadlocked in an infinite recursion of trying to outsmart each other? No. They have the option of choosing a strategy randomly.

Suppose the attacker decides to toss a coin, and go east if it lands heads, and west for tails. Suppose also that the defender, with his mastery of psychology, manages to accurately predict the attacker's bizarre behavior. What can he do with this knowledge? He cannot predict the outcome of a coin toss, so he makes do with calculating the expected outcome for each of his (the defender's) strategies. And it can be easily seen that no matter what the defender does, the expectation is 0.

Similarly, suppose the defender consults his preferred (P)RNG so that he defends east with probability 2/3, west with probability 1/3, and suppose that the attacker anticipates this. Again, either of the attacker's strategies will give an expected gain of 0.

This randomized behavior, which is a combination of strategies from which one is randomly chosen with specified probabilities, is called a "mixed strategy". They are typically denoted as a vector listing the probability for choosing each pure strategy, so the defender's suggested strategy is (2/3, 1/3).

What we have seen here, is that by a clever choice of mixed strategy, each side can make sure they cannot be outsmarted! By constraining ourselves to rely on randomness for deciding on an action, we denied our opponent the ability to predict it and counter it. If yourself you don't know what you'll do, there's no way your opponent will know. We conclude that sometimes, acting randomly is the rational action to take.

As we've seen, for each player we have that if he uses his suggested mixed strategy, then it doesn't matter what the other player will do. The corollary is that if the attacker plays (1/2, 1/2) and the defender plays (2/3, 1/3), then no player will have a reason to switch to a different strategy - this is a Nash equilibrium!

Some more points of interest: "Just" randomizing won't do - you have to pick the right probabilities. The exact probabilities of the mixed strategy Nash equilibria, and the resulting payoff, depend on the specifics of the payoff matrix. In my example, the defender needs a high probability of defending east to prevent the attacker from exercising his advantage, but the symmetry is such that the attacker chooses with even odds. In games with more strategies per player, an equilibrium mixed strategy may be supported on (have positive probability for) all, several, or one of the pure strategies. If all players apply a Nash equilibrium, then any player can switch to any one of the pure strategies supporting his mixed strategy without changing his expected payoff; switching to any other strategy either decreases the payoff or leaves it unchanged.

Perhaps most interesting of all is Nash's theorem, saying that every finite game has a mixed strategy Nash equilibrium! Our original problem, that some games have no equilibrium, is solved completely once we move to mixed strategies.

One thing should be kept in mind. A Nash equilibrium strategy, much like a minimax strategy, is "safe". It makes sure your expected payoff won't be too low no matter how clever your opponent is. But what if you don't want to be safe - what if you want to win? If you have good reason to believe you are smarter than your opponent, that he will play a non-equilibrium strategy you'll be able to predict, then go ahead and counter that strategy. Nash equilibria are for smart people facing smarter people.

In fact, it is possible that some of the comments I called confused earlier were fully aware of these issues and coming from this "post-Nash" view. To them I apologize.  

Examples where mixed strategies are crucial are plentiful. I'll give one more - the volunteer's dilemma. A group of people are about to suffer greatly from some misfortune, unless at least one of them volunteers to take a slightly inconvenient action. They cannot communicate to coordinate the volunteering. If everyone uses the same deterministic strategy, then either all or none will volunteer, neither of which is stable. But they have a mixed strategy equilibrium which, by carefully balancing the risks of needlessly volunteering and having nobody volunteer, leaves everyone with an expected penalty equal to just the inconvenience of volunteering. Not as good as having just one person volunteer, but at least it's stable.

For further reading, Wikipedia has many articles about game theory and Nash equilibria. Also, Perplexed made this related comment recently.