A Nash equilibrium is an outcome in which neither player is willing to unilaterally change her strategy, and they are often applied to games in which both players move simultaneously and where decision trees are less useful.

Suppose my girlfriend and I have both lost our cell phones and cannot contact each other. Both of us would really like to spend more time at home with each other (utility 3). But both of us also have a slight preference in favor of working late and earning some overtime (utility 2). If I go home and my girlfriend's there and I can spend time with her, great. If I stay at work and make some money, that would be pretty okay too. But if I go home and my girlfriend's not there and I have to sit around alone all night, that would be the worst possible outcome (utility 1). Meanwhile, my girlfriend has the same set of preferences: she wants to spend time with me, she'd be okay with working late, but she doesn't want to sit at home alone.

This “game” has two Nash equilibria. If we both go home, neither of us regrets it: we can spend time with each other and we've both got our highest utility. If we both stay at work, again, neither of us regrets it: since my girlfriend is at work, I am glad I stayed at work instead of going home, and since I am at work, my girlfriend is glad she stayed at work instead of going home. Although we both may wish that we had both gone home, neither of us specifically regrets our own choice, given our knowledge of how the other acted.

When all players in a game are reasonable, the (apparently) rational choice will be to go for a Nash equilibrium (why would you want to make a choice you'll regret when you know what the other player chose?) And since John Nash (remember that movie *A Beautiful Mind*?) proved that every game has at least one, all games between well-informed rationalists (who are not also being superrational in a sense to be discussed later) should end in one of these.

What if the game seems specifically designed to thwart Nash equilibria? Suppose you are a general invading an enemy country's heartland. You can attack one of two targets, East City or West City (you declared war on them because you were offended by their uncreative toponyms). The enemy general only has enough troops to defend one of the two cities. If you attack an undefended city, you can capture it easily, but if you attack the city with the enemy army, they will successfully fight you off.

Here there is no Nash equilibrium without introducing randomness. If both you and your enemy choose to go to East City, you will regret your choice - you should have gone to West and taken it undefended. If you go to East and he goes to West, he will regret his choice - he should have gone East and stopped you in your tracks. Reverse the names, and the same is true of the branches where you go to West City. So every option has someone regretting their choice, and there is no simple Nash equilibrium. What do you do?

Here the answer should be obvious: it doesn't matter. Flip a coin. If you flip a coin, and your opponent flips a coin, neither of you will regret your choice. Here we see a "mixed Nash equilibrium", an equilibrium reached with the help of randomness.

We can formalize this further. Suppose you are attacking a different country with two new potential targets: Metropolis and Podunk. Metropolis is a rich and strategically important city (utility: 10); Podunk is an out of the way hamlet barely worth the trouble of capturing it (utility: 1).

A so-called first-level player thinks: “Well, Metropolis is a better prize, so I might as well attack that one. That way, if I win I get 10 utility instead of 1”

A second-level player thinks: “Obviously Metropolis is a better prize, so my enemy expects me to attack that one. So if I attack Podunk, he'll never see it coming and I can take the city undefended.”

A third-level player thinks: “Obviously Metropolis is a better prize, so anyone clever would never do something as obvious as attack there. They'd attack Podunk instead. But my opponent knows that, so, seeking to stay one step ahead of me, he has defended Podunk. He will never expect me to attack Metropolis, because that would be too obvious. Therefore, the city will actually be undefended, so I should take Metropolis.”

And so on ad infinitum, until you become hopelessly confused and have no choice but to spend years developing a resistance to iocane powder.

But surprisingly, there is a single best solution to this problem, even if you are playing against an opponent who, like Professor Quirrell, plays “one level higher than you.”

When the two cities were equally valuable, we solved our problem by flipping a coin. That won't be the best choice this time. Suppose we flipped a coin and attacked Metropolis when we got heads, and Podunk when we got tails. Since my opponent can predict my strategy, he would defend Metropolis every time; I am equally likely to attack Podunk and Metropolis, but taking Metropolis would cost them much more utility. My total expected utility from flipping the coin is 0.5: half the time I successfully take Podunk and gain 1 utility, and half the time I am defeated at Metropolis and gain 0.And this is not a Nash equilibrium: if I had known my opponent's strategy was to defend Metropolis every time, I would have skipped the coin flip and gone straight for Podunk.

So how can I find a Nash equilibrium? In a Nash equilibrium, I don't regret my strategy when I learn my opponent's action. If I can come up with a strategy that pays exactly the same utility whether my opponent defends Podunk or Metropolis, it will have this useful property. We'll start by supposing I am flipping a *biased* coin that lands on Metropolis x percent of the time, and therefore on Podunk (1-x) percent of the time. To be truly indifferent which city my opponent defends, 10x (the utility my strategy earns when my opponent leaves Metropolis undefended) should equal 1(1-x) (the utility my strategy earns when my opponent leaves Podunk undefended). Some quick algebra finds that 10x = 1(1-x) is satisfied by x = 1/11. So I should attack Metropolis 1/11 of the time and Podunk 10/11 of the time.

My opponent, going through a similar process, comes up with the suspiciously similar result that he should defend Metropolis 10/11 of the time, and Podunk 1/11 of the time.

If we both pursue our chosen strategies, I gain an average 0.9090... utility each round, soundly beating my previous record of 0.5, and my opponent suspiciously loses an average -.9090 utility. It turns out there is no other strategy I can use to consistently do better than this when my opponent is playing optimally, and that even if I knew my opponent's strategy I would not be able to come up with a better strategy to beat it. It also turns out that there is no other strategy my opponent can use to consistently do better than this if I am playing optimally, and that my opponent, upon learning my strategy, doesn't regret his strategy either.

In *The Art of Strategy*, Dixit and Nalebuff cite a real-life application of the same principle in, of all things, penalty kicks in soccer. A right-footed kicker has a better chance of success if he kicks to the right, but a smart goalie can predict that and will defend to the right; a player expecting this can accept a less spectacular kick to the left if he thinks the left will be undefended, but a very smart goalie can predict this too, and so on. Economist Ignacio Palacios-Huerta laboriously analyzed the success rates of various kickers and goalies on the field, and found that they actually pursued a mixed strategy generally within 2% of the game theoretic ideal, proving that people are pretty good at doing these kinds of calculations unconsciously.

So every game really does have at least one Nash equilibrium, even if it's only a mixed strategy. But some games can have many, many more. Recall the situation between me and my girlfriend:

There are two Nash equilibria: both of us working late, and both of us going home. If there were only one equilibrium, and we were both confident in each other's rationality, we could choose that one and there would be no further problem. But in fact this game does present a problem: intuitively it seems like we might still make a mistake and end up in different places.

Here we might be tempted to just leave it to chance; after all, there's a 50% probability we'll both end up choosing the same activity. But other games might have thousands or millions of possible equilibria and so will require a more refined approach.

*Art of Strategy* describes a game show in which two strangers were separately taken to random places in New York and promised a prize if they could successfully meet up; they had no communication with one another and no clues about how such a meeting was to take place. Here there are a nearly infinite number of possible choices: they could both meet at the corner of First Street and First Avenue at 1 PM, they could both meet at First Street and Second Avenue at 1:05 PM, etc. Since neither party would regret their actions (if I went to First and First at 1 and found you there, I would be thrilled) these are all Nash equilibria.

Despite this mind-boggling array of possibilities, in fact all six episodes of this particular game ended with the two contestants meeting successfully after only a few days. The most popular meeting site was the Empire State Building at noon.

How did they do it? The world-famous Empire State Building is what game theorists call focal: it stands out as a natural and obvious target for coordination. Likewise noon, classically considered the very middle of the day, is a focal point in time. These focal points, also called Schelling points after theorist Thomas Schelling who discovered them, provide an obvious target for coordination attempts.

What makes a Schelling point? The most important factor is that it be special. The Empire State Building, depending on when the show took place, may have been the tallest building in New York; noon is the only time that fits the criteria of “exactly in the middle of the day”, except maybe midnight when people would be expected to be too sleepy to meet up properly.

Of course, specialness, like beauty, is in the eye of the beholder. David Friedman writes:

Two people are separately confronted with the list of numbers [2, 5, 9, 25, 69, 73, 82, 96, 100, 126, 150 ] and offered a reward if they independently choose the same number. If the two are mathematicians, it is likely that they will both choose 2—the only even prime. Non-mathematicians are likely to choose 100—a number which seems, to the mathematicians, no more unique than the other two exact squares. Illiterates might agree on 69, because of its peculiar symmetry—as would, for a different reason, those whose interest in numbers is more prurient than mathematical.

A recent open thread comment pointed out that you can justify anything with “for decision-theoretic reasons” or “due to meta-level concerns”. I humbly propose adding “as a Schelling point” to this list, except that the list is tongue-in-cheek and Schelling points really do explain almost everything - stock markets, national borders, marriages, private property, religions, fashion, political parties, peace treaties, social networks, software platforms and languages all involve or are based upon Schelling points. In fact, whenever something has “symbolic value” a Schelling point is likely to be involved in some way. I hope to expand on this point a bit more later.

Sequential games can include one more method of choosing between Nash equilibria: the idea of a subgame-perfect equilibrium, a special kind of Nash equlibrium that remains a Nash equilibrium for every subgame of the original game. In more intuitive terms, this equilibrium means that even in a long multiple-move game no one at any point makes a decision that goes against their best interests (remember the example from the last post, where we crossed out the branches in which Clinton made implausible choices that failed to maximize his utility?) Some games have multiple Nash equilibria but only one subgame-perfect one; we'll examine this idea further when we get to the iterated prisoners' dilemma and ultimatum game.

In conclusion, every game has at least one Nash equilibrium, a point at which neither player regrets her strategy even when she knows the other player's strategy. Some equilibria are simple choices, others involve plans to make choices randomly according to certain criteria. Purely rational players will always end up at a Nash equilibrium, but many games will have multiple possible equilibria. If players are trying to coordinate, they may land at a Schelling point, an equilibria which stands out as special in some way.

The actual equilibria can seem truly mind boggling at first glance. Consider this famous example:

There are 5 rational pirates, A, B, C, D and E. They find 100 gold coins. They must decide how to distribute them.

The pirates have a strict order of seniority: A is superior to B, who is superior to C, who is superior to D, who is superior to E.

The pirate world's rules of distrubution are thus: that the most senior pirate should propose a distribution of coins. The pirates, including the proposer, then vote on whether to accept this distribution. If the proposed allocation is approved by a majority or a tie vote, it happens. If not, the proposer is thrown overboard from the pirate ship and dies, and the next most senior pirate makes a new proposal to begin the system again.

Pirates base their decisions on three factors.

1) Each pirate wants to survive.

2) Given survival, each pirate wants to maximize the number of gold coins he receives.

3) Each pirate would prefer to throw another overboard, if all other results would otherwise be equal.

The pirates do not trust each other, and will neither make nor honor any promises between pirates apart from the main proposal.

It might be expected intuiti... (read more)

It's amazing, the results people come up with when they don't use TDT (or some other formalism that doesn't defect in the Prisoner's Dilemma - though so far as I know, the concept of the Blackmail Equation is unique to TDT.)

(Because the base case of the pirate scenario is, essentially, the Ultimatum game, where the only reason the other person offers you $1 instead of $5 is that they

model youas accepting a $1 offer, which is a very stupid answer to compute if it results in you getting only $1 - only someone who two-boxed on Newcomb's Problem would contemplate such a thing.)At some point you proposed to solve the problem of blackmail by responding to offers but not to threats. Do you have a more precise version of that proposal? What logical facts about you and your opponent indicate that the situation is an offer or a threat? I had problems trying to figure that out.

I have a possible idea for this, but I think I need help working out more the rules for the logical scenario as well. All I have are examples (and It's not like examples of a threat are that tricky to imagine.)

Person A makes situations that involve some form of request (an offer, a a series of offers, a threat, etc.). Person B may either Accept, Decline, or Revoke Person A's requests. Revoking a request blocks requests from occurring at all, at a cost.

Person A might say "Give me 1 dollar and I'll give you a Frozen Pizza." And Person B might "Accept" if Frozen Pizza grants more utility than a dollar would.

Person A might say "Give me 100 dollars and I'll give you a Frozen Pizza." Person B would "Decline" the offer, since Frozen Pizza probably wouldn't be worth more than 100 dollars, but he probably wouldn't bother to revoke it. Maybe Person A's next situation will be more reasonable.

Or Person A might say "Give me 500 dollars or I'll kill you." And Person B will pick "Revoke" because he doesn't want that situation to occur at all. The fact that there is a choice between death or minus 500 dollars is not a good situation. He m... (read more)

The doctor walks in, face ashen. "I'm sorry- it's likely we'll lose her or the baby. She's unconscious now, and so the choice falls to you: should we try to save her or the child?"

The husband calmly replies, "Revoke!"

In non-story format: how do you formalize the difference between someone telling you bad news and someone causing you to be in a worse situation? How do you formalize the difference between accidental harm and intentional harm? How do you determine the value for having a particular resistance to blackmail, such that you can distinguish between blackmail you should and shouldn't give in to?

Remember, this isn't any old pirate crew. This is a crew with a

particular set of rulesthat gives different piratesdifferent powers. There's no "hang the rules!" here, and since it's an artificial problem there's an artificial solution.D has the power to walk away with all of the gold, and A, B, and C dead. He has no incentive to agree to anything less, because if enough votes fail he'll be in the best possible position. This determines E's default result, which is what he makes decisions based off of. Building out the game tree helps here.

If C and E - and I'd say all 4 of them really, at least regarding a 98 0 1 0 1 solution - were inclined to be outraged as I suggest,

and A knew this, they wouldwalk away with more money. For me, that trumps any possible math and logic you could put forward.And just in case A is stupid:

"But look, C and E, this is the optimal solution, if you don't listen to me you'll get less gold!"

"Nice try, smartass. Overboard you go."

B watched on, starting to sweat...EDIT: Ooops, I notice that I missed the fact that B doesn't need to sweat since he just needs D. Still, my main point isn't about B, but A.

Also I wanna make it 100% clear: I don't claim that the proof is incorrect, given all the assumptions of the problem, including the ones

about how the agents work. I'm just not impressed with the agents, with their ability to achieve their goals. Leave A unchanged and toss in 4 reasonably bright real humans as B C D E, at least some of them will leave with more money.Looking at the problem, I believe there is a third equilibrium, a mixed one. Both you and your girlfriend toss a coin, and choose to go home with probability one half, or stay at work with probability one half. This gives you both an expected utility of 2. If you are playing that strategy, then it doesn't matter to your girlfriend whether she stays at work (definite utility of 2) or goes home (50% probability of 1, 50% probability of 3), so she can't do better than tossing a coin.

Incidentally, this is expected from Wilson's oddness theorem (1971) - almost all finite games have an odd number of equilibria.

This is a minor quibble, but while reading I got stuck at this point:

followed by a description of a game that didn't seem to have a Nash equilibrium and confirming text "Here there is no pure Nash equilibrium." and "So every option has someone regretting their choice, and there is no simple Nash equilibrium. What do you do?"

I kept re-reading this section, trying to work out how to reconcile these statements since it seemed like you have just offered an irrefutable counterexample to John Nash's theorem. It could use a bit of clarification (maybe something like "This game does have a Nash equilibrium, but one that is a little more subtle" or something similar.

Other than that I'm finding this sequence excellent so far.

There is no pure equilibrium, but there is a

mixedequilibrium.A pure strategy is a single move played

ad infinitum.A mixed strategy is a set of moves, with each turn's move randomly selected from this set.

A pure equilibrium is one where every player follows a pure strategy, and a mixd equilibrium is one where at least some players follow a mixed strategy.

Both pure equilibriums and mixed equilibriums are Nash equilibriums. Nash's proof that every game has an equilibrium rests on his previous work where he and von Neumann invented the concept of a

mixedequilibrium and proving that it satisfies the criteria.So this game has no pure equilibrium, but it does have a mixed one. Yvain goes on to describe how you calculate and determine that mixed equilibrium, and shows that it is the attacker playing Podunk 1/11th the time, and Metropolis 10/11th the time.

EDIT: The post explains this at the end:

Yvain: I would strongly recommend including a quick explanation of mixed and pure strategies, and defining equilibriums as either mixed or pure, as a clarification. At the least, move this line up to near the top. Excellent post and excellent sequence.

This is a trivial point, but as a student of mathematics, I feel compelled to point out that while I think he is correct that most mathematicians would choose 2, his reasoning for why is wrong. Mathematicians would pick 2 because there is a convention in mathematics of, when you have to make an arbitrary choice (but want to specify it anyway), pick the smallest (if this makes sense in context).

Excellent post, excellent sequence. Advanced game theory is

definitelysomething rationalists should have in their toolbox, since so much real-world decision-making involves other peoples' decisions.An interesting analysis when you have "nearly infinite" Nash equilibria, is the ... (read more)

Could someone explain how does the defending general in the Metropolis - Podunk example calculates his strategy?

I understand why the expected loss stays the same only if he defends Metropolis 10 /11 times and Podunk 1/11 times, but when I tried to calculate his decisions with the suggested procedure I ended up with the (wrong) result he should defend Podunk 10/11 times and Metropolis 1/11, and I'm not understanding what I'm doing wrong, so I'd likely botch the numbers when trying to calculate what to do for a more difficult decision.

Nice post, thanks for writing it up. It was interesting and very easy to read because principles were all first motivated by problems.

That said, I still feel confused. Let me try to re-frame your post in a decision theory setting: We have to make a decision, so the infinite regress of dependencies from rational players modelling each other, which doesn't generally settle to a fixed set of actions for all players who model each other naively (as shown by the penny matching example), has to be truncated some how. To do that, the game theory people decided th... (read more)