The continued misuse of the Prisoner's Dilemma

by SilasBarta2 min read23rd Oct 200970 comments

33

Prisoner's Dilemma
Personal Blog

Related to: The True Prisoner's Dilemma, Newcomb's Problem and Regret of Rationality

In The True Prisoner's Dilemma, Eliezer Yudkowsky pointed out a critical problem with the way the Prisoner's Dilemma is taught: the distinction between utility and avoided-jail-time is not made clear.  The payoff matrix is supposed to represent the former, even as its numerical values happen to coincidentally match the latter.  And worse, people don't naturally assign utility as per the standard payoff matrix: their compassion for the friend in the "accomplice" role means they wouldn't feel quite so good about a "successful" backstabbing, nor quite so bad about being backstabbed.  ("Hey, at least I didn't rat out a friend.")

For that reason, you rarely encounter a true Prisoner's Dilemma, even an iterated one.  The above complications prevent real-world payoff matrices from working out that way.

Which brings us to another unfortunate example of this misunderstanding being taught.

Recently, on the New York Times's "Freakonomics" blog, Professor Daniel Hamermesh gleefully recounts a recent experiment he performed (which he says he does often) on students in his intro economics course, which is basically the same as the Prisoner's Dilemma (henceforth, PD).

Now, before going further, let me make clear that Hamermesh is no small player.  Just take a look at all the accolades and accomplishments listed on his Wikipedia page or his university page CV.  So, this is a teaching of a professor at the top of his field, so it's only with hesitation that I proceed further to allege that he's Doing It Wrong

Hamermesh's variant of the PD is to pick eight students and auction off a $20 bill to them, with the money split evenly across the winners if there are multiple highest bids.  Here, cooperation corresponds to adhering to a conspiracy where everyone agrees to make the same low bid and thus a big profit.  Defecting corresponds to breaking the agreement and making a slightly higher bid so you can take everything for yourself.  If the others continue to cooperate, their bid is lower and they get nothing.

Here is how Hamermesh describes the result (italics mine, bold in the original):

Today seven of the students stuck to the collusive agreement, and each bid $.01. They figured they would split the $20 eight ways, netting $2.49 each. Ashley, bless her heart, broke the agreement, bid $0.05, and collected $19.95. The other 7 students booed her, but I got the class to join me in applauding her, as she was the only one who understood the game.

The game?  Which game?  There's more than one game going on here!  There's the neat little well-defined, artificial setup that Professor Hamermesh has laid out.  On top of that, there's the game we better know as "life", in which the later consequences of superficially PD-like scenarios cause us to assign different utilities to successful backstabbing (defecting when others cooperate).  There's also the game of becoming the high-status professor's Wunderkind.  And while Ashley (whose name he bolded for some reason) may have won the narrow, artificial game, she also told everyone there that, "Trusting me isn't such a good idea."  In other words, the kind of consequence we normally worry about in our everyday lives.

For this reason, I left the following comment:

No, she learned how to game a very narrow instance of that type of scenario, and got lucky that someone else didn’t bid $0.06.

Try that kind of thing in real life, and you’ll get the social equivalent of a horse’s head in your bed.

Incidentally, how many friends did Ashley make out of this event?

I probably came off as more "anticapitalist" or "collectivist" than I really am, but the point is important: betraying your partners has long-term consequences which aren't apparent when you only look at the narrow version of this game.

Hamermesh's point was actually to show the difficulty of collusion in a free market.  However, to the extent that markets can pose barriers to collusion, it's certainly not because going back on your word will consistently work out in just the right way as to divert a huge amount of utility to yourself -- which happens to be the very reason Ashley "succeded" (with the professor's applause) in this scenario.  Rather, it's because the incentives for making such agreements fundamentally change; you are still better off maintaining a good reputation.

Ultimately, the students learned the wrong lesson from an unrealistic game.

33

70 comments, sorted by Highlighting new comments since Today at 1:33 PM
New Comment

EDIT: as suggested I have turned most of this comment, somewhat expanded, into a post.

The entire point of experiential learning, which is what you set up to happen when you have students play a game - as opposed to telling them about a game - is that there is no "right" or "wrong" lesson to be taken from it.

...see linked post for fuller argument...

If Hamermesh is to be faulted for something, it is for (apparently) imposing on the students his own conclusions from a given outcome, as opposed to letting the students figure out for themselves what the outcome means.

9cousin_it11yAgree completely. I wonder why your comment isn't upvoted to +10. Applauding the defector in PD is a weird thing to do for a professor anyway. Possibly related is Nassim Nicholas Taleb's concept of the ludic fallacy [http://en.wikipedia.org/wiki/Ludic_fallacy]: "the person who is assuming a tightly-constrained game will emerge as the loser".
3Morendil11yI'm considering a top-level post (my first) on experiential games and a little background on how they might be worthwhile for LWers - there have been a few reports of experiences, such as the estimation/calibration game at one meetup, but I'm feeling that a little detail on the constructivist approach and practical advice on how to set up such games might be useful. I use experiential games quite a bit; one that I remember fondly from a few years ago was adapted from Dietrich Doerner's /The Logic of Failure/ - the one where you are to control a thermostat. Doerner's account of people actually playing the game is enlightening, with many irrational reactions on display. But reading about it is one thing, and actually playing the game quite another, so I got a group of colleagues together and we gave it a try. By the reports of all involved it was one of the most effective learning experiences they'd had. An experiential learning game focusing on the basics of Bayesian reasoning might be a valuable design goal for this community - one I'd definitely have an interest in playing.
2cousin_it11yBy all means write it, this stuff sounds very interesting. Possibly related are the PCT demo games [http://www.mindreadings.com/demos.htm] mentioned on LW before. I imagine a Bayesian learning game to be similar in spirit (better implement it in Flash rather than Java, though). Also tangentially related are the cognitive testing games [http://cognitivefun.net/].

Wait, you mean he let them conspire and they didn't set up explicit [monetary] penalties for breaking the agreement? Everybody fails.

I have been in Ashley's situation - roped in to play a similar parlour game to demonstrate game theory in action.

In my case it was in a work setting: part of a two day brainstorming / team building boondongle.

In my game there were five tables each with eight people, all playing the same, iterarted game.

In four out of five table every single person cooperated in every single iteration - including the first and last one. On the fifth table they got confused about the rules.

The reason for the behaviour was clear - the purpose of the game was to demonstrate that cooperation increased the total size of the pot (the game was structered that way). In a workplace setting the prize was to win the approbation of the trainers and managers, by demonstrating that we were teamplayers, and certainly NOT to be the asshole who cheated his tablemates and walked off with $50.

On the the fifth table they managed to confuse themselves such that on the first iteration two of them unwittingly defected. Their table therefore ended up with the least money, but the two individuals of course ended up the richest in the room - they were hideously embarrassed.

I was left wondering what amount of money it would have taken to change behaviour. Would people defect if there was $1000 at stake? In that setting, I think still not. $10,000? $100,000 ?

Practical game-theory experiments would be quite expensive to run, I think.

Pretending to not understand the game and acting embarrassed in order to defect without social consequences seems like a pretty good strategy to me.

8Chronos11yI'm reminded of a real-world similar example: World of Warcraft loot ninjas. Background: when a good item drops in a dungeon, each group member is presented with two buttons, a die icon ("need") and a pile-of-gold icon ("greed"). If one or more people click "need", the server rolls a random 100-sided die for each player who clicked "need", and the player with the highest roll wins the item. If no one in the group clicked "need", then the server rolls dice for everyone in the group. Usually players enter dungeons in the hopes of obtaining items that directly improve their combat effectiveness, but many items can also be sold at the in-game auction house, sometimes for a substantial amount of gold, so that a character can still benefit indirectly even if the item itself has no immediate worth. As you can imagine, "pick-up groups" (i.e. four random strangers you might never party with again) often suffer from loot ninjas: people who intentionally click on the "need" button to vastly improve their odds of obtaining items, even when the item holds no direct value for themselves but does hold direct value for another party member. And, indeed, a common loot ninja strategy is to feign ignorance of the "need versus greed" loot roll system (which, to be fair, has legitimately confusing icons) and to use every other possible trick to elicit sympathy, such as feigning bad spelling and grammar, for as long as possible before being booted from the party and forcibly expelled from the dungeon.

Ultimately, the students learned the wrong lesson from an unrealistic game.

Alternately, they learned more about finding the balance between maintaining peer alliances and gaining the favour of the ruler (guessing the teacher's password). This is the very essence of the courtier. Even if the students don't fully comprehend that message I am confident that their intuitions are lapping it up.

On a note more explicitly related to economics they gained insight into using anti-competitive practices needed to get ahead (in this case public shaming) while avoiding crossing the line that triggers adverse social sanction (professorial intervention.)

Whichever way you look at it, Ashley won this game. Of course, many other students with less aggressive or less aware professors have lost by taking the same actions. Including some in classes which I have attended!

I think it's odd that he would say that only Ashley understood the game, not because she may actually be the loser in the wider scheme of things, but because the relevance of the Prisoner's Dilemma is that is actually supposed to be a dilemma. His saying only her action showed understanding suggests he doesn't think it's a real dilemma at all. He thinks it's a question with an answer: defect.

3wedrifid11yIt isn't the prisoner's dilemma and Hamerish did not describe it as such. It is similar to the Prisoner's Dilemma in as much as, well it is to do with game theory and people could cooperate. The title of this post is a misuse of 'Prisoner's Dilemma'.
1Douglas_Knight11yIt is completely standard to refer to a wide class of problems as PD. This example is much closer than most examples.
4wedrifid11yIt is a completely standard mistake to refer to just about anything game theoretic as 'Prisoner's Dilemma'. In this instance, there are several elements that are neither newcomblike nor Prisoner's Dilemmaish. When one adds all the necessary assumptions and limitations to this problem to make the decision one particular agent faces analogous to a Prisoner's Dilemma one does not find that $0.05 is equivalent to 'defect'. The judgement required to reach that decision requires far more insight than a defection. When Hamerish said Ashley understood the game he was not saying "Ashley chose to defect which is the correct response to the Prisoner's not-dilemma". Mind you, Neil makes a good point. He just happens to be making false claims about what a Professor believes because he has been fed a false premise. I don't like being misrepresented and I particularly don't like it when this misrepresentation makes me look naive. If we go around saying things that are not true out of negligence then this is what we can expect to happen.
1SilasBarta11yIt doesn't need to be. The mapping to the PD here is that defection is continuous rather than binary. It generalizes the concept of defection in the canonical PD so that you can choose a level of defection, and the most "defective" (!) person, if they aren't equal, diverts utility to him/herself at the expense of the other players. Just like how in the standard PD, a defection when the other player doesn't will divert utility to yourself.
0wedrifid11yIn the PD increasing defection level from 0 to 1 never lowers utility. In this game increasing what you call the continuous measure of defection always lowers utility except when your defection is the largest.
0SilasBarta11yThere's a deeper similarity to the PD and I explained it in the original post.
0wedrifid11yWe cannot draw conclusions about whether Hamerish believes the Prisoner's Dilemma is a dilemma just because one element of the game he described is the potential for collusion.
1wedrifid11y
0Neil11ySince a bid's winningness is contingent on other bids you can't use winning as a proxy for understanding. If they all thought and acted like Ashley and broke the pact with 5 cent bids would they all have got a round of applause for their great insight in bidding 5 cents?
0prase11yWin isn't an answer. It's like somebody asking "where's the Central station?" getting the answer "just find it".
1wedrifid11yNo, it's like saying "Alison found the Central Station! Well done!"

As a group, they'd get more money appointing just one person to bid $0.01, and splitting it after the fact.

0timtyler11yThe rules of the game forbid that.
-2billswift11yYou mean as a group they would have gotten the exact same amount of money as they in fact did. Maybe we should call this the "socialist fallacy" - confusing a group's total benefit with the "equality" of the outcome for the group's members.
3Cyan11yNo, I don't think that's what he means. There's an ambiguity in what is meant by "joint bid" in: If seven people bid $0.01, does the prof take $0.01 for his $20, or does he take $0.07?
2SilasBarta11yI noticed early on that the problem was ambiguous in this respect. Fortunately, for the point made, about the gains from cooperation and defection, it doesn't matter: all you need is that it's possible to share in larger gains by cooperating, unless someone defects, and the professor's reaction to what happened.
0[anonymous]11yNo, that's not what he means. From the Freakonomics blog post: So as a group, if seven people bid $0.01, they split $19.93, whereas if under JamesAndrix's scheme, they'd net $19.99.

You're calling out the professor for not addressing the larger game of "life" but this post itself seems to be denying Ashley the opportunity to play the larger game of life at all.

For example, Ashley's demonstration also surely had some gross, if not necessarily net, benefit to her reputation - she showed everyone she is clever, that she can get the approval of the professor, etc.

Ashley may have left class and spent $17.45 neutralizing those hits to her reputation (distribute the money after class, buy everyone a beer, etc.). She would have ne... (read more)

1SilasBarta11yDid you miss the part about it being a 500-student class?
0bgrah44911yI meant the 7 people she "beat" at the game. Besides, have some faith in Ashley's ability to find a really, really good price-to-performance ratio on reputational gains!
2SilasBarta11yBut if she redistributed enough to her friends to make up for what she took, that would have been the original agreement they had made in the first place!
2bgrah44911yMy entire point was that I think it's possible for Ashley to use her gains from defecting in the PD to more than offset her real-life reputational costs. Do you disagree with my statement that it is possible to do that?
6SilasBarta11yYes, I do disagree, because any later redistribution to them will either be a) less than what they would have gotten in the original deal, or b) the same as what it would have been if she had just stuck to the disagreement. Plus, undoing damage to your reputation is much harder than doing the damage.

This professor, who has no doubt debated game theory with many other professors and countless students making all kinds of objections, gets three paragraphs in this article to make a point. Based on this, you figure that the very simple objection that you're making is news to him?

One thing that concerns me about LW is that it often seems to operate in a vacuum, disconnected from mainstream discourse.

Yes, like I said, given Hamermesh's credential's, I didn't want to jump to any hasty conclusion.

However, professional game theorists do in fact get deceived by the supposed textbook correctness of their conclusions. That's why I linked the previous Regret of Rationality, which goes over why being "reasonable" and winning so sharply diverge. It's also part of why no one ever wins the "guess a third of the average guess" by guessing zero, despite its correctness proof.

If Hamermesh did have some understanding of the issues I raised, it would have taken him very little -- even within the bounds of three paragraphs -- to make it clear. Just a simple "But Ashley may not get invited to many parties after this" would have sufficed.

But not only did Hamermesh not make such an acknowledgement, you can see from his tone that he quite clearly believes there is "a" correct way to play for that scenario, irrespective of what metagames it might be embedded in.

The other 7 students booed her, but I got the class to join me in applauding her, as she was the only one who understood the game.

The fact that students booed doesn't seem to have registered as ... (read more)

4Technologos11yAs soon as he said "she was the only one who understood the game," I wondered whether he really understood the game, broadly construed. Especially if we imagine some reasonable distribution of other players' actions, breaking from cooperation only benefits the player when 1) you bid the highest and therefore receive the money, or 2) everybody (or a majority) breaks cooperation and you don't look bad for doing so. Even then, 1) is only good if the money outweighs the social costs. Heck, there might be future financial costs, if they play cooperation games in the future in that class!
1Technologos11yBtw, they bold every name on the Freakonomics blog, at least the first time they are said in a particular post.
0SilasBarta11yOh. Oops. Guess I laid it on a little too thick there :-P
0Technologos11yDon't worry, I still voted you up, even after such an egregious error ;)
4billswift11yThis is one of the purest examples I have seen in a while of argument from authority, congratulations!
3Technologos11yIf he's really on top of the situation, why did he say the equilibrium was $17.50? Obviously this isn't an equilibrium, since anybody wins by defecting to $17.51. The equilibria are $19.99 and $20.00.
1Vladimir_Nesov11ySeconded.

I probably came off as more "anticapitalist" or "collectivist" than I really am, but the point is important: betraying your partners has long-term consequences which aren't apparent when you only look at the narrow version of this game.

This is actually the real meaning of "selfishness." It is in my own best interest to do things for the community.

The mantras of collectivists and anti-capitalists seem to either not realize or ignore the fact that greedy people aren't really doing things in their own best interest if they are making enemies in the process.

I had my intro Ethics students play an anonymous Prisoner's Dilemma with candy earlier this week - two one-shot, one iterated thrice. Although they didn't know who their own partners were, I had no good way to conceal who got no candy because someone had defected to their cooperation, who walked away with ten pieces looking smug, who had to settle for two, and who got five for mutual cooperation. This didn't appear to influence their behavior at all - actually, apart from the one star student who chose to attend that day and some of the people who manage... (read more)

1cousin_it11yDo you mean that they played randomly, or that they defected without articulating why?
5Alicorn11ySome of them seemed to be playing randomly. Some of them decided that they didn't like the game (too hard to understand, they weren't getting enough candy, whatever) and cooperated in spite of partner defection as a way of checking out of the game. One guy didn't even want to know what his partner had done last time during the iteration, he just defected every time - I guess that could be called a strategy, especially since he wound up with a randomly-playing partner that time.
3cousin_it11yThanks. So they saw the game as another nuisance that the teachers thought up... As my game theory book says, "there's really no point in playing poker except for sums of money it would really hurt to lose".
3Alicorn11yI didn't think that asking them all to put up cash would have gone over well, or I might have tried it. Besides, I got reimbursed for the candy and got to keep the leftovers.

I read the article, also. The description of the game was a bit short and somewhat ambiguous.

The game is designed to show people who participate why it is hard to maintain collusion or price fixing amongst oligopolies, secret agreements are not enough. It was a good demonstration of the difficulties in maintaining a secret deal. Far better than simply reading about it.

A number of theorists think that price fixing is mystery because the economics of it should make any agreements disappear.

However, there are price fixings in the real world which are regularly prosecuted. So, how are the Ashley's dealt with by those groups?

8gwern11yFortunately, I just finished reading Our customers are the enemy [http://books.google.com/books?id=7M8n4UN23WsC], a study of cartels in the '80s and '90s, so I can tell you! Cartels have a number of ways, but the illegal ones have the most problems. One of the most effective methods was used by Archer Daniels Midland in its lysine cartel: it built lysine plants of grotesque overcapacity, something like 30% of the global market, but only sold part of its peak production; its threat to defectors like Sewon (the Korean manufacturer) was that if they cheated on their quotas, it would unleash a price war that would drive it into trouble (apparently Sewon was very heavily in debt from financing its expansion, and like the 2 Japanese companies, it had minimal non-lysine business) or outright bankruptcy. This is similar to what De Beers & OPEC have sometimes done, IIRC. Another method in other cartels is the companies shared their internal financial data (whose veracity would be guaranteed by third-party auditors), pooled all the revenues, and then divided accordingly. Obviously this makes it harder to cheat as well, and reduces any incentive. An approximation to this would be market surveys, and if the surveys showed that any cartelist's share had fallen at the expense of another's, the offender would sell at-cost the product to the damaged party (one of te lysine mechanisms). Some cartels just hold together because the corporate managers running the cartel have a collective interest in driving up the price & their division's profits, but not much of one to engage in price wars for market share. (Such as the multiple vitamin cartels lasting many decades.) Others, like the big German conglomerates or Japanese zaibatsu, have been aided by government complicity or active aid. And then there often can be legal punishments for defectors - going back to the vitamin cartel, we can read in Wikipedia: (And the whistleblower on the lysine cartel, ironically, wound up staying

"I probably came off as more "anticapitalist" or "collectivist" than I really am"

There is most certainly nothing anti-capitalist about creating and maintaining a reputation for cooperation. Who would loan you money or send you goods without payment if you have a reputation for defecting?

0korin4311yAnd what's the point of making money if everyone hates you?
1tut11ySurvival. If everyone loves you you might be able to live without money. But if everyone hates you then you need to give full payment for everything that you want from them. So then you really need money.
4RichardKennaway11yIf everyone really hates you they won't deal with you at all. And if they really really hate you they'll burn you in your house. Money is a great way to transact with strangers. Enemies and friends, it gets complicated.
0SilasBarta11yUm ... decrease the money supply? =-)

Silas, well said. I note that Bob Murphy has linked to this post.

BTW, while I agree that Hamermesh's experiment showed the difficulty of collusion in a free market, I doubt that was his intended "point".

Regards,

Tom

1SilasBarta11yThanks, TokyoTom. And now a commenter on Scott Sumner's blog mentioned [http://blogsandwikis.bentley.edu/themoneyillusion/?p=2698#comment-8908] this. I'm gonna be famous! :-P (Btw, any plans for allowing more profile information so we post our email and websites if we want?) I was saying the opposite: in his post, Hamermesh is saying that this game shows why collusion is difficult, but it doesn't capture the mechanism by which it makes collusion difficult.
0wedrifid11yThere are wiki user pages. Unfortunately, they are not linked.

You aren't saying anything here that Hamermesh isn't well aware of. He is teaching models, and models are simplifications of the world.

5SilasBarta11yHe's aware that the mechanism by which Ashley won (being a lucky liar) is not the reason markets prevent collusion? Then why is he teaching that as a demonstration of why markets prevent collusion? Kind of a strange way to go about it, don't you think?

The celebratory tone of the Freakanomics post is also pretty inexplicable. Why is he so happy that one student out of eight bid $0.05, when the model that he's teaching supposedly predicts that everyone bid $17.50? Either his model is horribly wrong, or the students haven't learned anything, or both...

Maybe this professor just doesn't spend much effort on his blog posts. Take a look at http://freakonomics.blogs.nytimes.com/2009/09/21/why-my-students-dont-get-rebates where he uses the phrase "Pareto improvement" in a completely wrong way. Anyone who doesn't already know what it means will be misled, and those who do will be confused.

0Jayson_Virissimo11yRobin, when do you go from "using a model" to committing the ludic fallacy? I would really be interested in a post that attempts to better define where this line is.

I read the article, and thought much the same thing. Ashley may be up financially, but down socially.

1Technologos11yDefinitely--how much would you (the abstract "you") pay to avoid the whole class seeing you as a jerk?
2wedrifid11y< -$20

When you write "If the others continue to cooperate, their bid is lower and they get nothing" you imply an iterated game. It seems clear from Hamermesh's account that players were only allowed to submit one bid.

Ashley won, but she didn't maximize her win. The smartest thing to do would be to agree to collude, bid higher, and then divide the winnings equally anyway. Everyone gets the same payout, but only Ashley would get the satisfaction of winning. And if someone else bids higher, she's no longer the sole defector, which is socially significant. And, of course, $20 is really not a significant enough sum to play hardball for.

0SilasBarta11ySorry for the poor phrasing. I didn't read it as an iterated game at all. That statement should instead read, "If the others nevertheless cooperate, ... " Should I update it? How do you do the strikeout/line-through thing.

This is not Prisoner’s Dilemma. The original has no reputation effects. http://en.wikipedia.org/wiki/Prisoner’s_dilemma

This was a game in a game theory class. As so the teacher is trying to teach things like strategy domination, ect. In this case I believe he was applauding Ashley because she understood that a bid of .01 was weekly dominated by all other bids; that all other bids yield as good or better results.

Was it a bad idea for her to show herself as a “selfish git”? I don’t know that depends on the social situation. My guess is that folks in a g... (read more)

0[anonymous]11yI don't understand - if you both do just as well both cheating as you do when you both act honestly, why is there any reason whatsoever to be "honest"?

On the topic of "utilities in the prisoner dilemma coinciding with jailtime" I quote one of my guest blog posts: http://phd.kt.pri.ee/2009/01/27/the-real-prisoner-dilemma/

Two hardened criminals are taken to interrogation in separate cells. They are offered the usual deal: If neither confesses, both get one year probation. If both confess, both do 5 years in jail. If one confesses, he goes free but the other does 10 years hard time.

Here’s what actually goes through their minds: “Okay, if neither of us confesses, we have to go back to the rea... (read more)