Prisoner's Dilemma as a Game Theory Laboratory

25th Aug 2011

15RobertLumley

2Randaly

3RobertLumley

2prase

1RobertLumley

2Vaniver

1RobertLumley

0Vaniver

9lessdazed

7lessdazed

4vespiacic

0prase

2Eugine_Nier

2malthrin

0prase

5malthrin

4prase

3benelliott

1prase

3red75

2jimrandomh

0prase

9jimrandomh

0prase

2Caerbannog

1Solvent

1prase

0Solvent

0red75

0prase

2red75

0Eugine_Nier

2prase

0Nic_Smith

0prase

0Eugine_Nier

0lessdazed

0Eugine_Nier

0ArisKatsaris

0DuncanS

0[anonymous]

0Vaniver

2benelliott

3torekp

1benelliott

1jhuffman

0Jem

New Comment

47 comments, sorted by Click to highlight new comments since: Today at 8:12 AM

I wish I had been around when the initial Diplomacy game was organized. Diplomacy is a hobby of mine. And that project would have been much better organized (and recorded) on the website where I play, WebDiplomacy.

Furthermore, if you wanted a more easily analyzed metric, you can play "gunboat" diplomacy, where you're not allowed to communicate between countries, and the only communication are by moves - you can use support holds to indicate a request for an alliance, etc. I particularly enjoy this form of diplomacy, as it's far more like seven player chess.

A lot of the people on WebDiplomacy are math/science people; the two communities (if LessWrong could get over the HUGE number of trolls on WebDip) would actually get along really well. There are a number of metrics that they collect, but below is a link to a project that one person did, wherein, among other things, he analyzed the most statistically successful openings for the various countries. Also, below that, I have a link to my personal statistics, if you'd find those interesting.

http://tinyurl.com/DiplomacyReport

http://tinyurl.com/SmileysDipStats

I figure I've promoted LessWrong on WebDip enough, I ought to promote WebDip here too... :-)

There was another game, mostly including the NY members, which took place on WebDiplomacy. The game is here; Zvi Mowshowitz won.

Errrrr.... Points Per Supply Center...

Certainly excusable for LWers, but any serious play on WebDip is WTA. :-P

This is a topic highly debated on WebDiplomacy, (I could go on) but Babak said it best when he said (paraphrasing) "Playing for a strong second in PPSC in diplomacy is like shooting it in your own goal in soccer just because you want to score a goal."

And that project would have been much better organized (and recorded) on the website where I play, WebDiplomacy.

Perhaps it would, but Yvain's moderation and reports after each turn gave it a unique flavour which standard WebDip games lack. Anyway, it may be easy to organise another game for LWers on WebDip.

There will definitely be tons of interest. Since I claimed I was willing to set up another game after winning one of the LW games, I probably ought to go ahead and make this happen.

I'm happy to help if you want. I was considering just making a post myself, but I wanted to gauge whether or not there was any interest at all.

And here I was thinking I might take a bit of a break from diplomacy after my games finish up so I can focus more on my school work...

Thanks for the offer! I'm planning on mostly doing the administrative stuff of sorting people, as I've never used WebDiplomacy before. (I imagine it's pretty easy to use, but experience generally helps). The previous post has been updated with a link, if you'd like to sign up.

This is a fantastic idea.

My strategy is to build up an immunity to iocane powder. If you need help coding that just let me know.

I suggest the submission guidelines be modified to encourage naming the strategies. That would be fun.

I'm envisioning:

- A textual interface where people can enter strategies in psuedocode or possibly a programming language that I can sandbox.
- A scheduled job to run games along the lines you've described above every day? hour?
- A page to display the outcome of previous tournaments.

This would be an interesting result, since the Nash equilibrium of fixed length prisoner's dilemma is defectbot, and defectbots have not had a great rate of success in past tournaments.

Yes. It would depend on concrete implementation and number of players. The numbers needed for defectbots to actually thrive are very big and it would take a long time to converge into Nash, so I was perhaps a bit overconfident with my prediction.

Moreover, simulations I ran using your rules for evolutionary tournament show that one strategy quickly dominates and others go extinct. Defectbot is among strategies which are fastest to go extinct (even in presence of cooperatebot) as it feeds off overaltruist strategies, which in turn fail to compete with tit-for-tat. So I doubt that at least evolutionary tournament will converge into Nash.

I predict that strategy that tit-for-tats 99 turns and defects on 100-th one will win in evolutionary tournament, given that tit-for-tat is also in the population.

ETA: I've sent another strategy.

The strategy can be described in any comprehensible language. If I find problems in understanding, which will probably happen if the strategy is described in Lisp or Basque, but can happen even if it is written in English, I will ask.

You're going to be translating these all into computer programs so you can run them, right? You should specify which language it's going to be, so we can save you some work.

I have already a functional code written in Wolfram Mathematica which simulates the tournaments.

If the number of strategies will not be too big, it is easier for me to code them than to write down the instructions needed for the readers to know how I represent the data, not to speak about help for those unfamiliar with Mathematica's syntax. Simple strategies are usually one-liners, e.g. tit-for-tat has 29 characters.

In the meantime I've run my own simulation, studying a group of strategies, which perform as tit-for-tat except that at specific turn they defect and then they use result of this turn to switch to defect stone or continue tit-for-tatting. Thus they recognize copies of itself and cooperate with them. Such strategy can be exploited by switching to defect stone before it does, or by mimicking its behavior (second defect check after first. This case I didn't analyze).

It leads to interesting results in evolutionary tournament. Second fairest (second longest period of tit-for-tatting) strategy wins. It outperforms less fair strategies by longest cooperation with fairest strategy. And it outperforms fairest strategy by exploiting it.

Define strategy *S[n]* as TfT until turn *n* and defect ever since. In the limit of infinite population having non-zero initial number of *S[n]* for each *n*, *S[0]*, i.e. DefectBot, eventually dominates. Starting with equal subpopulations, initially most successful is *S[99]* which preys on *S[100]* and finally drives it to extinction. But then, *S[98]* gains advantage over *S[99]* and so on.

With not so big population however, the more defectorish strategies die out sooner than the environment becomes suitable for them. (I have done it with population of 2000 strategies and the lowest surviving after several hundred generations was *S[80]* or so).

Try another strategy. I[n] - TfT until turn n, defect on turn n, on later turns check if result on turn n was (defect,defect) and play TfT, otherwise defect. Idea is selfcooperation.

So, anyone want to bet on what proportion of strategies will be tit-for-tat? I'm fairly confident it'll be more than half.

(I should be clear, I mean the number of submitted strategies, as duplicate strategies are treated as one strategy.)

In a tournament with a fixed number of rounds and a known random strategy, it would be a bit weird to just use vanilla tit-for-tat, but yeah. It will be a bit boring if all the strategies are nice.

If you expect to play against tit-for-tat strategies and a random strategy then you would know that tit-for-tat cannot be expected to win. I'd be surprised if lots of people submit strategies that have almost no chance of winning.

Last year Yvain had organised a Diplomacy game between LessWrong users to test how well we perform in practical application of game theory. At least two games had been played, but as far as I know no analysis was made afterwards. One reason is probably that few games involving complex interactions between players constitute at most anecdotal evidence for whatever hypothesis one may test. The second one is lack of comparison to outside players. Although the games were fun, their value as a game theory experiment remains rather low. Could we test our game theoretic skills in a statistically more significant way?

Only recently I have learned about Robert Axelrod's experiment in which he run a competition of different strategies playing

iteraded prisoner's dilemma, and got an idea to replicate it. I have already run a similar experiment with five contestants (all being my friends) and now a second run is being prepared, with at least nine strategies in the pool. I am interested in a third run, this time with strategies nominated by LessWrongers. The contestants of the second run which has identical rules are readers of my blog and neither of them is probably familiar with specific LW ideas. Therefore, they would serve as a fairly good control group to test LW's applied rationality skills (or a subset of). After matching the strategies in both groups separately, I plan to put all of them together and see who wins.So, if you want to participate in this contest, feel free to send me your strategy. The rules are following.

strategyI mean a program sent by a contestant or coded according to his/her instructions. The strategies compete in iterated prisoner's dilemmas. A single iteration I will call aturn. In each turn each strategy has to choose betweencooperatinganddefecting. The payoffs are:matchI mean a series of 100 iterations between the same opponents.round-robin tournamentand theevolutionary tournament. Two separate final standings will be made, one for each tournament. Any received strategy has to participate in both.generationpool, the strategies will be represented in numbers proportional to the total number of points won by all its copies in the present generation. The total population will be maintained at a constant level (probably 2,000 strategies, up to rounding errors). Strategies may go extinct. The strategy with the highest number of copies after the 100th generation will be considered a winner.nothave access to results of other matches (including its own previous matches), the number of current generation, population sizes, number of points won by any strategy in any phase of the tournament (except the number of points already won in the current match which can be calculated from the previous turns results) and its opponent's identity (namely it doesn't know whether it plays against a copy of itself). The strategies obviously can't read their opponent's source code.The simulation will probably not be run before at least eight strategies are collected and before the beginning of September.

The competition is closed, no new strategies are accepted at this moment. 21 different strategies were accepted, their implementations are now being tested. Results will be probably posted on Sunday 4th September.

[Edit: Found inconsistency in using words

roundandturnto denote the same thing. Nowturnis used everywhere.]