4 "Extortionate" strategy beats tit-for-tat in iterated Prisoner's Dilemma

26th Jun 2012

1 min read

4

Less Wrong had a Prisoner's Dilemma contest some time back, whose results I've forgotten. Perhaps it should be rerun with William H. Press & Freeman Dyson's proposed extortionate strategies.

I hope Pinker gives a response at Edge.org, since P.D played a significant role his book "The Better Angels of Our Nature" as a source of morality embedded in the nature of logic/reality.

Hat-tip to Marginal Revolution.

Personal Blog

4

"Extortionate" strategy beats tit-for-tat in iterated Prisoner's Dilemma

New Comment

12 comments, sorted by

top scoring

Click to highlight new comments since: Today at 10:29 PM

[-]Grognor14y190

The linked article is a complete waste of time as the authors don't bother to explain what the extortionate strategy is, only insist that it turns the game into an ultimatum. And the title must be a lie, since halfway through, it explicitly says TFT gets the same score as its opponent. (In other words, it doesn't get "beat" by anything.) So the parts of the article that are true are useless, the parts that are supposedly interesting are asserted, unexplained, and the title is certainly false. Downvoted.

[-]Kindly14y180

There was a previous post about this topic that actually linked to the paper, which I think you'll be happier with.

In particular, what the extortionate strategy does is the following: if player 2 accepts that player 1 will play the extortionate strategy, and there's nothing to be done about that, then there is a linear relation between their scores, and he can only maximize his score by giving an even higher score to player 1. In particular, if player 2 plays TFT (which is also an extortionate strategy, in a degenerate sense, with extortion factor 1) then the two players eventually end up in the (Defect, Defect) state, and get 0 points per turn, which satisfies both relations.

[-]Peter Wildeford14y10

How does this actually get implemented in code?

[-]Kindly14y20

All of the "ZD strategies" are described by 4-tuples of probabilities: the probabilities of cooperation given the outcome of the previous turn, which can be one of (CC, CD, DC, DD). In comments to the previous post I calculated two examples, and the paper contains the general formulas in equations [8] and [12].

[-]shokwave14y10

Ah, thank you. Made that much clearer for me; I had the slightly incorrect impression that a ZD strategy was any strategy that could be described by such a 4-tuple, but I didn't make the connection that the evolution could apply directly to the probabilities instead of the strategy that generated the probabilities.

[-]shokwave14y170

Perhaps it should be rerun

It is, in fact. I'm finalising the code and the announcement post. Expect it in a couple of days.

William H. Press & Freeman Dyson's proposed extortionate strategies.

As I understand them (which is admittedly not much; the paper is surprisingly dense and metaphorical) can take advantage of 'evolutionary' strategies but not strategies with a 'theory of mind' - and it appears that tit-for-tat falls under the 'theory of mind' category. Although I would state once more that I'm not sure I understand the paper.

[-]Andreas_Giger13y10

How's your progress? You said you were "finalising the code and the announcement post", so I'm wondering whether you ran into any unexpected problems since then?

[-]shokwave13y10

Unexpected, yes; problems, no. I'm currently re-implementing the entire thing in Clojure, as it gives me a really elegant, simple, "simulate my opponent" function (which is exciting!) and makes everything else much neater to boot. I've also gotten a friend interested in this project; he's probably going to help me build a semi-permanent results / interaction page for the project - something like this and this put together. On that second page, about shuffle algorithms, you can enter your own custom shuffling method. We are excited about having something like that for bots - visualise the data, and enter your own custom strategy to see how it performs. This lets us have a sort of long-running 'informal' tournament. But it is still on its way!

[-]Username13y10

Fantastic! I'm very much interested in trying my hand at developing a strategy, thank you for putting this work together and I'm sure you will get a lot of positive feedback, once you finish it and release it.

[-]Vaniver14y00

Press and Dyson's setup had two areas where 'strategies' come into play.

The first area is the set of four probabilities you provide to the game, which determine your score when combined with the other player's set of four probabilities. Tit-for-tat is one particular choice of four probabilities (and, based on the nature of the game, should actually be represented as "slightly forgiving tit-for-tat", which cooperates with probability epsilon in the defect-defect or defect-cooperate case, so that when playing against itself all states will terminate with cooperate-cooperate).

The second area is how the players modify those sets over time. Here, 'theory of mind' is relevant: both players with and without theories of mind can play any particular 4-probability set, like tit-for-tat. Players with theory of mind think (at least) two steps ahead- when I change my probabilities, how will my opponent change their probabilities? Players without theory of mind think only one step ahead- given my opponent's probabilities, which play maximizes my score?

[-]Oscar_Cunningham14y130

Had it.

[-]Kindly14y50

I am skeptical about the more complex meta-strategies discussed in the interview. If you play a ZD strategy, but switch to a different ZD strategy if after 100 moves you're not doing as well as you should be, then you're not playing a ZD strategy: you're playing some different strategy that has a 100-move memory. The extortion arguments only go through if you set a strategy at the beginning of time and never touch it again, in which case there is no place for bargaining.

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

4

"Extortionate" strategy beats tit-for-tat in iterated Prisoner's Dilemma

4

4