Sometimes I see new ideas that, without offering any new information, offers a new perspective on old information, and a new way of thinking about an old problem. So it is with this lecture and the prisoner's dilemma.

Now, I worked a lot with the prisoners dilemma, with superrationality, negotiations, fairness, retaliation, Rawlsian veils of ignorance, etc. I've studied the problem, and its possible resolutions, extensively. But the perspective of that lecture was refreshing and new to me:

The prisoner's dilemma is resolved only when the off-diagonal outcomes of the dilemma are known to be impossible.

The "off-diagonal outcomes" are the "(Defect, Cooperate)" and the "(Cooperate, Defect)" squares where one person walks away with all the benefit and the other has none:

(Baron, Countess)
(3,3) (0,5)
(5,0) (1,1)


Facing an identical (or near identical) copy of yourself? Then the off-diagonal outcomes are impossible, because you're going to choose the same thing. Facing Tit-for-tat in an iterated prisoner's dilemma? Well, the off-diagonal squares cannot be reached consistently. Is the other prisoner a Mafia don? Then the off-diagonal outcomes don't exist as written: there's a hidden negative term (you being horribly murdered) that isn't taken into account in that matrix. Various agents with open code are essentially publicly declaring the conditions under which they will not reach for the off-diagonal. The point of many contracts and agreements is to make the off-diagonal outcome impossible or expensive.

As I said, nothing fundamentally new, but I find the perspective interesting. To my mind, it suggests that when resolving the prisoner's dilemma with probabilistic outcomes allowed, I should be thinking "blocking off possible outcomes", rather than "reaching agreement".

New Comment
34 comments, sorted by Click to highlight new comments since:

I had a thought related to this (apologies if this has been bought up on LW before, but I haven't seen it exactly).

Let's say you're playing a game theory game, and you have access to an oracle that either answers correctly or refrains from answering.

Obviously in a lot of games it makes sense to ask what move your opponent will make, so you can then optimise your own move. But in the Prisoner's Dilemma, the opponent's move is useless information - whatever you learn, your optimal move is to defect.

What you want to ask is not "what move will he make ?", but "will he make the same move as me ?". Learning which row or column you're in doesn't help; learning whether you're on-diagonal or off-diagonal does.

This is approximated by the strategy called Pavlov: "Cooperate on the first turn; Cooperate if our previous-turn moves were the same; Defect if they were different."

This is a really nice way of looking at it.

They don't have to be known to be impossible. Just unlikely. If you're facing someone similar to yourself, it might be that choosing to defect makes it more likely that they defect, and enough so to counter out any gain you'd have, but you still don't know they'll defect.

Came here to say that, see it's been said. If your actions don't approach the choice you would make given impossibility, as the probability of something approaches (but does not reach) zero, then you must either be assigning infinite utility to something or you must not be maximizing expected utility.

When you say that choosing to defect might make it more likely that they defect, do you mean that choosing to defect may cause the probability that the other person will defect to go up, or do you mean that the probability of the other player defecting, given that you defected, may be greater than the probability given that you cooperated?

To quote Douglas Adams, "The impossible often has a kind of integrity to it which the merely improbable lacks." If it is impossible to have off-diagonal results, that is a much stronger argument for cooperating than having it be improbable, even if the probability of an on-diagonal result is 99.99%; as long as the possibility exists, one should take it into consideration.

If it is impossible to have off-diagonal results, that is a much stronger argument for cooperating than having it be improbable

If the probability is epsilon, then having the probability be zero is only an epsilon stronger argument. If you doubt this let epsilon equal 1/googolplex.

I mean the second one. Also, if I said the first one, I would mean the second one. They're the same by the definitions I use. The second one is more clear.

If it is impossible to have off-diagonal results, that is a much stronger argument for cooperating than having it be improbable, even if the probability of an on-diagonal result is 99.99%; as long as the possibility exists, one should take it into consideration.

If the probability of an on-diagonal result is sufficiently high, and the benefit of an off-diagonal one is sufficiently low, that is all that's necessary for it to be worth while to cooperate.

Yes. I model "unlikely" as "I likely live in a universe where these outcomes are impossible", but that's just an unimportant different in perspective.

I likely live in a universe where these outcomes are impossible

What do you mean by "impossible"? If you mean highly unlikely, then you're using recursive probability, which doesn't make a whole lot of sense. If you mean against the laws of physics, then it's false. If you mean that it won't happen, then it's just a longer way of saying that those outcomes are unlikely.

it's just a longer way of saying that those outcomes are unlikely.


What if you are playing with someone and their decision on the current round does not affect your decision in the current round?

If you are known to cooperate because it means that your opponent (who is defined as 'similar to yourself'), then your opponent knows he is choosing between 3 points and 5 points. Being like you, he chooses 3 points.

If you are playing against someone whose decision you determine, (or influence) then you choose the square; if the nature of your control prevents you from choosing 5 or 0 (or makes those very unlikely) points but allows you to choose 3 or 1 (or make one of those very likely), choose 3. However, there only one player in that game.

I don't care which way the causal chain points. All I care about is if the decisions correlate.

Also, I'm not sure of most of what you're saying.

Given the choice between 0 points and 1 point, you would prefer 1 point; given the choice between 3 points and 5 points, you would prefer 3 points. (Consider the case where you are playing a cooperatebot; the choice which correlates is cooperation; against a defectbot, the choice which correlates is defection. There are no other strategies in single PD without the ability to communicate beforehand.)

Why would you prefer three points to five points? Aren't points just a way of specifying utility? Five points is better than three points by definition.

Right- which means defectbot is the optimal strategy. However, when playing against someone who is defined to be using the same strategy as you, you get more points by using the other strategy.

It should not be the case that two players independently using the optimal option would score more if the optimal option were different.

If the off-diagonals are impossible, then it's not the Prisoners' Dilemma, it's just Cake or Death. If you're facing an identical copy of yourself, then it's really the Newcomb Paradox. Open code (or, at least, unilateral open code) only works in the Ultimatum Game. One study found that knowing how the other person chose actually increases defection; the probability that Player Two defects given that Player One defects > probability given Player One cooperates > probability given Player One's action is unknown. Furthermore, open code only makes sense if you have a way of committing to following the code, and part of the Prisoners' Dilemma is that there is no way for either player to commit to a course of action. And another part of the whole concept of the one-off Prisoners' dilemma is that there is no way to retaliate. If the players can sue each other for reneging on agreements, then it's basically an iterated Prisoners' Dilemma (and if there's a defined endpoint, then you have the Backwards Induction Paradox).

If the off-diagonals are impossible, then it's not the Prisoners' Dilemma, it's just Cake or Death.

Yes; the goal is to replace the PD with Cake or Death, in which case Cake is chosen. Various strategies are available that do this in various situations, and all involve modifying the game or incentive structure.

Open code (or, at least, unilateral open code)

The easiest way to turn unilateral open code into bilateral open code is for your code to be "defect against people who have not given you their decision-making source code, and cooperate with those who post open code that cooperates with this program".

Why not put some figures on 'identicality' of the players and see what comes out ?

A simple way is to consider the probability P that both players will play the same move. That's a simple mesure of how similar both players are.

Remember I am not stating that there is any causal dependency between players (it's forbidden by the rules):

  • A and B could be twins raised in a tight familly

  • A and B could be one unique person asked to play against several unknown opponents and not knowing he is playing agaisnt herself (experimental psychologist can be quite perverse)

  • A and B could be two instances of one computer program

  • A and B could even be not so similar persons, but merely play alike two times out of three. It's already correlation enough.

  • A and B could be imagined to be so different as to always play the opposite move for one another, given the save initial conditions (but I guess in this case I can't imagine how they could both be rational)...

  • etc.

That gives use an inequation of this parameter and a result depending of the values inside the PD matrix.


Player A move is x, move can be: x=C (cooperation) or x=D (defection)

Player B move is y, move can be: y=C (cooperation) or y=D (defection)

P(E) denotes probability of event E

G(E) denotes the expected (probabilist) payoff if event E occurs.

We also assume a stable definition of rationality. That means something like what physicians calls gauge Invariance : you should be able to exchange the rôle of x and y without changing equations. Gauge invariance gives use some basic properties:

We can assume P(y=C) = P(x=C) = P(C) ; P(y=D) = P(x=D) = P(D).

It follows:

P(x=C and y=D) = P(x=D and y=C) = P(x!=y)

P(x=C and y=C) = 1 - P(x=C and y=D) = P(x=y)

P(x=D and y=D) = 1 - P(x=C and y=D) = P(x=y)

Now, keeping in mind these properties, let's find the payoff for x=C G(x=C), and the payoff for x=D G(x=D).

Gx(C) = Gx(x=C and y=C) P(x=C and y=C) + Gx(x=C and y=D) P(x=C and y=D)

Gy(D) = Gx(x=D and y=D) P(x=D and y=D) + Gx(x=D and y=C) P(x=D and y=C)

Gx(C) = Gx(x=C and y=C) p(X=Y) + Gx(x=C and y=D) P(x!=y)

Gy(D) = Gx(x=D and y=D) P(x=y) + Gx(x=D and y=C) P(x!=y)

Gx(C) = (Gx(x=C and y=C) - Gx(x=C and y=D)) * P(x=y) + Gx(x=C and y=D)

Gx(D) = (Gx(x=D and y=D) - Gx(x=D and y=C)) * P(x=y) + Gx(x=D and y=C)

The rational choise will be C for x if Gx(C) > Gx(D)

on the contrary the reasonable choise will be D for x if Gx(C) < Gx(D)

if Gx(C) = Gx(D) there is no obvious reason to choose one behavior or the other (random choice ?).

The above inequations are quite simple to understand if we consider P(x=y) as a variable in a geometric equation. We get equations for too lines. The line that is above the other should be considered as the rational move.

The mirror argument match the case where P(x=y) = 1,

Then we have Gx(C) = Gx(x=C and y=C), Gx(D) = Gx(x=D and y=D),

with usual parameters where Gx(x=C and y=C) > Gx(x=D and y=D),

C is rational for identical players.

The most interesting point is were the two lines meet.

At that point Gx(C) = Gx(D)

It yields :

P(x=y) = (Gx(x=D and y=C) - Gx(x=C and y=D))/(Gx(x=C and y=C) - Gx(x=C and y=D) - Gx(x=D and y=D) + Gx(x=D and y=C))

PD criterium is such that this is always a positive value.

With the usual values we get :

P(x=y) = (5-0)/(3-0+5-1) = 5/7 = 0.71

It simply means that if probability of same behavior is 71% or above it is rational to cooperate in a Non iterated Prisonner Dilemma.

My point is really that if both players are warned that the other one is a (mostly) rationale being it is enough for me to believe he is identical to me (he will behave the same) with a probability above 71%.

You should understand that a probability of 50% of identical behavior is what you get when the other player is random. As I understand it defectors are just evaluating the probability of identical behavior of the other between 50% and 71%. It is a bit too random for my taste.

What I also find interresting is that my small figures match results I remember having seen in real life experiences (3 on 5 cooperating, 2 on 5 defecting). [I remember a paper about "Quasi-magical reasonnning" from the 90's but I lost pointers to it]. It does not imply ant more that some of these people are rational and others are mislead, just divergence on raw evaluation of probability for other human players to do the same as them.

As an afterword, I should also say something about Dominance Argument, because this argument is the basis for the current belief of most academics that 'D = rational'.

It goes like this:

What should A play ?

if B choose C, A should choose D because  Gx(x=D and y=C) > Gx(x=C and y=C)

if B choose D, A should also choose D because Gx(x=D and y=D) > Gx(x=C and y=D)

Hencefoth A should choose D whathever B plays. Right ?


The above is only true if x and y are idependant variables. Basically that is what you get when P = 50%

The equations are above, easy to check.

Mathematically x and y are independant variable means the behavior of y is random relating to x.

This is a much stronger property than just stating there is no causal relationship between x and y. And not exactly a realistic one...

That is like stating that because two traders do not communicate/agree between each other, they won't choose to buy or sell the same actions on marketplace. Or that phone operators pricing won't converge if operators do not communicate between each others before publishing new package pricings ? I'm not pretending they will perfectly agree, or that convergence can not be improved through communication, but just that usually the same cause/environment/education give the same effects and that some correlation is to be expected. True random independance between variables only exist in the mathematical world.


Suppose that you were playing Omega in single PD; Omega is presumed to be able to accurately predict your moves in advance, as in Neucomb's problem.

Omega's strategy will maximize its score; your goal is to maximize your score.

What would Omega do? Should its strategy be dependent on yours?

Your strategy should be to defect unless you have good enough evidence that cooperating will cause Omega to also cooperate.

Omega will predict this, and will give you good enough evidence. Whether or not this actually leads to Omega cooperating depends on the strength of evidence given. Money placed in escrow would be enough.

Of course, if you can't receive any communication from Omega before placing your decision, Omega is going to defect on you (since Omega's decisions can't affect what you do). This is still assuming it's the only prisoner's dilemma Omega plays in your light cone, however.


Unless otherwise stated, the games theory game of Prisoner's Dilemma takes place as the only event in the hypothetical universe; in this example, prior communication and credible precommitment are not permitted.

Instead of generating a strategy to use against a copy of yourself, consider what the best strategy would be against the optimized player who knows what your strategy is.

TDT permits acting as though one had precommitted, the result being that one never wishes one's opportunities to precommit were different. Consider a perfectly reasoning person and a perfectly reasoning Omega with the added ability to know what the person's move will be before making its own move (and not mind-reading; only knowing the move).

If the human knows TDT is optimal, then he knows Omega will use it; if the human knows that TDT would make the true but non-credible precommittment, then the human knows that Omega has chosen the optimal precommitment.

If the ideal silent precommitment strategy is the diagonal, then we get C-C as the result. If any other precommitment is ideal, then Omega would do no worse than 3 points using it against a perfectly rational non-precog.

If the human is a cooperate-bot, then the ideal strategy is to defect. Therefore committing to the diagonal is suboptimal, because it results in 3 instead of 5 points in that one case. However, the human here is either going to cooperate or defect without regard to Omega's actual strategy (only the ideal strategy), meaning that the human is choosing between 0 and 1 if the ideal strategy is defectbot.

Either there's a potential precommit that I haven't considered, TDT is not optimal in this context, or I've missed something else. Evidence that I've missed something would be really nice.

Either you haven't read this, and are not talking about TDT as I know it, or I don't understand you at all.

... My line of thought is unchanged if Omega simply learns your decision after you decide but before Omega decides. The game is now not symmetrical.

Currently, I have concluded that if it is best to cooperate if the other player has cooperated, it is best for the first player to cooperate against a rational opponent. (3 instead of 1). However, it is better to cooperate with >1/3 chance, and that still provides a higher expected result to the first player.

If, given the choice between 3 points and 5 points, 5 points is better, then it is best for the first player to defect (1 instead of 0).

In the end, the first player has 2 possible strategies, and the second player has 4 possible strategies, for a total of 8 possibilities:

         Player 1\Player 2: {c:C;d:C} {c:C;d:D} {c:D;d:C} {c:D;d:D}
         c....................c:C 3/3...c:C 3/3...c:D 0/5...c:D 0/5
         d....................d:C 5/0...d:D 1/1...d:C 5/0...d:D 1/1

My problem is that if quid pro quo {c:C;d:D} is the optimum strategy, two optimal players end up cooperating. But quid pro quo is a strictly worse strategy than defectbot {c:D;d:D}. However, if defectbot is the best strategy for player 2, then the best strategy for player 1 is to defect; if quid pro quo is the best strategy for player 2, then the best strategy for player 1 is to cooperate.

I have trouble understanding how the optimal strategy can be strictly worse than a competing strategy.

IFF quid pro quo is optimal, then optimal players score 3 points each.
However, iff quid pro quo is the optimal strategy, then defectbot scores more against an optimal player 1; the optimal player 1 strategy is to defect, and optimal players score 1 point each.

Please stop using the words "rational" and "optimal", and give me some sign that you've read the linked post on counterfactuals rather than asking counterfactual questions whose assumptions you refuse to spell out.

The only difficult question here concerns the imbalance in knowledge between Omega and a human, per comment by shminux. Because of this, I don't actually know what TDT does here (much less 'rationality').

Assumptions: The game uses the payout matrix described OP, and the second player learns of the first player's move before making his move. Both players know that both players are trying to win and will not use a strategy which does not result in them winning.

My conclusion is that both players defect. My problem is that it would be better for player 2 if player 2 did not have the option to defect if player 1 cooperated.

I've thrown out cooperatebot and reverse quid pro quo as candidates for best strategy.

FYI: I'm using this as my reference, and this hinges on reflexive inconsistency. I can't find a reflexively consistent strategy even with only two options available. (Note that defectbot consistently equals or outperforms quid pro quo in all cases)

Again, you don't sound like you've read this post here. Let's say that, in fact, "it would be better for player 2 if player 2 did not have the option to defect if player 1 cooperated" - though I'm not at all sure of that, when player 2 is Omega - and let's say Omega uses TDT. Then it will ask counterfactual questions about what "would" happen if Omega's own abstract decision procedure gave various answers. Because of the nature of the counterfactuals, these will screen off any actions by player 1 that depend on said answers, even 'known' actions.

You're postulating away the hard part, namely the question of whether the human player's actions depend on Omega's real thought processes or if Omega can just fool us!


Which strategy is best does not depend on what any given agent decides the ideal strategy is.

I'm assuming only that both the human player and Omega are capable of considering a total of six strategies for a simple payoff matrix and determining which ones are best. In particular, I'm calling Löb'shit on the line of thought "If I can prove that it is best to cooperate, other actors will concur that it is best to cooperate" when used as part of the proof that cooperation is best.

I'm using TDT instead of CDT because I wish to refuse to allow precommitment to become necessary or beneficial, and CDT has trouble explaining why to one-box if the boxes are transparent.

Not sure how this is relevant to the OP, but clearly Omega would defect while making you cooperate, e.g. by convincing you that he is your PD-clone.

Superior (or even infinite) computing power does not imply he can make you be persuaded to cooperate, only that he knows whether you will. If there exist any words he could say to convince you to cooperate, he will say them and defect. However, if you cooperate only upon seeing omega cooperate (or prove that he will), he will cooperate.


The relevance is that the rational strategy is to defect. Always, unless your selection has a sufficiently high probability of changing your opponents suggestion. If possible, maximize the chances that your opponent will cooperate.

The relevance is that the rational strategy is to defect.

... unless your opponent also defects, and only in one-shot PD. Or maybe we are using different definitions of "rational", given how both players "rationally" defecting both lose.

Given that one's opponent defects, you get more points for defecting.