The bottom line is, adding “contrition” to TFT makes it quite a bit better, and allows it to keep pace with Pavlov in exploiting TFT’s, while doing better than Pavlov at exploiting Defectors.

This is no longer true if we add noise in the perception of good or bad standing; contrite strategies, like TFT, can get stuck defecting against each other if they erroneously perceive bad standing.

So cTFT moves TFT's weakness to noise somewhere else. Where can we find real robustness?

From page 2 of the paper:

"cTFT is not the only evolutionarily stable rule which is Pareto-optimal (and hence yields the maximal pay-off if the whole population adopts it)."

We discuss cTFT, PAVLOV and REMORSE with analytical methods and numerical simulations, embedding them in a large class of stochastic strategies. Finally, we show that by replacing the conventions concerning the ‘‘standing’’ by another set (which is even easier to implement, and only depends on an ‘‘internal variable’’) one is led to a PRUDENT PAVLOV strategy which is an ESS and immune against errors both in implementing and in perceiving moves.

That sounds very useful for a population to have.

The problem in general, if you're fond of strategies that "have short memories" but keep track of similar statistics instead:

page 11, being careful about bias:

In principle, one could apply other rules of ‘‘standing’’. To start with, we should replace this term by a more neutral one, in order not to get trapped by its connotations, and think only of an arbitrary ‘‘tagging’’ of the states without specifying which is ‘‘good’’ or ‘‘bad’’. A strategy is now specified by the probability to cooperate and/or change the standing in the next round, depending on the current state (including the current standing) of both opponents. It is plausible that we can obtain some evolutionarily stable strategies for many such codes.

12, after pPavlov's implementation is explained.

It seems highly plausible that there exists a wide variety of workable ‘‘taggings’’ which yield interesting ESS’s. The question is whether an evolution based on mutation and selection would tend to lead to one form of ‘‘tagging’’ rather than another. This could ultimately shed light on why humans developed a sense of fairness, feelings of guilt, and highly effective social norms [see also Sugden (1986) and Young (1993) on the evolution of conventions]. The sheer combinatorial complexity of encompassing all conceivable codes, or taggings, is enormous, and the costs (in fitness) for reckoning with these ‘‘tags’’ seem difficult to evaluate. But it is a tempting problem.

Reply

[-]ESRogs7y110

Remorse cooperates only if it is in bad standing, or if both players cooperated in the previous round. In other words, Remorse is more aggressive; unlike cTFT, it can attack cooperators.

Against the strategy “always cooperate”, cTFT always cooperates but Remorse alternates cooperating and defecting:

C/C -> C/D -> C/C -> C/D …

Shouldn't the second one of these be C/C, since one player is "always cooperate" and the other player cooperates "if both players cooperated in the previous round"?

Reply

[-]selylindi7y180

Yes. Page 287 of the paper affirms your interpretation: "REMORSE does not exploit suckers, i.e. AllC players, whereas PAVLOV does."

The OP has a mistake:

Remorse is more aggressive; unlike cTFT, it can attack cooperators

Neither Remorse nor cTFT will attack cooperators.

Reply

[-]Zack_M_Davis6y40Nomination for 2018 Review

Standards! We should have them! We should repent when we fail to live up to them!

(Looks like there may be a technical error to fix, though?)

Reply

[-]TheWakalix7y20

I don't quite understand the conclusion, so this question might be wrong, but - is a line really necessary? Do we need a discrete "acceptable/unacceptable" judgment assigned to each action, or is it the universal agreement that's most active in causing the effect you're talking about?

Reply

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

133

Contrite Strategies and The Need For Standards

133

133