Decision Theory Paradox: PD with Three Implies Chaos?

[-]JGWeissman14y290

If each TDT agent cares about its number of descendants, then they will instrumentally care about having a greater proportion of TDT agents to mutually cooperate with, and will therefore defect when one of the three players is a CDT agent, like the clique bot.

They only behave as you describe if they consider only the immediate, but not long term, consequences.

[-]Will_Sawin14y150

Seems correct to me. Judging agents with a different utility function than they use to decide will lead to them looking stupid.

[-]orthonormal14y20

I mentioned this to JGWeissman privately at the time, but I want to confirm now that this was exactly the solution I intended.

[-]Nominull14y140

The TDT agents are maximizing their number of descendants, you have no right to criticize them for failing to maximize their share of the population. The whole point of the prisoner's dilemma is that it's not a zero sum game, but if you count the population of Defectbots against the TDT agents, you are treating it as a zero sum game.

[-]benelliott14y60

And yet if they all switched to being cliquebots they would not only drive out the defectbots, but would have far more children in the long run.

[-][anonymous]14y190

They would not have far more children in the long run. Their descendants would have more children, but their utility function doesn't care about that.

Edit: and if they do start caring about grandchildren the problem stops being a straightforward Prisoners' Dilemma.

[-]Will_Sawin14y20

So can we show that TDT would play perfectly in such a scenario? I think yes.

Your decisions interact in complex ways with your peer's decisions, your descendant's decisions, etc. There might be a nice simple formula that classifies this, but:

These are TDT agents. So they all behave the same. At least, the ones during the same timestep behave the same. The ones during different time steps are in different situations, but have the same values. Since they (overall) have he same values as past generations, they will together implement a coherent strategy.

[-]wedrifid14y40

And yet if they all switched to being cliquebots they would not only drive out the defectbots, but would have far more children in the long run.

And if they all switched to being paperclip maximisers they would make more paperclips. Neither of these are going to help them maximise their actual utility function.

[-]benelliott14y50

Fair point. I failed to spot initially that the trick lay in equivocating between "maximise direct descendants" and "maximise all descendants".

[-]jimrandomh14y120

There is an easy proof that no strategy can win against every field of opponents: just make two bots, CliqueBotA and CliqueBotB, which each cooperate with copies of themselves and defect otherwise. No strategy can possibly win against both. Environments containing only one or the other are evolutionarily stable equilibria.

It gets weirder. It's also possible to construct an environment such that there is no optimal solution for that particular environment. Design an agent LargeNumberRewarder, which generates a random number from a geometric distribution, inspects its opponent's source code, picks out the largest integer, and cooperates iff the number it found was greater than the random number. There is no optimal agent against this field, because there is no largest number. Hybridize this with Cliquebot, and you get an evolutionarily stable population except that the agents contain a numeric parameter that increases with every generation. Play with the parameters a bit and you can make the derivative of that numeric parameter fall off quickly so that it converges to an almost-optimal equilibrium.

[-]orthonormal14y40

As I said in the first footnote, sometimes the best that an agent can do is tie with another strategy (ETA: Or, more precisely, go to war with the other strategy, at symmetric odds.)

Wouldn't the LNRs die out fairly quickly against any kind of CliqueBot, or TDT, or even DefectBot?

[-]wedrifid14y00

Wouldn't the LNRs die out fairly quickly against any kind of CliqueBot, or TDT, or even DefectBot?

I think jim was hybridizing this with CliqueBot. So it plays as a CliqueBot except that among other CliqueBot hybrids it does the large number thing. So it (roughly speaking, with the right details) first just dominates then it plays silly games. :)

[-]Jack14y110

Assume maximal selfishness: each agent is motivated solely to maximize its own number of children (the agent itself doesn't get returned!), and doesn't care about the other agents using the same decision theory, or even about its other "relatives" in the simulation.

...

Problem: The setup looks perfectly fair for TDT agents. So why do they lose? (Difficulty: 2+3i stars.)

Um, they don't lose. What the TDT agents care about is the number of children they have. If they cared about the total number of descendants they have the cost of cooperating with DefectBots or CliqueBots would be exponentially higher (and TDT would act as CliqueBots).

Edit: Come to think of it, they wouldn't act like CliqueBots. Unlike CliqueBots they would also cooperate with CooperateBots. And if you include CooperateBots in the simulation TDT would beat CliqueBots.

[-]wedrifid14y30

Edit: Come to think of it, they wouldn't act like CliqueBots. Unlike CliqueBots they would also cooperate with CooperateBots. And if you include CooperateBots in the simulation TDT would beat CliqueBots.

Oh, good point. CliqueBot is a fail strategy! CliqueBot++ would need to allow exceptions such that it cooperates with (CliqueBoT++ && anything that it does not consider a serious rival).

[-]orthonormal14y00

TDT wouldn't cooperate with CooperateBot- that would be throwing away utility.

[-]Jack14y70

Er, it cooperates with two TDTs and one CooperateBot, it defects with two CooperateBots and one TDT. That TDT cooperates even when one of the other agents is a CooperateBot and not TDT gives it an edge over CliqueBot in a population that includes all three.

[-]orthonormal14y20

Sorry, I was thinking 2-player games. Your solution doesn't work, though; the CooperateBots vanish first, followed by the TDTs.

[-]Jack14y80

Maybe it wasn't clear: I'm proposing that if the TDT agents cared about their total descendants the strategy they would adopt would be the same as the CliqueBot except that they would cooperate with another TDT and a CooperateBot. Once the CooperateBots disappear TDT and CliqueBots would be using the same strategy except that there would be more TDTs (since they cooperated with the CooperateBots while they were around).

[-]orthonormal14y20

Ah! I hadn't thought of that wrinkle before- it makes my analogy (for next time) even stronger than I'd thought. Thanks!

[-]gjm14y80

I like the pun in the title.

[-][anonymous]14y70

We all know that in a world of one-shot Prisoner's Dilemmas with read-access to the other player's source code, it's good to be Timeless Decision Theory.

I don't doubt it, but when was this proven?

[-]Vladimir_Nesov14y7-1

I would go as far as to actually doubt it. TDT seems to be insufficiently well-specified for this to clearly follow one way or the other, and in some cases I expect that TDT should be designed so that it won't unconditionally cooperate with other TDTs, picking other mixed strategies on the Pareto frontier instead.

(I assume for me to discuss stuff about this post not engaging the solution to its riddles directly is fine; I haven't even read it yet.)

[-]orthonormal14y10

(I assume for me to discuss stuff about this post not engaging the solution to its riddles directly is fine; I haven't even read it yet.)

Of course.

TDT seems to be insufficiently well-specified for this to clearly follow one way or the other, and in some cases I expect that TDT should be designed so that it won't unconditionally cooperate with other TDTs, picking other mixed strategies on the Pareto frontier instead.

In the simplest possible case, where two agents in a one-shot PD have isomorphic TDT algorithms maximizing different utility functions, they should cooperate. Let me know if I overstated the case in my first paragraph or Footnote 1.

[-]Vladimir_Nesov14y10

The question is of course in what counts as "isomorphic TDT algorithms" and how do the agents figure out if that's the case. However this post appears conclusively free of these problems.

[-]orthonormal14y20

However this post appears conclusively free of these problems.

Uh, do you mean "this post wrongly sweeps these problems under the rug", or "this post sweeps these problems under the rug, and that's OK"?

Anyway, although we don't have a coding implementation of any of these decision theories, Eliezer's description of TDT seems to keep the utility function separate from the causal network.

[-]Vladimir_Nesov14y20

These problems don't affect this post, so far as we assume the TDT agents to be suitably identical, since the games you consider are all symmetrical with respect to permutations of TDT agents, so superrationality (that TDT agents know how to apply) does the trick.

Anyway, although we don't have a coding implementation of any of these decision theories, Eliezer's description of TDT seems to keep the utility function separate from the causal network.

(Don't understand what you intended to communicate by this remark.)

[-]orthonormal14y30

Ah, good.

In retrospect, that remark doesn't apply to multiplayer games; I was thinking of the way that in Newcomb's Problem, the Predictor only cares what you choose and doesn't care about your utility function, so that the only place a TDT agent's utility function enters into its calculation there is at the very last stage, when summing over outcomes. But that's not the case for the Prisoner's Dilemma, it seems.

[-]Vladimir_Nesov14y30

Right, for TDT agents to expect each other acting identically from the symmetry argument, we need to be able to permute not just places of TDT agents in the game, but also simultaneously places of TDT agents in TDT agents' utility functions without changing the game, which accomodates the difference in agents' utility functions.

[-]wedrifid14y00

and in some cases I expect that TDT should be designed so that it won't unconditionally cooperate with other TDTs, picking other mixed strategies on the Pareto frontier instead.

I assume you are not expecting these cases to include the simple one shot prisoner's dilemma with full code access? I would be skeptical.

[-][anonymous]14y00

If you doubt it, then I doubt it as well. I thought I saw a formal specification a while back, but perhaps that was UDT.

[-]Vladimir_Nesov14y40

In a simple special case where everything is symmetric, they will cooperate if the problem is formalized in the spirit of TDT, but this is basically good old superrationality, not something TDT-specific. The doubt I expressed is about the case where the TDT agents are not exactly symmetric, so that each of them can't automagically assume that the other will do exactly the same thing. In the context of this post, this assumption may be necessary.

[-]Douglas_Knight14y00

I think it is unfair to TDT to say that it is just Hofstadter's superrationality. If TDT is an actual algorithm to which Hofstadter's argument applies, even just in the purely symmetric version, that is a great advance. I would definitely say that about UDT.

Yes, TDT is underspecified. But is it a class of fully specified algorithms, all of which cooperate with pure clones, or is it not clear if there is any way of specifying which logical counterfactuals it can consider?

Two relevant links: Gary Drescher on a problem with (a specification of?) TDT; you on underspecification.

[-]wedrifid14y00

The doubt I expressed is about the case where the TDT agents are not exactly symmetric, so that each of them can't automagically assume that the other will do exactly the same thing. In the context of this post, this assumption may be necessary.

The assumption of symmetry is not necessary in the context of this post. The ability to read the code and know that if they read your code and find that you would cooperate if they do (etc) is all that is necessary. Being the same as you isn't privileged at all. It's just convenient.

Code access is basically just better than knowledge they will do exactly the same thing.

[-]orthonormal14y00

I think Vladimir is saying that TDT agents with a superior bargaining position might extract further concessions from TDTs with an inferior bargaining position- or, rather, that we can't yet rigorously show that they wouldn't do such things. In the world of one-shot PDs, numerical superiority of one kind of TDT agent over another might be such a bargaining advantage.

[-]wedrifid14y00

In the world of one-shot PDs, numerical superiority of one kind of TDT agent over another might be such a bargaining advantage.

I had been considering a whole population of agents doing lots of prisoner's dilemmas among themselves to not be a one shot prisoner's dilemma. It does make sense for all sorts of other plays to be made when the situation becomes political.

[-]orthonormal14y00

Omega can wipe their memories of past interactions with other particular agents, as in the example I made up. That would make each interaction a one-shot, and it wouldn't prevent the sort of leverage we're talking about.

[-]wedrifid14y00

Omega can wipe their memories of past interactions with other particular agents, as in the example I made up. That would make each interaction a one-shot

I wouldn't call a game one shot just because memory constraints are applied. What matters is that the game that is being played is so much bigger than one prisoner's dilemma. Again, I don't dispute that there are all sorts of potential considerations that can be made, even if very little evidence about the external political environment is available to the agents, as in this case. Given this it seems likely that I don't disagree with Vlad significantly.

[-]Vladimir_Nesov14y20

I thought I saw a formal specification a while back, but perhaps that was UDT.

You're probably thinking of cousin_it's proof sketch of cooperation in PD. That was ADT/UDT. TDT talking about formal proofs is not part of its theory that was discussed anywhere that I know of.

[-]wedrifid14y30

Problem: The setup looks perfectly fair for TDT agents. So why do they lose? (Difficulty: 2+3i stars.)

V yvxr ubj lbh engrq gur qvssvphygl bs guvf dhrfgvba. Gung vf, vg'f zbfgyl vzntvanel. Gur GQG ntragf qba'g ybfr. Lbh tnir gurz fbzrguvat gb znkvzvfr naq gurl znkvzvfrq vg. Gurl qvq abg znkvzvfr na ragveryl qvssrerag ceboyrzf bs znxvat gurve qrfpraqnagf unir gur uvturfg cebcbegvba bs gur cbchyngvba be znxvat gurve trareny pynff bs qrpvfvba gurbevfg unir gur uvturfg cebcbegvba. Vs gubfr jrer gur tbnyf gura gur npghny hgvyvgl shapgvbaf tvira gb gur ntragf ercerfrag ybfg checbfrf. Gur 'cnenqbk', nf vf fb bsgra gur pnfr jvgu fhpu cnenqbkrf, vf bar bs ireony naq pbaprcghny fyrvtug bs unaq.

I'm also asking Eliezer, Vladimir Nesov, Wei Dai, cousin_it, and other decision-theory heavyweights to avoid posting spoilers on the main problem for now, if they agree with my reasons.

For the purpose of this problem I may as well be so I'll go as far as rot13.

[-]lessdazed14y00

You should distinguish between problems 3 and 4. Your answer would confuse someone who only understood one of those.

EDIT: like me.

[-]wedrifid14y00

Errr... neither? I was answering the 'problem' part, as quoted. It has its own rating and everything! I will go along with whatever association with 3 and 4 that ortho intended 'Problem' to have. It applies to 3 in as much as it involves the question "why do they only draw?" but far more interestingly to the variant in 4 when they get thrashed.

[-]lessdazed14y-20

"Gur GQG ntragf qba'g ybfr. Lbh tnir gurz fbzrguvat gb znkvzvfr naq gurl znkvzvfrq vg. Gurl qvq abg znkvzvfr na ragveryl qvssrerag ceboyrzf bs znxvat gurve qrfpraqnagf unir gur uvturfg cebcbegvba bs gur cbchyngvba be znxvat gurve trareny pynff bs qrpvfvba gurbevfg unir gur uvturfg cebcbegvba." is the natural answer to 3, and question 4 is distinguished from question 3 and I think elicits a better answer that adds to the discussion in a way the answer to question 3 wouldn't if one were to say "that which was wrong with question 3, is also wrong with question 4". An autistic reading of that answer applies to question 4, and isn't wrong, but there's more to say.

"Gurl qvqa'g ybbx nurnq" is something you can't say about 3, that you can say about 4, for had they done that in 3, it wouldn't increase their score, while they need to do that in 4 to increase their score (above zero).

[-]orthonormal14y40

Gung'f abg dhvgr evtug, yrffqnmrq; va gur fvghngvba jvgu GQGf naq QrsrpgObgf, gur GQGf jbhyq vapernfr gurve cbchyngvba snfgre va gur ybat eha vs gurl ershfrq gb nyybj QrsrpgObgf gb serr-evqr. Nyfb, vs gurl qvq fb, gurve cebcbegvba jbhyq evfr gb bar vafgrnq bs fgvpx ng bar unys.

[-]lessdazed14y00

Good point. I missed that since the number goes to infinity as the trials do regardless.

If you could devise a problem where the obvious flaw in "GQG fubhyq orng gur fbpxf bss bs QrsrpgObgf va nal snve svtug." alone would be implicated, that would be good, since it really is a different one than snvyvat gb ybbx nurnq.

[-]wedrifid14y00

Lrf, V qvq erpragyl qvfnterr jvgu lbh ba n qvssrerag guernq. Ubjrire zl pbzzrag urer fgvyy nccyvrf gb "Ceboyrz: ", abg "Rkrepvfr 3" be "Rkrepvfr 4". Gur eryngvbafuvc bs zl ercyl gb rvgure bs gubfr bgure rkrepvfrf vf jungrire eryngvbafuvc vg vaurevgrq sebz "Ceboyrz: " vgfrys naq jungrire ryfr unccraf gb nevfr cheryl ol pbvapvqrapr. Vs V jnagrq gb nyfb gnyx nobhg Rkrepvfr 3 naq 4 gurer ner znal bgure guvatf V pbhyq jevgr nobhg gurz!

[-]lessdazed14y10

The OP appears to conflate them, so I did too, see: "(Easy to verify if you've done Problem 2.)", but there is no problem 2, just an exercise 2. My comment "distinguish between problems 3 and 4" should really have read "distinguish between exercises 3 and 4", sorry.

I hadn't realized that for some number less than infinite iterations (and every higher number) in problem 3, snvyher gb ybbx nurnq would be important, so my comment is based on that error on my part. I overlooked it because for all iterations less than that high number, it isn't at all important. It is a legitimate thing to say from one with that specific wrong thought, so...whence "Lrf, V qvq erpragyl qvfnterr jvgu lbh ba n qvssrerag guernq"? I was going to say "you conflated the cases, I think you are wrong" but soft-pedaled it as much as I could, to be "you should distinguish" rather than "You are wrong" or "I think you are wrong". It turns out I was wrong, so...?

[-]wedrifid14y00

It is a legitimate thing to say from one with that specific wrong thought, so...whence "Lrf, V qvq erpragyl qvfnterr jvgu lbh ba n qvssrerag guernq"?

I could not understand why you had been continuing to target which of 3 and 4 I was responding to when I was actually targeting "P: ". It especially didn't make sense to me since "Problem: " seemed to be relevant to both 3 and 4 with 4 (just) being a somewhat exaggerated version of 3 that really emphasized the 'paradox'. It did not occur to me that you could just be wrong about 3. Take that as flattery of your intellect!

I also didn't pick up that you were trying to say that I was wrong in a nice way. Because the possibility that I was wrong about something so obvious just wasn't primed as a likely intent. ;)

It turns out I was wrong, so...?

That is sad for past you but doubly good for present you because you updated when ortho explained!

[-]lessdazed14y30

This should already perplex the reader who believes that rationalists should win

But see:

Assume maximal selfishness: each agent is motivated solely to maximize its own number of children (the agent itself doesn't get returned!), and doesn't care about the other agents using the same decision theory

If that wasn't stipulated, one could look ahead and see by acting like a defectbot you could tie, and self-modify into a defectbot/imitate a defectbot. Rationalists win provided rationalists don't mind others doing even better.

Likewise:

The setup looks perfectly fair for TDT agents. So why do they lose?

Because they only think one step ahead.

[-]Oscar_Cunningham14y20

Upvoted for title.

[-]Stuart_Armstrong14y00

Assume maximal selfishness: each agent is motivated solely to maximize its own number of children (the agent itself doesn't get returned!), and doesn't care about the other agents using the same decision theory, or even about its other "relatives" in the simulation

As I argued here, that is precisely the behaviour you don't want for your copies/descendants.

[-]Douglas_Knight14y00

The problem is underspecified. I think the post is really operating under the assumption that each generation Omega takes the population, divides it in triples, kills the remaining 0-2 without issue, runs the PD, and takes the result as the population for the next generation.* But if (as the post seems to say) Omega replaces each triple by its children before picking the next triple, exercise 1 is considerably more difficult than the others. In the first version, the agents care only about their children and not the children of other agents or of defectbot. But in the second case, an agent might encounter someone else's child when it plays the game, so it does care about the proportion of population. Then the situation is not a 1-shot PD and cannot be analyzed so simply. In particular, the answer depends on the utility function. It is not enough to say that the agent prefers more children to fewer; the answer depends on how much.

Consider the modified situation in which I claim that a homogeneous population of TDT/UDT agents defect. In this version, Omega picks replaces a triple with its children before picking the new triples. The payoffs depend on on the number of triples Omega has run: defection is always 1 copy, but cooperation by the N-th triple is worth N copies. If everyone cooperates, the population grows quadratically. There is a chance that an agent will never get picked. A population of sufficiently risk-averse agents will prefer to defect and get 1 child with probability 1 over cooperating and risking having no children. (There are some anthropic issues here, but I don't think that they matter.)

I had to change a lot of details to get there, but that means that the answer to Exercise 1 must depend on those details. If Omega replaces population every trial, then it cannot be analyzed like a triple in isolation, but must consider the population and utility function explicitly. If Omega runs it generation by generation, then exercise does reduce to an isolated triple. But since the agents only care about their children, they only care about what happens the first generation, and don't care whether Omega runs any more generations, making the other questions about later generations seem out of place.

* If you change the payoffs from 2/3 to 6/9, then there are no stragglers to kill. Or you could let the stragglers survive to to play next generation; if the population is large enough, this will not affect strategy.

[-]orthonormal14y00

But in the second case, an agent might encounter someone else's child when it plays the game, so it does care about the proportion of population.

I don't understand what you mean by this.

Consider the modified situation in which I claim that a homogeneous population of TDT/UDT agents defect...

Here, I believe you're saying that you can pick a rate of growth in the payoffs such that, with probability 1, if everyone cooperates, then eventually one particular lineage (chosen randomly) comes to dominate to the extent that nobody else even gets selected to reproduce, and if the starting utility functions are all sufficiently long-term and risk-averse, they might prefer not to undergo that lottery. Is that right?

[-]Douglas_Knight14y00

Your restatement of my example is correct, though the utility functions being long-term is not supposed to distinguish my example from yours. In a generational model where all the children happen at a predetermined time, discounting is not an issue. Your example seems to say that the agents care about the number of children (not descendents) that they have, regardless of when they occur. If they care about their infinite descendents they need discounting to get a finite number. If they care about exponentially discounted children, even with your payoffs and linear utility, they might choose to defect if the discount rate is high enough.

But in the second case, an agent might encounter someone else's child when it plays the game, so it does care about the proportion of population.

I don't understand what you mean by this.

My version is exaggerated so that the question is whether an agent gets to play the game at all. With the original constant payoffs, every agent gets put in some triple (with probability 1), so a homogeneous population of TDT agents cooperate. Now go back to your payoffs and consider a mix of TDT and defectors with Omega doing the triples one at a time. The only interesting strategy question for the TDT agent is whether to cooperate against 1 defector or to act cliquey. To decide the expected utility of a strategy, the agent must compute the probability of the different triples it might encounter. These depend on the population when it gets picked, which depend on the strategy it chose. In order to answer exercise 1, it must do exercises 2-4 and more. It is unlikely to encounter the asymptotic population, but it is also unlikely to encounter the original population.

Here's a rough calculation. Let's start with 1 defector. Let's assume that it encounters 2 of the original agents, not their descendents. Let's further assume its immediate children encounter original agents, not descendents (a quite unreasonable assumption); and that they don't get picked together, but face 2 TDT agents. Then the difference between cooperating and acting cliquey is 3 encounters with a defector against 9 such encounters. The number of children lost to such an encounter (compared to 3 TDTs) is 5 or 6, depending on cooperating or acting cliquey, which is quite large compared to the 1 that is gained by cooperating rather than acting cliquey. The assumption is ridiculous and the answer with linear utility is probably to cooperate, but it requires calculation and depends on the utility function (and maybe the initial population).

[-]wedrifid14y00

This is the most interesting (to me) post to appear on lesswrong in ages and I have downvoted it until such time as the "exercises" have worked solutions provided.

[-]orthonormal14y00

I'll write up my explanations and post them to Discussion this weekend.

[-]Manfred14y00

Interesting post, but I think the answer to the paradox being missing made it worse.

LESSWRONG
LW

LESSWRONG
LW

42

Decision Theory Paradox: PD with Three Implies Chaos?

42

42