Wiki Contributions



As far as I understand the main thing that is missing is a solid theory of logical counterfactuals.
The main question is: In the counter-factual scenario in which TDT recommends action X to agent A , what does would another agent B do?
How does the thought process of A correlate with the thought process of B?

There are some games mentioned in the FDT and TDT paper which clearly involve multiple TDT agents.
The FDT paper mentions that TDTs "form successful voting coalitions in elections",
and the TDT paper mentions that TDTs cooperate in Prisoner's Dilemma.
In those games we can easily tell how the thought process of one agent correlates with that of other agents,
because in those games there is an obvious symmetry between all agents, so that all agents will always do the same thing.

In a blackmail scenario it's not so obvious, but I do think there is a certain symmetry between rejecting all blackmail and sending all blackmail.
If the blackmailer B writes a letter saying "Hand over your utility point or we will both get -∞ utility" and thinks about whether to send it or not,
and then the victim A comes along and says "I reject all blackmail. This means: Do not send your letter, or we will both get -∞ utility", then B has been blackmailed by A. So there is a symmetry here. If A rejects all blackmail regardless of causal consequences, then B will also reject A's """blackmail"""  that tries to exhort him into not sending his letter, and send his letter to A anyway regardless of causal consequences.

So I no longer believe the claim that TDT agents simply avoid all negative-sum trades.


Yeah, my argument here is not contradicting the paper,
because the case of a TDT agent blackmailing a TDT agent is not discussed.
I just wanted to know whether the resistance against blackmail extortion still applies in this case,
because I think it doesn't.

But in some situations the logic can absolutely be applied to "normal causal blackmail".
If a CDT agent sends a completely normal blackmail to a TDT agent,
and if the CDT agent is capable of perfectly predicting the TDT agent,
then that is precisely the situation in which resisting the extortion makes sense.
In this situation, if the TDT agent resists the extortion, then the CDT will be able to predict that,
and since he is a CDT he will just do a simple EV calculation and not send the blackmail.
So I think it does apply to completely normal blackmail scenarios, as long as the CDT is insanely intelligent.


Would a TDT agent also just always send all possible blackmail to other agents, independently of whether they think it gets accepted or not, and just live with the consequences?
They might want to do that, because if they did, then they would encounter less universes in which their blackmails get rejected, because it's known that rejecting their blackmail doesn't disincentivize them from sending it.
Like, I don't believe TDT actually recommends that, but it's the same logic that justifies rejecting all blackmail.

In any case, the decision theory of the blackmailer does slightly matter.
For example if you have a genuinely stupid blackmailer who just always sends blackmail no matter what happens, then there is really no reason to reject the blackmail. Rejecting the blackmail of a genuinely stupid agent doesn't reduce the number universes in which you receive blackmail.