In the interest of making decision theory problems more relevant, I thought I'd propose a real-life version of counterfactual mugging. This is discussed in Drescher's Good and Real, and many places before. I will call it the Hazing Problem by comparison to this practice (possibly NSFW – this is hazing, folks, not Disneyland).
The problem involves a timewise sequence of agents who each decide whether to "haze" (abuse) the next agent. (They cannot impose any penalty on previous agent.) For all agents n, here is their preference ranking:
1) not be hazed by n-1
2) be hazed by n-1, and haze n+1
3) be hazed by n-1, do NOT haze n+1
or, less formally:
1) not be hazed
2) haze and be hazed
3) be hazed, but stop the practice
The problem is: you have been hazed by n-1. Should you haze n+1?
Like in counterfactual mugging, the average agent has lower utility by conditioning on having been hazed, no matter how big the utility difference between 2) and 3) is. Also, it involves you having to make a choice from within a "losing" part of the "branching", which has implications for the other branches.
You might object the choice of whether to haze is not random, as Omega’s coinflip is in CM; however, there are deterministic phrasings of CM, and your own epistemic limits blur the distinction.
UDT sees optimality in returning not-haze unconditionally. CDT reasons that its having been hazed is fixed, and so hazes. I *think* EDT would choose to haze because it would prefer to learn that, having been hazed, they hazed n+1, but I'm not sure about that.
I also think that TDT chooses not-haze, although this is questionable since I'm claiming this is isomorphic to CM. I would think TDT reasons that, "If n's regarded it as optimal to not haze despite having been hazed, then I would not be in a position of having been hazed, so I zero out the disutility of choosing not-haze."
Thoughts on the similarity and usefulness of the comparison?