Precisely. My argument was just that, depending on ones stance on anthropic reasoning, the fact that an actor is contemplating RB in the first place might already be an indication that he is in a simulation/being blackmailed in this way.
As I said, that argument is actually the most commonly presented one. However, there is actually a causal chain that would benefit an agent, causing it to adopt a Basilisk strategy: namely, if it thinks it is itself a simulation and will get punished otherwise.
Interesting post. Could the same argument not be used against the Simulation argument?
Simplify the model into assuming there is a universe in which I, the observer, are one of many many observers in an ancestor simulation run by some future civilization, and a universe in which I am a biological human naturally created by evolution on earth, with equal probability. Again, we can imagine running the universe many, many times. But no matter how many people are in the considered universe, I can only have the experience of being one at a time. So, asking:
The answer to that question converges to 1/2, as well. But if every observer reasoned like this when asked whether they are in a simulation, most would get the wrong answer (assuming there are more simulated than real observers)! How can we deal with this apparent inconsistency? Of course, as you say, different answers to different questions. But which one should we consider to be valid, when both seem intuitively to make sense?
Thank you for your answer. I agree that human nature is a reason to believe that a RB-like scenario (especially one based on acausal blackmail) is less likely to happen. However, I was thinking more of a degenerate scenario similar to the one proposed in this comment. Just exchange the message coming from a text terminal with the fact that you are thinking about a Basilisk situation: a future superintelligence might have created many observers, some of whom think very much like you, but are less prone to believing in human laziness and more likely to support RB. Thus, if you consider answering no to Q1 (in other words, dismissing the Basilisk), you could see this as evidence that maybe H1 is still true, and you are just (unluckily) one of the simulations that will be punished. By this logic, it would be very advisable to actually answer yes (assuming you care more about your own utility than that of a copy of you).
Actually though, my anthropic argument might be flawed. If we think about it like in this post by Stuart Armstrong, we see that in both H0 and H1, there is exactly one observer that is me, personally (i.e. having experiences that I identify with), thus, the probability of me being in a RB-scenario should not be higher than being in the real world (or any other simulation) after all. But which way of thinking about anthropic probability is correct, in this case?
That strategy might work as deterrence, although actually implementing it would still be ethically...suboptimal, as you would still need to harm simulated observers. Sure, they would be Rogue AIs instead of poor innocent humans, but in the end, you would be doing something rather similar to what you blame them for in the first place: creating intelligent observers with the explicit purpose of punishing them if they act the wrong way.