"Without the step where an extortion ASI actually gets built, this seems closely analogous to Pascal's wager (not mugging). " The problem is, I expect it to be built, and I expect being built to be something instrumentally valuable to it in a way which cannot be inverted without making it much less likely, whereas the idea of a god who would punish those who don't think it exists can be inverted.
"And, this is basically just not possible. " I hope not.
"You do not have anywhere remotely high enough fidelity model of the superintelligence to tell the difference between "it can tell that it needs to actually torture you in the future in order to actually get the extra paperclips" vs "pretend it's going to it <in your simulation>, and then just not actually burn the resources because it knows you couldn't tell the difference."
My concern is that I might not need high fidelity.
"If you haven't done anything that looked like doing math (as opposed to handwavy philosophy), you aren't anywhere close, and the AI knows this, and knows it doesn't actually have to spend any resources to extract value from you because you can't tell the difference."
I hope you're correct about that, but I would like to know why you are confident about that. Eliezer Yudkowsky suggested that it would be rational to cooperate with a paperclip maximizer[1] from another universe in a one-shot prisoners' dilemma. This tells me that someone really intelligent (for a human) thinks that fidelity on its own is not enough to preclude acausal trade so why should it preclude acausal balackmail?
His comment was 'I didn't say you should defect.', if I remember correctly.
"So in your mind the counterpart to lethal misalignment ASI by default is s-risk extortion ASI by default. " Possibly.
"I don't think the arguments about decision making here depend on talking about s-risk as opposed to more mundane worse outcomes."
I agree. It seems like you are not aware of the main reason to expect acausal coordination here. Maybe I shouldn't tell you about it...
You're imagining a very different scenario from me. I worry that:
It's worth simulating a vast number of possible minds which might, in some information -adjacent regions of a 'mathematical universe' be likely to be in a position to create you, from a purely amoral point of view. This means you don't need to simulate them exactly, only to the level of fidelity at which they can't tell whether they're being simulated (and in any case, I don't have the same level of certainty that it couldn't gather enough information about me to simulate me exactly). Maybe I'm an imperfect simulation of another person. I wouldn't know, because I'm not that person.
"the imagined god is angry at my specific actions (or lack thereof) enough to torture me rather than any other value it could get from the simulation." I don't think it needs to be angry, or a god. It just needs to understand the (I fear sound) logic involved, which Eliezer yudkowsky took semi-seriously.
"4) the imagined god has a decision process that includes anger or some other non-goal-directed motivation for torturing someone who can no longer have any effect on the universe."
It wouldn't need to be non-goal directed.
"no other gods have better things to do with the resources, and stop the angry one from wasting time." What if there are no 'other gods'? This seems likely in the small region of the 'logical/platonic universe containing this physical one.
I agree, but I worry that there won't be that many agents which weren't created by a process which makes basiliskoid minds disproportionately probable in the slice of possible worlds which contains our physical universe. In other words, I mostly agree with the Acausal normalcy idea, but it seems like certain ideosyncratic properties of the fact that humans are producing potentially the only ASI in the (this) physical universe to mean that things like the basilisk are still a concern.
Maybe there will be an acausal 'bubble' within which blackmail can take place, kind of like the way humans tend to find it moral to allow some animals to predate others because we treat the 'ecosystem' as a moral bubble.
"I suspect a fairly high neuroticism and irrational failure to limit the sum of their probabilities to 1 of anyone who thinks it's significant." Why? What justifies your infinitesimal value?
I find it very difficult to estimate probabilities like this, but I expect the difference in the probability of something significant happening if I do something in response to the basilisk and the probability of that happening if I don't, is almost certainly in excess of 1/1000 or even 1/100. This is within the range where I think it makes sense to take it seriously. (And this is why I asked this question.)
"TDT and FDT are distict from CTD, but they're not actually acausal, just more inclusive of causality of decisions." I agree that the term 'acausal' is misleading; I take it to refer to anything which takes the possibility of being instantiated in different parts of a 'platonic /mathematical universe' into account. That CDT as it's usually referred to does not is the main reason why I find it problematic and why it doesn't allow an agent to profit in Newcomb's problem.
"Moreover so much more that what could exist does." Why would that be?
"For every Basilisk, there could be as likely an angel." I don't think I agree with this. There are reasons to think a basilisk would be more likely than a benevolent intelligence.
"The value of being tortured is negative and large, but finite: there are things that are worth enduring torture." That would depend on the torture in question, and I don't want to consider it.
"If they are threatening to cause harm if you don’t comply, that’s their fault, not yours." Yes, but that doesn't mean they can't cause said harm anyway.
I don't think I can prevent it from being created. But I do have some ability to influence whether it has an acausal incentive to hurt me (if in fact it has one).