Benya

Benya's Comments

Continuity axiom of vNM

I don't follow. Could you make this example more formal, giving a set of outcomes, a set of lotteries over these outcomes, and a preference relation on these that corresponds to "I will act so that, at some point, there will have been a chance of me becoming a heavy-weight champion of the world", and which fails Continuity but satisfies all other VNM axioms? (Intuitively this sounds more like it's violating Independence, but I may well be misunderstanding what you're trying to do since I don't know how to do the above formalization of your argument.)

Harry Potter and the Methods of Rationality discussion thread, July 2014, chapter 102

Also, Magical Britain keeps Muggles out, going so far as to enforce this by not even allowing Muggles to know that Magical Britain exists. I highly doubt that Muggle Britain would do that to potential illegal immigrants even if it did have the technology...

A Parable of Elites and Takeoffs

Incidentally, the same argument also applies to Governor Earl Warren's statement quoted in Absence of evidence is evidence of absence: He can be seen as arguing that there are at least three possibilities, (1) there is no fifth column, (2) there is a fifth column and it supposed to do sabotage independent from an invasion, (3) there is a fifth column and it is supposed to aid a Japanese invasion of the West Coast. In case (2), you would expect to have seen sabotage; in case (1) and (3), you wouldn't, because if the fifth column were known to exist by the time of the invasion, it would be much less effective. Thus, while observing no sabotage is evidence against the fifth column existing, it is evidence in favor of a fifth column existing and being intended to support an invasion. I recently heard Eliezer claim that this was giving Warren too much credit when someone was pointing out an interpretation similar to this, but I'm pretty sure this argument was represented in Warren's brain (if not in explicit words) when he made his statement, even if it's pretty plausible that his choice of words was influenced by making it sound as if the absence of sabotage was actually supporting the contention that there was a fifth column.

In particular, Warren doesn't say that the lack of subversive activity convinces him that there is a fifth column, he says that it convinces him "that the sabotage we are to get, the Fifth Column activities are to get, are timed just like Pearl Harbor was timed". Moreover, in the full transcript, he claims that there are reasons to think (1) very unlikely, namely that, he alleges, the Axis powers all use them everywhere else:

To assume that the enemy has not planned fifth column activities for us in a wave of sabotage is simply to live in a fool's paradise. These activities, whether you call them "fifth column activities" or "sabotage" or "war behind the lines upon civilians," or whatever you may call it, are just as much an integral part of Axis warfare as any of their military and naval operations. When I say that I refer to all of the Axis powers with which we are at war. [...] Those activities are now being used actively in the war in the Pacific, in every field of operations about which I have read. They have unquestionably, gentlemen, planned such activities for California. For us to believe to the contrary is just not realistic.

I.e., he claims that (1) would be very unique given the Axis powers' behavior elsewhere. On the other hand, he suggests that (3) fits a pattern of surprise attacks:

[...] It convinces me more than perhaps any other factor that the sabotage that we are to get, the fifth column activities that we are to get, are timed just like Pearl Harbor was timed and just like the invasion of France, and of Denmark, and of Norway, and all of those other countries.

And later, he explicitly argues that you wouldn't expect to have seen sabotage in case (3):

If there were sporadic sabotage at this time or if there had been for the last 2 months, the people of California or the Federal authorities would be on the alert to such an extent that they could not possibly have any real fifth column activities when the M-day comes.

So he has the pieces there for a correct Bayesian argument that a fifth column still has high posterior probability after seeing no sabotage, and that a fifth column intended to support an invasion has higher posterior than prior probability: Low prior probability of (1); (comparatively) high prior probability of (3); and an argument that (3) predicts the evidence nearly as well as (1) does. I'm not saying his premises are true, just that the fact that he claims all of them suggests that his brain did in fact represent the correct argument. The fact that he doesn't say that this argument convinces him "more than anything" that there is a fifth column, but rather says that it convinces him that the sabotage will be timed like Pearl Harbor (and France, Denmark and Norway), further supports this -- though, as noted above, while I think that his brain did represent the correct argument, it does seem plausible that his words were chosen so as to suggest the alternative interpretation as well.

Bostrom versus Transcendence

The true message of the first video is even more subliminal: The whiteboard behind him shows some math recently developed by MIRI, along with a (rather boring) diagram of Botworld :-)

The sin of updating when you can change whether you exist

Sorry about that; I've had limited time to spend on this, and have mostly come down on the side of trying to get more of my previous thinking out there rather than replying to comments. (It's a tradeoff where neither of the options is good, but I'll try to at least improve my number of replies.) I've replied there. (Actually, now that I spent some time writing that reply, I realize that I should probably just have pointed to Coscott's existing reply in this thread.)

L-zombies! (L-zombies?)

I'm not sure which of the following two questions you meant to ask (though I guess probably the second one), so I'll answer both:

(a) "Under what circumstances is something (either an l-zombie or conscious)?" I am not saying that something is an l-zombie only if someone has actually written out the code of the program; for the purposes of this post, I assume that all natural numbers exist as platonical objects, and therefore all observers in programs that someone could in principle write and run exist at least as l-zombies.

(b) "When is a program an l-zombie, and when is it conscious?" The naive view would be that the program has to be actually run in the physical world; if you've written a program and then deleted the source without running it, it wouldn't be conscious. But as to what exactly the rule is that you can use to look at say a cellular automaton (as a model of physical reality) and ask whether the conscious experience inside a given Turing machine is "instantiated inside" that automaton, I don't have one to propose. I do think that's a weak point of the l-zombies view, and one reason that I'd assign measureless Tegmark IV higher a priori probability.

The sin of updating when you can change whether you exist

Thank you for the feedback, and sorry for causing you distress! I genuinely did not take into consideration that this choice could cause distress, and it could have occurred to me, and I apologize.

On how I came to think that it might be a good idea (as opposed to missing that it might be a bad idea): While there's math in this post, the point is really the philosophy rather than the math (whose role is just to help thinking more clearly about the philosophy, e.g. to see that PBDT fails in the same way as NBDT on this example). The original counterfactual mugging was phrased in terms of dollars, and one thing I wondered about in the early discussions was whether thinking in terms of these low stakes made people think differently than they would if something really important was at stake. I'm reconstructing, it's been a while, but I believe that's what made me rephrase it in terms of the whole world being at stake. Later, I chose the torture as something that, on a scale I'd reflectively endorse (as opposed, I acknowledge, actual psychology), is much less important than the fate of the world, but still important. But I entirely agree that for the purposes of this post, "paying $1" (any small negative effect) would have made the point just as well.

SUDT: A toy decision theory for updateless anthropics

In short, I don't think SUDT (or UDT) by itself solves the problem of counterfactual mugging. [...] Perhaps SUDT also needs to specify a rule for selecting utility functions (e.g. some sort of disinterested "veil of ignorance" on the decider's identity, or an equivalent ban on utilities which sneak it in a selfish or self-interested term).

I'll first give an answer to a relatively literal reading of your comment, and then one to what IMO you are "really" getting at.

Answer to a literal reading: I believe that what you value is part of the problem definition, it's not the decision theory's job to constrain that. For example, if you prefer DOOM to FOOM, (S)UDT doesn't say that your utilities are wrong, it just says you should choose (H). And if we postulate that someone doesn't care whether there's a positive intelligence explosion if they don't get to take part in it (not counting near-copies), then they should choose (H) as well.

But I disagree that this means that (S)UDT doesn't solve the counterfactual mugging. It's not like the copy-selfless utility function I discuss in the post automatically makes clear whether we should choose (H) or (T): If we went with the usual intuition that you should update on your evidence and then use the resulting probabilities in your expected utility calculation, then even if you are completely selfless, you will choose (H) in order to do the best for the world. But (S)UDT says that if you have these utilities, you should choose (T). So it would seem that the version of the counterfactual mugging discussed in the post exhibits the problem, and (S)UDT comes down squarely on the side of one of the potential solutions.

Answer to the "real" point: But of course, what I read you as "really" saying is that we could re-interpret our intuition that we should use updated probabilities as meaning that our actual utility function is not the one we would write down naively, but a version where the utilities of all outcomes in which the observer-moment making the decision isn't consciously experienced are replaced by a constant. In the case of the counterfactual mugging, this transformation gives exactly the same result as if we had updated our probabilities. So in a sense, when I say that SUDT comes down on the side of one of the solutions, I am implicitly using a rule for how to go from "naive" utilities to utilities-to-use-in-SUDT: namely, the rule "just use the naive utilities". And when I use my arguments about l-zombies to argue that choosing (T) is the right solution to the counterfactual mugging, I need to argue why this rule is correct.

In terms of clarity of meaning, I have to say that I don't feel too bad about not spelling out that the utility function is just what you would normally call your utility function, but in terms of the strength of my arguments, I agree that the possibility of re-interpreting updating in terms of utility functions is something that needs to be addressed for my argument from l-zombies to be compelling. It just happens to be one of the many things I haven't managed to address in my updateless anthropics posts so far.

In brief, my reasons are twofold: First, I've asked myself, suppose that it actually were the case that I were an l-zombie, but could influence what happens in the real world; what would my actual values be then? And the answer is, I definitely don't completely stop caring. And second, there's the part where this transformation doesn't just give back exactly what you would have gotten if you updated in all anthropic problems, which makes the case for it suspect. The situations I have in mind are when your decision determines whether you are a conscious observer: In this case, how you decide depends on the utility you assign to outcomes in which you don't exist, something that doesn't have any interpretation in terms of updating. If the only reason I adopt these utilities is to somehow implement my intuitions about updating, it seems very odd to suddenly have this new number influencing my decisions.

I like simplicity, but not THAT much

It's priors over logical states of affairs. Consider the following sentence: "There is a cellular automaton that can be described in at most 10 KB in programming language X, plus a computable function f() which can be described in another 10 KB in the same programming language, such that f() returns a space/time location within the cellular automaton corresponding to Earth as we know it in early 2014." This could be false even if Tegmark IV is true, and prior probability (i.e., probability without trying to do an anthropic update of the form "I observe this, so it's probably simple") says it's probably false.

Load More