From the last thread:

From Costanza's original thread (entire text):

"This is for anyone in the LessWrong community who has made at least some effort to read the sequences and follow along, but is still confused on some point, and is perhaps feeling a bit embarrassed. Here, newbies and not-so-newbies are free to ask very basic but still relevant questions with the understanding that the answers are probably somewhere in the sequences. Similarly, LessWrong tends to presume a rather high threshold for understanding science and technology. Relevant questions in those areas are welcome as well.  Anyone who chooses to respond should respectfully guide the questioner to a helpful resource, and questioners should be appropriately grateful. Good faith should be presumed on both sides, unless and until it is shown to be absent.  If a questioner is not sure whether a question is relevant, ask it, and also ask if it's relevant."


  • How often should these be made? I think one every three months is the correct frequency.
  • Costanza made the original thread, but I am OpenThreadGuy. I am therefore not only entitled but required to post this in his stead. But I got his permission anyway.



  • I still haven't figured out a satisfactory answer to the previous meta question, how often these should be made. It was requested that I make a new one, so I did.
  • I promise I won't quote the entire previous threads from now on. Blockquoting in articles only goes one level deep, anyway.


New Comment
209 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Um... Where's da Less Wrong IRC at?

(Edit: I hope I'm not getting upvoted for my goofing around over there last night!)

#lesswrong at

If zero and one aren't probabilities, how does Bayesian conditioning work? My understanding is that a Bayesian has to be certain of the truth of whatever proposition that she conditions on when updating.

Zero and one are probabilities. The apparent opposite claim is a hyperbole intended to communicate something else, but people on LessWrong persistently make the mistake of taking it literally. For examples of 0 and 1 appearing unavoidably in the theory of probability, P(A|A) =1 and P(A|-A)=0. If someone disputes either of these formulae, the onus is on them to rebuild probability theory in a way that avoids them. As far as I know, no-one has even attempted this.

But P(A|B) = P(A&B)/P(B) for any positive value of P(B). You can condition on evidence all day without ever needing to assert a certainty about anything. Your conclusions will all be hypothetical, of the form "if this is the prior over A and this B is the evidence, this is the posterior over A". If the evidence is uncertain, this can be incorporated into the calculation, giving conclusions of the form "given this prior over A and this probability distribution over possible evidence B, this is the posterior over A."

If you are uncertain even of the probability distribution over B, then a hard-core Bayesian will say that that uncertainty is modelled by a distribution over distributions of B, which can be... (read more)

This isn't necessary. In many circumstances, you can approximate the probability of an observation you're updating on to 1, such as an observation that a coin came up heads. An observation never literally has a probability of 1 (you could be hallucinating, or be a brain in a jar, etc.) Sometimes observations are uncertain enough that you can't approximate them to 1, but you can still do the math to update on them ("Did I really see a mouse? I might have imagined it. Update on .7 probability observation of mouse.")

Yeah, but if your observation does not have a probability of 1 then Bayesian conditionalization is the wrong update rule. I take it this was Alex's point. If you updated on a 0.7 probability observation using Bayesian conditionalization, you would be vulnerable to a Dutch book. The correct update rule in this circumstance is Jeffrey conditionalization. If P1 is your distribution prior to the observation and P2 is the distribution after the observation, the update rule for a hypothesis H given evidence E is:

P2(H) = P1(H | E) P2(E) + P1(H | ~E) P2(~E)

If P2(E) is sufficiently close to 1, the contribution of the second term in the sum is negligible and Bayesian conditionalization is a fine approximation.

This is a strange distinction, Jeffrey conditionalization. A little google searching shows that someone got their name added to conditioning on E and ~E. To me that's just a straight application of probability theory. It's not like I just fell off the turnip truck, but I've never heard anyone give this a name before. To get a marginal, you condition on what you know, and sum across the other things you don't. I dislike the endless multiplication of terms for special cases where the general form is clear enough.
I don't know. i like having names for things. Makes it easier to refer to them. And to be fair to Jeffrey, while the update rule itself is a trivial consequence of probability theory (assuming the conditional probabilities are invariant), his reason for explicitly advocating it was the important epistemological point that absolute certainty (probability 1) is a sort of degenerate epistemic state. Think of his name being attached to the rule as recognition not of some new piece of math but of an insight into the nature of knowledge and learning.
If you observe X then the thing you update on is "I observed X" and not just "X". Just because you observed something doesn't mean it was necessarily the case (you could be hallucinating etc.). So while you don't assign probability 1 to "X" you do assign probability 1 to "I observed X", which is fine.

Doesn't the paper cited here on acausal romance imply that gains from acausal trade are incoherent?

The fact that I can imagine someone who can imagine exactly me doesn't seem like it implies that I can make material gains by acting in reference to that inaccessible other.

What am I misunderstanding?

That's the joke. The whole paper is a "Modest Proposal" style satire. It's designed to tear down modal realism by taking it to the most absurd extreme. I also detected hints of playing on the Ontological Argument for the existence of God.
Thanks for that explanation of the paper. Is acausal trade supposed to rely on modal realism, or are they distinct?

People here seem confident that there exists a decision theory immune to blackmail. I see a large amount of discussion of how to make an AI immune to blackmail, but I've never seen it established (or even argued for) that doing so is possible. I think I missed something vital to these discussions somewhere. Could someone point me to it, or explain here?

I'm not aware of a satisfactory treatment of blackmail (in the context of reflective decision theory). The main problem appears to be that it's not clear what "blackmail" is, exactly, how to formally distinguish blackmail from trade.

I think blackmail can be taken as a form of threat. To use Schelling's definition (Strategy of Conflict), a threat has the property that after the person being threatened fails to perform the specified action, the threatener does not want to carry out the threat any more. In other words, the threatener has a credibility problem: he has to convince the target of the threat that he will carry it out once he desires not to. This requires some form of pre-commitment, or an iterated game, or a successful bluff, or something along those lines. What do you see as the distinguishing difference between blackmail and a threat? (I assume that blackmail is a subset of threats, but I suppose that might not be universally agreed.)
Counterexample: A man seduces a female movie star into a one night stand and secretly records a sex tape. He would prefer to blackmail the movie star for lots of money, but if that fails he would rather release the tape to the press for a smaller amount of money + prestige than he would just do nothing. The movie star's preference ordering is for nothing to happen, for her to pay out, then lastly for the press to find out. The optimal choice is for her to pay out, because if she pre-commits to not give in to blackmail, she will receive the worst possible outcome. This seems to fall squarely under blackmail, yet requires no pre-committment, iteration, or bluffing.
It does, making 'blackmail' the wrong term to use when considering game theory scenarios. Some are 'threats', some are simply trade.
I would say that blackmail is the intersection of {things that are indistinguishable from threats} and {things that are indistinguishable from trade}. And yes, from the perspective of the blackmailers that includes both some things that are trade and some that are threats.
Yep. Even without the payoff - the blackmail immune agent is going to interact poorly with stupid blackmailer agent which simply doesn't understand that the blackmail immune agent won't pay. Or with evil blackmailer agent that just derives positive utility from your misfortune, which is the case with most blackmailers in practice. The winning strategy depends to the ecosystem. The winning strategy among defect is defect, but among mixed tit-for-tat works great. The decision systems just tend to converge to a sort of self fulfilling prophecy solutions. edit: i.e. there's a rock-paper-skissors situation between different agents, they are not well ordered in 'betterness'.
Actually, I'm not sure this does fall squarely under blackmail. Consider the case where someone has a tape I don't want shown to the press, and sells that tape to the press for money + prestige, and never gives me any choice in the matter. That's clearly not blackmail. I'm not sure it becomes blackmail when they give me a choice to pay them instead, though the case could be made. Or consider the case where it turns out I don't mind having the tape shown (I want the publicity, say), and so the person sells the tape to the press, and everyone gets what they want. Also not blackmail. Not even clearly attempted blackmail, though the case could be made. My point being that it seems to me that for me to legitimately call something "blackmail" it needs to be something the blackmailer threatens to do only because it makes me suffer more than paying them, not something that the blackmailer wants to do anyway for his own reasons that just happens to make me suffer.
I disagree that the essential element to blackmail is it must be done only to make me suffer. To this end I offer a scenario. (I've made it a little more like a story just for giggles). I argue that both of them were attempting to blackmail and Julia's desire to follow through with it anyways doesn't change anything. The actions would both feel like blackmail to me if I were on the receiving end, and the police would treat both of them as blackmail as well. Blackmail is just an attempt to get money in exchange for not releasing information; the mindset of the blackmailer does not affect it. This is why I agree with Vladimir Nesov, that blackmail exists in a blurry spot on the continuum of trade. If you don't classify Julia's actions as blackmail, I would be curious what you do call it.
I classify Julia's actions as inconsistent, mostly. At time T1, Julia prefers to date me rather than end our relationship and tell my wife. At time T2, Julia prefers to end our relationship and tell my wife. The transition between T1 and T2 evidently has something to do with the transient belief that her silence was worth $4k/week, but what exactly it has to do with that belief is unclear, since by Julia's own account the truth or falsehood of that belief is irrelevant. If I take her account as definitive, I'm pretty clear that what Julia is doing is not blackmail... it reduces to "Hey, I've decided to tell your wife about us, and there's nothing you can do to stop me." It isn't even a threat, it's just early warning of the intent to harm me. If I assume she's lying about her motives, either consciously or with some degree of self-delusion, it might be blackmail. For example, if she believes I really can afford to pay her, and am just claiming poverty as a negotiating tactic, which she is countering by claiming not to care about the money, then it follows that she's blackmailing me. If I assume that she doesn't really have relevant motives anymore, she just precommitted to reveal the information if I don't pay her and now she's following through on her previous precommitment, and the fact that the precommitment was made based on one set of beliefs about the world and she now knows those beliefs were false at the time doesn't change the fact that the precommitment was made ("often wrong, never uncertain"), then she clearly blackmailed me once, and I guess it follows that she's still blackmailing me... maybe? It seems that if she set up a mechanical device that posts the secret to Facebook unless fed $4k in quarters once a week, and then changed her mind and decided she'd rather just keep dating me, but was unable to turn the device off, we could in the same sense say she was still blackmailing me, albeit against her own will. That is at best a problematic sense o
Sorry. I think I communicated unclearly, which is the danger of using stories instead of examples and is my fault entirely. At the very start of the story, Julia learns about your wife at the same time she learns about the lottery. She had previously thought you were single and the new information shifted her preference ordering. Regarding the example you used (oil company & energy), I also hold it is not blackmail. If I use the previous definition of Blackmail being the act of making an attempt to get money in exchange for not revealing information, then the attempt is the crucial part in this case (whether it succeeds or not). The oil company offering me money is okay; me trying to get money out of the oil company is blackmail.
And also sometimes okay. The distinction isn't "okay" vs blackmail. It is blackmail vs not-blackmail and "okay" vs not-okay.
(nods) As noted elsewhere, I missed this and was entirely mistaken about Julia's motives. I stand corrected. You were perfectly clear, I just wasn't reading attentively enough. Re: blackmail... OK. So, if I develop the technology and I approach the oil company and say "I have this technology, I'll guarantee you exclusive rights to it for $N/week," that's blackmail?
I'd say it's much closer to blackmail than the original oil company scenario.
I suppose I agree with that, but I wouldn't call either of them blackmail. Would you?
I think in Xachariah's story Julia did not know prior to seeing you on TV that you have a wife. So there was no time at which she had the preference you describe here.
Ah! Good point, I forgot about that. You're absolutely right... throughout, she presumably prefers to break up with me than date me if I'm married. My error.
Something about this confused me.
Not sure if this is what confused you or not, but it has since been pointed out to me that I was wrong; Julia does not necessarily (and ought not be understood to) have this preference, as she did not know about my wife at T1.
No, it was just you talking about your wife in first person! :)
Ah. Well, my husband had one once, I suppose I might feel left out.
If we avoid the overloaded term "blackmail" and talk of threats vs. trade, Angela is threatening you whereas Julia is offering a trade. I agree that this example shows that "makes you suffer" is not the distinguishing element. It's also interesting that you may not now if the situation is threat or trade (you may not know whether the mistress wants to tell your wife anyway).
I'm not sure how threats and trade are a real dichotomy rather than two fuzzy categories. Suppose I buy food. That's basic trade. But at the same time a monopoly could raise the price of food a lot, and I would still have to buy it, and now it is the threat of starvation. I can go fancy(N), and say, I won't pay more than X for food, I would rather starve to death and then they get no more of my money, and if I can make it credible, and if the monopoly reasons in fancy(N-1) manner, they won't raise the price above X because I won't pay, but if monopoly reasons in the fancy(N) manner, it does exact same reasoning and concludes that it should ignore my threat to starve myself to death and not pay. Most human agents seem to be tit for tat and mirror what ever you are doing, so if you are reasoning "i'll just starve myself to death not to pay" they reason like "i'll just raise the price regardless and the hell with what he does not pay". The blackmail resistant agent is also blackmail resistance resistant.
This is my position as well, blackmail probably doesn't need to be considered as a separate case, reasonable behavior in such cases will probably just fall out from a sufficiently savvy bargaining algorithm.
I agree with this, incidentally.
Good point; haggling is a good example of a fuzzy boundary between threats and trade. If A is willing to sell a widget for any price above $10, and B is willing to buy a widget for any price below $20, and there are no other buyers or sellers, then for any price X strictly between $10 and $20, A saying "I won't sell for less than X" and B saying "I won't sell for more than X" are both threats under my model. Which means that agents that "naively" precommit to never respond to any threats (the way I understand them) will not reach an agreement when haggling. They'll also fail at the Ultimatum game. So there needs to be a better model for threats, possibly one that takes shelling points into account; or maybe there should be a special category for "the kind of threats it's beneficial to precommit to ignore".
Hmm the pre-commitment to ignore would depend on other agents and their pre-pre-commitment to ignore pre-commitments. It just goes recursive like Sherlock Holmes vs Moriarty, and when you go meta and try to look for 'limit' of recursion, it goes recursive again... i have a feeling that it is inherently a rock-paper-skissors situation where you can't cheat like this robot. (I.e. I would suggest, at that point, to try to make a bunch of proofs of impossibility to narrow expectations down somewhat).
It's not possible to coordinate in general against arbitrary opponents, like it's impossible to predict what an arbitrary program does, but it's advantageous for players to eventually coordinate their decisions (on some meta-level of precommitment). On one hand, players want to set prices their way, but on the other they want to close the trade eventually, and this tradeoff keeps the outcome from both extremes ("unfair" prices and impossibility of trade). Players have an incentive to setup some kind of Loebian cooperation (as in these posts), which stops the go-meta regress, although each will try to set the point where cooperation happens in their favor.
I was thinking rather of Halting Problem - like impossibility, along with rock-paper-skissors situation that prevents declaring any one strategy, even the cooperative, as the 'best'.
If difficulty of selecting and implementing a strategy is part of the tradeoff (so that more complicated strategies count as "worse" because of their difficulty, even if they promise an otherwise superior outcome), maybe there are "best" strategies in some sense, like there is a biggest natural number that you can actually write down in 30 seconds. (Such things would of course have the character of particular decisions, not of decision theory.)
There is not a biggest natural number that you can actually write down in thirty seconds -- that's equivalent to Berry's paradox.
Huh? Just start writing. The rule wasn't "the number you can define in 30 seconds", but simply "the number you can write down in 30 seconds". Like the number of strawberries you can eat in 30 seconds, no paradox there!
I was reading "write down" more generally than "write down each digit of in base ten," but I guess that's not how you meant it.
Hmm if it was a programming contest I would expect non-transitive 'betterness'.
Given a fixed state of knowledge about possible opponents and finite number of feasible options for your decision, there will be maximal decisions, even if in an iterated contest the players could cycle their decisions against updated opponents indefinitely.
That's a promise, not a threat, by Schelling's terminology. Once the movie start upholds her end of the bargain, the man has no incentive to keep his promise, and every incentive to break it. Is there something game-theoretic about blackmail that makes it an identifiable subset of the group threats + promises? Note that Schelling also describes a negotiating position that is neither threat nor promise, but the combination of the two. And that's not exactly blackmail either. I suspect you could come up with a blackmail scenario that fits any of the three groupings.
That's not blackmail at all. It seems like blackmail because of the questionable morality of selling secretly recorded sex tapes, but giving the movie star the chance to buy the tape first doesn't make the whole thing any less moral than it would be without that chance, and unlike real blackmail the movie star being known not to respond to blackmail doesn't help in any way. Consider this variation: Instead of a secret tape the movie star voluntarily participated in an amateur porno that was intended to be publicly released from the beginning, but held up for some reason, and all that happened before the movie star became famous in the first place. The producer knows that releasing the tape will hurt her career and offers her to buy the tape to prevent it from being released. This doesn't seem like blackmail at all, and the only change was to the moral (and legal) status of releasing the tape, not to the trade.
I still classify it as blackmail. Something similar to this happened to Cameron Diaz although the rights to resell the photos were questionable. She posed topless in some bondage shots for a magazine, but they were never printed. The photographer kept the shots and the recording of the photo shoot for ten years until one of the Charlie's Angel's films was about to come out. He offered them to her for a couple of million or he would sell them to the highest bidder. The courts didn't buy that he was just offering her first right of refusal and sentenced him for attempted grand theft (blackmail), forgery, and perjury (for modifying release forms and lying about it). Link
Are you sure you aren't just pattern matching to similarity to known types of blackmail? Do you think it would be useful for an AI to classify it the same way (which was the starting point of this thread)? Your link doesn't go into much detail, but it seems like he was convicted because he was lying and making up the negative consequences he threatened her with, and like he was going out of his way to make the consequences of selling to someone else as bad as possible rather than maximizing revenue (or at least making her believe so). That would qualify this case as blackmail under the definition above, unlike either of our hypothetical examples.
But you're forgetting the man's best option: get lots of money from the movie star and get a smaller amount from the press. Edit: Ah, not lump-sum payment. I can see how that would work then.
And be a lot more vulnerable to criminal charges for the blackmail.
I'm assuming that the movie star is at least reasonably smart. The first thing that comes to mind is periodic payments that decrease over time, with the value hovering just above what magazines are willing to pay + bragging rights, since people are less impressed with a 10 year old sex tape than a brand new one. Eventually the payments would be stopped when either the man has more to lose from releasing the tape than staying quiet (eg, he's settled down and married now) or the movie star values money more than the loss of prestige from a scandal (eg, another scandal breaks or she stops getting roles anyway). I'm sure there are other ways to solve the problem as well, but regardless it's a technical hurdle rather than an absolute one.
Continuum behaviors are discussed in some detail by Schelling, and interestingly they can be used by both parties. Here they make the blackmail more effective. If the payment is lump-sum, the blackmailer can't be trusted, and so the movie start won't pay. The continuous payment option gives her a way to pay the blackmailer and expect him to stay quiet, which makes her more vulnerable to blackmail in the first place. Continuous options can also be used to derail threats, when the person being threatened can act incrementally and there is no bright line to force the action (assuming the threat is to carry out a single action).
Really? I hope not. Sound like a very silly idea to me. Does that just mean unresponsive to coercion? Pretending like you never heard it? Always refusing to give in to a threat? I suppose one could program that into an AI, and when someone says to the AI, "your money or your life", we'll get a dead AI. Why not just program your AI to self destruct when threatened? Sometimes, knuckling under is the right call.
...and sometimes choosing to die is.
Sometimes. As long as it's only sometimes, maybe you're better off not always driving off the cliff. I see some people recommending broadcasting your precommmitment to not knuckle under to blackmail. Not so good when you run into a blackmailer with a precommitment to always follow through. And not so good when the other guy isn't certain about your precommitment. Or if he is just a spiteful prick. Never underestimate the power of spite.
Book suggestion: Schelling's Strategy of Conflict. Immunity to blackmail comes from the potential blackmailers knowing that you wouldn't give in to their demands, so they don't do it. Many decision-making procedures can be constructed that have this property - for example, you could just make a "don't give in bot" that never gives in to blackmail, and distribute its source code to would-be blackmailers.
I suspect this is wishful thinking. It would be nice to use it (if it exists), so assume it exists and go looking for it: either you can't find it, and so it doesn't matter whether it existed or not in the first place, or you found it, and now you know it existed in the first place and what it is.
Blackmail and Pascal's Mugging is not the same. The goal is to make an AI immune to Pascal's Mugging and most humans are immune to Pascal's Mugging/Wager inherently. There is no decision theory that makes you immune to all forms of blackmail, since it is rational to give in in many cases depending on how the payouts are set up.
In this theme, there is a lot of talk about making decision theories or utility functions that are immune to pascal's mugging, but isn't the whole FAI scenario, give us money basically a Pascal's mugging?

Yeah, it is if you completely ignore the unique and defining feature of all Pascal's mugging, the conditionality of the reward on your assessed probability... ಠ_ಠ

I don't understand this. In the original Pascal's wager, it is suggested that you become Christian, start going through the motions, and this will eventually change your belief so that you think God's existence is likely. But this is not a feature of the generalized Pascal's mugging, at least not as described on the wiki page. (Also, given that this is the Stupid Questions Thread, I feel your comment would be improved by more explanation and less disapproving emoticons...)
No, in the original Pascal's wager you are advised to believe in God, as God would judge you based on your beliefs (i.e. your assessed probability of existence). However, that doesn't seem to be the form of the Pascal's mugging, which is also discussed quite a bit on this site. The conditionality of reward or punishment on subjective probability estimates doesn't seem to be the point at which decision theories break down, but rather they seem to break down with very small probabilities of very large effects.
Actually, Pascal did advise "going through the motions" as a solution to being unable to simply will oneself into belief. The wager might not be strong apologetics, but I give Pascal some credit for his grasp of cognitive dissonance.
I stand corrected.
Yes, this is what I was trying to say. I see how the phrase "conditionality of the reward on your assessed probability" could describe Pascal's Wager, but not how it could describe Pascal's Mugging.
More concisely than the original/gwern: The algorithm used by the mugger is roughly: Find your assessed probability of the mugger being able to deliver whatever reward, being careful to specify the size of the reward in the conditions for the probability offer an exchange such that U(payment to mugger) < U(reward) * P(reward) This is an issue for AI design because if you use a prior based on Kolmogorov complexity than it's relatively straightforward to find such a reward, because even very large numbers have relatively low complexity, and therefore relatively high prior probabilities.
When you have a bunch of other data, you should be not interested in the Kolmogorov complexity of the number, you are interested in Kolmogorov complexity of other data concatenated with that number. E.g. you should not assign higher probability that Bill Gates has made precisely 100 000 000 000 $ than some random-looking value, as given the other sensory input you got (from which you derived your world model) there are random-looking values that have even lower Kolmogorov complexity of total sensory input, but you wouldn't be able to find those because Kolmogorov complexity is uncomputable. You end up mis-estimating Kolmogorov complexity when you don't have it given to you on a platter pre-made. Actually, what you should use is algorithmic (Solomonoff) probability, like AIXI does, on the history of sensory input, to weighted sum among the world models that present you with the marketing spiel of the mugger. The shortest ones simply have the mugger make it up, then there will be the models where mugger will torture beings if you pay and not torture if you don't, it's unclear what's going to happen out of this and how it will pan out, because, again, uncomputable. In the human approximation, you take what mugger says as privileged model, which is strictly speaking an invalid update (the probability jumps from effectively zero for never thinking about it, to nonzero), and the invalid updates come with a cost of being prone to losing money. The construction of model directly from what mugger says the model should be is a hack; at that point anything goes and you can have another hack, of the strategic kind, to not apply this string->model hack to ultra extraordinary claims without evidence. edit: i meant, weighted sum, not 'select'.
The mugging is defined as having conditionality; just read Bostrom's paper or Baumann's reply! That Eliezer did not explicitly state the mugger's simple algorithm, but instead implied it in his discussion of complexity and size of numbers, does not obviate this point.

I kind of wish talk about newcomb's problem was presented in terms of source code and AI rather than the more common presentation, since I think it's much more obvious what is being aimed at when you think about it this way. Is there a reason people prefer the original version?

Most people aren't AI's or even programers (though the latter are fairly common on LW).
Most people also aren't presented with Omega situations. The reason it's important to solve newcomb's problem is so that we can make an AI that will respond to the incentives we give it to self-modify in ways we want it to.
Most people find the verbal descriptions easier to handle.
Most people are much more easily misled via verbal descriptions.
Maybe the right thing to do is to mix and match different presentations of the problem? I.e. one person might be all like "huh??" or "this is stupid" whenever Newcomb's problem is discussed, but be like "oh NOW I get it" when it's presented in terms of AI and source code. Somebody else might be the opposite.
Orthonormal does a pretty good job of sorce code-esque considerations. Helped me out.

If I'm a moral anti-realist, do necessarily I believe that provably Friendly AI is impossible? When defining friendly, consider Archimedes' Chronophone, which suggests that friendly AI would (should?) be friendly to just about any human who ever lived.

moral anti-realism - there are no (or insufficient) moral facts to resolve all moral disputes an agent faces.

People can hold different moral views. Sometimes these views are opposed and any compromise would be called immoral by at least one of them. Any AI that enforced such a compromise, would be called unFriendly by at least one of them. Even for a moral realist (and I don't think well of that position), the above remains true, because people demonstrably have irreconcilably different moral views. If you're a moral realist, you have the choice of: 1. Implement objective moral truth however defined, and ignore everyone's actual moral feelings. In which case FAI is irrelevant - if the moral truth tells you to be unFriendly, you do it. 2. Implement some pre-chosen function - your own morals, or many people's morals like CEV, or some other thing that does not depend on moral truth. If you're a moral anti-realist, you can only choose 2, because no moral truth exists. That's the only difference stemming from being a moral realist or anti-realist. Does this mean that Friendly-to-everyone AI is impossible in moral anti-realism? Certainly, because people have fundamental moral disagreements. But moral realism doesn't help! It just adds the option of following some "moral facts" which some or all humans disagree with, which is no better in terms of Friendliness than existing options. (If all humans agreed with some set of purported moral facts, people wouldn't have needed to invent the concept of moral facts in the first place.)
The existence of moral disagreement, standing alone, is not enough to show moral realism is false. After all, scientific disagreement doesn't show physical realism is false. Further, I am confused by your portrayal of moral realists. Presumably, the reality of moral facts would show that people acting contrary to those facts were making a mistake, much like people who thought "Objects in motion will tend to come to a stop" were making a mistake. It seems strange to call correcting that mistake "ignoring everyone's actual scientific feelings." Likewise, if I am unknowingly doing wrong, and you can prove it, I would not view that correction as ignoring my moral feelings - I want to do right, not just think I am doing right. In short, I think that the position you are labeling "moral realist" is just a very confused version of moral anti-realism. Moral realists can and should reject that idea that the mere existence at any particular moment of moral disagreement is useful evidence of whether there is one right answer. In other words, a distinction should be made between the existence of moral disagreement and the long-term persistence of moral disagreement.
I didn't say that it was. Rather I pointed out the difference between morality and Friendliness. For an AI to be able to be Friendly towards everyone requires not moral realism, but "friendliness realism" - which is basically the idea that a single behavior of the AI can satisfy everyone. This is clearly false if "everyone" means "all intelligences including aliens, other AIs, etc." It may be true if we restrict ourselves to "all humans" (and stop humans from diversifying too much, and don't include hypothetical or far-past humans). I, personally, believe the burden of proof is on those who believe this to be possible to demonstrate it. My prior for "all humans" says they are a very diverse and selfish bunch and not going to be satisfied by any one arrangement of the universe. Regardless, moral realism and friendliness realism are different. If you built an objectively moral but unFriendly AI, that's the scenario I discussed in my previous comment - and people would be unhappy. OTOH, if you think a Friendly AI is by logical necessity a moral one (under moral realism), that's a very strong claim about objective morals - a claim that people would perceive an AI implementing objective morals as Friendly. This is a far stronger claim than that people who are sufficiently educated and exposed to the right knowledge will come to agree with certain universal objective morals. A Friendly AI means one that is Friendly to people as they really are, here and now. (As I said, to me it seems very likely that an AI cannot in fact be Friendly to everyone at once.)
I think we are simply having a definitional dispute. As the term is used generally, moral realism doesn't mean that each agent has a morality, but that there are facts about morality that are external to the agent (i.e. objective). Now, "objective" is not identical to "universal," but in practice, objective facts tend to cause convergence of beliefs. So I think what I am calling "moral realism" is something like what you are calling "Friendliness realism." Lengthening the inferential distance further is that realism is a two place word. As you noted, there is a distinction between realism(Friendliness, agents) and realism(Friendliness, humans). That said, I do think that "people would perceive an AI implementing objective morals as Friendly" if I believed that objective morals exist. I'm not sure why you think that's a stronger claim than "people who are sufficiently educated and exposed to the right knowledge will come to agree with certain universal objective morals." If you believed that there were objective moral facts and knew the content of those facts, wouldn't you try to adjust your beliefs and actions to conform to those facts, in the same way that you would adjust your physical-world beliefs to conform to objective physical facts?
That seems likely. If moral realists think the morality is a one-place word, and anti realists think it's a two place word, we would be better served by using two distinct words. It is somewhat unclear to me what moral realists are thinking of, or claiming, about whatever it is they call morality. (Even after taking into account that different people identified as moral realists do not all agree on the subject.) I defined 'Friendliness (to X)' as 'behaving towards X in the way that is best for X in some implied sense'. Obviously there is no Friendliness towards everyone, but there might be Friendliness towards humans: then "Friendliness realism" (my coining) is the belief that there is a single Friendly-towards-humans behavior that will in fact be Friendly towards all humans. Whereas Friendliness anti-realism is the belief no one behavior would satisfy all humans, and it would inevitably be unFriendly towards some of them. Clearly this discussion assumes many givens. Most importantly, 1) what exactly counts as being Friendly towards someone (are we utilitarian? what kind? must we agree with the target human as to what is Friendly towards them? If we influence them to come to like us, when is that allowed?). 2) what is the set of 'all humans'? Do past, distant, future expected, or entirely hypothetical people count? What is the value of creating new people? Etc. My position is that: 1) for most common assumed answers to these questions, I am a "Friendliness anti-realist"; I do not believe any one behavior by a superpowerful universe-optimizing AI would count as Friendliness towards all humans at once. And 2), inasfar as I have seen moral realism explained, it seems to me to be incompatible with Friendliness realism. But it's possible some people mean something entirely different by "morals" and by "moral realism" than what I've read. That's a tautology: yes I would. But, the assumption is not valid. Even if you assume there exist objective moral facts (whatever
No. FAI is supposed to implement an extrapolated version of mankind's combined values, not search for an objectively defined moral code to implement. Also: Eliezer has argued that even from it's programmers' perspective, some elements of a FAI's moral code (Coherent Extrapolated Volition) will probably look deeply immoral. (But will actually be OK.)
Why does the moral anti-realist think "an extrapolated version of mankind's combined values" exists or is capable of being created? For the moral realists, the answer is easy - the existence of objective moral facts shows that, in principle, some moral system that all humans could endorse could be discovered/articulated. As an aside, CEV is a proposed method for finding what an FAI would implement. I think that one could think FAI is possible even if CEV were the wrong track for finding what FAI should do. In should, CEV is not necessarily part of the definition of Friendly.
Well, to assert that "an extrapolated version of mankind's combined values can be created" doesn't really assert much, in and of itself... just that some algorithm can be implemented that takes mankind's values as input and generates a set of values as output. It seems pretty likely that a large number of such algorithms exist. Of course, what CEV proponents want to say, additionally, is that some of these algorithms are such that their output is guaranteed to be something that humans ought to endorse. (Which is not to say that humans actually would endorse it.) It's not even clear to me that moral realists should believe this. That is, even if I posit that objective moral facts exist, it doesn't follow that they can be derived from any algorithm applied to the contents of human minds. But I agree with you that it's still less clear why moral anti-realists should believe it.
No. I mean, I'm unsure about the possibility of provably Friendly AI but it's not obvious that anti-realism makes it impossible. Moral realism, were it the case, might make things easier but it's hard for me to imagine what that world looks like.
Let us define a morality function F() as taking as input x=the factual circumstances an agent faces in making a decision, outputting y=the decision the agent makes. It is fairly apparent that practically every agent has an F(). So ELIEZER(x) is the function that describes what Eliezer would choose in situation x. Next, define GROUP{} as the set of morality functions run by all the members of that group. Let us define CEV() as the function that takes as input a morality function or set of morality functions and outputs a morality function that is improved/made consistent/extrapolated from the input. I'm not asserting the actual CEV formulation will do that, but it is a gesture towards the goal that CEV() is supposed to solve. For clarity, let the output of CEV(F()) = CEV.F(). Thus, CEV.ELIEZER() is the extrapolated morality from the morality Eliezer is running. In parallel CEV.AMERICA() (which is the output of CEV(AMERICA{})) the single moral function that is the extrapolated morality of everyone in the United States. If CEV() exists, an AI considering/implementing CEV.JOHNDOE() is Friendly to John Doe. Likewise, CEV.GROUP() leads to an AI that is Friendly to every member of the group. For FAI to be possible, CEV() must output for (A) any morality function or (B) set of morality functions. Further, for provable FAI, it must be possible to (C) mathematically show the output of CEV() before turning on the AI. If moral realism is false, why is there reason to think (A), (B), or (C) are true?
Any set? Why not just require that CEV.HUMANITY() be possible? It seems like there are some sets of morality functions G that would be impossible (G={x, ~x}?). Human value is really complex so it's a difficult thing to a)model it and b) prove the model. Obviously I don't know how to do that; no one does yet. If moral realism were true and morality were simple and knowable I suppose that would make the job a lot easier... but that doesn't seem like a world that is still possible. Conversely, morality could be both real and unknowable and impossibly complicated and then we'd be even in worse shape because learning about human values wouldn't even tell us how to do Friendly AI! Maybe if you gave me some idea of what your alternative to anti-realism would look like I could answer better. In short: Friendliness is really hard, part of the reason it seems so hard to me might have to do with my moral anti-realism but I have trouble imagining plausible realist worlds where things are easier.
First, a terminology point: CEV.HUMANITYCURRENTLYALIVE() != CEV.ALLHUMANITYEVER(). For the anti-realist, CEV.HUMANITYCURRENTLYALIVE() is massively more plausible, and CEV.LONDON() is more plausible than that - but my sense is that this sentence depends on the anti-realist accepting of some flavor of moral relativism. Second, it seems likely that fairly large groups (i.e. the population of London) already have some {P, ~P}. That's one reason to think making CEV() is really hard. I don't understand what proving the model means in this context. I don't understand why you talk about possibility. "Morality is true, simple, and knowable" seems like an empirical proposition: it just turns out to be false. It isn't obvious to me that simple moral realism is necessarily false in the way that 2+5=7 is necessarily true. How does the world look different if morality is real and inaccessible vs. not real? Pace certain issues about human appetites as objective things, I am an anti-realist - in case that wasn't clear.
Sure sure. But CEV.ALLHUMANITYEVER is also not the same as all CEV.ALLPOSSIBLEAGENTS. Some subroutines are probably inverted but there probably aren't people with fully negated utility functions from other people. Trade-offs needn't mean irreconcilable differences. Like I doubt there is anyone in the world who cares as much as you do about the exact opposite of everything you care about. Show with some confidence that it doesn't lead to terrible outcomes if implemented. I'm not sure that it is. But when I said "still" possible I meant that we have more than enough evidence to rule out the possibility that we are living in such a world. I didn't mean to imply any beliefs about necessity. That said I am pretty confused about what it would mean for there to be objective facts about right and wrong. Usually I think true beliefs are supposed to constrain anticipated experience. Since moral judgments don't do that... I'm not quite sure I know what moral realism would really mean. I imagine it wouldn't look different but since there is no obvious way of proving a morality logically or empirically I can't see how moral realists would be able to rule it out. Oh I understand that. I just meant that when you ask: I'm wondering "Opposed to what?". I'm having trouble imagining the person for whom the prospects of Friendly AI are much brighter because they are a moral realist.
It seems to me that moral realists have more reason to be optimistic about provably friendly AI than anti-realists. The steps to completion are relatively straightforward: (1) Rigorously describe the moral truths that make up the true morality. (2) Build an AGI that maximizes what the true morality says to maximize. I think Alice, a unitary moral realist, believes she is justified in saying: "Anyone whose morality function does not output Q in situation q is a defective human, roughly analogous to the way any human who never feels hungry is defective in some way." Bob, a pluralist moral realist, would say: "Anyone whose morality function does not output from the set {Q1, Q2, Q3} in situation q is a defective human." Charlie, a moral anti-realist, would say Alice and Bob's statements are both misleading, being historically contingent, or incapable of being evaluated for truth, or some other problem. Consider the following statement: "Every (moral) decision a human will face has a single choice that is most consistent with human nature." To me, that position implies that moral realism is true. If you disagree, could you explain why? What is at stake in the distinction? A set of facts that cannot have causal effect might as well not exist. Compare error theorists to inaccessibility moral realists - the former say value statements cannot be evaluated for truth, the latter say value statements could be true, but in principle, we will never know. For any actual problem, both schools of thought recommend the same stance, right?
Is step 1 even necessary? Presumably in that universe one could just build an AGI that was smart enough to infer those moral truths and implement them, and turn it on secure in the knowledge that even if it immediately started disassembling all available matter to make prime-numbered piles of paperclips, it would be doing the right thing. No?
That's an interesting point. I suppose it depends on whether a moral realist can think something can be morally right for one class of agents and morally wrong for another class. I think such a position is consistent with moral realism. If that is a moral realist position, then the AI programmer should be worried that an unconstrained AI would naturally develop a morality function different than CEV.HUMANITY(). In other words, when we say moral realist, are we using a two part word with unfortunate ambiguity between realism(morality, agent) and realism(morality, humans)? Wow, I never considered whether this was part of the inferential distance in these types of discussions.
Well, to start with, I would say that CEV is beside the point here. In a universe where there exist moral truths that make up the true morality, if what I want is to do the right thing, there's no particular reason for me to care about anyone's volition, extrapolated or otherwise. What I ought to care about is discerning those moral truths. Maybe I can discern them by analyzing human psychology, maybe by analyzing the human genome, maybe by analyzing the physical structure of carbon atoms, maybe by analyzing the formal properties of certain kinds of computations, I dunno... but whatever lets me figure out those moral truths, that is what I ought to be attending to in such a universe, and if humanity's volition conflicts with those truths, so much the worse for humanity. So the fact that an unconstrained AI might -- or even is guaranteed to -- develop a morality function different than CEV.HUMANITY() is not, in that universe, a reason not to build an unconstrained AI. (Well, not a moral reason, anyway. I can certainly choose to forego doing the right thing in that universe if it turns out to be something I personally dislike, but only at the cost of behaving immorally.) But that's beside your main point, that even in that universe the moral truths of the universe might be such that different behaviors are most right for different agents. I agree with this completely. Another way of saying it is that total rightness is potentially maximized when different agents are doing (specific) different things. (This might be true in a non-moral-realist universe as well.) Actually, it may be useful here to be explicit about what we think a moral truth is in that universe. That is, is it a fact about the correct state of the world? Is it a fact about the correct behavior of an agent in a given situation, independent of consequences? Is it a fact about the correct way to be, regardless of behavior or consequences? Is it something else?

Is there a better way to read Less Wrong?

I know I can put the sequences on my kindle, but I would like to find a way to browse Discussion and Main in a more useable interface (or at least something that I can customize). I really like the threading organization of newsgroups, and I read all of my .rss feeds and mail through Gnus in emacs. I sometimes use the Less Wrong .rss feed in Gnus, but this doesn't allow me to read the comments. Any suggestions?

Also, if any other emacs users are interested, I would love to make a lesswrong-mode package. I'm not a very good lisp hacker, but I think it would be a fun project.

How do I put the sequences on a Kindle?
Thumbs up from me for lesswrong-mode!
I attempted this today but without an API (LW's fork of the reddit codebase looks pretty old) I don't think I can get very far.

Could someone please explain to me exactly, precisely, what a utility function is? I have seen it called a perfectly well-defined mathematical object as well as not-vague, but as far as I can tell, no one has ever explained what one is, ever.

The words "positive affine transformation" have been used, but they fly over my head. So the For Dummies version, please.

Given an agent with some set X of choices, a utility function u maps from the set X to the real numbers R. The mapping is such that the agent prefers x1 to x2 if and only if u(x1) > u(x2). This completes the definition of an ordinal utility function. A cardinal utility function satisfies additional conditions which allow easy consideration of probabilities. One way to state these conditions is that probabilities defined on X are required to be linear over u. This means that we can now consider probabilistic mixes of choices from X (with probabilities summing to 1). For example, one valid mix would be 0.25 probability of x1 with 0.75 probability of x2, and a second valid mix would be 0.8 probability of x3 with 0.2 probability of x4. A cardinal utility function must satisfy the condition that the agent prefers the first mix to the second mix if and only if 0.25u(x1) + 0.75u(x2) > 0.8u(x3) + 0.2u(x4). Cardinal utility functions can also be formalized in other ways. E.g., another way to put it is that the relative differences between utilities must be meaningful. For instance, if u(x1) - u(x2) > u(x3) - u(x4), then the agent prefers x1 to x2 more than it prefers x3 to x4. (This property need not hold for ordinal utility functions.) Other notes: * In my experience, ordinal utility functions are normally found in economics, whereas cardinal utility functions are found in game theory (where they are essential for any discussion of mixed strategies). Most, if not all, discussions on LW use cardinal utility functions. * The VNM theorem is an incredibly important result on cardinal utility functions. Basically, it shows that any agent satisfying a few basic axioms of 'rationality' has a cardinal utility function. (However, we know that humans don't satisfy these axioms. To model human behavior, one should instead use the descriptive prospect theory.) * Beware of erroneous straw characterizations of utility functions (recent example). Remember the VNM theorem—very fru
Wiktionary seems to have a decent definition. It boils down to listing all possible outcomes and ordering them according to your preferences. The words "affine transformation" reflect the fact that all possible ways to assign numbers to outcomes which result in the same ordering are equivalent.
Isn't this a Wikipedia article on them?

So, in the spirit of stupid (but nagging) questions:

The sequences present a convincing case (to me at least) that MWI is the right view of things, and that it is the best conclusion of our understanding of physics. Yet I don't believe it, because it seems to be in direct conflict with the fact of ethics: if all I can do is push the badness out of my path, and into some other path, then I can't see how doing good things matters. I can't change the fundamental amount of goodness, I can just push it around. Yet it matters that I'm good and not bad.

The 'keep y... (read more)

because it seems to be in direct conflict with the fact of ethics

Actual answers aside, as a rationalist, this phrase should cause you to panic.

What do you mean by in conflict? Believing one says nothing about the other. You're not "pushing" anything around. If you act good in one set of universes, that is a set of universes made better by your actions. If you act bad in another, the same thing. Acting good does not cause other universes to become bad.

People making decisions are not quantum events. When a photon could either end up in a detector or not, there are branches where it does and branches where it doesn't. But when you decide whether or not to do something good, this decision is being carried out by neurons, which are big enough that quantum events do not influence them much. This means that if you decide to do something good, you probably also decided to do the same good thing in the overwhelming majority of Everette branches that diverge from when you started considering the decision.

This may be true, but I don't think anyone knows for sure, and it seems likely to me that the brain has the property of sensitivity to initial conditions, meaning that it's likely to do different stuff in different Everett branches. Yvain recently asked about this on his blog -- he tends to agree with you: More on-topic for the grandparent: Greg Egan's novella Oracle talks about the ethical issue of bad stuff happening in other Everett branches.

The fact that I can reliably multiply numbers shows that at least some of my decisions are deterministic.

To the extent that I make ethical decisions based on some partially deterministic reasoning process, my ethical decisions are not chaotic.

If, due to chaos, I have a probability p of slapping my friends instead of hugging them, then Laplace's law of succession tells me that p is less than 1%.

There must be chaotic amplification of quantum events going on. Any macroscopic system at finite temperature will be full of quantum events, like a molecule in an excited state returning to its ground state. The quantum randomness is a constant source of "noise" which normally averages out, but sometimes there will be fluctuations away from a mean, and sometimes they will be amplified into mesoscopic and macroscopic difference. This must be true, but it would be best to have a mathematical demonstration, e.g. that the impact of quantum fluctuations on the transfer of heat through an atmosphere will amplify into macroscopically different weather patterns on a certain timescale.
I have taken lots of decisions based on random bits from Fourmilab or (especially before finding LessWrong -- nowadays I only do that when deciding which password to use and stuff like that).

The sequences present a convincing case (to me at least) that MWI is the right view of things, and that it is the best conclusion of our understanding of physics.

Just a caution, here. The sequences only really talk about non-relativistic quantum mechanics (NRQM), and I agree that MWI is the best interpretation of this theory. However, NRQM is false, so it doesn't follow that MWI is the "right view of things" in the general sense. Quantum field theory (QFT) is closer to the truth, but there are a number of barriers to a straightforward importation of MWI into the language of QFT. I'm reasonably confident that an MWI-like interpretation of QFT can be constructed, but it does not exist in any rigorous form as of yet (as far as I am aware, at least). You should be aware of this before committing yourself to the claim that MWI is an accurate description of the world, rather than just the best way of conceptualizing the world as described by NRQM.

This is important if true, and I would like to know more. What are the barriers? On the other hand, my understanding is that QFT itself doesn't exist in a rigorous form yet, either.

This article (PDF) gives a nice (and fairly accessible) summary of some of the issues involved in extending MWI to QFT. See sections 4 and 8 in particular. Their focus in the paper is wavefunction realism, but given that MWI (at least the version advocated in the Sequences) is committed to wavefunction realism, their arguments apply. They offer a suggestion of the kind of theory that they think can replace MWI in the relativistic context, but the view is insufficiently developed (at least in that paper) for me to fully evaluate it.

A quick summary of the issues raised in the paper:

  • In NRQM, the wave function lives in configuration space, but there is no well-defined particle configuration space in QFT since particle number is not conserved and particles are emergent entities without precisely defined physical properties.

  • A move to field configuration space is unsatisfactory because quantum field theories admit of equivalent description using many different choices of field observable. Unlike NRQM, where there are solid dynamical reasons for choosing the position basis as fundamental, there seems to be no natural or dynamically preferred choice in QFT, so a choice of a particular f

... (read more)
Thanks for that; it's quite an interesting article, and I'm still trying to absorb it. However, one thing that seems pretty clear to me is that for EY's intended philosophical purposes, there really is no important distinction between "wavefunction realism" (in the context of NRQM) and "spacetime state realism" (in the context of QFT). Especially since I consider this post to be mostly wrong: locality in configuration space is what matters, and configuration space is a vector space (specifically a Hilbert space) -- there is no preferred (orthonormal) basis. If the "problem" is merely that certain integrals are divergent, then I agree. No one says that the fact that diverges shows a lack of rigor in real analysis! What concerns me is whether any actual mathematical lies are being told -- such as integrals being assumed to converge when they haven't yet been proved to do so. Or something like the early history of the Dirac delta, when physicists unashamedly spoke of a "function" with properties that a function cannot, in fact, have. If QFT is merely a physical lie -- i.e., "not a completely accurate description of the universe" -- and not a mathematical one, then that's a different matter, and I wouldn't call it an issue of "rigor".
I'm a little unclear about what EY's intended philosophical purposes are in this context, so this might well be true. One possible problem worth pointing out is that spacetime state realism involves an abandonment of a particular form of reductionism. Whether or not EY is committed to this form of reductionism somebody more familiar with the sequences than I would have to judge. According to spacetime state realism, the physical state of a spacetime region is not supervenient on the physical states of its subregions, i.e. the physical state of a spacetime region could be different without any of its subregions being in different states. This is because subregions can be entangled with one another in different ways without altering their local states. This is not true of wavefunction realism set in configuration space. There, the only way a region of configuration space could have different physical properties is if some of its subregions had different properties. Also, I think it's possible that the fact that the different "worlds" in spacetime state realism are spatially overlapping (as opposed to wavefunction realism, where they are separated in configuration space) might lead to interesting conceptual differences between the two interpretations. I haven't thought about this enough to give specific reasons for this suspicion, though. I'm not sure exactly what you're saying here, but if you're rejecting the claim that MWI privileges a particular basis, I think you're wrong. Of course, you could treat configuration space itself as if it had no preferred basis, but this would still amount to privileging position over momentum. You can't go from position space to momentum space by a change of coordinates in configuration space. Configuration space is always a space of possible particle position configurations, no matter how you transform the coordinates. I think you might be conflating configuration space with the Hilbert space of wavefunctions on configuration sp
As I read him, he mainly wants to make the point that "simplicity" is not the same as "intuitiveness", and the former trumps the latter. It may seem more "humanly natural" for there to be some magical process causing wavefunction collapse than for there to be a proliferation of "worlds", but because the latter doesn't require any additions to the equations, it is strictly simpler and thus favored by Occam's Razor. Yes, sorry. What I actually meant by "configuration space" was "the Hilbert space that wavefunctions are elements of". That space, whatever you call it ("state space"?), is the one that matters in the context of "wavefunction realism". (This explains an otherwise puzzling passage in the article you linked, which contrasts the "configuration space" and "Hilbert space" formalisms; but on the other hand, it reduces my credence that EY knows what he's talking about in the QM sequence, since he doesn't seem to talk about the space-that-wavefunctions-are-elements-of much at all.) This is contrary to my understanding. I was under the impression that classical mechanics, general relativity, and NRQM had all by now been given rigorous mathematical formulations (in terms of symplectic geometry, Lorentzian geometry, and the theory of operators on Hilbert space respectively). The mathematician's standards are what interests me, and are what I mean by "rigor". I don't consider it a virtue on the part of physicists that they are unaware of or uninterested in the mathematical foundations of physics, even if they are able to get away with being so uninterested. There is a reason mathematicians have the standards of rigor they do. (And it should of course be said that some physicists are interested in rigorous mathematics.)
This is a very good post, but I wonder: One of the authors in the paper you cite is David Wallace, perhaps the most prominent proponent of modern Everettian interpretation. He just published a new book called "The Emergent Multiverse" and he claims there is no problem unifying MWI with QFT because interactions within worlds are local and only states are nonlocal. I have yet to hear him mention any need for serious reformulation of anything in terms of MWI. You said you suspect this is necessary, but that you hope we can recover a similar MWI, but isn't it more reasonable to expect that at the planck scale something else will explain the quantum weirdness? After all if MWI fails both probability and relativity, then there is no good reason to suspect that this interpretation is correct. Have you given Gerard 't Hoofts idea of cellular automata which he claims salvage determinism, locality and realism any thought?
When I talk about recovering MWI, I really just mean absorbing the lesson that our theory does not need to deliver determinate measurement results, and ad hoc tools for satisfying this constraint (such as collapse or hidden variables) are otiose. Of course, the foundations of our eventual theory of quantum gravity might be different enough from those of quantum theory that the interpretational options don't translate. How different the foundations will be depends on which program ends up working out, I suspect. If something like canonical quantum gravity or loop quantum gravity turns out to be the way to go, then I think a lot of the conceptual work done in interpreting NRQM and QFT will carry over. If string theory turns out to be on the right track, then maybe a more radical interpretational revision will be required. The foundations of string theory are now thought to lie in M-theory, and the nature of this theory is still pretty conceptually opaque. It's worth noting though that Bousso and Susskind have actually suggested that string theory provides a solid foundation for MWI, and that the worlds in the string theory landscape are the same thing as the worlds in MWI. See here for more on this. The paper has been on my "to read" list for a while, but I haven't gotten around to it yet. I'm skeptical but interested. I know of 't Hooft's cellular automata stuff, but I don't know much about it. Speaking from a position of admitted ignorance, I'm skeptical. I suspect the only way to construct a genuinely deterministic local realist theory that reproduces quantum statistics is to embrace superdeterminism in some form, i.e. to place constraints on the boundary conditions of the universe that make the statistics work out by hand. This move doesn't seem like good physics practice to me. Do you know if 't Hooft's strategy relies on some similar move?
't Hooft's latest paper is the first in which he maps a full QFT to a CA, and the QFT in question is a free field theory. So I think that in this case he evades Bell's theorem, quantum complexity theorems, etc, by working in a theory where physical detectors, quantum computers, etc don't exist, because interactions don't exist. It's like how you can evade the incompleteness theorems if your arithmetic only has addition but not multiplication. Elsewhere he does appeal to superselection / cosmological initial conditions as a way to avoid cat states (macroscopic superpositions), but I don't see that playing a role here. The mapping itself has something to do with focusing on the fractional part of particle momentum as finite, and avoiding divergences by focusing on a particular subspace. It's not a trivial result. But extending it to interacting field theory will require new ideas, e.g. making the state space of each individual cell in the CA into a Fock space, or permitting CTCs in the CA grid. Surely you need radical ingredients like that in order to recover the full quantum state space...
Aha, I see. So you do not share EY's view that MWI is "correct" then and the only problem it faces is recovering the Born Rule? I agree that obviously what will end up working will depend on what the foundations are :) I remember that paper by Buosso and Susskind, I even remember sending a mail to Susskind about it, while at the same time asking him about his opinion of 't Hoofts work. If I remember correctly the paper was discussed at some length over at (can't remember the post) and it seemed that the consensus was that the authors have misinterpreted decoherence in some way. I don't remember the details, but the fact that the paper itself has not been mentioned or cited in any article I have read since then indicates to me that there has had to have been some serious error in it. Also Susskinds answer regarding 't Hoofts work was illuminating. To paraphrase he said he felt that 't Hooft might be correct, but due to there not being any predictions it was hard to hold a strong opinion either way on the matter. So it seems Susskind was not very sold on his own idea. Gerard 't Hooft actually does rely on what people call "superdeterminism", which I just call "full determinism", which I think is also a term 't Hooft likes more. At least that is what his papers indicate. He discuss this some in a article from 2008 in response to Simon Kochen and John Conway's Free Will Theorem. You might want to read the article: After that you might want to head on over to arxiv, 't Hooft has published a 3 papers the last 6 months on this issue and he seem more and more certain of it. He also adress the objections in some notes in those papers. Link:
Depends on what you mean by rigorous. (OTOH, it's not fully compatible with general relativity, so we know it doesn't exactly describe the world -- or that GR doesn't, or that neither does.)
If you bug physicists enough, they will admit that the standard model has some problems, like the Landau pole. However, there are toy QFTs in 2 spacial dimension that have models rigorous enough for mathematicians. That should be adequate for philosophical purposes.
I don't think the Landau pole can be characterized as an actual problem. It was considered a problem for strong interactions, but we now know that quantum chronodynamics is asymptotically free, so it does not have a Landau pole. The Landau pole for quantum electrodynamics is at an energy scale much much higher than the Planck energy. We already know that we need new physics at the Planck scale, so the lack of asymptotic freedom in the Standard Model is not a real practical (or even conceptual) problem.
The Landau pole for QED goes away when coupled with QCD, but I believe another one appears with the Higgs field. If you don't like the question I'm answering, complain to Komponisto, not me. But what would you count as a conceptual problem?
I wasn't complaining to anyone. And I don't dislike the question. I was just adding some relevant information. Anyway, I did reply directly to komponisto as well. See the end of my long comment above. If we did not have independent evidence that QFT breaks down at the Planck scale (since gravity is not renormalizable), I might have considered the Landau pole a conceptual problem for QFT. But since it is only a problem in a domain where we already know QFT doesn't work, I don't see it that way.
I don't think that's the normal use of "conceptual problem." If physicists believe, as their verbiage seems to indicate, that QED is a real theory that is an approximation to reality, and they compute approximations to the numbers in QED, while QED is actually inconsistent, I would say that is an error and a paradigmatic example of a conceptual error. What does it mean to interpret an inconsistent theory?
There is the standard MWI advocacy that matches Elieser's views. This is a critique of this advocacy, point by point. See especially Q14, re QFT. This gives a reason why MWI is not a useful object of study.
The first critique seems to criticize something different that Eliezer says. It seems like the person quoted by the author did not express themselves clearly, and the critique takes a wrong explanation. For example this part: For me the Eliezer's explanation of "blobs of amplitude" makes sense. There is a set of possible configurations, which at the beginning are all very similar, but because some interactions make the differences grow, the set gradually separates into smaller subsets. When exactly? Well, in theory the parts are connected forever, but the connection only has epsilon size related to the subsets, so it can be ignored. But asking when exactly is like asking "what exactly is the largest number that can be considered 'almost zero'?". If you want to be exact, only zero is exactly zero. On the other hand, 1/3^^^3 is for all practical purposes zero. I would feel uncomfrotable picking one number and saying "ok, this X is 'almost zero', but 1.000001 X is not 'almost zero'". The quoted person seems to say something similar, just less clearly, which allows the critic to use the word "subjective" and jump to a wrong conclusion that author is saying that mathematics is observer-dependent. (Analogically, just because you and me can have different interpretations of 'almost zero', that does not mean mathematics is subjective and observer-depended. It just means that 'almost zero' is not exactly defined, but in real life we care whether e.g. the water we drink contains 'almost zero' poison.) So generally for me it means that once someone famous says a wrong (or just ambiguous) explanation of MWI, that explanation will be forever used as an argument against anything similar to MWI.
Well, not quite. Someone ought to be thinking about this sort of stuff, and the claim that link makes is that MWI isn't worth considering because it goes against the "scientific ethos." The reason I would tell people why MWI is not a useful object of study (for them) is because until you make it a disagreement about the territory, disagreeing about maps cashes out as squabbling. How you interpret QM should not matter, so don't waste time on it.
Tell that to EY.
MWI doesn't say anything like that. Nothing in physics says anything about "badness" or "goodness".
Well, except insofar as humans run on physics, and as such can be described by physics.
Wrong (even when assuming there is an exact definition of goodness). You can't fix all branches of the universe, because (1) in most branches you don't exist, and (2) in a very few branches totally random events may prevent your actions. But this does not mean that your actions don't increase the amount of goodness. First, you are responsible only for the branches where you existed, so let's just remove the other branches from our moral equation. Second, the exceptionally random events happen only in exceptionally small proportion of branches. So even if some kind of Maxwell's demon can ruin your actions in 0.000 ... ... ... 001 of branches, there are stil 0.999 ... ... ... 999 of branches where your actions worked normally. And improving such majority of branches is a good thing. More info here:
Well, lets say we posit some starting condition, say the condition of the universe on the day I turned 17. I am down one path from that initial condition, and a great many other worlds exist in which things went a little differently. I take it that it's not (unfortunately) a physical or logical impossibility that in one or more of those branches, I have ten years down the line committed a murder. Now, there are a finite number of murder-paths, and a finite number of non-murder-paths, and my path is identical to one of them. But it seems to me that whether or not I murder someone, the total number of murder-paths and the total number of non-murder-paths is the same? Is this totally off base? I hope that it is. Anyway, if that's true, then by not murdering, all I've done is put myself off of a murder-path. There's one less murder in my world, but not one less murder absolutely. So, fine, live in my world and don't worry about the others. But whence that rule? That seems arbitrary, and I'm not allowed to apply it in order to localize my ethical considerations in any other case.
On a macro level, a Many Worlds model should be mathematically equal to One World + Probabilities model. Being unhappy that in 0.01% of Many Worlds you are a murderer, is like being unhappy that with probability 0.01% you are a murderer in One World. The difference is that in One World you can later say "I was lucky" or "I was unlucky", while in the Many Worlds model you can just say "this is a lucky branch" or "this is an unlucky branch". At this point it seems to me that you are mixing a Many Worlds model with a naive determinism, and the problem is with the naive determinism. Imagine saying this: "on the day I turned 17, there is one fixed path towards the future, where I either commit a murder or don't, and the result is the same whatever I do". Is this right, or wrong, or confused, or...? Because this is what you are saying, just adding Many Worlds. The difference is that in One World model, if you say "I will flip a coin, and based on the result I will kill him or not" and you mean it, then you are a murderer with probability 50%, while in Many Worlds you are a murderer in 50% of branches. (Of course with the naive determinism the probability is also only in mind -- you were already determined to throw the coin with given direction and speed.) Simply speaking, in Many Worlds model all probabilities happen, but higher probabilities happen "more" and lower probabilities happen "less". You don't want to be a murderer? Then behave so that your probability of murdering someone is as small as possible! This is equally valid advice for One World and Many Worlds. Because you can't influence what happen in the other branches. However, if you did something that could lead with some probability to other person's death (e.g. shooting at them and missing them), you should understand that it was a bad thing which made you (in some other branch) a murderer, so you should not do that again (but neither should you do that again in One World). On the other hand, if you did s
That doesn't seem plausible. If there's a 0.01% probability that I'm a murderer (and there is only one world) then if I'm not in fact a murderer, I have committed no murders. If there are many worlds, then I have committed no murders in this world, but the 'me' in another world (who'se path approximates mine to the extent that would call that person 'me') in fact is a murderer. It seems like a difference between some murders and no murders. I'm saying that depending on what I do, I end up in a non-murder path or a murder path. But nothing I do can change the number of non-murder or murder paths. So it's not deterministic as regards my position in this selection, just deterministic as regards the selection itself. I can't causally interact with other worlds, so my not murdering in one world has no effect on any other worlds. If there are five murder worlds branching off from myself at 17, then there are five no matter what. Maybe I can adjust that number prior to the day I turn 17, but there's still a fixed number of murder worlds extending from the day I was born, and there's nothing I can do to change that. Is that a faulty case of determinism? That's a good point. Would you be willing to commit to an a priori ethical principle such that ought implies can?
That's equivalent to saying "if at the moment of my 17th birthday there is a probability 5% that I will murder someone, then in that moment there is a probability 5% that I will murder someone no matter what". I agree with this. That's equivalent to saying "if at the day I was born there is an X% chance that I will become a murderer, there is nothing I can do to change that probability on that day". True; you can't travel back in time and create a counterfactual universe. It is explained here, without the Many Words. Short summary: You are mixing together two different views -- timeful and timeless view. In timeful view you can say "today at 12:00 I decided to kill my neighbor", and it makes sense. Then you switch to a position of a ceiling cat, an independent observer outside of our universe, outside of our time, and say "I cannot change the fact that today at 12:00 I killed my neighbor". Yes, it also makes sense; if something happened, it cannot non-happen. But we confusing two narrators here: the real you, and the ceiling cat. You decided to kill your neighbor. The ceiling cat cannot decide that you didn't, because the ceiling cat does not live in this universe; it can only observe what you did. The reason you killed your neighbor is that you, existing in this universe, have decided to do so. You are the cause. The ceiling cat sees your action as determined, because it is outside of the universe. If we apply it to Many World hypothesis, there are 100 different yous, and one ceiling cat. From those, 5 yous commit murder (because they decided to do so), and 95 don't (because they decided otherwise, or just failed to murder successfully). Inside the universes, the 5 yous are murderers, the 95 are not. The ceiling cat may decide to blame those 95 for the actions of those 5, but that's the ceiling cat's decision. It should at least give you credit for keeping the ratio 5:95 instead of e.g. 50:50. That's tricky. In some sense, we can't do anything unless the atoms
This sounds right to me, and I think your subsequent analysis is on target. So we have two views, the timeless view and the timeful view and we can't (at least directly) translate ethical principles like 'minimize evils' across the views. So say we grant this and move on from here. Maybe my question is just that the timeless view is one in which ethics seems to make no sense (or at least not the same kind of sense), and the timeful view is a view in which it is a pressing concern. Would you object to that?
I didn't fully realize that previously, but yes -- in the timeless view there is no time, no change, no choice. Ethics is all about choices. Ethical reasoning only makes sense in time, because the process of ethical reasoning is moving the particles in your brain, and the physical consequence of that can be a good or evil action. Ethics can have an influence on universe only if it is a part of the universe. The whole universe is determined only by its laws and its contents. The only way ethics can act is through the brains of people who contemplate it. Ethics is a human product (though we can discuss how much freedom did we have in creating this product; whether it would be different if we had a different history or biology) and it makes sense only on the human level, not on the level of particles.
8Eliezer Yudkowsky
I just stick with the timeless view and don't have any trouble with ethics in it, but that's because I've got all the phenomena of time fully embedded in the timeless view, including choice and morality. :)
I'm happy with the idea that ethics is a human product (since this doesn't imply that it's arbitrary or illusory or anything like that). I take this to mean, basically, that ethics concerns the relation of some subsystems with others. There's no ethical language which makes sense from the 'top-down' or from a global perspective. But there's also nothing to prevent (this is Eliezer's meaning, I guess) a non-global perspective from being worked out in which ethical language does make sense. And this perspective isn't arbitrary, because the subsystems working it out have always occupied that perspective as subsystems. To see an algorithm from the inside is to see world as a whole by seeing it as potentially involved in this algorithm. And this is what leads to the confusion between the global, timeless view from the (no less global, in some sense) timeful inside-an-algorithm view. If that's all passably normal (as skeptical as I am at the coherence of the idea of 'adding up to normality') then the question that remains is what I should do with my idea of things mattering ethically. Maybe the answer here is to see ethical agents as ontologically fundamental or something, though that sounds dangerously anthropocentric. But I don't know how to justify the idea that physically-fundamental = ontologically-fundamental either.
I'm not Viliam Bur, but I wouldn't quite agree with this, in that time matters. It's not incoherent to talk about a system that can't do X, could have done X, and ought to have done X, for example. It's similarly not incoherent to talk about a system that can't do X now but ought to have acted in the past so as to be able to do X now. But yes, in general I would say the purpose of ethics is to determine right action. If we're talking about the ethical status of a system with respect to actions we are virtually certain the system could not have taken, can not take, and will not be able to take, then we're no longer talking about ethics in any straightforward sense.
Okay, so let's adopt 'ought implies can' then, and restrict it to the same tense: if I ought to do X, I can do X. If I could have done (but can no longer do) X, then I ought to have done (but no longer ought to do) X. How does this, in connection with MW, interact with consequentialism? The consequences of my actions can't determine how much murdering I do (in the big world sense), just whether or not I fall on a murder-path. In the big world sense, I can't (and therefore ought not) change the number of murder-paths. The consequence at which I should aim is the nature of the path I inhabit, because that's what I can change. Maybe this is right, but if it is, it seems to me to be an oddly subjective form of consequentialism. I'm not sure if this captures my thought, but it seems that it's not as if I'm making the world a better place, I'm just putting myself in a better world.
It seems like you are not making world a better place because you think about fixed probability of becoming a murderer, which your decisions cannot change. But the probability of you becoming a murderer is a result of your decisions. You have reversed the causality, because you imagine the probability of you ever being a murderer as something that existed sooner, and your decisions about murdering as something that happens later. You treat probability of something happening in future as a fact that happened in the past. (Which is a common error. When humans talk about "outside of time", they always imagine it in the past. No, the past is not outside of time; it is a part of time.)
I'm not at all convinced that I endorse what you are doing with the word "I" here. If we want to say that there exists some entity I, such that I commit murders on multiple branches, then to also talk about "the nature of the path I inhabit" seems entirely incoherent. There is no single path I inhabit, I (as defined here) inhabits all paths. Conversely, if we want to say that there exists a single path that I inhabit (a much more conventional way of speaking), then murders committed on other branches are not murders I commit. I'm not sure if that affects your point or not, but I have trouble refactoring your point to eliminate that confusion, so it seems relevant.
True, good point. That seems to be salt on the wound though. What I meant by 'I' is this: say I'm in path A. I have a parallel 'I' in path B if the configuration of something in B is such that, were it in A at some time past or future, I would consider it to be a (perhaps surprising) continuation of my existence in A. If the Ai and the Bi are the same person, then I'm ethically responsible for the behavior of Bi for the same reasons I'm ethically responsible for myself (Ai). If Ai and Bi are not the same person (even if they're very similar people) then I'm not responsible for Bi at all, but I'm also no longer de-coherent: there is always only one world with me in it. I take it neither of these options is true, and that some middle ground is to be preferred: Bi is not the same person as me, but something like a counterpart. Am I not responsible for the actions of my counterparts? That's a hard question to answer, but say I get uploaded and copied a bunch of times. A year later, some large percentage of my copies have become serial killers, while others have not. Are the peaceful copies morally responsible for the serial killing? If we say 'no' then it seems like we're committed to at least some kind of libertarianism as regards free will. I understood the compatibilist view around here to be that you are responsible for your actions by way of being constituted in such and such a way. But my peaceful copies are constituted in largely the same way as the killer copies are. We only count them as numerically different on the basis of seemingly trivial distinctions like the fact that they're embodied in different hardware.
Well, OK. We are, of course, free to consider any entity we like an extension of our own identity in the sense you describe here. (I might similarly consider some other entity in my own path to be a "parallel me" if I wish. Heck, I might consider you a parallel me.) It is not at all clear that I know what the reasons are that I'm ethically responsible for myself, if I am the sort of complex mostly-ignorant-of-its-own-activities entity scattered across multiple branches that you are positing I am. Again, transplanting an ethical intuition (like "I am ethically responsible for my actions") unexamined from one context to a vastly different one is rarely justified. So a good place to start might be to ask why I'm ethically responsible for myself, and why it matters. Can you say more about that preference? I don't share it, myself. I would say, rather, that I have some degree of confidence in the claim "Ai and Bi are the same person" and some degree of confidence that "Ai and Bi are different people," and that multiple observers can have different degrees of confidence in these claims about a given (Ai, Bi) pair, and there's no fact of the matter. Say I belong to a group of distinct individuals, who are born and raised in the usual way, with no copying involved. A year later, some large percentage of the individuals in my group become serial killers, while others do not. Are the peaceful individuals morally responsible for the serial killing? Almost all of the relevant factors governing my answer to your example seem to apply to mine as well. (My own answer to both questions is "Yes, within limits," those limits largely being a function of the degree to which observations of Ai can serve as evidence about Bi.
Good news! It is totally off base. There is nothing in quantum mechanics requiring that the number of branches corresponding to an arbitrary macroscopic event and its negation must be equal.
Aww, you had my hopes up. There's nothing in my set-up that requires them to be equal either, just that the numbers be fixed.
That feeling of arbitrariness is, IMHO, worth exploring more carefully. Suppose, for example, it turns out that we don't live in a Big World... that this is all there is, and that events either happen in this world or they don't happen at all. Suppose you somehow were to receive confirmation of this. Big relief, right? Now you really can reduce the total amount of whatever in all of existence everywhere, so actions have meaning again. But then you meet someone who says "But what about hypothetical people? No matter how many people I don't actually murder, there's still countless hypothetical people being hypothetically murdered! And, sure, you can tell me to just worry about actual people and don't worry about the other, but whence that rule? It seems arbitrary." Would you find their position reasonable? What would you say to them, if not?
Well put. This actually does come up in a philosophical view known as modal realism. Roughly, if we can make true or false claims about possible worlds, then those worlds must be actual in order to be truth-makers. So all possible worlds are actual. If my someone said what you said he said, suppose I ask this in reply: E:"Wait, are those hypothetical people being hypothetically murdered? Is that true?" S: "Yes! And there's nothing you can do!" E:"And there's some reality to which this part of the map, the hypothetical-people-being-murdered corresponds? Such that the hypothetical murder of these people is a real part of our world?" S: "Well, sure." E: "Okay, well if we're going to venture into modal realism then this just conflicts in the same way." S: Suppose we're not modal realists then. Suppose there's just not really a fact of the matter about whether or not hypothetical, and therefore non-existant people are being murdered. E: No problem. I'm just interested in reducing real evils. S: Isn't that an arbitrary determination? E: No, it's the exact opposite of arbitrary. I also don't take non-existant evidence as evidence, I don't eat non-existant fruit, etc. If we call this arbitrary, then what isn't?
I would certainly say you're justified in not caring about hypothetical murders. I would also say you're justified in not caring about murders in other MW branches. What you seem to want to say here is that because murders in other MW branches are "actual", you care about them, but since murders in my imagination are not "actual", you don't. I have no idea what the word "actual" could possibly refer to so as to do the work you want it to do here. There are certainly clusters of consistent experience to which a hypothetical murder of a hypothetical person corresponds. Those clusters might, for example, take the form of certain patterns of neural activation in my brain... that's how I usually model it, anyway. I'm happy to say that those are "actual" patterns of neural activation. I would not say that they are "actual" murdered human beings. That said, I'm not really sure it matters if they are. I mean, if they are, then... hold on, let me visualize... there: I just "actually" resurrected them and they are now "actually" extremely happy. Was their former murder still evil? At best, it seems all of my preconceived notions about murder (e.g., that it's a permanent state change of some kind) have just been thrown out the window, and I should give some serious thought to why I think murder is evil in the first place. It seems something similar is true about existence in a Big World... if I want to incorporate that into my thinking, it seems I ought to rethink all of my assumptions. Transplanting a moral intuition about murder derived in a small world into a big world without any alteration seems like a recipe for walking off conceptual cliffs.
Right, exactly. I'm taking this sense of 'actual' (not literally) from the sequences. This is from 'On being Decoherent': Later on in this post EY says that the Big World is already at issue in spatial terms: somewhere far away, there is another Esar (or someone enough like me to count as me). The implication is that existing in another world is analogous to existing in another place. And I certainly don't think I'm allowed to apply the 'keep your own corner clean' principle to spatial zones. In 'Living in Many Worlds", EY says: I take him to mean that there are really, actually many other people who exist (just in different worlds) and that I'm responsible for the quality of life for some sub-set of those people. And that there really are, actually, many people in other worlds who have discovered or know things I might take myself to have discovered or be the first to know. Such that it's a small but real overturning of normality that I can't really be the first to know something. (That, I assume, is what an ethical implication of MW for ethics amounts to, some overturning of some ethical normality). If you modeled it to the point that you fully modeled a human being in your brain, and then murdered them, it seems obvious that you did actually kill someone. Hypothetical murders (but considered) fail to be murders because they fail to be good enough models. Yes...obviously!
Ordinarily, I would describe someone who is uncertain about obvious things as a fool. It's not clear to me that I'm a fool, but it is also not at all clear to me that murder as you've defined it in this conversation is evil. If you could explain that obvious truth to me, I might learn something.
I didn't mean to call you a fool, only I don't think the disruption of your intuitions is a disruption of your ethical intuitions. It's unintuitive to think of a human-being as something fully emulated within another human being's brain, but if this is actually possible, it's not unintuitive that ending this neural activity would be murder (if it weren't some other form of killing-a-human-being). My point was just that the distinction in hardware can't make a difference to the question of whether or not ending a neural activity is killing, and given a set of constants, murder. Since I don't think we're any longer talking about my original question, I think I'll tap out.
It all adds up to normality.
How do you know it all adds up to normality? What should I anticipate if it does, and what should I anticipate if it doesn't? Or is this an a priori principle?
Which means that your ethics should not depends on the potential existence of other worlds we have no way of interacting with. In other words, while it might well be simpler (for some people) to reason your ethics by using the many worlds paradigm, the outcome of this reasoning should not depend on the number of worlds.
So, I've been thinking about this, and say I and everyone I know believes that it's possible to be the first one, absolutely, to whistle a tune. This is, for our strange culture, an important ethical belief. That belief is part of what I would call 'normality'. Now, some jerk comes a long and proves MW, and so I learn that for any tune I would consider novel, odds are that it's been whistled before in another world (I'm taking this example from EY in the sequences). So, depending on my normal, MW may add up to normality, and it may not. In a much more obvious sense, if my normal is Newtonian physics, MW doesn't add up to normality either. So what does adding up to normal mean? Consider that my other stupid question. Egan's law seems to go un-argued for and unexplained. If it just means what the paragraph you cite says, then MW may well abolish or come into conflict with our ethical ideas, since apparently it comes into conflict with all kinds of other ideas (like false physical theories) and none of this requires the destruction of the solar system or flying apples.
It means that if you do not observe pink unicorns daily, no new weird and wonderful theory should claim that you should have. Or, as EY puts it "apples didn't stop falling, planets didn't swerve into the Sun". Another name for this is the correspondence principle. If your ethics requires for you to be the first tune whistler in the multiverse, not just in this world, it's not a useful ethics.
The usefulness of the ethics (if that's the right standard to apply to an ethical idea) is not relevant to the example. That is, unless you want to posit (and we should be super, super clear about this) that there is an a priori principle that any ethics capable of being contradicted by a true physical theory is not useful. But I very much doubt you want to say that. I think modern physics pretty obviously doesn't add up to normality in a number of cases. Long debates about cryonics took place because part of many people's normal understanding of personal identity (an ethical category if there ever was one) involved a conception of material constituants like atoms such that there can be my atoms versus your atoms. This just turned out to be nonsense, as we discovered through investigation of physics. The fact that atoms no more have identities qua particular instances than do numbers overturned some element of normality. Given cases like that, how does one actually argue for Egan's law? It's not enough to just state it.
It means that if in your branch you are the first one to whistle the tune, there is no one else in your branch to contradict you. (Just as you would expect in One World.) In some other branch someone else was first, and in that branch you don't think that you were the first, so again no conflict. Then "adding up to normal" means that even when Einstein ruins your model, all things will behave the same way as they always did. Things that within given precision obeyed the Newtonian physics, will continue to do it. You will only see exceptions in unusual situations, such as GPS satellites. (But if you had GPS satellites before Einstein invented his theory, you would have seen those exceptions too. You just didn't know that would happen.) In case of morality it means that if you had a rule "X is good" because it usually has good consequences (or because it follows the rules, or whatever), then "X is good" even with Many Worlds. The exception is if you try to apply moral significance to a photon moving through a double slit. An explanation may change: for example it was immoral to say "if the coin ends this side up, I will kill you", and it is still immoral to do so, but the previous explanation was that "it is bad to kill people with 50% probability" and the new explanation is "it is bad to kill people in 50% of branches" (which means killing them with 50% probability in a random branch).
Okay, so on reflection, I think the idea that it all adds up to normality is just junk. It doesn't mean anything. I'll try to explain: A: MW comes into conflict with this ethical principle. B: It can't come into conflict. Physics always adds up to normality. A: Really? Suppose I see an apple falling, and you discover that there's no such thing as an apple, but that what we called apples are actually a sub-species of blueberries. Now I've learned that I've in fact never seen an apple fall, since by 'apple' I meant the fruit of an independent species of plant. So, normality overturned. B: No, that's not an overturning of normality, that's just a change of explanation. What you saw was this greenish round thing falling, and you explained this as an 'apple'. Now your explanation is different, but the thing you observed is the same. A: Ah, but lets say science discovers that the green round thing I saw isn't green at all. In fact, green is just the color that bounces off the thing. If it's any color, it's the color of the wavelengths of light it absorbs. Normality overturned. B: But that's just what being 'green' now means. What you saw was some light your receptors in way that varied over time, and you explained this as a green thing moving. The observation, the light hitting your eye over time, is the same. The explanation has shifted. A: Now say that it turns out that (bear with me) there is no motion or time. What I thought was some light hitting my retina over time is just my own brain co-evolving with a broader wave-function. Now that's overturning normality. B: No, what you experienced qualitatively is the same, but the explanation has changed. A: What did I experience qualitatively? B: If you're willing to go into plausible but hypothetical discoveries, I can't give it any description that is basic enough that it can't be 'overturned'. Even 'experience' is probably overturnable. A: That's why 'it all adds up to normality' is junk. By that standard, not
You're decisions aren't random! If you decide to do something then the vast majority of other selves you have will decide the same thing. When you do good you do indeed do good in all universes branching from this one. (But what if what I just said wasn't the case? Would you let your sense of ethics override the physical evidence? Look at the causal history of your morality: it comes from evolution. Do you think that if MW was true then evolution would be forced to happen differently, in order to give you different morals?
This is a good question, but I think it's important to understand that it's a good question. Evidence from the physical sciences doesn't have some fixed priority over other kinds of evidence. One could argue that its an unusually good source of evidence, of course, but I'm not sure how to make the comparison in this case.
0Wei Dai
This question used to worry me a lot too, and at one point I also considered the idea that we can't "change the fundamental amount of goodness" but just choose a path through the branching worlds. The view that's currently prevalent among LWers who study decision theory is that you should think of yourself as being able to change mathematical facts, because decisions are themselves mathematical facts and by making decisions you determine other mathematical facts via logical implication. So for example the amount of goodness in a deterministic universe like MWI, given some initial conditions, is a mathematical fact that you can change through your decisions.
Hmm, I don't think I understand that at all: how can one change a mathematical fact? Aren't mathematical facts fixed? Is there something you could point me to, which explains this?
0Wei Dai
Try Towards a New Decision Theory and Controlling Constant Programs. Also, I used the word "change" in my comment since you were asking the question in terms of "change", but perhaps a better term is "control", which is what Nesov uses.
The issue is that the MWI does not address the phenomenon of single path being empirically special (your path). The theories as in the code that you would have when you use Solomonoff induction on your sensory input, have to address this phenomenon - they predict (or guess) sensory input not produce something which merely contains sensory input somewhere in the middle of enormous stream of alternatives. [putting aside for the moment that Solomonoff induction with Turing machine would have troubles with rotational and other symmetries] That is true of physics in general - it is by design is concerned with predicting our sensory input NOT 'explaining it away' by producing an enormous body of things within which the input can be found, and this is why MWI, the way it is now, is seen as unsatisfactory, and why having the un-physical collapse of CI is acceptable. The goal is to guess the sensory input the best, and thus choice of path - even if made randomly - has to be part of theory. Furthermore, if one is to seek the shortest 'explanatory' theory which contains you and your input somewhere within it, but doesn't have to include the 'guess where you are' part, the MWI is not the winner, a program that iterates over all theories of physics and simulates them, is - you get other sort of multiverse. edit: On a more general note, one shouldn't be convinced simply because one can't see a simpler alternative. It's very hard to see alternatives in physics. Here is a good article about the issue.

How would I build a website that worked the same way as Lesswrong?

I don't know the details, but LessWrong is forked from the Reddit sourcecode - doing something similar might be a good start. In more general terms, you need to learn to program, and write a webserver program, generally by using a Web Application Framework, and put that webserver program on a server that you either rent from any number of places or setup yourself. A reasonable way to do this (speaking from a small amount of experience making a site much less complicated than LessWrong) is to use Python with webapp2 and Google App Engine. Beyond that, you're going to have to be more specific about your experiences and goals. LessWrong is not a simple website. Don't expect to be able to write from scratch anything of this magnitude in less than several man years (depending on what you count as "from scratch"). Building it by, say, using an existing system such as WordPress would be much, much less work.
You could use the LW code itself.

Does anyone else find weapons fascinating? Swords, guns, maces and axes?

I really want a set of fully functional weapons, as objects of art and power. Costly, though.

How much do other people spend on stuff they hang on the wall and occasionally take down to admire?

I've had this sort of impulse before. Lately I try to minimize "stuff to hang on the wall and occasionally take down to admire". It brings me little benefit with all sorts of strange hidden costs. But I tend towards clutter so that may just be in my case. And my wife is an artist, so I expect to have more objects-to-be-admired than I know what to do with, without spending money on some.
I'm near the end of a major decluttering phase, which has opened up a lot of room in my small apartment, and shows off my rather spartan walls. Also, I made a list. That seems to be the primary precursor to buying lots of things: making a list, ranking and comparing items. It happened with a few of my previous projects, and I'm already halfway through the new list, though I managed to keep myself to the cheap and small stuff so far.

I've read the majority of the Less Wrong articles on metaethics, and I'm still very very confused. Is this normal or have I missed something important? Is there any sort of consensus on metaethics beyond the ruling out of the very obviously wrong?

Your response is normal - the metaethics sequence is quite opaque, especially compared to others, like the "Mysterious Questions" sequence. I'm doubtful there is much consensus on the correct metaethics in this community - anecdotal evidence is that there isn't even consensus on the meaning of the technical vocabulary we use. For an intermediate look into some of the issues, I suggest the Stanford Encyclopedia of Philosophy entries on moral realism and moral anti-realism Also, I recently realized in discussions that realism is a two place word, not a one place word. Thus, some inferential distance in these types of discussions is between those use the label "moral realism" to refer to realism(morality, agent) and those who refer to realism(morality, humanity).
Thanks for this. I'm already aware of all of the definitions you've mentioned, and in fact I don't like to use the word realism because of the ambiguity. Is there an obvious next step, once you realise what the options are and what the questions are, or are there only the hard questions left?
It depends on what you think the next questions are. Uncertainty about the truth of moral realism1, moral realism2, or anti-realism leads the inquiry in one direction. If one is satisfied with the meta-ethical issue, object level moral questions predominate - and probably must be approached differently depending on one's meta-ethical commitments. If you are unsure about what the next step is, I might recommend reading Camus' "The Stranger" and examining what you think the main character is doing wrong - that should help you focus your interest on object-level ethics or meta-ethics.

Has Yudkowsky fully written up exactly why he believes what he believes about AI? The disclaimer at the top of this page troubles me somewhat.

Isn't it almost certain that super-optimizing AI will result in unintended consequences? I think it's almost certain that super-optimizing AI will have to deal with their own unintended consequences. Isn't the expectation of encountering intelligence so advanced, that it's perfect and infallible essentially the expectation of encountering God?

Isn't the expectation of encountering intelligence so advanced, that it's perfect and infallible essentially the expectation of encountering God?

Which god? If by "God" you mean "something essentially perfect and infallible," then yes. If by "God" you mean "that entity that killed a bunch of Egyptian kids" or "that entity that's responsible for lightning" or "that guy that annoyed the Roman empire 2 millennia ago," then no.

Also, essentially infallible to us isn't necessarily essentially infallible to it (though I suspect that any attempt at AGI will have enough hacks and shortcuts that we can see faults too).

That one. Big man in sky invented by shepherds does't interest me much. Just because I'm a better optimizer of resources in certain contexts than an amoeba doesn't make me perfect and infallible. Just because X is orders of magnitude a better optimizer than Y doesn't make X perfect and infallible. Just because X can rapidly optimize itself doesn't make it infallible either. Yet when people talk about the post-singularity super-optimizers, they seem to be talking about some sort of Sci-Fi God.
Y'know, I'm not really sure where that idea comes from. The optimization power of even a moderately transhuman AI would be quite incredible, but I've never seen a convincing argument that intelligence scales with optimization power (though the argument that optimization power scales with intelligence seems sound).
"optimization power" is more-or-less equivalent to "intelligence", in local parlance. Do you have a different definition of intelligence in mind?
One that doesn't classify evolution as intelligent.
So the nonapples theory of intelligence, then?
More generally, a theory that requires modeling of the future for something to be intelligent.
What's unintended consequences? An imperfect ability to predict the future? Read strictly, any finite entity's ability to predict the future is going to be imperfect.
What if the AI are advanced over us as we are over cockroaches, and the superintelligent AI find us just as annoying, disgusting, and hard to kill?
What reason is there to expect such a thing? (Not to mention that, proverbs notwithstanding, humans can and do kill cockroaches easily; I wouldn't want the tables to be reversed.)
Reason: Cockroaches and the behavior of humans. We can and do kill individuals and specific groups of individuals. We can't kill all of them, however. If humans can get into space, the lightspeed barrier might let far-flung tribes of "human fundamentalists," to borrow a term from Charles Stross, to survive, though individuals would often be killed and would never stand a chance in a direct conflict with a super AI.
In itself that doesn't seem to be relevant evidence. "There exist species that humans cannot eradicate without major coordinated effort". It doesn't follow that either the same would hold for far more powerful AIs, nor that we should model AI-human relationship on humans-cockroaches rather than humans-kittens or humans-smallpox. It's easy to imagine specific scenarios, especially when generalizing from fictional evidence. In fact we don't have evidence sufficient to even raise any scenario as concrete as yours to the level of awareness. I could as easily reply that AI that wanted to kill fleeing humans could do so by powerful enough directed lasers, which will overtake any STL ship. But this is a contrived scenario. There really is no reason to discuss it specifically. (For one thing, there's still no evidence human space colonization or even solar system colonization will happen anytime soon. And unlike AI it's not going to happen suddenly, without lots of advanced notice.)
A summary of your points is that: while conceivable, there's no reason to think it's at all likely. Ok. How about, "Because it's fun to think about?" Actually, lasers might not be practical against maneuverable targets because of the diffraction limit and the lightspeed limit. In order to focus a laser at very great distances, one would need very large lenses. (Perhaps planet sized, depending on distance and frequency.) Targets could respond by moving out of the beam, and the lightspeed limit would preclude immediate retargeting. Compensating for this by making the beam wider would be very expensive.
Regarding lasers: I could list things the attackers might do to succeed. But I don't want to discuss it because we'd be speculating on practically zero evidence. I'll merely say that I would rather that my hopes for the future do not depend on a failure of imagination on part of an enemy superintelligent AI.
You're assuming that there's always an answer for the more intelligent actor. Only happens that way in the movies. Sometimes you get the bear, and sometimes the bear gets you. Sometimes one can pin their hopes on the laws of physics in the face of a more intelligent foe.
It's more fun to me to think about pleasant extremely improbable futures than unpleasant ones. To each their own.
There's lots of scope for great adventure stories in dystopian futures.

Meta: One problem with this thread is that it immediately frames all questions as "stupid". I'm not sure questions should be approached from the perspective of "This point must be wrong since the Sequences are right. How is this point wrong?" Some of the questions might be insightful. Can we take "stupid" out of it?

I think taking the stupid out would make it worse. Making it a stupid questions thread makes it a safe space to ask questions that FEEL stupid to the asker.The point of this thread isn't to enable important critiques of the sequences, it's to make it easier to ask questions when they feel like everyone else already acts like they know the answer or something. There can be other venues for actual critiques or serious questions about how accurate the sequences are.

"Basic questions", "background questions", "simple questions" or even "Explain Like I'm Five" would all get the point across without "stupid".
"Simple question" makes you wonder whether your question is simple.
"Uh...stupid question?" is a fairly common way of introducing a question that doesn't seem to have been addressed, but to which everyone else seems to take the answer for granted. It doesn't presuppose the correctness of the material being questioned. It's a diplomatic way of saying "either you've not explained this properly, or I've missed something". I would rather keep it as "stupid questions" (a known turn of phrase referring to this specific circumstance, which is how it's used), rather than something more awkward yet marginally less ambiguous.