Psy-Kosh: I was using the example of pure baby eater values and conscious babies to illustrate the post Nick Tarleton linked to rather than apply it to this one.
Michael: if it's "inevitable" that they will encounter aliens then it's inevitable that each fragment will in turn encounter aliens, unless they do some ongoing pre-emptive fragmentation, no? But even then, if exponential growth is the norm among even some alien species (which one would expect) the universe should eventually become saturated with civilizations. In the long run, the only e... (read more)
Nick, note that he treats the pebblesorters in parallel with the humans. The pebblesorters' values lead them to seek primeness and Eliezer optimistically supposes that human values lead humans to seek an analogous rightness.
What Eliezer is trying to say in that post, I think, is that he would not consider it right to eat babies even conditional on humanity being changed by the babyeaters to have their values.
But the choice to seek rightness instead of rightness' depends on humans having values that lead to rightness instead of rightness'.
It's evidence of my values which are evidence of typical human values. Also, I invite other people to really think if they are so different.
Eliezer tries to derive his morality from human values, rather than simply assuming that it is an objective morality, or asserting it as an arbitrary personal choice. It can therefore be undermined in principle by evidence of actual human values.
Also, I think I would prefer blowing up the nova instead. The babyeater's children's suffering is unfortunate no doubt but hey, I spend money on ice cream instead of saving starving children in Africa. The superhappies' degrading of their own, more important, civilization is another consideration.
(you may correctly protest about the ineffectiveness of aid - but would you really avoid ice cream to spend on aid, if it were effective and somehow they weren't saved already?)
If blowing up Huygens could be effective, why did it even occur to you to blow up Earth before you thought of this?
Sure it's a story, but one with an implicit idea of human terminal values and such.
I'm actually inclined to agree with FarÃ© that they should count the desire to avoid a few relatively minor modifications over the eternal holocaust and suffering of baby-eater children.
I originally thought Eliezer was a utilitarian, but changed my mind due to his morality series.
(Though I still thought he was defending something that was fairly similar to utilitarianism. But he wasn't taking additivity as a given but attempting to derive it from human terminal values themse... (read more)
...with babyeater values.
Actually, I'm not sure if that's what I thought about their intentions towards the babyeaters, but I at least didn't originally expect them to still intend to modify themselves and humanity.
No, they are simply implementing the original plan by force.
When I originally read part 5, I jumped to the same conclusion you did, based presumably on my prior expectations of what a reasonable being would do. But then I read nyu2's comment which assumed the opposite and went back to look at what the text actually said, and it seemed to support that interpretation.
It seems we are at a disadvantage relative to Eliezer in thinking of alternative endings, since he has a background notion of what things are possible and what aren't, and we have to guess from the story.
How quickly can you go from star to star?
Does the greater advancement of the superhappies translate into higher travel speed, or is this constrained by physics?
Can information be sent from star to star without couriering it with a ship, and arrive in a reasonable time?
How long will the lines connected to the novaing star remain open?
Can inf... (read more)
... but relative to simply cooperating, it seems a clear win. Unless the superhappies have thought of it and planned a response.
Of course, the corollary for the real world would seem to be: those people who think that most people would not converge if "extrapolated" by Eliezer's CEV ought to exterminate other people who they disagree with on moral questions before the AI is strong enough to stop them, if Eliezer has not programmed the AI to do something to punish that sort of thing.
Hmm. That doesn't seem so intuitively nice. I wonder if it's just... (read more)
If the humans know how to find the babyeaters' star,
and if the babyeater civilization can be destroyed by blowing up one star,
then I would like to suggest that they kill off the babyeaters.
Not for the sake of the babyeaters (I consider the proposed modifications to them better than annihilation from humanity's perspective)
but to prevent the super-happies from making even watered down modifications adding baby-eater values -
not so much to humans, since this can also be (at least temporarily) prevented by destroying Huygens -
but to themselves, as they ar... (read more)
James Andrix: I don't claim that the aliens would prefer modification over death, only that it is more consonant with my conception of human values to modify them than exterminate them, notwithstanding that the aliens may prefer the latter.
Akon claims this is a "true" prisoner's dilemma situation, and then tries to add more values to one side of the scale. If he adds enough values to make cooperation higher value than defecting, then he was wrong to say it was a true prisoner's dilemma. But the story has made it clear that the aliens appear to be not smart enough to accurately anticipate human behaviour (or vice versa for that matter), so this is not a situation where it is rational to cooperate in a true prisoner's dilemma. If it really is a true prisoner's dilemma, they should ju... (read more)
Strong enough to disrupt personal identity, if taken in one shot? That's a difficult question to answer, especially since I don't know what experiment to perform to test any hypotheses. On one hand, billions of neurons in my visual cortex undergo massive changes of activation every time my eyes squeeze shut when I sneeze - the raw number of flipped bits is not the key thing in personal identity. But we are already talking about serious changes of information, on the order of going to sleep, dreaming, forgetting your dreams, and waking up the next mornin
Eliezer, whenever you start thinking about people who are completely causally unconnected with us as morally relevant, alarm bells should go off.
What's worse though, is that if your opinion on this is driven by a desire to justify not agreeing with the "repugnant conclusion", it may signify problems with your morality that could annihilate humanity if you give your morality to an AI. The repugnant conclusion requires valuing the bringing into existence of hypothetical people with total utility x by as much as reducing the utility of existing peop... (read more)
I think what he means by "calibrated" is something like it not being possible for someone else to systematically improve the probabilities you give for the possible answers to a question just from knowing what values you've assigned (and your biases), without looking at what the question is.
I suppose the improvement would indeed be measured in terms of relative entropy of the "correct" guess with respect to the guess given.
Responding to Gaffa (I kind of intended to respond right after the comment, bot got sidetracked):
When approaching a scientific or mathematical problem, I often find myself trying hard to avoid having to calculate and reason, and instead try to reach for an "intuitive" understanding in the back of my mind, but that understanding, if I can even find it, is rarely sufficient when dealing with actual problems.
I would advise you to embrace calculation and reason, but just make sure you think about what you are doing and why. Use the tools, but try... (read more)
It might make an awesome movie, but if it were expected behaviour, it would defeat the point of the injunction. In fact if rationalists were expected to look for workarounds of any kind it would defeat the point of the injunction. So the injunction would have to be, not merely to be silent, but not to attempt to use the knowledge divulged to thwart the one making the confession in any way except by non-coercive persuasion.
Or alternatively, not to ever act in a way such that if the person making the confession had expected it they would have avoided making the confession.
To the extent that a commitment to ethics is externally verifiable, it would encourage other people to cooperate, just as a tendency to anger (a visible commitment to retribution) is a disincentive to doing harm.
Also, even if it is not verifiable, a person who at least announces their intention to hold to an ethical standard has raised the impact their failure to do so will have on their reputation, and thus the announcement itself should have some impact on the expectation that they will behave ethically.
Just for the sake of devil's advocacy:
4) You want to attribute good things to your ethics, and thus find a way to interpret events that enables you to do so.
Miguel: it doesn't seem to be a reference to something, but just a word for some experience an alien might have had that is incomprehensible to us humans, analogous to humour for the alien.
Psy-Kosh, my argument that Boltzmann brains go poof is a theoretical argument, not an anthropic one. Also, if we want to maximize our correct beliefs in the long run, we should commit to ignore the possibility that we are a brain with beliefs not causally affected by the decision to make that commitment (such as a brain that randomly pops into existence and goes poof). This also is not an anthropic argument.
With regard to longer-lived brains, if you expect there to be enough of them that even the ones with your experience are more common than minds in a re... (read more)
Nick, do you use the normal definition of a Boltzmann brain?
It's supposed to be a mind which comes into existence by sheer random chance. Additional complexity - such as would be required for some support structure (e.g. an actual brain), or additional thinking without a support structure - comes with an exponential probability penalty. As such, a Boltzmann brain would normally be very short lived.
In principle, though, there could be so much space uninhabitable for regular civilizations that even long-lived Boltzmann brains which coincidentally have experi... (read more)
Let's suppose, purely for the sake of argument of course, that the scientists are superrational.
The first scientist chose the most probable theory given the 10 experiments. If the predictions are 100% certain then it will still be the most probable after 10 more successful experiments. So, since the second scientist chose a different theory, there is uncertainty and the other theory assigned an even higher probability to these outcomes.
In reality people are bad at assessing priors (hindsight bias), leading to overfitting. But these scientists are assumed t... (read more)
It may be that most minds with your thoughts do in fact disappear after an instant. Of course if that is the case there will be vastly more with chaotic or jumbled thoughts. But the fact that we observe order is no evidence against the existence of additional minds observing chaos, unless you don't accept self-indication.
So, your experience of order is not good evidence for your belief that more of you are non-Boltzmann than Boltzmann. But as I said, in the long term your expected accuracy will rise if you commit to not believing you are a Boltzmann brain,... (read more)
Nick and Psy-Kosh: here's a thought on Boltzmann brains.
Let's suppose the universe has vast spaces uninhabited by anything except Boltzmann brains which briefly form and then disappear, and that any given state of mind has vastly more instantiations in the Boltzmann-brain only spaces than in regular civilizations such as ours.
Does it then follow that one should believe one is a Boltzmann brain? In the short run perhaps, but in the long run you'd be more accurate if you simply committed to not believing it. After all, if you are a Boltzmann brain, that commitment will cease to be relevant soon enough as you disintegrate, but if you are not, the commitment will guide you well for a potentially long time.
And by elementary I mean the 8 different ways W, F, and the comet hit/non hit can turn out.
Err... I actually did the math a silly way, by writing out a table of elementary outcomes... not that that's silly itself, but it's silly to get input from the table to apply to Bayes' theorem instead of just reading off the answer. Not that it's incorrect of course.
Richard, obviously if F does not imply S due to other dangers, then one must use method 2:
P(W|F,S) = P(F|W,S)P(W|S)/P(F|S)
Let's do the math.
A comet is going to annihilate us with a probability of (1-x) (outside view) if the LHC would not destroy the Earth, but if the LHC would destroy the Earth, the probability is (1-y) (I put this change in so that it would actually have an effect on the final probability)
The LHC has an outside-view probability of failure of z, whether or not W is true
The universe has a prior probabilty w of being such that the LHC if i... (read more)
You have another inconsistency as well. As you should have noticed in the "How many" thread, the assumptions that lead you to believe that failures of the LHC are evidence that it would destroy Earth are the same ones that lead you to believe that annihilational threats are irrelevant (after all, if P(W|S) = P(W), then Bayes' rule leads to P(S|W) = P(S)).
Thus, given that you believe that failures are evidence of the LHC being dangerous, you shouldn't care. Unless you've changed to a new set of incorrect assumptions, of course.
I might add, for the benefit of others, that self-sampling forbids playing favourites among which observers to believe that you are in a single universe (beyond what is actually justified by the evidence available), and self-indication forbids the same across possible universes.
Nominull: It's a bad habit of some people to say that reality depends on, or is relative to observers in some way. But even though observers are not a special part of reality, we are observers and the data about the universe that we have is the experience of observers, not an outsid... (read more)
Why do you reject self-indication? As far as I can recall the only argument Bostrom gave against it was that he found it unintuitive that universes with many observers should be more likely, with absolutely no justification as to why one would expect that intuition to reflect reality. That's a very poor argument considering the severe problems you get without it.
I suppose you might be worried about universes with many unmangled worlds being made more likely, but I don't see what makes that bullet so hard to bite either.
Whoops, I didn't notice that you did specifically claim that P(W|S)=P(W).
Do you arrive at this incorrect claim via Bostrom's approach, or another one?
Not particularly. I use 4 but with P(W|S) = P(W) which renders it valid. (We're not talking about two side-by-side universes, but about prior probabilities on physical law plus a presumption of survival.)
You mean you use method 2. Except you don't, or you would come to the same conclusion that I do. Are you claiming that P(W|S)= P(W)? Ok, I suspect you may be applying Nick Bostrom's version of observer selection: hold the probability of each possible version of the universe fixed independent of the number of observers, then divide that probability equally ... (read more)
Eliezer, I used "=>" (intending logical implication), not ">=".
I would suggest you read my post above on this second page, and see if that changes your mind.
Also, in a previous post in this thread I argued that one should be surprised by externally improbable survival, at least in the sense that it should make one increase the probability assigned to alternative explanations of the world that do not make survival so unlikely.
Sorry Richard, well of course they aren't necessarily independent. I wasn't quite sure what you were criticising. But I pointed out already that, for example, a new physical law might in principle both cause the LHC to fail and cause it to destroy the world if it did not fail. But I pointed out that this was not what people were arguing, and assuming that such a relation is not the case then the failure of the LHC provides no information about the chance that a success would destroy the world. (And a small relation would lead to a small amount of information, etc.)
While I'm happy to have had the confidence of Richard, I thought my last comment could use a little improvement.
What we want to know is P(W|F,S)
As I pointed out F=> S so P(W|F,S) = P(W|F)
We can legitimately calculate P(W|F,S) in at least two ways:
1. P(W|F,S) = P(W|F) = P(F|W)P(W)/P(F) <- the easy way
2. P(W|F,S) = P(F|W,S)P(W|s)/P(F|S) <- harder, but still works
there are also ways you can get it wrong, such as:
3. P(W|F,S) != P(F|W,S)P(W)/P(F) <- what I said other people were doing last post
4. P(W|F,S) != P(F|W,S)P(W)/P(F|S) <... (read more)
I'm going to try another explanation that I hope isn't too redundant with Benja's.
Consider the events
W = The LHC would destroy Earth
F = the LHC fails to operate
S = we survive (= F OR not W)
We want to know P(W|F) or P(W|F,S), so let's apply Bayes.
First thing to note is that since F => S, we have P(W|F) = P(W|F,S), so we can just work out P(W|F)
P(W|F) = P(F|W)P(W)/P(F)
Note that none of these probabilities are conditional on survival. So unless in the absence of any selection effects the probability of failure still depends on whether the LHC would... (read more)
Robinson, I could try to nitpick all the things wrong with your post, but it's probably better to try to guess at what is leading your intuition (and the intuition of others) astray.
Here's what I think you think:
Allan, I am of course aware of that (actually, it would probably take time, but even if the annihilation were instantaneous the argument would not be affected).
There are 4 possibilities:
The fact that conditional on survival possibility 2 must not have happened has no effect on the relative probabilities of possibility 1 and possibility 3.
To clarify, I mean failures should not lead to a change of probability away from the prior probability; of course they do result in a different probability estimate than if the LHC succeeded and we survived.
Actually, failures of the LHC should never have any effect at all on our estimate of the probability that if it did not fail it would destroy Earth.
This is because the ex ante probability of failure of the LHC is independent of whether or not if it turned on it would destroy Earth. A simple application of Bayes' rule.
Now, the reason you come to a wrong conclusion is not because you wrongly applied the anthropic principle, but because you failed to apply it (or applied it selectively). You realized that the probability of failure given survival is higher un... (read more)
As previously mentioned, there are tricky aspects to this. You can't say: "You see those humans over there? Whatever desire is represented in their brains, is therefore right." This, from a moral perspective, is wrong - wanting something doesn't make it right - and the conjugate failure of the AI is that it will reprogram your brains to want things that are easily obtained in great quantity. If the humans are PA, then we want the AI to be PA+1, not Self-PA... metaphorically speaking.
Before reading this post, if I had been programming a frien... (read more)
It's not clear to me where you are going with it.
To argue that a proof is being made concluding ?C using the assumption ?(◻C -> C) given the theory PA, to which proof we can apply the deduction theorem to get (PA |- "?(◻C -> C) -> ?C") (i.e. my interpretation of Löb's Theorem)
We use 10 steps, 9 of which are proofs inside of PA
But the proof uses an additional assumption which is the antecedent of an implication, and comes to a conclusion which is the consequent of the implication. To get the implication, we must use the deduction theorem ... (read more)
Hmm. I was thinking that Löb's Theorem was a theorem in PA, in which case the step going from
PA + ?(?C -> C) |- ?(?L -> C)
PA + ?(?C -> C) |- ?(?(?L -> C))
seems legitimate given
PA |- (?X -> ?(?X))
which we ought to be able to use since PA is part of the theory before the |- symbol.
If we don't have PA on the left, can we use all the "ingredients" without adding additional assumptions?
In any case, if we do not use the deduction theorem to derive the implication in Löb's Theorem, what do we use?
We don't have PA + X proving anything for any X.
It seems to me that we do have (PA + "?(◻C -> C)" |- "?C")
from which the deduction theorem gives: (PA |- "?(◻C -> C) -> ?C") which is Löb's Theorem itself.
The hypothesis was that PA proves that "if PA proves C, then C"
This enabled it to be proved that "PA proves C"
So I think what we actually get applying the deduction theorem is ?((◻C)->C)->?C
I don't think you're talking about my sort of view* when you say "morality-as-preference", but:
Why do people seem to mean different things by "I want the pie" and "It is right that I should get the pie"? Why are the two propositions argued in different ways?
A commitment to drive a hard bargain makes it more costly for other people to try to get you to agree to something else. Obviously an even division is a Schelling point as well (which makes a commitment to it more credible than a commitment to an arbitrary division).
When a... (read more)
"From a utilitarian perspective", where does the desire to do things better than can be done with the continued existence of humans come from? If it comes from humans, should not the desire to continue to exist also be given weight?
Also, if AI researchers anchor their expectations for AI on the characteristics of the average human then we could be in big trouble.