All of Allan_Crossman's Comments + Replies

... and stuns Akon (or everyone). He then opens a channel to the Superhappies, and threatens to detonate the star - thus preventing the Superhappies from "fixing" the Babyeaters, their highest priority. He uses this to blackmail them into fixing the Babyeaters while leaving humanity untouched.

No, he says "you're the first person who etc..."

Is this a "failed utopia" because human relationships are too sacred to break up, or is it a "failed utopia" because the AI knows what it should really have done but hasn't been programmed to do it?

I think it's a failed utopia because it involves the AI modifying the humans' desires wholesale - the fact that it does so by proxy doesn't change that it's doing that. (This may not be the only reason it's a failed utopia.)
I don't see how those are mutually exclusive.

that can support the idea that the much greater incidence of men committing acts of violence is "natural male aggression" that we can't ever eliminate.

The whole point of civilisation is to defeat nature and all its evils.

I don't think trying to `defeat nature' is a very constructive way of thinking, rather we should be working with nature to improve all life.

... how isn't atheism a religion? It has to be accepted on faith, because we can't prove there isn't a magical space god that created everything.

I think there's a post somewhere on this site that makes the reasonable point that "is atheism a religion?" is not an interesting question. The interesting question is "what reasons do we have to believe or disbelieve in the supernatural?"

My issue with this is that we don't, actually, have a philosophical/rational/scientific vision of capital-T Truth yet, despite all of our efforts. (Descartes, Spinoza, Kant, etc.)

Truth is whatever describes the world the way it is.

Even the capital-T Truth believers will admit that we don't know how to achieve an understanding of that truth, they'll just say that it's possible because there really is this kind of truth.

Do you mean an understanding of the way the world is, or an understanding of what truth is?

Isn't it the case, then that your embracing this ... (read more)

Paul, that's a good point.

Eliezer: If all I want is money, then I will one-box on Newcomb's Problem.

Mmm. Newcomb's Problem features the rather weird case where the relevant agent can predict your behaviour with 100% accuracy. I'm not sure what lessons can be learned from it for the more normal cases where this isn't true.

If a serial killer comes to a confessional, and confesses that he's killed six people and plans to kill more, should the priest turn him in? I would answer, "No." If not for the seal of the confessional, the serial killer would never have come to the priest in the first place.

It's important to distinguish two ways this argument might work. The first is that the consequences of turning him in are bad, because future killers will be (or might be) less likely to seek advice from priests. That's a fairly straightforward utilitarian argument.

But the... (read more)

Benja: But it doesn't follow that you should conclude that the other people are getting shot, does it?

I'm honestly not sure. It's not obvious to me that you shouldn't draw this conclusion if you already believe in MWI.

(Clearly you learned nothing about that, because whether or not they get shot does not affect anything you're able to observe.)

It seems like it does. If people are getting shot then you're not able to observe any decision by the guards that results in you getting taken away. (Or at least, you don't get to observe it for long - I'm don't think the slight time lag matters much to the argument.)

Benja: Allan, you are right that if the LHC would destroy the world, and you're a surviving observer, you will find yourself in a branch where LHC has failed, and that if the LHC would not destroy the world and you're a surviving observer, this is much less likely. But contrary to mostly everybody's naive intuition, it doesn't follow that if you're a surviving observer, LHC has probably failed.

I don't believe that's what I've been saying; the question is whether the LHC failing is evidence for the LHC being dangerous, not whether surviving is evidence for the LHC having failed.

Allan: your intuition is wrong here too. Notice that if Zeus were to have independently created a zillion people in a green room, it would change your estimate of the probability, despite being completely unrelated.

I don't see how, unless you're told you could also be one of those people.

Simon: As I say above, I'm out of my league when it comes to actual probabilities and maths, but:

P(W|F) = P(F|W)P(W)/P(F)

Note that none of these probabilities are conditional on survival.

Is that correct? If the LHC is dangerous and MWI is true, then the probability of observing failure is 1, since that's the only thing that gets observed.

An analogy I would give is:

You're created by God, who tells you that he has just created 10 people who are each in a red room, and depending on a coin flip God made, either 0 or 10,000,000 people who are each in a blue roo... (read more)

Benja, I'm not really smart enough to parse the maths, but I can comment on the intuition:

The very small number of Everett branches that have the LHC non-working due to a string of random failures is the same in both cases [of LHC dangerous vs. LHC safe]

I see that, but if the LHC is dangerous then you can only find yourself in the world where lots of failures have occurred, but if the LHC is safe, it's extremely unlikely that you'll find yourself in such a world.

Thus, if all you know is that you are in an Everett branch in which the LHC is non-working due ... (read more)

Simon: the ex ante probability of failure of the LHC is independent of whether or not if it turned on it would destroy Earth.

But - if the LHC was Earth-fatal - the probability of observing a world in which the LHC was brought fully online would be zero.

(Applying anthropic reasoning here probably makes more sense if you assume MWI, though I suspect there are other big-world cosmologies where the logic could also work.)

Oh God I need to read Eliezer's posts more carefully, since my last comment was totally redundant.

First collisions aren't scheduled to have happened yet, are they? In which case, the failure can't be seen as anthropic evidence yet, since we might as well be in a world where it hasn't failed, since such a world wouldn't have been destroyed yet in any case.

But if I'm not mistaken, even old failures will become evidence retrospectively once first collisions are overdue, since (assuming the unlikely case of the LHC actually being dangerous) all observers still alive would be in a world where the LHC failed; when it failed being irrelevant.

As much as the AP fascinates me, it does my head in. :)

From your perspective, you should chalk this up to the anthropic principle: if I'd fallen into a true dead end, you probably wouldn't be hearing from me on this blog.

I'm not sure that can properly be called anthropic reasoning; I think you mean a selection effect. To count as anthropic, my existence would have to depend upon your intellectual development; which it doesn't, yet. :)

(Although I suppose my existence as Allan-the-OB-reader probably does so depend... but that's an odd way of looking at it.)

I'm interested in the inconsistency of those who accept defection as the rational equilibrium in the one-shot PD, but find excuses to reject it in the finitely iterated known-horizon PD.

[...] What if neither party to the IPD thinks there's a realistic chance that the other party is stupid - if they're both superintelligences, say?

It's never worthwhile to cooperate in the one shot case, unless the two players' actions are linked in some Newcomb-esque way.

In the iterated case, if there's even a fairly small chance that the other player will try to establish... (read more)

Carl - good point.

I shouldn't have conflated perfectly rational agents (if there are such things) with classical game-theorists. Presumably, a perfectly rational agent could make this move for precisely this reason.

Probably the best situation would be if we were so transparently naive that the maximizer could actually verify that we were playing naive tit-for-tat, including on the last round. That way, it would cooperate for 99 rounds. But with it in another universe, I don't see how it can verify anything of the sort.

(By the way, Eliezer, how much communi... (read more)

Vladimir: In case of prisoner's dilemma, you are penalized by ending up with (D,D) instead of better (C,C) for deciding to defect

Only if you have reason to believe that the other player will do whatever you do. While that's the case in Simpleton's example, it's not the case in Eliezer's.

If it's actually common knowledge that both players are "perfectly rational" then they must do whatever game theory says.

But if the paperclip maximizer knows that we're not perfectly rational (or falsely believes that we're not) it will try and achieve a better score than it could get if we were in fact perfectly rational. It will do this by cooperating, at least for a time.

I think correct strategy gets profoundly complicated when one side believes the other side is not fully rational.

Chris: Sorry Allan, that you won't be able to reply. But you did raise the question before bowing out...

I didn't bow out, I just had a lot of comments made recently. :)

I don't like the idea that we should cooperate if it cooperates. No, we should defect if it cooperates. There are benefits and no costs to defecting.

But if there are reasons for the other to have habits that are formed by similar forces

In light of what I just wrote, I don't see that it matters; but anyway, I wouldn't expect a paperclip maximizer to have habits so ingrained that it can't ever... (read more)

Psy-Kosh: They don't have to believe they have such causal powers over each other. Simply that they are in certain ways similar to each other.

I agree that this is definitely related to Newcomb's Problem.

Simpleton: I earlier dismissed your idea, but you might be on to something. My apologies. If they were genuinely perfectly rational, or both irrational in precisely the same way, and could verify that fact in each other...

Then they might be able to know that they will both do the same thing. Hmm.

Anyway, my 3 comments are up. Nothing more from me for a while.

[D,C] will happen only if the other player assumes that the first player bets on cooperation

No, it won't happen in any case. If the paperclip maximizer assumes I'll cooperate, it'll defect. If it assumes I'll defect, it'll defect.

I debug my model of decision-making policies [...] by requiring the outcome to be stable even if I assume that we both know which policy is used by another player

I don't see that "stability" is relevant here: this is a one-off interaction.

Anyway, lets say you cooperate. What exactly is preventing the paperclip maximizer from defecting?

simpleton: won't each side choose to cooperate, after correctly concluding that it will defect iff the other does?

Only if they believe that their decision somehow causes the other to make the same decision.

CarlJ: How about placing a bomb on two piles of substance S and giving the remote for the human pile to the clipmaximizer and the remote for its pile to the humans?

It's kind of standard in philosophy that you aren't allowed solutions like this. The reason is that Eliezer can restate his example to disallow this and force you to confront the real dilemma.... (read more)

Michael: This is not a prisoner's dilemma. The nash equilibrium (C,C) is not dominated by a pareto optimal point in this game.

I don't believe this is correct. Isn't the Nash equilibrium here (D,D)? That's the point at which neither player can gain by unilaterally changing strategy.

Prase, Chris, I don't understand. Eliezer's example is set up in such a way that, regardless of what the paperclip maximizer does, defecting gains one billion lives and loses two paperclips.

Basically, we're being asked to choose between a billion lives and two paperclips (paperclips in another universe, no less, so we can't even put them to good use).

The only argument for cooperating would be if we had reason to believe that the paperclip maximizer will somehow do whatever we do. But I can't imagine how that could be true. Being a paperclip maximizer, it's... (read more)

What you're missing is the idea that we should be optimizing our policies rather than our individual actions, because (among other alleged advantages) this leads to better results when there are lots of agents interacting with one another. In a world full of action-optimizers in which "true prisoners' dilemmas" happen often, everyone ends up on (D,D) and hence (one life, one paperclip). In an otherwise similar world full of policy-optimizers who choose cooperation when they think their opponents are similar policy-optimizers, everyone ends up on (C,C) and hence (two lives, two paperclips). Everyone is better off, even though it's also true that everyone could (individually) do better if they were allowed to switch while everyone else had to leave their choice unaltered.
One thing I can't understand. Considering we've built Clippy, we gave it a set of values and we've asked it to maximise paperclips, how can it possibly imagine we would be unhappy about its actions? I can't help but thinking that from Clippy's point of view, there's no dilemma: we should always agree with its plan and therefore give it carte blanche. What am I getting wrong?
7 years late, but you're missing the fact that (C,C) is universally better than (D,D). Thus whatever logic is being used must have a flaw somewhere because it works out worse for everyone - a reasoning process that successfully gets both parties to cooperate is a WIN. (However, in this setup it is the case that actually winning would be either (C,D) or (C,D), both of which are presumably impossible if we're equally rational).

Damnit, Eliezer nitpicked my nitpicking. :)

I agree: Defect!

Clearly the paperclip maximizer should just let us have all of substance S; but a paperclip maximizer doesn't do what it should, it just maximizes paperclips.

I sometimes feel that nitpicking is the only contribution I'm competent to make around here, so... here you endorsed Steven's formulation of what "should" means; a formulation which doesn't allow you to apply the word to paperclip maximizers.

Plato had a concept of "forms". Forms are ideal shapes or abstractions: every dog is an imperfect instantiation of the "dog" form that exists only in our brains.

Mmm. I believe Plato saw the forms as being real things existing "in heaven" rather than merely in our brains. It wasn't a stupid theory for its day; in particular, a living thing growing into the right shape or form must have seemed utterly mysterious, and so the idea that some sort of blueprint was laid out in heaven must have had a lot of appeal.

But anyway, forms as... (read more)

Boiling it down to essentials, it looks to me like the key move is this:

  • If we can prove X, then we can prove Y.
  • Therefore, if X is true, then we can prove Y.

But this doesn't follow - X could be true but not provable.

Is that right? It's ages since I did logic, and never to a deep level, so excuse me if this is way off.

Eliezer, I think I kind-of understand by now why you don't call yourself a relativist. Would you say that it's the "psychological unity of mankind" that distinguishes you from relativists?

A relativist would stress that humans in different cultures all have different - though perhaps related - ideas about "good" and "right" and so on. I believe your position is that the bulk of human minds are similar enough that they would arrive at the same conclusions given enough time and access to enough facts; and therefore, that it's an ... (read more)

It's a datum (which any adequate metaethical theory must account for) that there can be substantive moral disagreement. When Bob says "Abortion is wrong", and Sally says, "No it isn't", they are disagreeing with each other.

I wonder though: is this any more mysterious than a case where two children are arguing over whether strawberry or chocolate ice cream is better?

In that case, we would happily say that the disagreement comes from their false belief that it's a deep fact about the universe which ice cream is better. If Eliezer is right (I'm still agnostic about this), wouldn't moral disagreements be explained in an analogous way?

This post hits me far more strongly than the previous ones on this subject.

I think your main point is that it's positively dangerous to believe in an objective account of morality, if you're trying to build an AI. Because you will then falsely believe that a sufficiently intelligent AI will be able to determine the correct morality - so you don't have to worry about programming it to be friendly (or Friendly).

I'm sure you've mentioned this before, but this is more forceful, at least to me. Thanks.

Personally, even though I've mentioned that I thought there ... (read more)

And I may not know what this question is, actually; I may not be able to print out my current guess nor my surrounding framework; but I know, as all non-moral-relativists instinctively know, that the question surely is not just "How can I do whatever I want?"

I'm not sure you've done enough to get away from being a "moral relativist", which is not the same as being an egoist who only cares about his own desires. "Moral relativism" just means this (Wikipedia):

In philosophy, moral relativism is the position that moral or ethical ... (read more)

It's fairly clear that, at least according to EY, the blobs are universal across all humans.

Myself: I can't help but wonder about anthropic effects here. It might be the case that nuclear-armed species annihilate themselves with high probability (say 50% per decade), but of course, all surviving observers live on planets where it hasn't happened through sheer chance.

Just to expand on this (someone please stop me if this sort of speculative post is irritating):

Imagine there are a hundred Earths (maybe because of MWI, or because the universe is infinite, or whatever). Lets say there's a 90% chance of nuclear war before 2008, and such a war would re... (read more)

Time has passed, and we still haven't blown up our world, despite a close call or two.

I can't help but wonder about anthropic effects here. It might be the case that nuclear-armed species annihilate themselves with high probability (say 50% per decade), but of course, all surviving observers live on planets where it hasn't happened through sheer chance.

(Though on the other hand, if an all-out nuclear war is survivable for a species like ours, then this line of thought wouldn't work.)

Poke, can you expand a little on what you're driving at?

Also, Steven, how on Earth is that statement true under MWI? :)

We do not know very well how the human mind does anything at all. But that the the human mind comes to have preferences that it did not have initially, cannot be doubted.

I believe Eliezer is trying to create "fully recursive self-modifying agents that retain stable preferences while rewriting their source code". Like Sebastian says, getting the "stable preferences" bit right is presumably necessary for Friendly AI, as Eliezer sees it.

(This clause "as Eliezer sees it" isn't meant to indicate dissent, but merely my total incompetence to judge whether this condition is strictly necessary for friendly AI.)

I am assuming [the AI] acts, and therefore makes choices, and therefore has preferences, and therefore can have preferences which conflict with the preferences of other minds (including human minds).

An AI can indeed have preferences that conflict with human preferences, but if it doesn't start out with such preferences, it's unclear how it comes to have them later.

On the other hand, if it starts out with dubious preferences, we're in trouble from the outset.

Eliezer [in response to me]: This just amounts to defining should as an abstract computation, and then excluding all minds that calculate a different rule-of-action as "choosing based on something other than morality". In what sense is the morality objective, besides the several senses I've already defined, if it doesn't persuade a paperclip maximizer?

I think my position is this:

If there really was such a thing as an objective morality, it would be the case that only a subset of possible minds could actually discover or be persuaded of that fact.

Presumably, for any objective fact, there are possible minds who could never be convinced of that fact.

Eliezer: It's because when I say right, I am referring to a 1-place function

Like many others, I fall over at this point. I understand that Morality_8472 has a definite meaning, and therefore it's a matter of objective fact whether any act is right or wrong according to that morality. The problem is why we should choose it over Morality_11283.

Of course you can say, "according to Morality_8472, Morality_8472 is correct" but that's hardly helpful.

Ultimately, I think you've given us another type of anti-realist relativism.

Eliezer: But if you were ste... (read more)

I hope the priests of Baal checked that it was indeed water, and not some sort of accelerant.

Seeing as this was on a mountain top (Mt Carmel) subject to all kinds of electrical weirdness, the water was probably to act as a lightning rod.

In more detail, suppose there was in fact no conspiracy and Oswald was a lone, self-motivated individual. It might still turn out that the simplest way to imagine what would have happened if Oswald had not killed Kennedy, would be to imagine that there was in fact a conspiracy, and they found someone else, who did the job in the same way. That would arguably be the change which would minimize total forward and backward alterations to the timeline.

Hal: what you describe is called "backtracking" in the philosophical literature. It's not usually see... (read more)

Hmm, the second bit I just wrote isn't going to work, I suppose, since your knowledge of what came after the event will affect whether you believe in a conspiracy or not...

Oh, and to talk about "the probability that John F. Kennedy was shot, given that Lee Harvey Oswald didn't shoot him", we write:


If I've understood you, this is supposed to be a high value near 1. I'm just a noob at Bayesian analysis or Bayesian anything, so this was confusing me until I realised I also had to include all the other information I know: i.e. all the reports I've heard that Kennedy actually was shot, that someone else became president, and so on.

It seems like this would be a case where it's genuinely helpful ... (read more)

None of us can say what our descendants will or will not do, but there is no reason to believe that any particular part of human nature will be worthy in their eyes. [emphasis mine]

I can see one possible reason: we might have some influence over what they think.

there is no such thing as a probability that isn't in any mind.

Hmm. Doesn't quantum mechanics (especially if we're forgetting about MWI) give us genuine, objective probabilities?

Once you unwind past evolution and true morality isn't likely to contain [...]

I think either a word has been missed out here, or and should be then.

If I recall correctly, I did ask myself that, and sort of waved my hands mentally and said, "It just seems like one of the best guesses - I mean, I don't know that people are valuable, but I can't think of what else could be."

I find this fairly ominous, since that handwaved belief happens to be my current belief: that conscious states are the only things of (intrinsic) value: since only those conscious states can contain affirmations or denials that whatever they're experiencing has value.

ES: for the puzzle to make sense we have to assume that the islanders have no memory of exactly when they came to be on the island.

I don't see what difference that makes. All that matters is that everyone is present before it's announced that someone has blue eyes, and everyone has made an accurate count of how many other people have blue eyes, and nobody knows their own eye colour.

Load More