Sorted by New

Wiki Contributions



Correction: I don't think it helps Bostrom's position to overload the concept of friendship friendly with the connotations of close friendship.


Carl: This point is elementary. A “friend” who seeks to transform himself into somebody who wants to hurt you, is not your friend."

The switch from "friendly" (having kindly interest and goodwill; not hostile) to a "friend" (one attached to another by affection or esteem) is problematic. To me it radically distorts the meaning of FAI and makes this pithy little sound-bite irrelevant. I don't think it helps Bostrom's position to overload the concept of friendship with the connotations of close friendship.

Exactly how much human bias and irrationality is needed to sustain our human concept of "friend", and is that a level of irrationality that we'd want in a superintelligence? Can the human concept of friendship (involving extreme loyalty and trust in someone we've happened to have known for some time and perhaps profitably exchanged favours with) be applied to the relationship between a computer and a whole species?

I can cope with the concept of "friendly" AI (kind to humans and non hostile), but I have difficulty applying the distinct English word "friend" to an AI.

Suggested listening: Tim Minchin - If I Didn't Have You


Kip Werking: "P2. But, all we have to prove that giving to charity, etc., is right, is that everyone thinks it is"

You're stating that there exists no other way to prove that giving to charity is right. That's an omniscient claim.

Still, it's unlikely to be defeated in the space of a comment thread, simply because your sweeping generalization about the goodness of charity is far from being universally accepted. A very general claim like that, with no concrete scenario, no background information on where it is to be applied, makes relativism a foregone conclusion.

I'd like to hear your arguments for something a little more fundamental. Would you apply the same reasoning to the goodness of life? Would you be prepared to claim that
"all we have to prove that your life is better than your death, is that everyone thinks it is"?

And with regard to the joy/sorrow question, would you be prepared to claim that
"all we have to prove that your not suffering is better than your suffering, is that everyone thinks it is"?


komponisto: "I'm really having trouble understanding how this isn't tantamount to moral relativism"

I think I see an element of confusion here in the definition of moral relativism. A moral relativist holds that "no universal standard exists by which to assess an ethical proposition's truth". However, the word universal in this context (moral philosophy) is only expected to apply to all possible humans, not all conceivable intelligent beings. (Of all the famous moral relativist philosophers, how many have addressed the morals of general non-human intelligences?)

So we can ask two different questions:

#1. Is there a standard by which we can assess an ethical proposition's truth that applies to all humans?

#2. Is there a standard by which we can assess an ethical proposition's truth that applies to all conceivable intelligent beings?

I expect that Eliezer would answer yes to #1 and no to #2.

If you interpret universal in the broader sense (#2), then Eliezer would indeed be a moral relativist, but I think that distorts the concept of moral relativism, since the philosophy was developed with only humans of different cultures in mind.


Larry D'Anna: "And it doesn't do any good to say that they aren't defective. They aren't defective from a human, moral point of view, but that's not the point. From evolutions view, there's hardly anything more defective, except perhaps a fox that voluntarily restrains it's own breeding."

Why is it "not the point"? In this discussion we are talking about differences in moral computation as implemented within individual humans. That the blind idiot's global optimization strategy defines homosexuality as a defect is of no relevance.

Larry D'Anna: "I'm not sure if I see where the complex adaptation is here. Some people have more empathy, some less. Even if the difference is supposed to be genetic, there seem to be a lot of these flexible parameters in our genome."

I wasn't claiming a complex adaptation. I was claiming "other computations that could exhibit a superficial unity, but with a broad spread."

I think we are already in substantial agreement, and having seen Eliezer's last comment, I see that much of what I've been rambling on about comes from reading more than was warranted into the last paragraphs of his blog entry.


Eliezer: "The basic ev-bio necessity behind the psychological unity of human brains is not widely understood."

I agree. And I think you've over-emphasized the unity and ignored evidence of diversity, explaining it away as defects.

Eliezer: "And even more importantly, the portion of our values that we regard as transpersonal, the portion we would intervene to enforce against others, is not all of our values; it's not going to include a taste for pepperoni pizza, or in my case, it's not going to include a notion of heterosexuality or homosexuality."

I think I failed to make my point clearly on the idea of a sexual orientation pill. I didn't want to present homosexuality as a moral issue for your judgment, but as an example of the psychological non-unity of human brains. Many people have made the mistake of assuming that heterosexuality is "normal" and that homosexuals need re-education or "fixing" (defect removal). I hold that sexuality is distributed over a spectrum. The modes of that distribution represent heterosexual males and females--an evolutionary stable pattern. The other regions of that distribution remain non-zero despite selection pressure.

Clearly we do not have a psychological unity of human sexual preference. People with various levels of homosexual preference are not merely defective or badly educated/informed.

Sexuality is a complex feature arising in a complex brain. Because of the required mutual compatability of brain-construction genetics, we can be sure that we all have extremely similar machinery, but our sexual dimorphism requires that a single set of genes can code flexibly for either male or female. Since that flexibility doesn't implement a pure binary male/female switch, we find various in-between states in both physical and mental machinery. The selection pressure from sexual dimorphism means we should expect far more non-unity of sexual preference than in other areas of our psychology.

But the fact that our genes can code for that level of flexibility, yet still remain biologically compatible, tells us that there are likely to be many other computations that could exhibit a superficial unity, but with a broad spread. The spread of propensities to submit to authority and override empathy observed in the Stanford Prison experiment gives good reason to question the supposed unity. (Yes I know you could shoehorn that diversity into your model as a effect of lack of moral training.)

Now let's reconsider psychopathy or the broader diagnosis of antisocial personality disorder. What should we do with those humans whose combination of narcissism and poor social cognition is beyond some particular limit? Lock them up, or elect them to govern? From my limited understanding of the subject, it seems that the "condition" is considered untreatable.

It's an easy path to stick to the psychological unity of humans and declare those in the tails of the distribution to be defective. But is it statistically justified? Does your unity model actually fit the data or just give a better model than the tabula rasa model that Tooby and Cosmides reacted against?

That's why the idea of some idealised human moral computation, that everyone would agree to if they knew enough and weren't defective, seems like question begging. That's why I was asking for the empirical data that have led you to update your beliefs. Then I could update mine from them and maybe we'd be in agreement.

I'm open to the idea that we can identify some best-fit human morality--a compromise that minimizes the distance/disagreement to our population of (veridically informed) humans. That seems to me to be the best we can do.


Eliezer: "But this would be an extreme position to take with respect to your fellow humans, and I recommend against doing so. Even a psychopath would still be in a common moral reference frame with you, if, fully informed, they would decide to take a pill that would make them non-psychopaths. If you told me that my ability to care about other people was neurologically damaged, and you offered me a pill to fix it, I would take it."

How sure are you that most human moral disagreements are attributable to

  • lack of veridical information, or
  • lack of ability/tools to work through that information, or
  • defects? You talk freely about psychopaths and non-psychopaths as though these were distinct categories of non-defective and defective humans. I know you know this is not so. The arguments about psychological unity of humankind only extend so far. e.g., would you be prepared to tell a homosexual that, if they were fully informed, they would decide to take a pill to change their orientation? There are demonstrable differences where our fellow humans really are "different optimization processes". Why should we ignore the spread of differences in moral computations?

I've been enjoying your OB posts and your thought experiments are powerful, but I'm curious as to the empirical data that have led you to update your beliefs so strongly in favour of psychological unity and so strongly against differences in computation. Your arguments that mention psychopaths smack a little of a "no true Scotsman" definition of human morality.


It seems that the Pebblesorting People had no problems with variations in spelling of their names. (Biko=Boki)

Good parable though, Eliezer.


Imagine the year 2100

AI Prac Class Task: (a) design and implement a smarter-than-human AI using only open source components; (b) ask it to write up your prac report. Time allotted: 4 hours Bonus points: disconnect your AI host from all communications devices; place your host in a Faraday cage; disable your AI's morality module; find a way to shut down the AI without resorting to triggering the failsafe host self-destruct.

sophiesdad, since a human today could not design a modern microprocessor (without using the already-developed plethora of design tools) then your assertion that a human will never design a smarter-than-human machine is safe but uninformative. Humans use smart tools to make smarter tools. It's only reasonable to predict that smarter-than-human machines will only be made by a collaboration of humans and existing smart machines.

Speculation on whether "smart enough to self improve" comes before or after the smart-as-a-human mark on some undefined 1-dimensional smartness scale is fruitless. By the look of what you seem to endorse by quoting your unnamed correspondent, your definition of "smart" makes comparison with human intelligence impossible.


HA: "I aspire not to care about rescuing toddlers from burning orphanages. There seems to be good evidence they're not even conscious, self-reflective entities yet."

HA, do you think that only the burning toddler matters? Don't the carers from the orphanage have feelings? Will they not suffer on hearing about the death of someone they've cared for?

Overcoming bias does not mean discarding empathy. If you aspire to jettison your emotions, I wonder how you'll make an unbiased selection of which ones you don't need.

Load More