Note: The following post is a cross of humor and seriousness.

After reading another reference to an AI failure, it seems to me that almost every "The AI is an unfriendly failure" story begin with "The Humans are wasting too many resources, which I can more efficiently use for something else."

I felt like I should also consider potential solutions that look at the next type of failure. My initial reasoning is: Assuming that a bunch of AI researchers are determined to avoid that particular failure mode and only that one, they're probably going to run into other failure modes as they attempt (and probably fail) to bypass that.

For instance: AI Researchers build an AI that gains utility roughly equivalent to the Square Root(Median Human Prolifigacy) times Human Population times Time, and is dumb about Metaphysics, and has a fixed utility function.

It's not happier if the top Human doubles his energy consumption. (Note: Median Human Prolifigacy)

It's happier, but not twice as happy when Humans are using Twice as many Petawatthours per Year (Note: Square Root: This also helps prevent 1 human killing all other humans from space and setting the earth on fire be a good use of energy. This Skyrockets the Median, but it does not skyrocket the Square Root of the Median nearly as much.)

It's five times as happy if there are five times as many Humans, and ten times as happy when Humans are using the same amount of energy for year for 10 years as opposed to just 1.

Dumb about metaphysics is a reference to the following type of AI failure: "I'm not CERTAIN that there are actually billions of Humans, we might be in the matrix, and if I don't know that, I don't know if I'm getting utility, so let me computronium up earth really quick just to run some calculations to be sure of what's going on." Assume the AI just disregards those kinds of skeptical hypotheses, because it's dumb about metaphysics. Also assume it can't change it's utility function, because that's just too easy too combust.

As I stated, this AI has bunches of failure modes. My question is not "Does it Fail?" but "Does it even sound like it avoids having eat humans, make computronium be the most plausible failure? If so, what sounds like a plausible failure?"

Example Hypothetical Plausible Failure: The AI starts murdering environmentalists because it fears that environmentalists will cause an overall degradation in Median human energy use that will lower overall AI utility, and environmentalists also encourage less population growth, which further degrades AI utility, and while the AI does value the environmentalists human energy consumption which boosts utility, they're environmentalists, so they have a small energy footprint, and it doesn't value not murdering people in of itself.

After considering that kind of solution, I went up and changed 'my reasoning' to 'my initial reasoning' Because at some point I realized I was just having fun considering this kind of AI failure analysis and had stopped actually trying to make a point. Also, as Failed Utopia 4-2 points out in http://lesswrong.com/lw/xu/failed_utopia_42/ designing more interesting failures can be fun.

Edit for clarity: I AM NOT IMPLYING THE ABOVE AI IS OR WILL CAUSE A UTOPIA. I don't think it it could be read that way, but just in case there are inferential gaps, I should close them.

it seems to me that almost every "The AI is an unfriendly failure" story begin with "The Humans are wasting too many resources, which I can more efficiently use for something else."

Really? I think the one I see most is "I am supposed to make humans happy, but they fight with each other and make themselves unhappy, so I must kill/enslave all of them". At least in Hollywood. You may be looking in more interesting places.

Per your AI, does it have an obvious incentive to help people below the median energy level?

1Armok_GoB7yWell, even if it turned out to do exactly what it's designers were thinking (wich it won't), it'd still be unfriendly for the simple fact that no remotely optimal future likely involve humans with big energy consumptions. The FAI almost certainly should eat all humans for computronium, the only difference is the friendly one will scan their brains first and make emulations.

Open thread, July 16-22, 2013

by David_Gerard 1 min read15th Jul 2013305 comments


If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Given the discussion thread about these, let's try calling this a one-week thread, and see if anyone bothers starting one next Monday.