But why would the AI kill us?

So8res

But why would the AI kill us? — LessWrong

142 But why would the AI kill us?

by So8res

17th Apr 2023

AI Alignment Forum

3 min read

142 Ω 35

Status: Partially in response to We Don't Trade With Ants, partly in response to watching others try to make versions of this point that I didn't like. None of this is particularly new; it feels to me like repeating obvious claims that have regularly been made in comments elsewhere, and are probably found in multiple parts of the LessWrong sequences. But I've been repeating them aloud a bunch recently, and so might as well collect the points into a single post.

This post is an answer to the question of why an AI that was truly indifferent to humanity (and sentient life more generally), would destroy all Earth-originated sentient life.

Might the AGI let us live, not because it cares but because it has no particular reason to go out of its way to kill us?

As Eliezer Yudkowsky once said:

The AI does not hate you, nor does it love you, but you are made of atoms which it can use for something else.

There's lots of energy in the biosphere! (That's why animals eat plants and animals for fuel.) By consuming it, you can do whatever else you were going to do better or faster.

(Last I checked, you can get about 10x as much energy from burning a square meter of biosphere as you can get by collecting a square meter of sunlight for a day. But I haven't done the calculation for years and years and am pulling that straight out of a cold cache. That energy boost could yield a speedup (in your thinking, or in your technological design, or in your intergalactic probes themselves), which translates into extra galaxies you manage to catch before they cross the cosmic event horizon!)

But there's so little energy here, compared to the rest of the universe. Why wouldn't it just leave us be, and go mine asteroids or something?

Well, for starters, there's quite a lot of energy in the sun, and if the biosphere isn't burned for fuel then it will freeze over when the AI wraps the sun in a dyson sphere or otherwise rips it apart. It doesn't need to consume your personal biomass to kill you; consuming the sun works just fine.

And separately, note that if the AI is actually completely indifferent to humanity, the question is not "is there more energy in the biosphere or in the sun?", but rather "is there more energy available in the biosphere than it takes to access that energy?". The AI doesn't have to choose between harvesting the sun and harvesting the biosphere, it can just harvest both, and there's a lot of calories in the biosphere.

I still just think that it might decide to leave us be for some reason.

That answers above are sufficient to argue that the AI kills us (if the AI's goals are orthogonal to ours, and can be better achieved with more resources). But the answer is in fact overdetermined, because there's also the following reason.

A humanity that just finished coughing up a superintelligence has the potential to cough up another superintelligence, if left unchecked. Humanity alone might not stand a chance against a superintelligence, but the next superintelligence humanity builds could in principle be a problem. Disassembling us for parts seems likely to be easier than building all your infrastructure in a manner that's robust to whatever superintelligence humanity coughs up next. Better to nip that problem in the bud.^[1]

But we don't kill all the cows.

Sure, but the horse population fell dramatically with the invention of the automobile.

One of the big reasons that humans haven't disassembled cows for spare parts is that we aren't yet skilled enough to reassemble those spare parts into something that is more useful to us than cows. We are trying to culture meat in labs, and when we do, the cow population might also fall off a cliff.

A sufficiently capable AI takes you apart instead of trading with you at the point that it can rearrange your atoms into an even better trading partner.^[2] And humans are probably not the optimal trading partners.

But there's still a bunch of horses around! Because we like them!

Yep. The horses that are left around after they stopped being economically useful are around because some humans care about horses, and enjoy having them around.

If you can make the AI care about humans, and enjoy having them around (more than it enjoys having-around whatever plethora of puppets it could build by disassembling your body and rearranging the parts), then you're in the clear! That sort of AI won't kill you.

But getting the AI to care about you in that way is a big alignment problem. We should totally be aiming for it, but that's the sort of problem that we don't know how to solve yet, and that we don't seem on-track to solve (as far as I can tell).

Ok, maybe my objection is that I expect it to care about us at least a tiny bit, enough to leave us be.

This is a common intuition! I won't argue against it in depth here, but I'll leave a couple points in parting:

my position is that making the AI care a tiny bit (in the limit of capability, under reflection) is almost as hard as the entire alignment problem, and we're not on track to solve it.
if you want to learn more about why I think that, some relevant search terms are "the orthogonality thesis" and "the fragility of value".