The Utility of Human Atoms for the Paperclip Maximizer

by avturchin2 min read2nd Feb 201819 comments


Paperclip MaximizerInstrumental Convergence

TL;DR: use of humans’ atoms will increase AI’s expected utility by 0.00000000000000002 of total U.

The iconic example of the existential risk of superintelligence is the so-called paperclip maximizer, that is, a system which maximizes some random goal not aligned with human values. While a real paperclip maximizer is unlikely, convergent dynamic subgoals suggested by Omohundro, imply that almost any non-aligned AI will try to acquire as much computational power as possible, as well as other material resource. Thus, it will need the atoms of humans.

In this article, I will try to estimate the marginal utility of human atoms U¢ compared with total utility of all atoms’ U for the AI.

I assume that:

A) Superintelligent AI which is interested in human atoms, is limited to the Solar System. The reasons for this is that it may need human atoms to build spaceships to travel to the nearest stars, and if it reaches the stars, the value of most human atoms will be lost.

B) Superintelligent AI has powerful nanotechnology, which could be used to do whatever it wants with atoms, like build large space ships, deconstruct planets down to their final atoms etc.

C) The only resource that AI is interested in is atoms—not energy, not information etc.

D) The utility of atoms’ U is increasing linearly with increases in the available mass.

The weight of Solar System planets is 0.135% of the Sun’s mass, which is 1.98855×10E30 kg, and the mass of all planets is 2.6E1027kg, where most of this mass is Jupiter. Earth’s mass is 5.97237×10E24 kg.

The total mass of human beings on Earth is 70kg х 7.5x10E9 people = 5.25x10E11kg.

In that case, human atoms constitute marginal utility U’= 2x10E-16U. That is, use of the humans’ atoms will increase the AI’s utility by 0.00000000000000002 of total U.

But this is obviously flawed, as humans can’t survive in a vacuum, they need the whole Earth!

Imagine that AI wants to deconstruct the Earth for its atoms, but also decided to preserve human lives. It has two options:

1) Upload everybody into smaller computers. Based on various estimations of the Launder limit, and the computational capacity of human brains, the size of these computers will be different, but they could easily be 1000 times less than human bodies.

2) AI decides not to upload humans, but to build a space station, where humans can live approximately the same life as they do currently. As the typical weight of a human house is something like 10 tons, and assuming very effective nanotech, such a space station may require hardware weighing 1000 times more kg for every human kg (or perhaps even less). It will weigh 5.25x10E14kg.

Option (1) is a million times more economical than the option (2) for the AI. But even in the case of option (2), the marginal utility of human atoms U’= 2x10E-13U.

If the number of the atoms linearly translates into the speed of the start of the galactic colonization (Armstrong & Sandberg, 2013), and AI needs 1 billion seconds (30 years) to convert all the Solar System planets into space ships, the delay caused by preserving humans on a space station will be around 0.2 milliseconds.

Given Bostrom’s astronomical waste idea (Bostrom, 2003), that may be not small after all, as it will increase the sphere of the AI’s reach by 150 km, and after billions of years it will correspond to very large volume (assuming the size of the universe is like 10E21 light milliseconds, and the number of stars in it is around 10E24, times an economy of 0.2 milliseconds, could mean gain of more than 1000 stars, equal to hundreds of solar masses.)

Even Friendly AI may deconstruct humans for their atoms in the AI’s early stages, and as such sacrificy will translate in the higher total number of sentient beings in the universe at the end.

In the another work (Turchin, 2017), I suggested that “price” of human atoms for AI is so infinitely small, that it will not kill humans for their atoms, if it has any infinitely small argument to preserve humans. Here I suggested more detailed calculation.

This argument may fail if we add the changing utility of human atoms over time. For early AI, Earth’s surface is the most available source of atoms, and organic matter is the best source of carbon and energy (Freitas, 2000). Such an AI may bootstrap its nanotech infrastructure more quickly if it does not care about humans. However, AI could start uploading humans, or at least freezing their brains in some form of temporary cryostasis, even in the very early stages of its development. It that case AI may be acquire most human atoms without killing them.

Armstrong, S., & Sandberg, A. (2013). Eternity in six hours: intergalactic spreading of intelligent life and sharpening the Fermi paradox. Acta Astronautica, 89, 1–13.

Bostrom, N. (2003). Astronomical waste: The opportunity cost of delayed technological development. Utilitas, 15(3), 308–314.

Freitas, R. (2000). Some Limits to Global Ecophagy by Biovorous Nanoreplicators, with Public Policy Recommendations. Foresight Institute Technical Report.

Turchin, A. (2017). Messaging future AI. Retrieved from


19 comments, sorted by Highlighting new comments since Today at 10:25 AM
New Comment

I know you're not arguing here that using our atoms for something else is the only (or most likely?) reason for a superintelligence to harm us. But just in case some reader gets this impression, here's another reason:

If the utility of the superintelligence is not perfectly aligned with "our utility", then at some point we'll probably want to switch the superintelligence off. So from the perspective of the superintelligence, the current configuration of our atoms might be very net negative. Suppose the superintelligence is boxed and can only affect us by sending us, say, 1 kb of text. It might be the case that killing us all is the only way for 1 kb of text to reliably stop us from switching the superintelligence off.

Yes, true. I estimated that there are many scenarious where AI may kill us, like 50. I posted them here: agifailures modesand levels/

We, with D.Denkenberger, wrote an article with the full list of the ways how AI catastrophe could happen and it is under review now. Kaj Sotala has another classification of such catastrophic types.

Hm, I noticed that your link showed up quite wonky. Here's a fixed version:

Even Friendly AI may deconstruct humans for their atoms in the AI’s early stages, and as such sacrificy will translate in the higher total number of sentient beings in the universe at the end.

“You keep using that word, I do not think it means what you think it means.”

What word do you mean? Friendly AI? It's a term (I'm hardly an expert, but I guess wikipedia should be okay for that intelligence )

I think they're referring to the fact that they wouldn't expect a Friendly AI to deconstruct them.

Also, for some reason, the link is wonky - likely because LessWrong 2.0 parses text contained in as italics. Here's the fixed link:

Which word do yo mean?

"Friendly", presumably.

I think there is an interesting calculation to be done here, but the atoms seem obviously irrelevant.

I'd be interested in (a) how large is the slowdown from uploading humans before the information is destroyed? (b) how large is the slowdown from moving biological humans off Earth before rendering it uninhabitable? (c) how large is the slowdown from leaving part of Earth inhabitable to humans indefinitely? (d) how large is the slowdown from leaving most of Earth inhabitable to humans?

I expect it's possible to estimate these to within an order of magnitude or two. I don't think any of them depend on atoms. Relevant calculations include: how early would default economic development render Earth uninhabitable, how hard is it to upload or move humans, how important is manufacturing on Earth?

I think that answering these questions will be a part of future work. Without calculations, I think that if AI decides to not touch Earth at all, but use only other plantes and asteorids for its space engineering, it will be around 1 years slowdown for it plus or minus order of magnitude, which translates into losing sonething like one billionth of its total utility.

1 year sounds plausible to me and is similar to my default guess, though I could also imagine it being shorter (getting lots of stuff of Earth is quite hard, and getting a small amount of heavy technology off Earth is quite easy).

I think to create something like a chart where marginal utility of human atoms will be ploted against time=AI's capabilities and it will be quickly diminishing curve.

One reason that Earth may be still attractive for AI is that it has many high quality enriched ores created by naturally processes which used water and life, and other planets (except may be Mars) have very mixed element composition. But even here using just Pacific ocean and all its underwater mineral resources may be eniugh for AI to boostrap, without much damage to biosphere.

On the other side, Earth is huge gravitational well and everything will be simpler on the asteroid belt.

Could I ask what the motivation behind this post was?

Another reasom was to quantisize popular narrative that the Paperclipper is only interested in your atoms: "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else".

Makes sense.

This is a part of larger work, which was burried inside it, so I decided to take it out as a standalone result. The work is about the question: is it possible that non-aligned AI will not kill humans, and what could we do for it. The draft is here:

Thanks! I think it makes sense to link it at the start, so new readers can get context for what you're trying to do.

I alredy posted that content but it was met rather skepticaly.

Ok, so why not rip a page from nature? Throw a replication counter or some other hard limit in an AI's terminal values.

The very concept of a paperclip maximizer is an agent that is stuck maximizing some silly parameter and unable to change. If it's unable to change it's terminal values, then a second terminal value that limits how far it can expand will hold.

I don't think this is a counter-argument, to be honest. Because the most probable scenario if we lose control of AI is we create some type of evolutionary environment where many types of agents and many terminal values can compete for resources. And it is self evident that the ultimate winner of that competition is an agent that values copying itself, as accurately and rapidly and ruthlessly as possible. Such agents are a paperclip maximizer, in the same way all life is.