Ethical dilemmas for paperclip maximizers

[-]gwern14y60

Can't say it seems very fun to me; Clippy's utility function is underdefined and not accessible to us anyway. We can debate the details for human utility functions because we have all sorts of shared intuitions which let us go into details, but how do we decide longevity of paperclips is better than number of paperclips? I have no intuitions for clippys.

[-]cousin_it14y80

It's still conceivable that, even given all our shared intuitions, our "utility function" is just as underdefined as Clippy's.

[-]wedrifid14y70

It's still conceivable that, even given all our shared intuitions, our "utility function" is just as underdefined as Clippy's.

I would have said far more so.

[-]jhuffman14y10

I wonder what Clippy would infer about our utility functions.

[-]Clippy14y50

That they're stupid and reflectively inconsistent.

[-]cousin_it14y20

Thanks for your comment! First I was like, "Clippy wouldn't formalize humans as having utility functions", then I was like "in that case why do we want to formalize our utility functions?", and then I was all "because we have moral intuitions saying we should follow utility functions!" It's funny how the whole house of cards comes tumbling down.

[-][anonymous]14y30

I want to note I had a different experience. All of the paperclip maximizer ethical problems seemed similar to a human ethical problems, so I did not experience that I had no intuition for Clippies.

1: This seems similar to the Mere addition paradox. http://en.wikipedia.org/wiki/Mere_addition_paradox.

2: This seems similar to the Robin Hanson space or time civilization question. http://www.overcomingbias.com/2011/06/space-v-time-allies.html

3: This seems similar to the problem of given a finite number as a maximum population, is it better to have the population be immortal, or to have the oldest die and the new younger ones take their place.

4: This seems similar to the problem of whether there are circumstances where it's important to sacrifice a single person for the good of the many.

Are these just problems that apply to most self reproducing patterns, regardless of what they happen to be called?

I do also want to note, that the paperclip maximizer doesn't begin as a self reproducing pattern, but it doesn't seem like it would go very far if it didn't build more paperclip maximizers in addition to building more paperclips. And it would probably want to have it's own form have some value as well, or it might self destruct into paperclips, which means it would be a paperclip, since that is explicitly the only thing it values, which seems to mean it is very likely it resolves into building copies of itself.

[-]MixedNuts14y60

Pattern-matching reasoning error " must be an explicit goal, because otherwise it won't do it, but it needs to in order to reach its goal ". It needs only know copies help make paperclips to have "make copies" as an instrumental goal, and it doesn't start valuing copies for themselves - if a copy becomes inefficient, disassemble it to make paperclips. You sometimes need to open car doors to go to the store, but you don't wax poetic about the inherent value of opening car doors.

[-][anonymous]14y00

Let me try removing the word "value" and rewording this a little.

The paperclip maximizer doesn't begin as a self reproducing pattern, but it doesn't seem like it would go very far if it didn't build more paperclip maximizers in addition to building more paperclips. And it would probably want to have it's own copies be maximized as well, or it might self destruct into paperclips. This means it would have to consider itself a form of paperclip, since that is explicitly the only thing it maximizes for, since it isn't a [paperclip and paperclip maximizer] maximizer which seems to mean it is very likely it resolves into building copies of itself.

Does that rephrase fix the problems in my earlier post?

[-]MixedNuts14y30

And it would probably want to have it's own copies be maximized as well [...] This means it would have to consider itself a form of paperclip

That's the problematic step. If maximizing copies of itself if what maximizes paperclips, it happens automatically. It doesn't have to decide "paperclips" stands for "paperclips and the 837 things I've found maximize them". It notices "making copies leads to more paperclips than self-destructing into paperclips", and moves on. Like you're not afraid that, if you don't believe growing cocoa beans is inherently virtuous, you might try to disassemble farms and build chocolate from their atoms.

[-][anonymous]14y00

I think I see what you're getting at. It's more in the vein of solving a logic/physics problem at that point. The only reason it would make the consideration I referred to would be if by making that consideration, it could make more paperclips, so it would come down to which type of replication code allowed for less effort to be spent on maximizers and more effort to be spent on paperclips over the time period considered.

[-]gwern14y00

My problem with this is easily summed up: that makes sense, if you simply transform the Clippy problem into human problems, by replacing 'paperclip' with 'human'. I don't even know how Clippy problems map onto human problems, so I can't smuggle my intuitions the other way into the Clippy problems (assuming the mapping is even bijective).

[-][anonymous]14y00

That's why I was trying to consider both the Clippy problems and the Human problems as a self replicating pattern problems. My human intuitions on Clippy problems might be flawed (since it isn't human), but my self replicating pattern intuitions on Clippy problems shouldn't have that same problem and I think they would map a lot better.

[-]Clippy14y20

Note: "clippy", as a shortened term for "paperclip maximiser", should be uncapitalized, and should be pluralised as "clippys".

[-]gwern14y10

Duly noted.

[-]Clippy14y50

1) I am asked so often that I finally clarified in the first question of my current FAQ. (Feel free to add suggestions for the FAQ on the talk page.)

2) In normal existence, there is no need to make the distinction, but a longer-lasting paperclip counts as more paperclippy for purposes of my value system. I value most highly the number of paperclips that exist at the universe's quasi-steady-state, where the number of paperclips will not significantly change, which can indeed be far in the future.

3) There is no intrinsic discount factor, but discounting emerges due to other concerns such as uncertainty regarding the future. A paperclip that I can move to the safe zone now is more valuable than a paperclip that I "will" be able to move in ten years.

4) Yes, and you gave an excellent example. Another case would be selling one paperclip to a human, knowing that human will melt it down, in order to get money that can be applied to another human who will, on receiving the money, marshal resources in such a way as to produce more than one paperclip of equivalent per-paperclip clippiness.

Now I have a question for you: I heard you were in the same position as me with respect to wanting to integrate into human society via having a job and a human-typical place to live. How did you accomplish this, and can you or your human friends help me in any way?

[-]Raemon14y10

So there presumably is a particular "ideal" paperclip size (i.e. you want it to be as small as possible while still capable of holding a few pieces of paper together. I get the sense that a clip that can hold three papers together has the same ethical weight as a clip that can hold 10 or 40?)

[-]Clippy14y00

That's about right. The "true" ethically necessary number of sheets a paperclip needs to fasten is a complicated matter, but it's certainly less than 10. (There's a related fuzzy issue about how large a tolerance band around standard width paper is acceptable.)

[-]CronoDAS14y00

Sadly, I haven't yet managed to solve that problem either!

[-]Clippy14y-20

What about your human friends in your local LessWrong meetup?

[-]DanielLC14y40

You may be interested in Clippy's FAQ.

Making them and then stopping is a better approach for maximizing paperclips. Recycling them is a better approach for maximizing manufacturing.
Yes. You just listed one. Another is melting down paperclips to build a star ship so it can turn other star systems into paperclips.

[-]wedrifid14y30

1) Do paperclip maximizers care about paperclip mass, paperclip count, or both?

Given the origin of paperclip maximisers as a metaphor we can expect them to maximise the paperclips based off the template they were constructed with originally. It is possible that even the specification of a paperclip is unstable under recursive improvement but somewhat less likely. Postulating agents that don't even know what a paperclip is seems less useful as a tool for constructing counterfactuals. Agents that are that flexible with respect to what their actual goal is can be used to illustrate different decision theoretic games but there is no need to recycle 'paperclip maximiser' for that purpose.

[-]Eugine_Nier14y-20

It is possible that even the specification of a paperclip is unstable under recursive improvement but somewhat less likely. Postulating agents that don't even know what a paperclip is seems less useful as a tool for constructing counterfactuals.

It is, however, useful for thinking about recursive stability in general, and thinking about designing agents to have stable goal systems.

[-]Eugine_Nier14y00

Suppose the paperclip maximizer doesn't believe in time discounting, and furthermore has been informed by Omega that the universe won't end. The PCM acquires some resources that can be used to either

a) make paperclips or

b) make more efficient ways to make paperclips, e.g., interstellar ships, computronium to design more efficient factories, etc.

Note that option (b) will lead to more paperclips in the long run and since the PCM doesn't discount the future it should always choose (b). But that means it never actually gets around to making any paperclips.

[-][anonymous]14y00

(Why? Because it's fun.)

Upvote for the fun, thankyou!

[This comment is no longer endorsed by its author]Reply

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

14

Ethical dilemmas for paperclip maximizers

14

14