A more mundane example:

The Roomba cleaning robot is scarcely an agent. While running, it does not build up a model of the world; it only responds to immediate stimuli (collisions, cliff detection, etc.) and generates a range of preset behaviors, some of them random.

It has some senses about itself — it can detect a jammed wheel, and the "smarter" ones will return to dock to recharge if the battery is low, then resume cleaning. But it does not have a variable anywhere in its memory that indicates how clean it believes the room is — an explicit representation of a utility function of cleanliness, or "how well it has done at its job". It does, however, have a sensor for how dirty the carpet immediately below it is, and it will spend extra time on cleaning especially dirty patches.

Because it does not have beliefs about how clean the room is, it can't have erroneous beliefs about that either — it can't become falsely convinced that it has finished its job when it hasn't. It just keeps sweeping until it runs out of power. (We can imagine a paperclip-robot that doesn't think about paperclips; it just goes around finding wire and folding it. It cannot be satisfied, because it doesn't even have a term for "enough paperclips"!)

It is scarcely an agent. To me it seems even less "agenty" than an arbitrage daemon, but that probably has more to do with the fact that it's not designed to interact with other agents. But you can set it on the floor and push the go button, and in an hour come back to a cleaner floor. It doesn't think it's optimizing anything, but its behavior has the result of being useful for optimizing something.

Whether an entity builds up a model of the world, or is self-aware or self-protecting, is to some extent an implementation detail, which is different from the question of whether we want to live around the consequences of that entity's actions.

The agent/tool distinction is in the map, not the territory — it's a matter of adopting the intentional stance toward whatever entity we're talking about. To some extent, saying "agent" means treating the entity as a black box with a utility function printed on the outside: "the print spooler wants to send all the documents to the printer" — or "this Puppet config is trying to put the servers in such-and-so state ..."

It's very hard to avoid apparent teleology when speaking in English. (This is particularly troublesome when talking about evolution by natural selection, where the assumption of teleology is the number one barrier to comprehending how it actually works.)

2DaveX8yMy roomba does not just keep sweeping until it runs out of power. It terminates quickly in a small space and terminates slower in a large space. To terminate it must somehow sense the size of the space it is working in and compare it to some register of how long it has operated. Roombas try to build up a (very limited) model of how big the room is from the longest uninterrrupted traversal it can sense. See "Can you tell me more about the cleaning algorithm that the Roomba uses?" in http://www.botjunkie.com/2010/05/17/botjunkie-interview-nancy-dussault-smith-on-irobots-roomba/ [http://www.botjunkie.com/2010/05/17/botjunkie-interview-nancy-dussault-smith-on-irobots-roomba/]

Tool for maximizing paperclips vs a paperclip maximizer

by private_messaging 1 min read12th May 201223 comments


To clarify some point that is being discussed in several threads here, tool vs intentional agent distinction:

A tool for maximizing paperclips would - for efficiency purposes - have a world-model which it has god's eye view of (not accessing it through embedded sensors like eyes), implementing/defining a counter of paperclips within this model. Output of this counter is what is being maximized by a problem solving portion of the tool. Not the real world paperclips

No real world intentionality exist in this tool for maximizing paperclips; the paperclip-making-problem-solver would maximize the output of the counter, not real world paperclips. Such tool can be hooked up to actuators, and to sensors, and made to affect the world without human intermediary; but it won't implement real world intentionality.

An intentional agent for maximizing paperclips is the familiar 'paperclip maximizer', that truly loves the real world paperclips and wants to maximize them, and would try to improve it's understanding of the world to know if it's paperclip making efforts are successful.

The real world intentionality is ontologically basic in human language and consequently there is very strong bias to describe the former as the latter.

The distinction: the wireheading (either direct or through manipulation of inputs) is a valid solution to the problem that is being solved by the former, but not by the latter. Of course one could rationalize and postulate tool that is not general purpose enough as to wirehead, forgetting that the issue being feared is a tool that's general purpose to design better tool or self improve. That is an incredibly frustrating feature of rationalization. The aspects of problem are forgotten when thinking backwards.

The issues with the latter: We do not know if humans actually implement real world intentionality in such a way that it is not destroyed under full ability to self modify (and we can observe that we very much like to manipulate our own inputs; see art, porn, fiction, etc). We do not have single certain example of such stable real world intentionality, and we do not know how to implement it (that may well be impossible). We also are prone to assuming that two unsolved problems in AI - general problem solving and this real world intentionality - are a single problem, or are solved necessarily together. A map compression issue.