What do people mean when they say that something will become more like a utility maximizer?

[-]johnswentworth2mo164

My standard answer to the kinds of points you're making here is in A Simple Toy Coherence Theorem, specifically this section.

[-]Nina Panickssery2mo40

Coherence is not about whether a system "can be well-modeled as a utility maximizer" for some utility function over anything at all, it's about whether a system can be well-modeled as a utility maximizer for utility over some specific stuff.

The utility in the toy coherence theorem in this post is very explicitly over final states, and the theorem says nontrivial things mainly when the agent is making decisions at earlier times in order to influence that final state - i.e. the agent is optimizing the state "far away" (in time) from its current decision. That's the prototypical picture in my head when I think of coherence. Insofar as an incoherent system can be well-modeled as a utility maximizer, its optimization efforts must be dominated by relatively short-range, myopic objectives. Coherence arguments kick in when optimization for long-range objectives dominates.

My understanding based on this is that your definition of “reasonable” as per my post is “non-myopic” or “concerned with some future world state”?

[-]johnswentworth2mo52

Yes.

In my head I usually think of it as non-myopic in spacetime (as opposed to just time), but the version which is (somewhat) justified by the Toy Coherence Theorem is non-myopia over time.

[-]Nina Panickssery2mo20

Is there a good definition of non-myopic in spacetime?

[-]johnswentworth1mo30

Optimization at a Distance has the mental picture in it, which... is not not a definition.

[-]J Bostock2mo70

"Utility Maximizer" exists in the map, not the territory. It's something we can apply to model the behaviour of things in the territory. As in all cases, models make a trade-off between simplicity and accuracy.

Some entities are particularly well modelled (by me) as carrying out a strategy of "Maximize [X]" where [X] is a short description of some outcome.

(The classic examples of "Stockfish" being well modelled by "Achieve wins in chess" comes to mind. Someone might well model a company as executing a strategy of "Maxmize your profits" or a politician as executing a strategy of "Maximize your popularity".)

This isn't perfect, obviously. We might need to add some extra information. For example, we can describe a chess player as executing "Win chess" but with an extra variable of "ELO = 1950" which describes the power of that utility maximizer. Likewise, you might model a doctor as executing a strategy of "Cure patients" but subject to a limited set of knowledge. This isn't really what people mean by "irrational" though, since these are mostly just limitations.

What really makes an entity "irrational" is when your model of it contains pretty much any other kind of behaviour. For example, those Go bots whose behaviour is well modelled as "Win at Go, ELO = Superhuman, EXCEPT behave as if you think cyclic patterns are unbeatable". In that case, the Go engine is behaving irrationally under our model of the world.

(Another classic example: someone who mostly has consistent preferences, which can be simply described by a utility function, but also prefers apples to bananas, bananas to oranges, and oranges to apples. This puts an epicycle in our model if we have to model their fruit-swapping behaviour.)

An entity is irrational under some model to the degree that that there are more epicycles on its behaviour. A person might appear like a utility maximizer (and thus, very rational) to a much stupider person (who would not be able to model their behaviour in any other way), but very unlike a utility maximizer to a superintelligent AI. Since most humans don't vary in intelligence by that much, most of the time we're working under similar models, so we can just talk about entities being irrational.

Caveat:

We might want to talk about which agents are more or less rational in general, which then means we're making a claim that our models reflect some aspect of reality. A more (or less) rational agent is then one which is overall considered more (or less) rational under a wide variety of high-accuracy low-complexity models.

[-]Richard_Kennaway2mo00

For example, if I appear to have very different preferences at different points in time (e.g. I prefer to hold a red apple on odd hours and a green apple on even hours), you can extract money from me, and that seems "irrational" to us.

This is everyday rational behaviour, disguised by the details of the example. Sometimes I want something to eat and sometimes I do not. Are the managers of restaurants and food shops money-pumping me by selling me sustenance over and over? If I take the train to work in the morning, and take the train home in the evening, is the train operator money-pumping me? If I have a bed for when I prefer to sleep, and chairs for when I prefer to wake, is the furniture store money-pumping me? No, I am gaining from each purchase and have no cause to regret any of them.

Another classic example: someone who mostly has consistent preferences, which can be simply described by a utility function, but also prefers apples to bananas, bananas to oranges, and oranges to apples.

At the moment there are cherries, plums, and oranges in my fruit bowl. (There are! I am not making this up!.) Which I prefer to choose from them, when I do, is negatively correlated with my most recent choices. So which of them do I prefer? The ontology of the question is wrong. My preference function, if there is one, is not over "cherries", "plums", "oranges", or any other types of fruit. In the longer term, almost every fruit I see in the shops will rotate through my fruit bowl on occasion.

If someone's preferences look incoherent to me, maybe I am mistaken about what sorts of things their preferences are over. Behaviour reveals nothing, without the motive.

LESSWRONG
LW

LESSWRONG
LW

40

What do people mean when they say that something will become more like a utility maximizer?

40

40

What do we mean when we say a person is more or less of a perfect utility maximizer/is more or less of a "rational agent"?

Unsatisfactory answers I've seen

A1: It's about being able to cause the universe to look more like the way you want it to

A2: It's more rational if the implied utility function is simpler

A3: It's the degree to which you satisfy the VNM axioms

A4: It's the degree to which you lose real-world resources to VNM violations

The most promising answers I've seen are ways to formalize the "reasonableness" restriction

A5: It's the degree to which your implied preferences are coherent over time

A6: It's the degree to which your implied preferences are robust to arbitrary-seeming perturbations