This post is motivated by a recent post of Stuart Armstrong on going from preferences to a utility function. It was originally planned as a comment, but seems to have developed a bit of a life of its own. The ideas here came up in a discussion with Owen Biesel; all mistakes in this exposition are mine. I'm not very good with the typesetting engine here, so apologies for latex and other problems.

The basic ideas is as follows. Suppose we have a set of objects, and we are given some information on which objects are preferred to which other objects. Then we are interested in whether and in how many ways this data can be captured by a utility function. Our key innovation is that we assume not only the direction of preferences is given, but also some information on the strength of the preferences, in a manner which we will make precise below (weak preferences).

Basic on orders vs utility functions

We refer to the Order Theory page on Wikipedia for the definitions of reflexive, anti-symmetric, transitive and connexive binary relations. If is a set and is a function (`utility'), this induces a reflexive, transitive and connected binary relation (not anti-symmetric in general, unless is injective).

Conversely, any reflexive, transitive, antisymmetric and connexive binary relation (a.k.a. total order) on a countable set , this is induced by a utility function taking values in the rational numbers (link to proof); there is a more general discussion here.

Strength of preferences

In what follows, we fix a totally ordered abelian group . To express a preference between to objects , of our set , one should give an element of which expresses how strongly is preferred to . The most natural example is to take , then:

Saying is preferred to with strength 1 means we slightly prefer to ;

Saying s1 is preferred to s2 with strength 0 means have no preference between s1 and s2;

Saying s1 is preferred to s2 with strength 2 means we prefer s1 to s2 more strongly;

Saying s1 is preferred to s2 with strength -1 means we slightly prefer s2 to s1.

Expressions of preference

We consider three ways of describing preferences among objects in a set :

1. Weak Preference

Definition. A weak preference among elements of consists of a collection of triples with and .

A triple (s1,s2,g) is interpreted as meaning that is preferred to with strength . This is the kind of data one might be provided with in practise; for example, someone tells you that they slightly prefer cats to dogs, but strongly prefers dogs to snakes (assigning numbers/elements of to the strengths of the preferences).

2. Categorical preference

Definition. To give a categorical preference for elements of is to give a category with objects , together with an enrichment over

This definition may seem a bit cryptic, but is close to a standard way of thinking of an order as an enrichment. For every two objects we assign an element of (which we think of as telling us the strength of the preference for over ), subject to a bunch of compatibility conditions (for example it will imply that the preference for over is given by the unit of ). More details and expansion can be found in Lawvere's Taking Categories Seriously.

3. Utility function

This is just a function , and is interpreted in the usual way.

Goal

In practise, information is likely to be supplied in the form of a weak preference. Ultimately one wants a utility function to feed to some optimisation procedure. We will describe how to pass naturally from a weak to a categorical preference, and then see (under some hypotheses) that a categorial preferences induces an essentially-unique utility function.

We suppose throughout that S and G are fixed, with finite for simplicity.

Weak preferences to categorial preferences

First, given a categorical preference, choosing a subset of arrows yields a weak preference. On the other hand, given a weak preference, we are interested in

Q1. whether this weak preference can arise from a categorical preference, and

Q2. if so, then from how many distinct categorical preferences can this weak preference arise.

Suppose we are given a weak preference . First, for every triple , we adjoin to the triple ; call the resulting weak prefernce A cycle in W' is a finite ordered sequence of triples , and such a cycle is closed if

Claim 1: The weak preference can be extended to a categorical preference if and only if every cycle is closed.

Sketch of proof. Suppose we are given and , and want to assign an element of If there is no path from to we can assign any element we want (we will use this observation later). If there is a path, then the group law in determines a value of the preference of over . The only way a problem might arise is if there are two (or more) paths from to , but then the cycles are closed condition means that these paths will induce the same preference for over . QED

We move on the the uniqueness question. Assume that satisfies the `cycles are closed' condition, and write for the set of connected components of .

Claim 2. The set of categorical preferences restricting to is naturally in bijection with

In particular, if is connected then there is exactly one way to associate a categorical preference.

Sketch of proof. In the proof of claim 1, the only choice we made was when there was no path from to In that case we chose an element of . QED

From categorical preferences to utility functions

Suppose we are given a categorical preference on , i.e. the structure of a category enriched over with object set . Analogously to the above, we are interested in whether and in how many ways this can be induced by a utility function.

From now on, we assume that is Archimedean , equivalently that it is isomorphic (as an ordered group) to a subgroup of . This isomorphism is then necessarily unique up to scaling (Holder's theorem).

Suppose first that we have a utility function . Let be a subgroup containing for every . Then by assigning to the pair the element we give a categorical preference structure, with enrichment over .

Again, we are interested in two questions:

Q1. given a categorical preference on , does it arise from some utility function in the above fashion?

Q2. If the answer to Q1 is `yes', then from` how many` utility functions can out categorical preference arise?

The answer to Q1 is always yes. First choose an embedding of in as a totally ordered group. Then choose some element , and define The values of on the other elements of are then uniquely determined by the enriched category structure.

Q2 is almost as easy. The embedding of in is unique up to scaling, and the choice of is just a translation. Hence the utility function is unique up to translation and scaling.

Conclusion

In practise, we might be given the data of a weak preference. We have seen an easy way to check whether it can be extended to a categorical preference, and a simple description of all the resulting possible categorical preferences. A categorical preference always comes from a utility function, in a way which is unique up to translation and scaling.

In actual practise, it is not unlikely that the weak preference data will not come from a categorical preference (and thus not from a utility function). Then we should probably look for the `most reasonable' associated categorical preference, how to do this is not so clear to me yet.

I find the fact that utility functions are only unique up to translation and scaling a bit awkward; maybe this notion of categorical preference captures the important data in a more canonical fashion?!



New to LessWrong?

New Comment
6 comments, sorted by Click to highlight new comments since: Today at 6:48 AM

I think the most "native" representation of utility functions is actually as a function from ordered triples of outcomes to real numbers. Rather than having an arbitrary (affine symmetry breaking) scale for strength of preference, set the scale of a preference by comparing to a third possible outcome.

The function is the "how much better?" function. Given possible outcomes A, B, and X, how many times better is A (relative to X) than B (relative to X).

If A is chocolate cake, and B is ice cream, and X is going hungry, maybe the chocolate cake preference is 1.25 times stronger, so the function Betterness(chocolate cake, ice cream, going hungry) = 1.25.

This is the sort of preference that you would elicit from a gamble (at least from a rational agent, not necessarily from a human). If I am indifferent to a gamble with a probability 1 of ice cream, and a probability 0.8 of chocolate cake and 0.2 of going hungry, this tells you that betterness-value above.

Anyhow, interesting post, I'm just idly commenting.

Thanks for the comment Charlie.

If I am indifferent to a gamble with a probability of ice cream, and a probability 0.8 of chocolate cake and 0.2 of going hungry

To check I understand correctly, you mean the agent is indifferent between the gambles (probability of ice cream) and (probability 0.8 of chocolate cake, probability 0.2 of going hungry)?

If I understand correctly, you're describing a variant of Von Neumann–Morgenstern where instead of giving preferences among all lotteries, you're specifying a certain collection of special type of pairs of lotteries between which the agent is indifferent, together with a sign to say in which `direction' things become preferred? It seems then likely to me that the data you give can be used to reconstruct preferences between all lotteries...

If one is given information in the form you propose but only for an incomplete' set of special triples (c.f.weak preferences' above), then one can again ask whether and in how many ways it can be extended to a complete set of preferences. It feels to me as if there is an extra ambiguity coming in with your description, for example if the set of possible outcomes has 6 elements and I am given the value of the Betterness function on two disjoint triples, then to generate a utility function I have to not only choose a `translation' between the two triples, but also a scaling. But maybe this is better/more realistic!

. By `special types', I mean indifference between pairs of gambles of the form
(probability of A) vs (probability of B and probability of C)
for some , and possible outcomes A, B, C. Then the sign says that I prefer higher probability of B (say).

I think a reasonable question to ask is what value having relative strength of the preference ordering matters to questions we want to answer. That is, I suspect the reason we normally only consider a preference ordering and not a preference measure is that it's not relevant to observed behavior, since in that case we presume the most preferred possible action will always be taken, thus more information than the order is not relevant to the decision.

I can imagine we might care about measure in cases like modeling how human preferences are actually manifested where they might have relative weights and can be updated, although personally I prefer the idea of avoiding this by making updating appear immutable by conditioning each preference on the entire causal history that came prior to its realization, although this has its own problems.

Sure, in the end we only really care about what comes top, as that's the thing we choose. My feeling is that information on (relative) strengths of preferences is often available, and when it is available it seems to make sense to use it (e.g. allowing circumvention of Arrow's theorem).

In particular, I worry that, when we only have ordinal preferences, the outcome of attempts to combine various preferences will depend heavily on how finely we divide up the world; by using information on strengths of preferences we can mitigate this.

Neat!

Though I should mention that my current version of partial preferences does not assume all cycles are closed - the constrained optimisation can be seen as trying to get "as close as possible" to that, given non-closed cycles.

Thanks! I like the way your optimisation problem handles non-closed cycles.

I think I'm less comfortable with how it treats disconnected components - as I understand it you just translate each separately to have `centre of mass' at 0. If one wants to get a utility function out at the end one has to make some kind of choice in this situation, and the choice you make is probably the best one, so in that sense it seems very good.

But for example it seems vulnerable to creating 'virtual copies' of worlds in order to shift the centre of mass and push connected components one way or the other. That was what started me thinking about including strength of preference - if one adds to your setup a bunch of virtual copies of a world between which one is `almost indifferent' then it seems it will shift the centre of mass, and thus the utility relative to come other chain. Of course, if one is actually indifferent then the 'virtual copies' will be collapsed to a single point in your , but if they are just extremely close then it seems it will affect the utility relative to some other chain. I'll try to explain this more clearly in a comment to your post.