How do humans assign utilities to world states?

[-]OrphanWilde11y50

Humans don't. "Utility" is part of the map, not part of the territory. We make choices, but utility theory is only a modeling language used to describe choice-making processes.

One are of research you may want to investigate is "Revealed Preference", a concept developed primarily by Paul Samuelson.

"Revealed Preference" has issues, because of things like circular preferences - although it's a mistake to conclude that circular preferences are proof that humans are irrational. Rather, it demonstrates that utility theory in general is just a model, and an incomplete one.

The fundamental issue is that utility, as a model, attempts to compress a topography of many dimensions - human preferences - into a topography of exactly one - a utility value for each potential choice. Impossible Objects - "contradictions", such as circular preferences - are to be expected in the abbreviated topography.

[-][anonymous]11y00

Why is self-reference expected when reducing the dimensions? Is it because these dimensions might influence each other in a circular way?

[-]OrphanWilde11y-30

Circles are valid two-dimensional objects. What mapping do you use to represent a circle in one dimension?

[-][anonymous]11y00

Well, there is one case in which naive utility theory makes perfect sense: when the utility function is just measuring the value of some real-number random variable inside the epistemic model (ie: when reading a number off your map tells you the utility of the territory). Since utility theory was invented to deal with economics, in which such a random variable exists and is called "money", nobody ever bothered to ask what happened when you didn't have such a convenient real-valued, assumed-monotonic random variable.

[-]OrphanWilde11y00

True. Although I think most utility theorists would be somewhat horrified if you suggested that money was the only thing worth measuring, when measuring utility.

[-][anonymous]11y00

Well of course, because they conceived of utility theory as giving value to money. They also invented a utility theory that only really applies to measuring money. It was a kind of doublethink in which, if real human preferences don't fit a model constructed to deal with money, then economists conclude that humans are Irrational (in a capital-letter ideological sense) rather than trying to come up with a model of evaluative reasoning that actually explains the data gained from real people.

[-][anonymous]11y00

And utility starts to become absurd when discussing multi-agent systems. For instance, if every person assigns utility prefences to every other person, at difference levels of confidence depending on their mentalising ability... Voting theory constructs some solutions to this problem, but if I've ever come across someone who's familiar enough with voting theory for us to interact in optimal ways, I've never know. I would also recommend looking up the social and cognitive determinants of altruistic behaviour and kindess to supplement the dry economic blub you'll get by looking att he research on revealed preference.

[This comment is no longer endorsed by its author]Reply

[-]jacob_cannell11y40

This general problem has been studied by Stuart Russell, Andrew Ng, and others. It's called "Inverse Reinforcement Learning", and the general idea is to learn the utility function of an agent A given training data which includes A's actual decisions, and then use that to infer an approximation of A's utility function for use in a RL agent B, where B can satisfy A's goals, perhaps better than A itself (by thinking faster and or predicting the future better).

You need to start with some sensible prior's over A's utility function for the problem to be well formed, but after that it becomes a machine learning problem.

[-]Richard_Kennaway11y20

What does this method produce if there is no utility function that accurately models the agent's decisions?

[-]jacob_cannell11y40

I'm not sure, but I'd guess it wouldn't produce much. For example, if the agent is just making random decisions, well you won't be able to learn from that.

The IRL research so far has used training data provided by humans, and can infer human goal shaped utility functions for at least the fairly simple problem domains tested so far. Most of this research was done almost a decade ago and hasn't been as active recently. In particular if you scaled it up with modern tech, I bet that IRL techniques could learn the score function of Atari from watching human play - for example.

[-]Manfred11y20

One relevant idea is that there is a duality between assigning utilities to actions (or equivalently, being able to pick your favorite option out of a probabilistic mix of actions), and assigning utilities to outcomes. Acting consistently in one way implies that you are also acting consistently in the other.

Since humans are much better at picking actions than we are at evaluating entire world-states, this is pretty handy (though it comes nowhere near solving the entire problem). Paul Christiano has a writeup of what a naive-ish application of this idea would look like here.

[-]Ishaan11y10

So if I understand this correctly, Alice and the Sovereign are identically omniscient, and the Sovereign additionally has some power and influence upon the world that Alice does not. In the case where Alice herself is the sovereign the problem is solved, right? The sovereign just has to figure out what she prefers and do that. The solution then, is to simulate the scenario where Alice has the power to make the decision herself and then match Alice's decision. This solves both 1 and 2.

My short answer to the broader "How do we know what sacks of meat / circuits / whatever prefer" question is "you look at the behavioral output". Here, if Alice can make the decision herself, the decision represents her behavioral output.

(I'm about halfway through writing about how to make this idea more workable without resorting to omniscient things with consistent preferences, if I still like the idea after writing it out I'll cross post it on lw.)

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

5

How do humans assign utilities to world states?

5

5