How Not to be Stupid: Brewing a Nice Cup of Utilitea

[-]conchis17y20

I'm a little puzzled by what it is you're trying to do here. It feels as though you're reinventing the wheel, but I have no clear sense of whether that's what you see yourself as doing, and if not, why you think your wheel is different from existing ones.

(This may just be a communication issue, but it might be worth clarifying.)

[-]Psy-Kosh17y00

Basically I'm trying to derive decision theory as the approximate unique solution to "don't automatically lose"

It occurred to me that someone should be doing something like this on OB or LW, so... I'm making a go of it.

[-]conchis17y30

I'm afraid that's still a little too vague for me to make much sense of. What decision theory are you trying to derive? How does this particular decision theory differ (if at all) from other decision theories out there. If you're deriving it from different premises/axioms than other people already have, how do these relate to existing axiomatizations?

Perhaps most importantly, why does this need to be done from scratch on OB/LW? I could understand the value of summarizing existing work in a concise and intuitive fashion, but that doesn't seem to be what you're doing.

[-]Psy-Kosh17y00

Seems reasonable to me that it would be useful to have somewhere on LW a derivation of decision theory, an answer to "why this math rather than some other?"

I wanted to base it on dutch book/vulnerability arguments, but then I kept finding things I wanted to generalize and so on. So I decided to do a derivation in that spirit, but with all the things filled in that I felt I had needed to fill in for myself. It's more "here's what I needed to think through to really satisfy myself with this." But yeah, I'm just going for ordinary Bayesian decision theory and epistemic probabilities. That's all. I'm not trying to do anything really novel here.

I'm not so much axiomatizing as much as working from the "don't automatically lose" rule.

[-]conchis17y10

I wanted to base it on dutch book/vulnerability arguments, but then I kept finding things I wanted to generalize and so on. So I decided to do a derivation in that spirit, but with all the things filled in that I felt I had needed to fill in for myself. It's more "here's what I needed to think through to really satisfy myself with this."

Just a thought, but I wonder whether, it might work better to:

start with the dutch book arguments;
explicitly state the preconditions necessary for them to work;
gradually build backwards, filling in the requirements for the most problematic preconditions first

This would still need to be done well, but it has the advantage that it's much clearer where you're going with everything, and what exactly it would be that you're trying to show at each stage.

At the moment, for example, I'm having difficulty evaluating your claim to have shown that utility "indices actually corresponds in a meaningful way to how much you prefer one thing to another". One reason for that is that the claim is ambiguous. There's an interpretation on which it might be true, and at least one interpretation on which I think it's likely to be false. There may be other interpretations that I'm entirely missing. If I knew what you were going to try to do with it next, it would be much easier to see what version you need.

Taking this approach would also mean that you can focus on the highest value material first, without getting bogged down in potentially less relevant details.

[-]Vladimir_Nesov17y10

Seems reasonable to me that it would be useful to have somewhere on LW a derivation of decision theory, an answer to "why this math rather than some other?"

That's only if the derivation is good. I warned you that you are going to shoot your feet off, if you are not really prepared. Even the classical axiomatizations have some problems with convincing people to trust in them.

[-]Psychohistorian17y00

It's somewhat less constructive than reinventing the wheel, actually. It's axiomatic, not empirical.

If A2-A1 > B2-B1, and A1 = B1, then A2 + B1 > A1 + B2 is about as insightful as 4+2 > 2+3, or 2n +2 > 2n +1. Once you set your definitions, the meaning of ">" does pretty much all your work for you.

As I understand it, your goal is to derive some way of assigning utilities to different outcomes such that they maintain preference ranking. I believe this could be done about as well with:

"Assume all outcomes can be mapped to discrete util values, and all utils are measured on the same scale."

This gives you all of the properties you've described and argued for in the last several posts, I believe, and it takes rather less effort. I believe you've assumed this implictly, though you haven't actually admitted as much. Your system follows from that statement if it is true. If it's false, your system cannot stand. It's also rather easier to understand than these long chains of reasoning to cross very small gaps.

[-]conchis17y10

It's somewhat less constructive than reinventing the wheel, actually. It's axiomatic, not empirical.

The whole von Neumann-Morgenstern edifice (which is roughly what Psy-Kosh seems to be reconstructing in a roundabout way) is axiomatic. That doesn't make it worthless.

assigning utilities to different outcomes such that they maintain preference ranking ...could be done about as well with [the assumption that] all outcomes can be mapped to discrete util values.

Well, yes. You can derive X from the assumption that X is true, but that doesn't seem very productive. (I didn't think Psy-Kosh claimed, or needs to claim (yet), that all utils are measured on the same scale, but I could be wrong. Not least because that statement could mean a variety of different things, and I'm not sure which one you intend.)

Only some preference orderings can be represented by a real-valued utility functions. Lexicographic preferences, for example, cannot. Nor can preferences which are, in a particular sense, inconsistent (e.g. cyclic preferences).

My sense is that Psy-Kosh is trying to establish something like a cardinally measurable utility function, on the basis of preferences over gambles. This is basically what vNM did, but (a) as far as I can tell, they imposed more structure on the preferences; (b) they didn't manage to do it without using probabilities; and (c) there's a debate about the precise nature of the "cardinality" they established. The standard position, as I understand it, is that they actually didn't establish cardinality, just something that looks kind of like it.

Intuitively, the problem with the claim that utility "indices actually correspond[] in a meaningful way to how much you prefer one thing to another" is that you could be risk-averse, or risk-loving with respect to welfare, and that would break the correspondence. (Put another way: the indices correspond to how much you prefer one thing to another adjusted for risk - not how much you prefer one thing to another simpliciter.)

[-]Psy-Kosh17y10

Yeah. Here I'm trying to actually justify the existence of a numbering scheme that has the property that "increase of five points of utility is increase of five points of utility (in some set of utility units)", no matter what the state that you're starting and increasing from is.

I need to do this so that I then have a currency I can use in a dutch book style argument to build up the rest of it.

As far as Lexicographic preferences, I had to look that up. Thanks, that's interesting. Maybe doable with hyperreals or such?

As far as risk aversion, um... unless I misunderstand your meaning, that should be easily doable. Simply have really increasingly huge steps of disutiliy as one goes down the preference chain, so even slight possibility of a low rank outcome would be extremely unpreferred?

[-]conchis17y00

I'm afraid all of this is all still a bit vague for me, sorry.

Are you familiar with the standard preference representation results in economics (e.g. the sort you'd find in a decent graduate level textbook)? The reason I ask is that the inability to represent lexicographic preferences is pretty well-known, and the fact that you weren't aware of it makes me suspect even more strongly than before that you may be trying to do something that's already been done to death without realizing it.

I think we're talking past each other on the risk aversion front. Probably my fault, as my comment was somewhat vague. (Maybe also an issue of inferential distance.)

[-]Psy-Kosh17y10

More I think about it though, seems like hyperreals, now that I know of them, would let one do a utility function for lexicographic preferences, no?

And nothing for you to apologize for. I mean, if there's this much confusion about what I'm writing, it seems likely that the problem is at my end. (And I fully admit, there's much basic material I'm unfamiliar with)

[-]Psychohistorian17y00

My criticism may be more of the writing than the concept. Once you establish that utilities obey a >=< relationship with one another, all these properties seem to flow rather cleanly and easily. If there's one thing I've learned from philosophy, it's that you should always be wary of someone who uses a thousand words when a hundred will do.

The properties are interesting and useful, it just seems that the explanation of them is being dragged out to make the process look both more complex and more "objective" than it really is, and that's what I'm wary of.

[-]Psy-Kosh17y00

Well, right at the start, I said we could assign numbers that maintain preference rankings. Here I was trying to establish a meaningful scale, though. A numbering in which the sizes of the steps actually meant something.

"on the same scale"? I first need to explicitly specify what the heck that actually means. I'm doing this partly because when I've looked at some arguments, I'd basically go "but wait! what if...? and what about...? And you're just plain assuming that..." etc. So I'm trying to fill in all those gaps.

Further, ultimately what I'm trying to appeal to is not a blob of arbitrary axioms, but the notion of "don't automatically lose" plus some intuitive notions of "reasonableness"

Obviously, I'm being far less clear than I should be. Sorry about that.

[-]Richard_Kennaway17y00

Can you summarise this series of posts as a straightforward mathematical theorem? As someone with a mathematical background, I would find that a lot easier to grasp than this expanse of text. At the moment, I can't tell whether you are writing an exposition of the concepts, hypotheses and reasoning of the Utility Theorem, or doing something different.

[-]Psy-Kosh17y00

Well, I'm not really doing it from an axiomatic perspective, as such, but basically I'm doing a "avoiding being stupid, that is, avoiding automatically losing and such, more or less uniquely leads to Bayesian decision theory"

ie "if an agent isn't acting in accordance with decision theory, they're going to be doing something stupid sooner or later". I'm trying to construct decision theory from this perspective. The basic notions I'm working with are things like Dutch Book arguments and Stephen Omohundro's vulnerability arguments. But I'm filling in bits that I personally had to struggle with, had to go "but wait, what about...?" until I eventually worked out what I felt to be the missing bits.

That's my basic overall intent here. "Intro to decision theory, or, intro to why decision theory is the Right Way". It sure seems though, unfortunately, like I'm not doing that good of a job at it presentation wise, but at least it may be useful as reference material to someone.

[-]orthonormal17y00

Psy-Kosh, you should also use the "summary break" button with posts of this length.

[-]Psy-Kosh17y00

Done. :)

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

2

How Not to be Stupid: Brewing a Nice Cup of Utilitea

2

2