# 32

A lot of rationalist thinking about ethics and economy assumes we have very well defined utility functions - knowing exactly our preferences between states and events, not only being able to compare them (I prefer X to Y), but assigning precise numbers to every combinations of them (p% chance of X equals q% chance of Y). Because everyone wants more money, you should theoretically even be able to assign exact numerical values to positive outcomes in your life.

I did a small experiment of making a list of things I wanted, and giving them point value. I must say this experiment ended up in a failure - thinking "If I had X, would I take Y instead", and "If I had Y, would I take X instead" very often resulted in a pair of "No"s. Even thinking about multiple Xs/Ys for one Y/X usually led me to deciding they're really incomparable. Outcomes related to similar subject were relatively comparable, those in different areas in life were usually not.

I finally decided on some vague numbers and evaluated the results two months later. My success on some fields was really big, on other fields not at all, and the only thing that was clear was that numbers I assigned were completely wrong.

This leads me to two possible conclusions:

• I don't know how to draw utility functions, but they are a good model of my preferences, and I could learn how to do it.
• Utility functions are really bad match for human preferences, and one of the major premises we accept is wrong.

Anybody else tried assigning numeric values to different outcomes outside very narrow subject matter? Have you succeeded and want to share some pointers? Or failed and want to share some thought on that?

I understand that details of many utility functions will be highly personal, but if you can share your successful ones, that would be great.

# 32

New Comment
Some comments are truncated due to high volume. Change truncation settings

Utility functions are really bad match for human preferences, and one of the major premises we accept is wrong.

They may be a bad descriptive match. But in prescriptive terms, how do you "help" someone without a utility function?

9Wei_Dai12yTo help someone, you don't need him to have an utility function, just preferences. Those preferences do have to have some internal consistency. But the consistency criteria you need to in order to help someone seem strictly weaker than the ones needed to establish an utility function. Among the von Neumann-Morgenstern axioms, maybe only completeness and transitivity are needed. For example, suppose I know someone who currently faces choices A and B, and I know that if I also offer him choice C, his preferences will remain complete and transitive. Then I'd be helping him, or at least not hurting him, if I offered him choice C, without knowing anything else about his beliefs or values. Or did you have some other notion of "help" in mind?
2MichaelBishop12yFurthermore, utility functions actually aren't too bad as a descriptive match when you are primarily concerned about aggregate outcomes. They may be almost useless when you try to write one that describes your own choices and preferences perfectly, but they are a good enough approximation that they are useful for understanding how the choices of individuals aggregate: see the discipline of economics. This is a good place for the George Box quote: "All models are wrong, but some are useful."
0[anonymous]12yIsn't "helping" a situation where the prescription is derived from the description? Are you suggesting we lie about others' desires so we can more easily claim to help satisfy them? Helping others can be very tricky. I like to wait until someone has picked a specific, short term goal. Then I decide whether to help them with that goal, and how much.
0conchis12yNot necessarily. There are lots of plausible moral theories under which individuals' desires don't determine their well-being.
0MichaelBishop12yI think Eliezer is simply saying: "I can't do everything, therefore I must decide where I think the marginal benefits are greatest. This is equivalent to attempting to maximize some utility function."
0Vladimir_Nesov12yDerivation of prescription from description isn't trivial. That's the difference between finding the best plan, and conceding for a suboptimal plan because you ran out of thought.
0[anonymous]12yI agree with both those statements, but I'm not completely sure how you're relating them to what I wrote. Do you mean that the difficulty of going from a full description to a prescription justifies using this particular simpler description instead? It might. I doubt it because utility functions seem so different in spirit from the reality, but it might. Just remember it's not the only choice.
0Vladimir_Nesov12yA simple utility function can be descriptive in simple economic models, but taken as descriptive, such function doesn't form a valid foundation for the (accurate) prescriptive model. On the other hand, when you start from an accurate description of human behavior, it's not easy to extract from it a prescriptive model that could be used as a criterion for improvement, but utility function (plus prior) seems to be a reasonable format for such a prescriptive model if you manage to construct it somehow.
0[anonymous]12yIn that case, we disagree about whether the format seems reasonable (for this purpose).

You want a neuron dump? I don't have a utility function, I embody one, and I don't have read access to my coding.

3k3nt12yI'm not sure I embody one! I'm not sure that I don't just do whatever seems like the next thing to do at the time, based on a bunch of old habits and tendencies that I've rarely or never examined carefully. I get up in the morning. I go to work. I come home. I spend more time reading the internets (both at work and at home) than I probably should -- on occasion I spend most of the day reading the internets, one way or another, and while I'm doing so have a vague but very real thought that I would prefer to be doing something else, and yet I continue reading the internets. I eat more or less the same breakfast and the same lunch most days, just out of habit. Do I enjoy these meals more than other options? Almost certainly not. It's just habit, it's easy, I do it without thinking. Does this mean that I have a utility function that values what's easy and habitual over what would be enjoyable? Or does it mean that I'm not living in accord with my utility function? In other words, is the sentence "I embody a utility function" intended to be tautological, in that by definition, any person's way of living reveals/embodies their utility function (a la "revealed preferences" in economics), or is it supposed to be something more than that, something to aspire to that many people fail at embodying? If "I embody a utility function" is aspirational rather than tautological -- something one can fail at -- how many people reading this believe they have succeeded or are succeeding in embodying their utility function?

I've put a bit of thought into this over the years, and don't have a believable theory yet. I have learned quite a bit from the excercise, though.

1) I have many utility functions. Different parts of my identity or different frames of thought engage different preference orders, and there is no consistent winner. I bite this bullet: personal identity is a lie - I am a collective of many distinct algorithms. I also accept that Arrow’s impossibility theorem applies to my own decisions.

2) There are at least three dimensions (time, intensity, and risk) to my... (read more)

1Roko12yWhilst this is true, it is in the interest of each of those algorithms to reciprocally unify with others, as opposed to continually struggling for control of the person in question. Very good point, though.
0Dagon12yIt's not clear to me that my subpersonal algorithms have the ability to enforce reciprocity well enough, or to reflectively alter themselves with enough control to even make an attempt at unification. Certainly parts of me attempt to modify other parts in an attempt to do so, but that's really more conquest than reciprocity (a conquest "I" pursue, but still clearly conquest). Unification is a nice theory, but is there any reason to think it's possible for subpersonal evaluation mechanisms any more than it is for interpersonal resource sharing?
0Vladimir_Nesov12yIt is in interest of each and every agent to unify (coordinate) more with other agents, so this glosses over the concept of the individual.
1Roko12yI don't understand the point you;re making here. Can you spell it out for me in more detail? Thanks. My point is simply that it is better for each facet of a person if all the facets agree to unify with each other more, to the point where the person is fully unified and never in conflict with itself.
1Cyan12yThis misses the mark, I think. Here's a mutation: "It is in interest of each and every cell to unify (coordinate) more with other cells, so this glosses over the concept of the organism." The coordination of cells is what allows us to speak of an organism as a whole. I won't go so far as to declare that co-ordination of agents justifies the concept of the individual, but I do think the idea expressed in the parent is more wrong than right.

Here's one data point. Some guidelines have been helpful for me when thinking about my utility curve over dollars. This has been helpful to me in business and medical decisions. It would also work, I think, for things that you can treat as equivalent to money (e.g. willingness-to-pay or willingness-to-be-paid).

1. Over a small range, I am approximately risk neutral. For example, a 50-50 shot at $1 is worth just about$0.50, since the range we are talking about is only between $0 and$1. One way to think about this is that, over a small enough range, there is

1AndrewKemendo12yHow have you come to these conclusions? For example: Is that because there have been points in time when you have made 200K and 400K respectively and found that your preferences didn't change much. Or is that simply expected utility?
0bill12yFor the specific quote: I know that, for a small enough change in wealth, I don't need to re-evaluate all the deals I own. They all remain pretty much the same. For example, if you told me a had $100 more in my bank account, I would be happy, but it wouldn't significantly change any of my decisions involving risk. For a utility curve over money, you can prove that that implies an exponential curve. Intuitively, some range of my utility curve can be approximated by an exponential curve. Now that I know it is exponential over some range, I needed to figure out which exponential and over what range does it apply. I assessed for myself that I am indifferent between having and not having a deal with a 50-50 chance of winning$400K and losing $200K. The way I thought about that was how I thought about decisions around job hunting and whether I should take or not take job offers that had different salaries. If that is true, you can combine it with the above and show that the exponential curve should look like u(x) = 1 - exp(-x/400K). Testing it against my intuitions, I find it an an okay approximation between$400K and minus $200K. Outside that range, I need better approximations (e.g. if you try it out on a 50-50 shot of$10M, it gives ridiculous answers). Does this make sense?
0AndrewKemendo12yIt makes sense however you mention that you test it against your intuitions. My first reaction would be to say that this is introducing a biased variable which is not based on a reasonable calculation. That may not be the case as you may have done so many complicated calculations such that your unconscious "intuitions" may give your conscious the right answer. However from the millionaires biographies I have read and rich people I have talked to a better representation of money and utility according to them is logarithmic rather than exponential. This would indicate to me that the relationship between utility and money would be counter-intuitive for those who have not experienced those levels which are being compared. I have not had the fortune to experience anything more than a 5 figure income so I cannot reasonably say how my preferences would be modeled. I can reasonably believe that I would be better off at 500K than 50K through simple comparison of lifestyle between myself and a millionaire. I cannot make an accurate enough estimation of my utility and as a result I would not be prepared to make a estimation of what model would best represent it because the probability of that being accurate is likely the same as coin flipping. Ed: I had a much better written post but an errant click lost the whole thing - time didn't allow the repetition of the better post.
0bill12yAs I said in my original post, for larger ranges, I like logarithmic-type u-curves better than exponential, esp. for gains. The problem with e.g. u(x)=ln(x) where x is your total wealth is that you must be indifferent between your current wealth and a 50-50 shot of doubling vs. halving your wealth. I don't like that deal, so I must not have that curve. Note that a logarithmic curve can be approximated by a straight line for some small range around your current wealth. It can also be approximated by an exponential for a larger range. So even if I were purely logarithmic, I would still act risk neutral for small deals and would act exponential for somewhat larger deals. Only for very large deals indeed would you be able to identify that I was really logarithmic.
1conchis12yFurther to this, it's also worth pointing out that, to the extent that Andew's biographies and rich acquaintances are talking about a logarithmic experienced utility function that maps wealth into a mind state something like "satisfaction", this doesn't directly imply anything about the shape of the decision utility function they should use to represent their preferences over gambles. It's only if they're also risk neutral with respect to experienced utility that the implied decision utility function needs to be log(x). If they're risk averse with respect to experienced utility then their decision utility function will be a concave function of log(x), while if they're risk loving it will be a convex function of it. P.S. For more on the distinction between experienced and decision utility (which I seem constantly to be harping on about) see: Kahneman, Wakker and Sarin (1997) "Back to Bentham? Explorations of Experienced Utility [http://people.few.eur.nl/wakker/pdfspubld/97.1kwsqje.pdf]"
0AndrewKemendo12yI am curious how this would look in terms of decisions under experience. Does this imply that they are expecting to change their risk assessment once they are experienced?
0conchis12yI'm afraid I have no idea what you mean, perhaps because I failed to adequately explain the distinction between experienced utility and decision utility, and you've taken it to mean something else entirely. Roughly: experienced utility is something you experience or feel (e.g. positive emotions); decision utility is an abstract function that describes the decisions you make, without necessarily corresponding to anything you actually experience. Follow the link I gave, or see my earlier comment here [http://lesswrong.com/lw/zv/post_your_utility_function/s7i] (experienced utility is 1., decision utility is 2.) Apologies if I'm failing to understand you for some other reason, such as not having slept. ;)
0AndrewKemendo12yUnfortunately the better parts of my post were lost - or rather more of the main point. I posit that the utility valuation is an impossibility currently. I was not really challenging whether your function was exponential or logarithmic - but questioning how you came to the conclusion; how you decide, for instance where exactly the function changes especially having not experienced the second state. The "logarithmic" point I was making was designed to demonstrate that true utility may differ significantly from expected utility once you are actually at point 2 and thus may not be truly representative. Mainly I am curious as to what value you place on "intuition" and why.
1bill12yIf you wanted to, we could assess at least a part of your u-curve. That might show you why it isn't an impossibility, and show what it means to test it by intuitions. Would you, right now, accept a deal with a 50-50 chance of winning $100 versus losing$50? If you answer yes, then we know something about your u-curve. For example, over a range at least as large as (100, -50), it can be approximated by an exponential curve with a risk tolerance parameter of greater than 100 (if it were less that 100, then you wouldn't accept the above deal). Here, I have assessed something about your u-curve by asking you a question that it seems fairly easy to answer. That's all I mean by "testing against intuitions." By asking a series of similar questions I can assess your u-curve over whatever range you would like. You also might want to do calculations: for example, $10K per year forever is worth around$300K or so. Thinking about losing or gaining $10K per year for the rest of your life might be easier than thinking about gaining or losing$200-300K.
2AndrewKemendo12yI think this greatly oversimplifies the issue. Whatever my response to the query is, it is only an estimation as to my preferences. It also assumes that my predicted risk will, upon the enactment of an actual deal, stay the same; if only for the life of the deal. A model like this, even if correct for right now, could be significantly different tomorrow or the next day. It could be argued that some risk measurements do not change at intervals so fast as would technically prohibit recalculation. Giving a fixed metric puts absolutes on behaviors which are not fixed, or which unpredictably change. Today, because I have lots of money in my account, I might agree to your deal. Tomorrow I may not. This is what I mean by intuitions - I may think I want the deal but I may in reality be significantly underestimating the chance of -50 or any other number of factors which may skew my perception. I know of quite a few examples of people getting stuck in high load mutual funds or other investments because their risk preferences significantly changed over a much shorter time period than they expected because they really didn't want to take that much risk in their portfolio but could not cognitively comprehend the probability as most people cannot. This in no way advocates going further to correcting for these mistakes after the fact - however the tendencies for economists and policy makers is to suggest modeling such as this. In fact most consequentialists make the case that modeling this way is accurate however I have yet to see a true epistemic study of a model which reliably demonstrates accurate "utility" or valuation. The closest to accurate models I have seen take stated and reveled preferences together and work towards a micro estimation which still has moderate error variability where not observed ( http://ideas.repec.org/a/wly/hlthec/v13y2004i6p563-573.html [http://ideas.repec.org/a/wly/hlthec/v13y2004i6p563-573.html]). Even with observed behavior applied it is still
0conchis12yJust to be clear, you know that an exponential utility function [http://en.wikipedia.org/wiki/Exponential_utility] (somewhat misleadingly ) doesn't actually imply that utility is exponential in wealth, right? Bill's claimed utility function doesn't exhibit increasing marginal utility, if that's what you're intuitively objecting to. It's 1-exp(-x), not exp(x). Many people do find the constant absolute risk aversion implied by exponential utility functions unappealing, and prefer isoelastic utility functions that exhibit constant relative risk aversion, but it does have the advantage of tractability, and may be reasonable over some ranges.
0bill12yExample of the "unappealingness" of constant absolute risk aversion. Say my u-curve were u(x) = 1-exp(-x/400K) over all ranges. What is my value for a 50-50 shot at 10M? Answer: around $277K. (Note that it is the same for a 50-50 shot at$100M) Given the choice, I would certainly choose a 50-50 shot at $10M over$277K. This is why over larger ranges, I don't use an exponential u-curve. However, it is a good approximation over a range that contains almost all the decisions I have to make. Only for huge decisions to I need to drag out a more complicated u-curve, and they are rare.
0[anonymous]12yJust to be clear, you know that he means negative exponential, right? His claimed utility function doesn't exhibit increasing marginal utility, if that's what you're intuitively objecting to. (If that's not what you're intuitively objecting to, then is there a specific aspect of the negative exponential that you find unappealing?)

This leads me to two possible conclusions

A third possibility: Humans aren't in general capable of accurately reflecting on their preferences.

Utility functions are really bad match for human preferences, and one of the major premises we accept is wrong.

If utility functions are a bad match for human preferences, that would seem to imply that humans simply tend not to have very consistent preferences. What major premise does this invalidate?

8SoullessAutomaton12yHumans are obviously capable of perceiving their own preferences at some level, otherwise they'd be unable to act on them. I assume what you propose here is that conscious introspection is unable to access those preferences? In that case, utility functions could potentially be deduced by the individual placing themselves into situations that require real action based on relevant preferences, recording their choices, and attempting to deduce a consistent basis that explains those choices. I'm pretty sure that someone with a bit of math background who spent a few days taking or refusing various bets could deduce the nonlinearity and approximate shape of their utility function for money without any introspection, for instance.
0taw12yThree is pretty much like one. If utility functions work, there must be some way of figuring them out, I hoped someone figured it out already. Utilitarian model being wrong doesn't necessarily mean that a different model based on different assumptions doesn't exist. I don't know which assumptions need to be broken.
2StanR12yThe general premise in the mind sciences is that there are different selves, somehow coordinated through the cortical midline structures. Plenty of different terms have been used, and hypotheses suggested, but the two "selves" I use for shorthand come from Daniel Gilbert: Socrates and the dog. Socrates is the narrative self, the dog is the experiencing self. If you want something a bit more technical, I suggest the lectures about well-being (lecture 3) here [http://mbb.harvard.edu/resources/kahneman08.php], and to get really technical, this [http://www.med.uni-magdeburg.de/fme/znh/kpsy/northoff/download/self_referential_processing_in_our_brain.pdf] paper on cognitive science exploring the self.

thinking "If I had X, would I take Y instead", and "If I had Y, would I take X instead" very often resulted in a pair of "No"s

It's a well-known result that losing something produces roughly twice the disutility that gaining the same thing would produce in utility. (I.e., we "irrationally" prefer what we already have.)

1conchis12yThis may depend what you mean by (dis)utility. Kermer et al. (one of the alii is Dan Gilbert) argue that "Loss aversion is an affective forecasting error [http://people.virginia.edu/~tdw/psych.science.2006.pdf]", caused by a tendency to systematically overestimate negative emotional responses.
0taw12yI thought I could trivially counter it by thinking about "X vs N Ys" and "Y vs M Xs", and geometrically averaging N with 1/M, but it didn't really work, and N/M values were often much larger than 2.
0SoullessAutomaton12yAre you assuming your utility function for Xs and Ys is linear? If X is "houses" and Y is "cars", and someone starts with one of each, how many people would gain utility from trading their only house for more cars or their only car for more houses?
0Philo12yI'd trade my car for another house: virtually any house would be worth more than my old car; I could sell the house and buy a better car, with something left over!
2SoullessAutomaton12yBut why would you sell the house and buy a car? Because you place higher utility on having one of each, which is precisely my point. The fact that two houses can be converted into more fungible resources than two cars is true, but is, as thomblake said, missing the point.
1Alicorn12yIn this housing market? You'd be without a car for months waiting for the house to sell - would that be worth the vehicle upgrade and the leftover money, even assuming the house did eventually sell?
0Larks12yOnly if he tried to sell at the current market price (or what passes for one at the moment). I suspect if he tried to sell his house for something just above the price of a car, it would sell easily. On the other hand, SoullessAutomaron's response is sound.
0thomblake12yBy "this housing market" did you mean the current one in the real world, or in the thought experiment where everyone already has exactly one house and car? Either way it seems like an apt point, though it seems to miss the point of Philo's response (which in turn seemed to miss the point of SoullessAutomaton's question)
2Alicorn12yI meant the real one, although the hypothetical universe where everybody has one house and one car would be hard on real estate sales too.
0taw12yAnother obvious trick of thinking about lotteries was even worse - I cannot get myself to do any kind of high-value one-off lottery thinking because of risk aversion, and low-value many-attempts lotteries are just like having EN Xs with some noise.

I feel some people here are trying to define their utility functions via linear combinations of sub-functions which only depend on small parts of the world state.

Example: If I own X, that'll give me a utility of 5, if I own Y that'll give me a utility of 3, if I own Z, that'll give me a utility of 1.

Problem: Choose any two of {X, Y, Z}

Apparent Solution: {X, Y} for a total utility of 8.

But human utility functions are not a linear combination of such sub-functions, but functions from global World states into the real numbers. Think about the above example wi... (read more)

This is a good exercise, I'll see what I can do for MY utility function.

First of all, a utility function is a function

f: X --> R

Where X is some set. What should that set be? Certainly it shouldn't be the set of states of the universe, because then you can't say that you enjoy certain processes (such as bringing up a child, as opposed to the child just appearing). Perhaps the set of possible histories of the universe is a better candidate. Even if we identify histories that are microscopically different but macroscopically identical, and apply some cru... (read more)

1Cyan12yPart of the problem is the X is necessarily based in the map, not the territory. There will always be the chance that one will learn something that radically changes the map, so it seems like an explicit statement of f will have to involve all possible maps that one might have.
0conchis12yNecessarily? If I place value on e.g. "my friends actually liking and respecting me", rather than just "the subjective sense that my friends like and respect me" then my utility function seems to be responding directly to the territory rather than the map. (It also means that I won't ever really know my utility, but that's true of lots of things.) Some people argue that things can't affect one's well-being unless one somehow experiences them, but that's a contentious position. Am I missing your intended meaning?
0Cyan12yYou're right -- I should have written "necessarily based in the map, not just the territory." My intended meaning has to do with fundamental shifts in one's understanding of how reality works that make some previous apparent question of fact become a "wrong question" or category error or similar non-issue.
-3pjeby12yIn practice, your definition of what "liking and respecting me" means -- i.e., what evidence you expect to see in the world of that -- is part of the map, not the territory. Suppose, for example, that your friends really and truly like and respect you... but they have to beat you up and call you names, for some other reason. Does that match what you actually value? It's out there in the territory, after all. That is, is merely knowing that they "like and respect you" enough? Or is that phrase really just a shorthand in your map for a set of behaviors and non-behaviors that you actually value? Note that if you argue that, "if they really liked and respected me, they wouldn't do that", then you are now back to talking about your map of what that phrase means, as opposed to what someone else's map is. System 2 thinking is very tricky this way -- it's prone to manipulating symbols as if they were the things they're merely pointing at, as though the map were the territory... when the only things that exist in its perceptual sphere are the labels on the map. Most of the time, when we think we're talking about the territory, we're talking about the shapes on the map, but words aren't even the shapes on the map!
8orthonormal12yWe have only access to our current map to tell us about the territory, yes. But we have strong intuitions about how we would act if we could explicitly choose that our future map permanently diverge from our current map (which we currently see as the territory). If we (again by our current map) believe that this divergence would conform less to the territory (as opposed to a new map created by learning information), many of us would oppose that change even against pretty high stakes. I mean, if Omega told me that I had to choose between * (A) my sister on Mars being well but cut off from all contact with me, or * (B) my sister being killed but a nonsentient chatbot impersonating her to me in happy weekly chats, and that in either case my memory of this choice would be wiped when I made it, I would choose (A) without hesitation. I understand that calling our current map "the territory" looks like a categorical error, but rejecting conchis' point entirely is the wrong response. There's a very real and valid sense in which our minds oppose what they calculate (by the current map) to be divergences between the future map and the territory.
2Cyan12yI affirm this, but it does not follow that: Just because the events that occur are not the proximate cause of an experience or preference does not mean that these things have nothing to do with external reality. This whole line of argument ignores the fact that our experience of life is entangled with the territory, albeit as mediated by our maps.
1saturn12yAnd what if he did ask?
0Vladimir_Nesov12yIs human knowledge also not just in the map, but exclusively of the map? If not, what's the difference?
1pjeby12yAny knowledge about the actual territory can in principle be reduced to mechanical form without the presence of a human being in the system. To put it another way, a preference is not a procedure, process, or product. The very use of the word "preference" is a mind projection - mechanical systems do not have "preferences" - they just have behavior. The only reason we even think we have preferences in the first place (let alone that they're about the territory!) is because we have inbuilt mind projection. The very idea of having preferences is hardwired into the model we use for thinking about other animals and people.
0pjeby12yYou said, "if not, what's the difference", and I gave you the difference. i..e, we can have "knowledge" of the territory.
0Vladimir_Nesov12ySo, knowledge exists in the structure of map and is about the territory, while preference can't be implemented in natural artifacts. Preference is a magical property of subjective experience, and it is over maps, or about subjective experience, but not, for example, about the brain. Saying that preference exists in the structure of map or that it is about the territory is a confusion, that you call "mind projection" Does that summarize your position? What are the specific errors in this account?
1pjeby12yNo, "preference" is an illusory magical property projected by brains onto reality, which contains only behaviors. Our brains infer "preferences" as a way of modeling expected behaviors of other agents: humans, animals, and anything else we perceive as having agency (e.g. gods, spirits, monsters). When a thing has a behavior, our brains conclude that the thing "prefers" to have either the behavior or the outcome of the behavior, in a particular circumstance. In other words, "preference" is a label attached to a clump of behavior-tendency observations and predictions in the brain -- not a statement about the nature of the thing being observed. Thus, presuming that these "preferences" actually exist in the territory is supernaturalism, i.e., acting as though basic mental entities exist. My original point had more to do with the types of delusion that occur when we reason on the basis of preferences actually existing, rather than the idea simply being a projection of our own minds. However, the above will do for a start, as I believe my other conclusions can be easily reached from this point.
0Vladimir_Nesov12yDo you think someone is advocating the position that goodness of properties of the territory is an inherent property of territory (that sounds like a kind of moral realism)? This looks like the lack of distinction between 1-place and 2-place words [http://lesswrong.com/lw/ro/2place_and_1place_words/]. You could analogize preference (and knowledge) as a relation between the mind and the (possible states of the) territory, that is neither a property of the mind alone, nor of the territory alone, but a property of them being involved in a certain interaction.
1pjeby12yNo, I assume that everybody who's been seriously participating has at least got that part straight. Now you're getting close to what I'm saying, but on the wrong logical level. What I'm saying is that the logical error is that you can't express a 2-place relationship between a map, and the territory covered by that map, within that same map, as that amounts to claiming the territory is embedded within that map. If I assert that my preferences are "about" the real world, I am making a category error because my preferences are relationships between portions of my map, some of which I have labeled as representing the territory. The fact that there is a limited isomorphism between that portion of my map, and the actual territory, does not make my preferences "about" the territory, unless you represent that idea in another map. That is, I can represent the idea that "your" preferences are about the territory in my map... in that I can posit a relationship between the part of my map referring to "you", and the part of my map referring to "the territory". But that "aboutness" relationship is only contained in my map; it doesn't exist in reality either. That's why it's always a mind projection fallacy to assert that preferences are "about" territory: one cannot assert it of one's own preferences, because that implies the territory is inside the map. And if one asserts it of another person's preferences, then that one is projecting their own map onto the territory. I initially only picked on the specific case of self-applied projection, because understanding that case can be very practically useful for mind hacking. In particular, it helps to dissolve certain irrational fears that changing one's preferences will necessarily result in undesirable futures. (That is, these fears are worrying that the gnomes and fairies will be destroyed by the truth, when in fact they were never there to start with.)
-1Vladimir_Nesov12yHow's that? You can write Newton's law of universal gravitation describing the orbit of the Earth around the Sun on a piece of paper located on the surface of a table standing in a house on the surface of the Earth. Where does this analogy break from your point of view? "...but, you can't fold up the territory and put it in your glove compartment"
1pjeby12yThe "aboutness" relationship between the written version of Newton's law and the actual instances of it is something that lives in the map in your head. IOW, the aboutness is not on the piece of paper. Nor does it exist in some supernatural link between the piece of paper and the objects acting in accordance with the expressed law.
0pjeby12yAnd this helps your position how?
1pjeby12yNo, your head is rotating around the Sun, and it contains a description relating the ideas of "head" and "Sun". You are confusing head 1 (the real head) with head 2 (the "head" pictured inside head 1), as well as Sun 1 (the real Sun) and Sun 2 (the "Sun" pictured inside head 1).
1Vladimir_Nesov12yNo, I'm not confusing them. They are different things. Yet the model simulates the real thing, which means the following (instead of magical aboutness): By examining the model it's possible to discover new properties of its real counterpart, that were not apparent when the model was being constructed, and that can't be observed directly (or it's just harder to do), yet can be computed from the model.
1pjeby12yIndeed. Although more precisely, examining the model merely suggests or predicts these "new" (rather, previously undiscovered, unnoticed, or unobservable) properties. That is what I mean by isomorphism between model and territory. The common usage of "about", however, projects an intention onto this isomorphism - a link that can only exist in the mind of the observer, not the similarity of shapes between one physical process and another.
0Vladimir_Nesov12ySince agent's possible actions are one of the things in the territory captured by the model, it's possible to use the model to select an action leading to a preferable outcome, and to perform thus selected action, determining the territory to conform with the plan. The correspondence between the preferred state of the world in the mind and the real world is ensured by this mechanism for turning plans into actuality. Pathologies aside, or course.
1pjeby12yI don't disagree with anything you've just said, but it does nothing to support the idea of an isomorphism inherently meaning that one thing is "about" another. If I come across a near-spherical rock that resembles the moon, does this make the rock "about" the moon? If I find another rock that is shaped the same, does that mean it is about the moon? The first rock? Something else entirely? The :"aboutness" of a thing can't be in the thing, and that applies equally to thermostats and humans. The (external) aboutness of a thermostat's actions don't reside in the thermostat's map, and humans are deluded when they project that the (external) aboutness of their own actions actually resides within the same map they're using to decide those actions. It is merely a sometimes-useful (but often harmful) fiction.
-1Vladimir_Nesov12yTaboo "aboutness" already. However unfathomably confused the philosophic and folk usage of this word is doesn't interest me much. What I mean by this word I described in these [http://lesswrong.com/lw/zv/post_your_utility_function/swb] comments [http://lesswrong.com/lw/zv/post_your_utility_function/swq], and this usage seems reasonably close to the usual one, which justifies highjacking the word for the semi-technical meaning rather than inventing a new one. This is also the way meaning/aboutness is developed in formal theories of semantics.
0Vladimir_Nesov12yActually, the point is that most of the other usages of these words are meaningless confusion, and the argument is that this particular semi-technical sense is what the word actually means, when you get the nonsense out of it. It's not how it's used, but it's the only meaningful thing that fits the idea. Since you don't just describe the usage of the word, but argue for the confusion behind it, we have a disagreement. Presenting a clear definition is the easy part. Showing that ten volumes of the Encyclopedia of Astrology is utter nonsense is harder, and arguing with each point made in its chapters is a wrong approach. It should be debunked on meta-level, with an argument that doesn't require the object-level details, but that requires the understanding of the general shape of the confusion.
1pjeby12yYes, but ones which most people do not understand to be confusion, and the only reason I started this discussion in the first place was because I was trying to clear up one point in that confusion. I am arguing against the confusion, not for the confusion. So, as far as I can tell, there should be no disagreement. In practice, however, you have been making arguments that sound like you are still confusing map and territory in your own thinking, despite seeming to agree with my reasoning on the surface. You are consistently treating "about" as a 2-way relationship, when to be minimally cohesive, it requires 3 entities: the 2 entities that have an isomorphism, and the third entity whose map ascribes some significance to this isomorphism. You've consistently omitted the presence of the third entity, making it sound as though you do not believe it to be required, and thereby committing the mind projection fallacy.
-2[anonymous]12ySo you are saying that my definition with which you've just agreed is unreasonable. Pick something tangible. (Also, please stop using "mind projection fallacy", you are misapplying the term.)
-1Vladimir_Nesov12yThere are natural categories, like "tigers", that don't require much of a mind to define. It's not mind projection fallacy to say that something is a tiger. P.S. I'm correcting self-censoring threshold, so expect silence where before I'd say something for the fifth time.
0Vladimir_Nesov12yFor the record, I thought it obvious that my argument above implied that I claim aboutness to be a natural category (although I'm not perfectly sure it's a sound argument). I deleted my comment because I deemed it low-quality, before knowing you responded to it.
1pjeby12yIt's not. First, the only way it can be one is if "natural category" has the reductionist meaning of "a category based on distinctions that humans are biased towards using as discriminators", rather than "a category that 'naturally' exists in the territory". (Categories are abstractions, not physical entities, after all.) And second, even if you do use the reductionist meaning of "natural category", then this does not in any way undermine the conclusion that "aboutness" is mind projection when you omit the entity mapping that aboutness from the description. In other words, this argument appears to result in only one of two possibilities: either "aboutness" is not a natural category per the reductionist definition, and thus inherently a mind projection when the attribution source is omitted, or "aboutness" is a natural category per the reductionist definition... in which case the attribution source has to be a human brain (i.e., in another map). Finally, if we entirely reject the reductionist definition of "natural category", then "natural category" is itself an instance of the mind projection fallacy, since the description omits any definition of for whom the category is "natural". In short, QED: the argument is not sound. (I just didn't want to bother typing all this if you were going to retreat to a claim this was never your argument.)
0thomblake12yIndeed. If this didn't work then there wouldn't be any practical point in modeling physics!
0Cyan12yTo the (unknowable*) extent that the portion of my map labelled "territory" is an accurate reflection of the relevant portion of the territory, do I get to say that my preferences are "about" the territory (implicitly including disclaimers like "as mediated by the map")? * due at the very least to Matrix/simulation scenarios
0thomblake12yThat's one way of writing. Another is to edit what you intend to post before you click 'comment'.
0orthonormal12yI feel your frustration, but throwing the word "magical" in there is just picking a fight, IMO. Anyway, I too would like to see P.J. Eby summarize his position in this format.
0Vladimir_Nesov12yI have a certain technical notion of magic [http://wiki.lesswrong.com/wiki/Magic] in mind. This particular comment wasn't about frustration (some of the others were), I'm trying out something different of which I might write a post later.
0orthonormal12yBe charitable in your interpretation, and remember the Least Convenient Possible World [http://lesswrong.com/lw/2k/the_least_convenient_possible_world/] principle. I was presuming that the setup was such that being alive on Mars wouldn't be a 'fate worse than death' for her; if it were, I'd choose differently. If you prefer, take the same hypothetical but with me on Mars, choosing whether she stayed alive on Earth; or let choice B include subjecting her to an awful fate rather than death. I would say rather that my reaction is my evaluation of an imagined future world. The essence of many decision algorithms is to model possible futures and compare them to some criteria [http://lesswrong.com/lw/rb/possibility_and_couldness/]. In this case, I have complicated unconscious affective criteria for imagined futures (which dovetail well with my affective criteria for states of affairs I directly experience), and my affective reaction generally determines my actions. To the extent this is true (as in the sense of my previous sentence), it is a tautology. I understand what you're arguing against: the notion that what we actually execute matches a rational consequentialist calculus of our conscious ideals. I am not asserting this; I believe that our affective algorithms do often operate under more selfish and basic criteria, and that they fixate on the most salient possibilities instead of weighing probabilities properly, among other things. However, these affective algorithms do appear to respond more strongly to certain facets of "how I expect the world to be" than to facets of "how I expect to think the world is" when the two conflict (with an added penalty for the expectation of being deceived), and I don't find that problematic on any level.
-2pjeby12yAs I said, it's still going to be about your experience during the moments until your memory is erased. I took that as a given, actually. ;-) What I'm really arguing against is the naive self-applied mind projection fallacy that causes people to see themselves as decision-making agents -- i.e., beings with "souls", if you will. Asserting that your preferences are "about" the territory is the same sort of error as saying that the thermostat "wants" it to be a certain temperature. The "wanting" is not in the thermostat, it's in the thermostat's maker. Of course, it makes for convenient language to say it wants, but we should not confuse this with thinking the thermostat can really "want" anything but for its input and setting to match. And the same goes for humans. (This is not a mere fine point of tautological philosophy; human preferences in general suffer from high degrees of subgoal stomp, chaotic loops, and other undesirable consequences arising as a direct result of this erroneous projection. Understanding the actual nature of preferences makes it easier to dissolve these confusions.)
0Alicorn12yI wish I could upvote this two or three times. Thank you.
0Vladimir_Nesov12yWhat features of that comment made it communicate something new to you? What was it that got communicated? The comment restated a claim that a certain relationship is desirable as a claim that given that it's desirable, there is a process that establishes it to be true. It's interesting how this restatement could pierce inferential distance: is preference less trustworthy than a fact, and so demonstrating the conversion of preference into a fact strengthens the case?
0orthonormal12yGiven the length of the thread I branched from, it looks like you and P.J. Eby ended up talking past each other to some extent, and I think that you both failed to distinguish explicitly between the current map (which is what you calculate the territory to be) and a hypothetical future map. P.J. Eby was (correctly) insisting that your utility function is only in contact with your current map, not the territory directly. You were (correctly) insisting that your utility function cares about (what it calculates to be) the future territory, and not just the future map. Is that a fair statement of the key points?
0Vladimir_Nesov12yUtility function is no more "in contact" with your current map than the actual truth of 2+2=4 is in contact with display of a calculator that displays the statement. Utility function may care about past territory (and even counterfactual territory) as well as future territory, with map being its part. Keeping a map in good health is instrumentally a very strong move: just by injecting an agent with your preferences somewhere in the territory you improve it immensely.
0orthonormal12yWhile there might exist some abstracted idealized dynamic [http://lesswrong.com/lw/t0/abstracted_idealized_dynamics/] that is a mathematical object independent of your map, any feasible heuristic for calculating your utility function (including, of course, any calculation you actually do) will depend on your map. If Omega came through tomorrow and made all pigs conscious with human-like thoughts and emotions, my moral views on pig farming wouldn't be instantly changed; only when information about this development gets to me and my map gets altered will I start assigning a much higher disutility to factory farming of pigs. Or, to put it another way, a decision algorithm refers directly to the possible worlds in the territory (and their probabilities, etc), but it evaluates these referents by looking at the corresponding objects in its current map. I think that, since we're talking about practical purposes, this is a relevant point. Agree completely. Of the worlds where my future map looks to diverge from the territory, though, I'm generally more repulsed by the ones in which my map says it's fine where it's not than by the opposite.
1conchis12yThis something of a nitpick, but this isn't strictly true. If others are trying to calculate your utility function (in order to help you), this will depend on their maps rather than yours (though probably including their map of your map). The difference becomes important if their maps are more accurate than yours in some respect (or if they can affect how accurate your map is). For example, if you know that I value not being deceived (and not merely the subjective experience of not being deceived), and you care about my welfare, then I think that you should not deceive me, even if you know that I might perceive my welfare to be higher if you did.
0orthonormal12yOh, good point. I should have restricted it to "any calculation you personally do", in which case I believe it holds.
0Vladimir_Nesov12yAt which point it becomes trivial: any calculation that is done on your map is done using your map, just Markovity of computation... A related point is that you can create tools that make decisions themselves, in situations only of possibility of which you are aware.
0orthonormal12yRight. It's trivial, but relevant when discussing in what sense our decision algorithms refer to territory versus map. I can't parse this. What do you mean?
0Vladimir_Nesov12yIf you install an alarm system that uses a video camera to recognize movement and calls the police if it's armed, you are delegating some of the map-making and decision-making to the alarm system. You are neither aware of the exact nature of possible intruders, nor making a decision regarding calling the police before any intrusion actually occurs. The system decides what to do by itself, according to the aspect of your values it implements. You map is not involved.
1orthonormal12yYes, but your decision to install it (as well as your decision to arm it) comes from your map. You would not install it if you thought you had virtually no chance of being burglarized, or if you thought that it would have a false alarm every five minutes when the train went past. We can make choices that cause other (human, mechanical, etc) agents to act in particular ways, as one of the manners in which we affect possible futures. But these sorts of choices are evaluated by us in the same way as others. I fear we've resorted to arguing about the semantics of "map" versus "territory", as I don't see a scenario where we'd predict or decide differently from each other on account of this disagreement. As such, I'm willing to drop it for now unless you see such a scenario. (My disagreement with Mr. Eby, on the other hand, appears to be more substantive.)
0pjeby12yAnd does this alarm system have "preferences" that are "about" reality? Or does it merely generate outputs in response to inputs, according to the "values it implements"? My argument is simply that humans are no different than this hypothetical alarm system; the things we call preferences are no different than variables in the alarm system's controller - an implementation of values that are not our own. If there are any "preferences about reality" in the system, they belong to the maker of the alarm system, as it is merely an implementation of the maker's values. By analogy, if our preferences are the implementation of any values, they are the "values" of natural selection, not our own. If now you say that natural selection doesn't have any preferences or values, then we are left with no preferences anywhere -- merely isomorphism between control systems and their environments. Saying this isomorphism is "about" something is saying that a mental entity (the "about" relationship) exists in the real world, i.e., supernaturalism. In short, what I'm saying is that anybody who argues human preferences are "about" reality is anthropomorphizing the alarm system. However, if you say that the alarm system does have preferences by some reductionistic definition of "preference", and you assert that human preference is exactly the same, then we are still left to determine the manner in which these preferences are "about" reality. If nobody made the alarm system, but it just happened to be formed by a spontaneous jumbling of parts, can it still be said to have preferences? Are its "preferences" still "about" reality in that case?
-1Vladimir_Nesov12yBoth. You are now trying to explain away the rainbow [http://lesswrong.com/lw/or/joy_in_the_merely_real/], by insisting that it consists of atoms [http://lesswrong.com/lw/p2/hand_vs_fingers/], which can't in themselves possess the properties of a rainbow [http://lesswrong.com/lw/p3/angry_atoms/].
1pjeby12ySo an alarm system has preferences? That is not most people's understanding of the word "preference", which requires a degree of agency that most rationalists wouldn't attribute to an alarm system. Nonetheless, let us say an alarm system has preferences. You didn't answer any of my follow-on questions for that case. As for explaining away the rainbow, you seem to have me confused with an anti-reductionist. See Explaining vs. Explaining Away [http://lesswrong.com/lw/oo/explaining_vs_explaining_away/], in particular: At this point, I am attempting to show that the very concept of a "preference" existing in the first place is something projected onto the world by an inbuilt bias in human perception. Reality does not have preferences, it has behaviors. This is not erasing the rainbow from the world, it's attempting to erase the projection of a mind-modeling variable ("preference") from the world, in much the same way as Eliezer broke down the idea of "possible" actions in one of his series. So, if you are claiming that preference actually exists, please give your definition of a preference, such that alarm systems and humans both have them.
-2Vladimir_Nesov12yA good reply, if only you approached the discussion this constructively more often. Note that probability is also in the mind [http://lesswrong.com/lw/oj/probability_is_in_the_mind/], but yet your see all the facts [http://wiki.lesswrong.com/wiki/Absolute_certainty] through it, and you can't ever revoke it, each mind is locked in its subjectively objective [http://lesswrong.com/lw/s6/probability_is_subjectively_objective/] character. What do you think of that?
0pjeby12yI think that those things have already been very well explained by Eliezer -- so much so that I assumed that you (and the others participating in this discussion) would have already internalized them to the same degree as I have, such that asserting "preferences" to be "about" things would be a blatantly obvious instance of the mind projection fallacy. That's why, early on, I tended to just speak as though it was bloody obvious, and why I haven't been painstakingly breaking it all out piece by piece, and why I've been baffled by the argument, confusion, and downvoting from people for whom this sort of basic reductionism ought to be a bloody simple matter. Oh, and finally, I think that you still haven't given your definition of "preference", such that humans and alarm systems both have it, so that we can then discuss how it can then be "about" something... and whether that "aboutness" exists in the thing having the preference, or merely in your mental model of the thing.
-2Vladimir_Nesov12yThat in reply to a comment full of links to Eliezer's articles. You also didn't answer my comment, but wrote some text that doesn't help me in our argument. I wasn't even talking about preference.
1pjeby12yI know. That's the problem. See this comment [http://lesswrong.com/lw/zv/post_your_utility_function/sv9] and this one [http://lesswrong.com/lw/zv/post_your_utility_function/su7], where I asked for your definition of preference, which you still haven't given. That's because you also "didn't answer my comment [http://lesswrong.com/lw/zv/post_your_utility_function/sv9], but wrote some text [http://lesswrong.com/lw/zv/post_your_utility_function/svg] that doesn't help me in our argument." I was attempting to redirect you to answering the question which you've now ducked twice in a row.
-1Vladimir_Nesov12yWriting text that doesn't help is pointless and mildly destructive. I don't see how me answering your questions would help this situation. Maybe you have the same sentiment towards answering my questions, but that's separate from reciprocation. I'm currently trying to understand your position in terms of my position, not to explain to you my position.
-1Vladimir_Nesov12yThat's why philosophy is such a bog, and why it's necessary to arrive at however insignificant but technical conclusions in order to move forward reliably. I chose the articles in the comment above because they were in surface-match with what you are talking about, as a potential point on establishing understanding. I asked basically how you can characterize your agreement/disagreement with them, and how it carries over to the preference debate.
0pjeby12yAnd I answered [http://lesswrong.com/lw/zv/post_your_utility_function/svi] that I agree with them, and that I considered it foundational material to what I'm talking about. Indeed, which is why I'd now like to have the answer to my question, please. What definition of "preferences" are you using, such that an alarm system, thermostat, and human all have them? (Since this is not the common, non-metaphorical usage of "preference".)
0Vladimir_Nesov12yPreference is order on the lotteries of possible worlds (ideally established by expected utility), usually with agent a part of the world. Computations about this structure are normally performed by a mind inside the mind. The agent tries to find actions that determine [http://lesswrong.com/lw/r0/thou_art_physics/] the world to be as high as possible on the preference order, given the knowledge about it. Now, does it really help?
1pjeby12yYes, as it makes clear that what you're talking about is a useful reduction of "preference", unrelated to the common, "felt" meaning of "preference". That alleviates the need to further discuss that portion of the reduction. The next step of reduction would be to unpack your phrase "determine the world"... because that's where you're begging the question that the agent is determining the world, rather than determining the thing it models as "the world". So far, I have seen no-one explain how an agent can go beyond its own model of the world, except as perceived by another agent modeling the relationship between that agent and the world. It is simply repeatedly asserted (as you have effectively just done) as an obvious fact. But if it is an obvious fact, it should be reducible, as "preference" is reducible, should it not?
0Vladimir_Nesov12yHmm... Okay, this should've been easier if the possibility of this agreement was apparent to you. This thread is thereby merged here [http://lesswrong.com/lw/zv/post_your_utility_function/swq].
0Alicorn12yI'd been following this topic and getting frustrated with my inability to put my opinion on the whole preferences-about-the-territory thing into words, and I thought that orthonomal's comment accomplished it very nicely. I don't think I understand your other question.
-1pjeby12yI am only saying that the entire stack of concepts you have just mentioned exists only in your map. Permit me to translate: supposing utility is not about the (portion of map labeled) territory but about the (portion of map labeled) map, we get people who want nothing more than to sabotage their own mapmaking capabilities. Does that make it any clearer what I'm saying? This is a "does the tree make a sound" argument, and I'm on the, "no it doesn't" side, due to using a definition of "sound" that means "the representation of audio waves within a human nervous system". You are on the "of course it makes a sound" side, because your definition of sound is "pressure waves in the air." Make sense?
1saturn12yAs far as I can tell, you're saying that there is no territory, or that the territory is irrelevant. In other words, solipsism. You've overcome the naive map/territory confusion, but only to wind up with a more sophisticated form of confusion. This isn't a "does the tree make a sound" argument. It's more like a "dude... how do we even really know reality is really real" argument. Rationality is entirely pointless if all we're doing is manipulating completely arbitrary map-symbols. But in that case, why not leave us poor, deluded believers in reality to define the words "map", "territory", and "utility" the way we have always done?
0pjeby12yNo, general semantics. There's a difference.
2saturn12yCan you point out the difference? Even though "this is not a pipe", the form of a depiction of a pipe is nevertheless highly constrained by the physical properties of actual pipes. Do you deny that? If not, how do you explain it?
1conchis12yI've been trying to be on the "it depends on your definition and my definition sits within the realm of acceptable definitions" side. Unfortunately, whether this is what you intend or not, most of your comments come across as though you're on the "it depends on the definition, and my (PJ's) defintion is right and yours is wrong" side, which is what seems to be getting people's backs up.
0Vladimir_Nesov12yThis confusion is dissolved in the post Disputing Definitions [http://lesswrong.com/lw/np/disputing_definitions/].
0conchis12yWhich confusion? I didn't think I was confused. Now I'm confused about whether I'm confused. ;)
0Vladimir_Nesov12yYou mentioned this confusion as possibly playing a role in you and Eby talking past each other, the ambiguous use of the word "utility".
1conchis12yOK, cool. Now, given that we've already identified that, what does Disputing Definitions [http://lesswrong.com/lw/np/disputing_definitions/] tell us that we don't already know?
2Cyan12yThere's two things to say in response to this: first, I can define "liking and respecting me" as "experiencing analogous brain states to mine when I like and respect someone else". That's in the territory (modulo some assumptions about the cognitive unity of humankind): I could verify it in principle, although not in practice. The second thing is that even if we grant that the example was poor, the point was still valid. For example, one might prefer that one's spouse never cheat to one's spouse cheating but never being aware of that fact. (ETA: but maybe you weren't arguing against the point, only the example.)
-1pjeby12yBut what if they experience that state, and still, say, beat you up and treat you like jerks, because that's what their map says you should do when you feel that way? This isn't about the example being poor, it's about people thinking things in the map actually exist in the territory. Everything you perceive is mediated by your maps, even if only in the minimal sense of being reduced to human sensory-pattern recognition symbols first, let alone all the judgments about the symbols that we add on top. How about the case where you absolutely believe the spouse is cheating, but they really aren't? This is certainly a better example, in that it's easier to show that it's not reality that you value, but the part of your map that you label "reality". If you really truly believe the spouse is cheating, then you will feel exactly the same as if they really are. IOW, when you say that you value something "in the territory", all you are ever really talking about is the part of your map that you label "the territory", whether that portion of the map actually corresponds to the territory or not. This is not some sort of hypothetical word-argument, btw. (I have no use for them, which is why I mostly avoid the Omega discussions.) This is a practical point for minimizing one's suffering and unwanted automatic responses to events in the world. To the extent you believe that your map is the territory, you will suffer when it is out-of-sync.
2Cyan12yIt's still possible to prefer this state of affairs to one where they are beating me because they are contemptuous of me. Remember, we're talking about a function from some set X to the real numbers, and we're trying to figure out what sorts of things are members of X. In general, people do have preferences about the way things actually are. But my spouse won't, and I have preferences about that fact. All other things being equal, my preference ordering is "my spouse never cheats and I believe my spouse never cheats" > "my spouse cheats and I find out" > "my spouse cheats and I believe my spouse never cheats" > "my spouse never cheats but I believe she does". If a utility function exists that captures this preference, it will be a function that takes both reality and my map as arguments.
-3pjeby12yRight, which is where this veers off into "hypothetical word-arguments" for me, because the entire point I'm making is that all your preferences are still about the map, no matter how many times you point to a region of the map marked, "the way things actually are", and distinguish it from another part of the map labeled, "my map". You've read "Godel, Escher, Bach", right? This is not a pipe. The hand that's doing the drawing is in the drawing, no matter how realistically drawn it is. ;-)
2Cyan12yCan I get your analysis of my "spouse cheating and what I know about it" example? I've understood your position on utility functions as stated in other branches; I'm curious as to how you would interpret my claim to have preferences over both states of reality and beliefs about reality in this fairly concrete and non-hypothetical case.
-4pjeby12yMy point is that your entire argument consists of pointing to the map and claiming it's the territory. In the cases where reality and your belief conflict, you won't know that's the case. Your behavior will be exactly the same, either way, so the distinction is moot. When you are trying to imagine, "my spouse is cheating and I think she isn't", you aren't imagining that situation... you are actually imagining yourself perceiving that to be the case. That is, your map contains the idea of being deceived, and that this is an example of being deceived, and it is therefore bad. None of that had anything to do with the reality over which you claim to be expressing a preference, because if it were the reality, you would not know you were being deceived. This is just one neat little example of systemic bias in the systems we use to represent and reflect on preferences. They are designed to react to perceived circumstances, rather than to produce consistent reasoning about how things ought to be. So if you ever imagine that they are "about" reality, outside the relatively-narrow range of the here-and-now moment, you are on the path to error. And just as errors accumulate in Newtonian physics as you approach the speed of light, so too do reasoning errors tend to accumulate as you turn your reasoning towards (abstract) self-reflection.
2Cyan12ySure. No. My imagination encompasses the fact that if it were the reality, I would not know I was being deceived. I know what my emotional state would be -- it would be the same as it is now. That's easy to get. What it really comes down to is that saying that my preference ordering is "my spouse cheats and I find out" > "my spouse cheats and I believe my spouse never cheats" is equivalent to saying that if I am not sure (say 50% probability), I will seek out more information. I'm not sure how such a decision could be justified without considering that my preferences are over both the map and the territory. ETA: Reading over what you've written in other branches, I'd like to point out a preference for not being deceived even if you will never realize it isn't an error -- it's prima facie evidence that human preferences are over the territory as well as the map. That may not be the most useful way of thinking about it from a mindhacking perspective, but I don't think it's actually wrong.
-1pjeby12yThat preference is not universal, which to me makes it absolutely part of the map. And it's not just the fictional evidence of Cypher wanting to go back in the Matrix and forget, guys routinely pay women for various forms of fantasy fulfillment, willingly suspending disbelief in order to be deceived. Not enough? How about the experimental philosophers who re-ran the virtual world thought experiment until they found that people's decision about living in a fantasy world that they'd think was real, was heavily dependent upon whether they 1) had already been living in the fantasy, 2) whether their experience of life would significantly change, and 3) whether their friends and loved ones were also in the fantasy world. If anything, those stats should be quite convincing that it's philosophers and extreme rationalists who have a pathological fear of deception, rather than a inbuilt human preference for actually knowing the truth... and that most likely, if we have an inbuilt preference against deception, it's probably aimed at obtaining social consensus rather than finding truth. All that having been said, I will concede that there perhaps you could find some irreducible microkernel of "map" that actually corresponds to "territory". I just don't think it makes sense (on the understanding-people side) to worry about it. If you're trying to understand what people want or how they'll behave, the territory is absolutely the LAST place you should be looking. (Since the distinctions they're using, and the meanings they attach to those distinctions, are 100% in the map.)
1conchis12yI don't see how it supposed to follow from the fact that not everyone prefers not-being-decieved, that those who claim to prefer not-being-deceived must be wrong about their own preferences. Could you explain why you seem to think it does? The claim others are defending here (as I understand it) is not that everyone's preferences are really over the territory; merely that some people's are. Pointing out that some people's preferences aren't about the territory isn't a counterargument to that claim.
0Vladimir_Nesov12y1) Why would people differ so much? Even concrete preferences don't get reversed, magical [http://wiki.lesswrong.com/wiki/Magic] mutants don't exist. 2) Even if you only care about your map, you still care about your map as a part of the territory, otherwise you make the next step and declare that you don't care about state of your brain either, you only care about caring itself, at which point you disappear in a "puff!" of metaphysical confusion. It's pretty much inevitable.
3Cyan12yThere's a distinction to be made between the fact that our knowledge about whether our preferences are satisfied is map-bound and the assertion that our preferences only take the map into account.
-1pjeby12yI'm saying that the preferences point to the map because your entire experience of reality is in the map - you can't experience reality directly. The comments about people's differences in not-being-deceived were just making the point that that preference is more about consensus reality than reality reaity. In truth, we all care about our model of reality, which we labeled reality and think is reality, but is actually not.
1conchis12yI'm afraid I have no idea what this is supposed to mean. It seems to me like you're just repeating your conclusion over and over again using different words, which unfortunately doesn't constitute an argument. Maybe to you it seems like we're doing the same thing, I don't know. Alternatively, maybe we're still talking past each other for the reasons suggested here [http://lesswrong.com/lw/zv/post_your_utility_function/sd2] (which everyone seemed to agree with at the time.) In which case, I wonder why we're still having this conversation at all, and apologise for my part in pointlessly extending it. ;)
1pjeby12yIt's probably because I replied to an unclosed subthread, causing an unintended resurrection. Also, at one point Vladimir Nesov did some resurrection too, and there have also been comments by Cyan and Saturn that kept things going. Anyway, yes, as you said, we already agreed we are talking about different things, so let's stop now. ;-)
1Vladimir_Nesov12yIf you agree that you are just talking about a different thing, and given that "utility" is a term understood to mean different thing from what you were talking about, kindly stop using that term for your separate concept to avoid unnecessary confusion and stop arguing about the sound of fallen tree.
1conchis12yYay!
0Vladimir_Nesov12yYou've just ignored Cyan's counterexample, and presented a few of your own that support your point of view.
1pjeby12yI answered it here [http://lesswrong.com/lw/zv/post_your_utility_function/sdn], actually.
2Vladimir_Nesov12yWrite up your argument, make a top post, refer to it if it's convincing. But guerilla arguing is evil: many words and low signal-to-noise.
-1pjeby12yI don't understand you.
1Vladimir_Nesov12yIf you are operating under an assumption that nobody agrees with, you are wasting anyone's time (assumption: map is about the map, territory be damned), as the argument never goes anywhere. As a compromise, compose your best argument as a top-level post (but only if you expect to convince at least someone).
0pjeby12yThat's why I don't understand you - I dropped this particular subthread for that very reason, but Cyan asked a second time for a reply. Otherwise, I'd have not said anything else in this particular subthread.
0Vladimir_Nesov12yYou could still write a meta-reply, taking that problem into account. The root of the disagreement can be stated in one line, and a succinct statement of at the moment unresolvable disagreement is resolution of an argument.
1Vladimir_Nesov12yI'm just trying to be decisive in identifying the potential flaming patterns in the discussion. I could debate the specifics, but given my prior experience in debating stuff with you, and given the topics that could be debated in these last instances, I predict that the discussion won't lead anywhere, and so I skip the debate and simply state my position, to avoid unnecessary text. One way of stopping recurring thematic or person-driven flame wars (that kill Internet communities [http://lesswrong.com/lw/c1/wellkept_gardens_die_by_pacifism/]) is to require the sides to implement decent write-ups of their positions: even without reaching agreement, at some point there remains nothing to be said, and so the endless cycle of active mutual misunderstanding gets successfully broken.
-1pjeby12yI don't understand how that's supposed to work. If you don't expect it to lead anywhere, why bother saying anything at all?
0Vladimir_Nesov12yI'm registering the disagreement, and inviting you to sort the issue out for yourself, through reconsidering your position in response to apparent disagreement, or through engaging into a more constructive form of discussion.
-1pjeby12yThis appears to be a one-way street. If applied consistently, it would seem that your first step would be to reconsider your position in response to apparent disagreement... or that I should reply by registering my disagreement -- which implicitly I'd have already done. Or, better yet, you would begin (as other people usually do) by starting the "more constructive form of discussion", i.e., raising specific objections or asking specific questions to determine where the differences in our maps lie.
2Vladimir_Nesov12yPreferences are computed by the map, but they are NOT about the map.
-4pjeby12yOh, right. It says so right here... on the map! ;-)
0Vladimir_Nesov12yDon't take it lightly, it's a well-vetted and well-understood position, extensively discussed and agreed upon. You should take such claims as strong evidence that you may have missed something crucial, that you need to go back and reread the standard texts.
-1pjeby12yIt's extensively discussed and agreed upon, that that is how we (for certain definitions of "we") would like it to be, and it certainly has desirable properties for say, building Friendly AI, or any AI that doesn't wirehead. And it is certainly a property of the human brain that it orients its preferences towards what it believes is the outside world - again, it has good consequences for preventing wireheading. But that doesn't make it actually true, just useful. It's also pretty well established as a tenet of e.g., General Semantics, that the "outside world" is unknowable, since all we can ever consciously perceive is our map. The whole point of discussing biases is that our maps are systematically biased -- and this includes our preferences, which are being applied to our biased views of the world, rather than the actual world. I am being descriptive here, not prescriptive. When we say we prefer a certain set of things to actually be true, we can only mean that we want the world to not dispute a certain map, because otherwise we are making the supernaturalist error of assuming that a thing could be true independent of the components that make it so. To put it another way, if I say, "I prefer that the wings of this plane not fall off", I am speaking about the map, since "wings" do not exist in the territory. IOW, our statements about reality are about the intersection of some portion of "observable" reality and our particular mapping (division and labeling) of it. And it cannot be otherwise, since to even talk about it, we have to carve up and label the "reality" we are discussing.
0loqi12yIt's funny that you talk of wordplay a few comments back, as it seems that you're the one making a technically-correct-but-not-practically-meaningful argument here. If I may attempt to explore your position: Suppose someone claims a preference for "blue skies". The wirehead version of this that you endorse is "I prefer experiences that include the perception I label 'blue sky'". The "anti-wirehead" version you seem to be arguing against is "I prefer actual states of the world where the sky is actually blue". You seem to be saying that since the preference is really about the experience of blue skies, it makes no sense to talk about the sky actually being blue. Chasing after external definitions involving photons and atmospheric scattering is beside the point, because the actual preference wasn't formed in terms of them. This becomes another example of the general rule that it's impossible to form preferences directly about reality, because "reality" is just another label on our subjective map. As far as specifics go, I think the point you make is sound: Most (all?) of our preferences can't just be about the territory, because they're phrased in terms of things that themselves don't exist in the territory, but at best simply point at the slice of experience labeled "the territory". That said, I think this perspective grossly downplays the practical importance of that label. It has very distinct subjective features connecting in special ways to other important concepts. For the non-solipsists among us, perhaps the most important role it plays is establishing a connection between our subjective reality and someone else's. We have reason to believe that it mediates experiences we label as "physical interactions" in a manner causally unaffected by our state of mind alone. When I say "I prefer the galaxy not to be tiled by paperclips", I understand that, technically, the only building blocks I have for that preference are labeled experiences and concepts that aren't
1Vladimir_Nesov12yThink of what difference is there between "referring directly" to the outside reality and "referring directly" to the brain. Not much, methinks. There is no homunculus whose hands are only so long to reach the brain, but not long enough to touch your nose.
0loqi12yAgreed, as the brain is a physical object. Referring "directly" to subjective experiences is a different story though.
-5pjeby12y
0timtyler12yWhether your preferences refer to your state, or to the rest of the world is indeed a wirehead-related issue. The problem with the idea that they refer to your state is that that idea tends to cause wirehead behaviour - surgery on your own brain to produce the desired state. So - it seems desirable to construct agents that believe that there is a real world, and that their preferences relate to it.
-1pjeby12yI agree - that's probably why humans appear to be constructed that way. The problem comes in when you expect the system to also be able to accurately reflect its preferences, as opposed to just executing them. This does not preclude the possibility of creating systems that can; it's just that they're purely hypothetical. To the greatest extent practical, I try to write here only about what I know about the practical effects of the hardware we actually run on today, if for no other reason than if I got into entirely-theoretical discussions I'd post WAY more than I already do. ;-)
0timtyler12yPresumably, if you asked such an agent to reflect on its own purposes, it would claim that they related to the external world (unless it's aim was to deceive you about its purposes for signalling reasons, of course). For example, it might claim that its aim was to save the whales - rather than to feel good about saving the whales. It could do the latter by taking drugs or via hypnotherapy - and that is not how it actually acts.
0pjeby12yActually, if signaling was its true purpose, it would claim the same thing. And if it were hacked together by evolution to be convincing, it might even do so by genuinely believing that its reflections were accurate. ;-) Indeed. But in the case of humans, note first that many people do in fact take drugs to feel good, and second, that we tend to dislike being deceived. When we try to imagine getting hypnotized into believing the whales are safe, we react as we would to being deceived, not as we would if we truly believed the whales were safe. It is this error in the map that gives us a degree of feed-forward consistency, in that it prevents us from certain classes of wireheading. However, it's also a source of other errors, because in the case of self-fulfilling beliefs, it leads to erroneous conclusions about our need for the belief. For example, if you think your fear of being fired is the only thing getting you to work at all, then you will be reluctant to give up that fear, even if it's really the existence of the fear that is suppressing, say, the creativity or ambition that would replace the fear. In each case, the error is the same: System 2 projection of the future implicitly relies on the current contents of System 1's map, and does not take into account how that map would be different in the projected future. (This is why, by the way, The Work's fourth question is "who would you be without that thought?" The question is a trick to force System 1 to do a projection using the presupposition that the belief is already gone.)
1conchis12yIt's possible that the confusion here is (yet again) due to people using the word utility to mean different things. If PJ is using utility to refer to, e.g. an emotional or cognitive state, then he's right that our utility cannot respond directly to the territory. But broader notions of utility, well-being, and preference are possible, and nothing PJ has said is especially relevant to whether they are coherent or not.
0Cyan12yAh, right. Good call.
-1pjeby12yRight. I don't dabble in discussing those broader notions, though, since they can't be empirically grounded. How can you test a concept of utility that's not grounded in human perception and emotion? What good can it ever do you if you can't connect it back to actual living people? I consider such discussions to be much more irrational than, say, talk of "The Secret", which at least offers an empirical procedure that can be tested. ;-) (In fairness, I do consider such discussions here on LW to be far less annoying than most discussions of the Secret and suchlike!)
0conchis12yThese notions are about what it means for something to be good for "actual living people". They're difficult, if not impossible to "test" (about the best testing procedures we've come up with is thought experiments, which as discussed elsewhere are riddled with all sorts of problems). But it's not like you can "test" the idea that positive emotions are good for you either.
-1pjeby12yI thought this was well established scientifically, if by "good for you", you mean health, persistence, or success in general. (see e.g. Seligman)
0conchis12yThe argument is precisely about what "good for you" means, so this would be assuming the conclusion that needs to be established.
0pjeby12yOw. That makes my head hurt. (See, that's why I try not to get into these discussions!) (I'm hard pressed, though, to conceive of a moral philosophy where improved health would not be considered "good for you".)
0conchis12yPreference utilitarianism applied to someone who thinks that it is only through suffering that life can achieve meaning. To be clear, I don't subscribe to such a view myself, but it's conceivable. I agree with you that health is good for people. My point is just that this agreement owes more to shared intuition than conclusive empirical testing.
-1pjeby12yYes, but now we're back to concrete feelings of actual people again. ;-) Right, which is one reason why, when we're talking about this particular tiny (but important) domain (that at least partially overlaps with Eliezer's notion of Fun Theory), conclusive empirical testing is a bit of a red herring, since the matter is subjective from the get-go. We can objectively predict certain classes of subjective events, but the subjectivity itself seems to be beyond that. At some point, you have to make an essentially arbitrary decision of what to value.
0[anonymous]12yPreference utilitarianism applied to someone who thinks that it is only through suffering that life can achieve meaning. (To be clear, I don't subscribe to this view; but it is conceivable.)

So, we're just listing how much we'd buy things for? I don't see why it's supposed to be hard.

I guess it gets a bit complicated when you consider combinations of things, rather than just their marginal value. For example, once I have a computer with an internet connection, I care for little else. Still, I just have to figure out what would be about neutral, and decide how much I'd pay an hour (or need to be payed an hour) to go from that to something else.

Playing a vaguely interesting game on the computer = 0.

Doing something interesting = 1-3.

Talking to a ... (read more)

I realize that my utility function is inscrutable and I trust the unconscious par of me to make accurate judgments of what I want. When I've determined what I want, I use the conscious part of me to determine how I'll achieve it.

4Vladimir_Nesov12yDon't trust the subconscious too much in determining what you want either. Interrogate it relentlessly, ask related questions, find incoherent claims and force the truth about your preference to the surface.
1CannibalSmith12yYou seem to mistakenly think that my subconscious is not me.
3Cyan12yI'd say rather that it seems he doesn't trust one's subconscious to be self-consistent -- and that doesn't seem mistaken to me.
1Vladimir_Nesov12yCorrect. I wasn't very careful by implying that the truth extracted from subconscious is to be accepted: the criteria for acceptance, or trust is also a component of your values (that suggests an additional limit of reflective consistency), closing on itself, to be elicited as well.

"Utility functions are really bad match for human preferences, and one of the major premises we accept is wrong." Given the sheer messiness and mechanical influence involved in human brains, it's not even clear we have real 'values' which could be examined on a utility function, rather than simple dominant-interestedness that happens for largely unconscious and semi-arbitrary reasons.

Interesting exercise. After trying for a while I completely failed; I ended up with terms that are completely vague (e.g. "comfort"), and actually didn't even begin to scratch the surface of a real (hypothesized) utility function. If it exists it is either extremely complicated (too complicated to write down perhaps) or needs "scientific" breakthroughs to uncover its simple form.

The result was also laughably self-serving, more like "here's roughly what I'd like the result to be" than an accurate depiction of what I do.

What counts as a "successful" utility function?

In general terms there are two, conflicting, ways to come up with utility functions, and these seem to imply different metrics of success.

1. The first assumes that "utility" corresponds to something real in the world, such as some sort of emotional or cognitive state. On this view, the goal, when specifying your utility function, is to get numbers that reflect this reality as closely as possible. You say "I think x will give me 2 emotilons", and "I think y will give me 3 emot

1bill12yWhen I teach decision analysis, I don't use the word "utility" for exactly this reason. I separate the "value model" from the "u-curve." The value model is what translates all the possible outcomes of the world into a number representing value. For example, a business decision analysis might have inputs like volume, price, margin, development costs, etc., and the value model would translate all of those into NPV. You only use the u-curve when uncertainty is involved. For example, distributions on the inputs lead to a distribution on NPV, and the u-curve would determine how to assign a value that represents the distribution. Some companies are more risk averse than others, so they would value the same distribution on NPV differently. Without a u-curve, you can't make decisions under uncertainty. If all you have is a value model, then you can't decide e.g. if you would like a deal with a 50-50 shot at winning $100 vs losing$50. That depends on risk aversion, which is encoded into a u-curve, not a value model. Does this make sense?
0conchis12yTotally. ;)

Utility functions are really bad match for human preferences, and one of the major premises we accept is wrong.

Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can't model all that crap with one number.

The study of affective synchrony shows that humans have simultaneously-active positive and negative affect systems. At extreme levels in either system, the other is shut down, but the rest of the time, they can support or oppose each other. (And in positions of opposition, we experience conflict... (read more)

2conchis12yI don't really see why not (at least without further argument). 1. Relativity and contextuality introduce additional arguments into the utility function, they don't imply that the output can't be scalar. Lots of people include relativity and contextual concerns into scalar utility all the time. 2. Semi-independent positive and negative axes only prevent you from using scalar utility if you think they're incommensurable. If you can assign weights to the positive and negative axes, then you can aggregate them into a single utility index. (How accurately you can do this is a separate question.) Of course, if you do think there are fundamentally incommensurable values, then scalar utility runs into trouble.* Amartya Sen and others have done interesting work looking at plural/vector utility and how one might go about using it. (I guess if we're sufficiently bad at aggregating different types of value, such methods might even work better in practice than scalar utility.) * I'm sceptical; though less sceptical than I used to be. Most claims of incommensurability strike me as stemming from unwillingness to make trade-offs rather than inability to make trade-offs, but maybe there are some things that really are fundamentally incomparable.
2taw12yI was pretty convinced for commensurability and thought cognitive biases would just introduce noise, but lack of success by me, and apparently by everyone else in this thread, changed my mind quite significantly.
0conchis12yNot knowing how to commensurate things doesn't imply they're incommensurable (though obviously, the fact that people have difficulty with this sort of thing is interesting in its own right). As a (slight) aside, I'm still unclear about what you think would count as " success [http://lesswrong.com/lw/zv/post_your_utility_function/s7i]" here.
0taw12yIt's not a hard implication, but it's a pretty strong evidence against existence of traditional utility functions. A success would be a list of events or states of reality and their weights, such that you're pretty convinced that your preferences are reasonably consistent with this list, so that you know how many hours of standing in queues is losing 5kg worth and how much money is having one thousand extra readers of your blog worth. It doesn't sound like much, but I completely fail as soon as it goes out of very narrow domain, I'm surprised by this failure, and I'm surprised that others fail at this too.
0Cyan12yI'm surprised at your surprise. Even granting that humans could possibly be innately reflectively self-consistent, there's a huge curse of dimensionality [http://en.wikipedia.org/wiki/Curse_of_dimensionality] problem in specifying the damn thing. ETA: The problem with the dimensionality is that interactions between the dimensions abound; ceteris paribus assumptions can't get you very far at all.
0taw12yI was expecting noise, and maybe a few iterations before reaching satisfying results, but it seems we cannot even get that much, and it surprises me.
4conchis12yOK, there's a lot of food for thought in there, and I can't possibly hope to clarify everything I'd ideally like to, but what I think you're saying is: 1. it's theoretically possible to think about utility as a single number; but 2. it's nonetheless a bad idea to do so, because (a) we're not very good at it, and (b) thinking about things mathematically means we won't "own" the decision, and therefore leads to akrasia problems (FWIW, I was only claiming 1.) I'm fairly sympathetic to 2(a), although I would have thought we could get better at it with the right training. I can see how 2(b) could be a problem, but I guess I'm not really sure (i) that akrasia is always an issue, and (ii) why (assuming we could overcome 2(a)) we couldn't decide mathematically, and then figure out how to "own" the decision afterwards. (This seems to have worked for me, at least; and stopping to do the math has at sometimes stopped me "owning" the wrong decision, which can be worse than half-heartedly following through on the right one.) P.S. I didn't think anyone was suggesting Omega should make the trade-off. I certainly wasn't.
2pjeby12yTo own it, you'd need to not mathematically decide; the math could only ever be a factor in your decision. There's an enormous gap between, "the math says do this, so I guess I'll do that", and "after considering the math, I have decided to do this." The felt-experience of those two things is very different, and it's not merely an issue of using different words. Regarding getting better at making decisions off of mathematics, I think perhaps you miss my point. For humans, the process by which decision-making is done, has consequences for how it's implemented, and for the person's experience and satisfaction regarding the decision itself. See more below... I'd like to see an actual, non-contrived example of that. Mostly, my experience is that people are generally better off with a 50% plan executed 100% than a 100% plan executed 50%. It's a bit of a cliche -- one that I also used to be skeptical/cynical about -- but it's a cliche because it's true. (Note also that in the absence of catastrophic failure, the primary downside of a bad plan is that you learn something, and you still usually make some progress towards your goals.) It's one of those places where in theory there's no difference between theory and practice, but in practice there is. We just think differently when we're considering something from when we're committed to it -- our brains just highlight different perceptions and memories for our attention, so much so that it seems like all sorts of fortunate coincidences are coming your way. Our conscious thought process in System 2 is unchanged, but something on the System 1 level operates differently with respect to a decision that's passed through the full process. I used to be skeptical about this, before I grasped the system 1/system 2 distinction (which I used to call the "you" (S2) vs. "yourself" (S1) distinction). I assumed that I could make a better plan before deciding to do something or taking any action, and refused to believe otherwise. Now
1conchis12ySure. I don't think this is inconsistent with what I was suggesting, which was really just that that the math could start the process off. All of which I agree with; but again, I don't see how this rules out learning to use math better. Fair enough. The examples I'm thinking of typically involve "owned" decisions that are more accurately characterised as 0% plans (i.e. do nothing) or -X% plans (i.e. do things that are actively counterproductive). 1. How do you decide what to get S1 to buy in to? 2. What do you do in situations where feedback comes too late (long term investments with distant payoffs) or never (e.g. ethical decisions where the world will never let you know whether you're right or not). P.S. Yes, I'm avoiding the concrete example request. I actually have a few, but they'd take longer to write up than I have time available at the moment, and involve things I'm not sure I'm entirely comfortable sharing.
0pjeby12yI already explained: you select options by comparing their positive traits. The devil is in the details, of course, but as you might imagine I do entire training CDs on this stuff. I've also written a few blog articles about this in the past. I don't understand the question. If you're asking how I'd know whether I made the best possible decision, I wouldn't. Maximizers do very badly at long-term happiness, so I've taught myself to be a satisficer. I assume that the decision to invest something for the long term is better than investing nothing, and that regarding an ethical decision I will know by the consequences and my regrets or lack thereof whether I've done the "right thing"... and I probably won't have to wait very long for that feedback.
1Cyan12yOne can imagine a person who has committed emotionally to the maxim "shut up and multiply (when at all possible)" and made it an integral part of their identity. For such an individual, the commitment precedes the act of doing the math, and the enormous gap referred to above does not exist.
2pjeby12yIf such an individual existed, they would still have the same problem of shifting decisions, unless they also included a commitment to not recalculate before a certain point. Consider, e.g. Newcomb's problem. If you do the calculation before, you should one-box. But doing the calculation at the actual time, means you should two-box. So, to stick to their commitments, human beings need to precommit to not revisiting the math, which is a big part of my point here. Your hypothetical committed-to-the-math person is not committed to their "decisions", they are committed to doing what the math says to do. This algorithm will not produce the same results as actual commitment will, when run on human hardware. To put it more specifically, this person will not get the perceptual benefits of a committed decision for decisions which are not processed through the machinery I described earlier. They will be perceptually tuned to the math, not the situation, for example, and will not have the same level of motivation, due to a lack of personal stake in their decision. In theory there's no difference between theory and practice, but in practice there is. This is because System 2 is very bad at intuitively predicting System 1's behavior, as we don't have a built-in reflective model of our own decision-making and motivation machinery. Thus, we don't know (and can't tell) how bad our theories are without comparing decision-making strategies across different people.
1Vladimir_Nesov12yThis is incorrect. You are doing something very wrong if changing the time when you perform a calculation changes the result. That's an important issue in decision theory being reflectively consistent.
2pjeby12yThat's the major point I'm making: that humans are NOT reflectively consistent without precommitment... and that the precommitment in question must be concretely specified, with the degree of concreteness and specificity required being proportional to the degree of "temptation" involved.
1Vladimir_Nesov12yThat may usually be the case, but this is not a law. Certain people could conceivably precommit to being reflectively consistent, to follow the results of calculations whenever the calculations are available.
2pjeby12yOf course they could. And they would not get as good results from either an experiential or practical perspective as the person who explicitly committed to actual, concrete results, for the reasons previously explained. The brain makes happen what you decide to have happen, at the level of abstraction you specify. If you decide in the abstract to be a good person, you will only be a good person in the abstract. In the same way, if you "precommit to reflective consistency", then reflective consistency is all that you will get. It is more useful to commit to obtaining specific, concrete, desired results, since you will then obtain specific, concrete assistance from your brain for achieving those results, rather than merely abstract, general assistance. Edit to add: In particular, note that a precommitment to reflective consistency does not rule out the possibility of one's exercising selective attention and rationalization as to which calculations to perform or observe. This sort of "commit to being a certain kind of person" thing tends to produce hypocrisy in practice, when used in the abstract. So much so, in fact, that it seems to be an "intentionally" evolved mechanism for self-deception and hypocrisy. (Which is why I consider it a particularly heinous form of error to try to use it to escape the need for concrete commitments -- the only thing I know of that saves one from hypocrisy!)
3pjeby12yA person who decides to be "a good person" will selectively perceive those acts that make them a "good person", and largely fail to perceive those that do not, regardless of the proportions of these events, or whether these events are actually good in their effects. They will also be more likely perceive to be good, anything that they already want to do or which benefits them, and they will find ways to consider it a higher good to refrain from doing anything they'd rather not do in the first place. Similarly, a person who decides to be "reflectively consistent" will not only selectively perceive their acts of reflective consistency, they will also fail to observe the lopsided way in which they apply the concept, nor will they notice how their "reflective consistency" is not, in itself, achieving any other results or benefits for themselves or others. Brains operate on the level of abstraction you give them, so the more abstract the goal, the less connected to reality the results will be, and the more wiggle room there will be for motivated reasoning and selective perception. So in theory you can precommit to reflective consistency, but in practice you will only get an illusion of reflective consistency. (Edit to add: If you're still confused by this, it's probably because you're thinking about thinking, and I'm talking about actual behavior.)
3conchis12yI can't speak for Vladimir, but from my perspective, this is much clearer now. Thanks! (ETA: FWIW, while most of your comments on this post leave me with a sense that you have useful information to share, I've also found them somewhat frustrating, in that I really struggle to figure out exactly what it is. I don't know if this is your writing style, my slow-wittedness, or just the fact that there's a lot of inferential distance between us; but I just thought it might be useful for you to know.)
2pjeby12ySince I'm trying to rapidly summarize a segment of what Robert Fritz took a couple of books to get across to me ("The Path of Least Resistance" and "Creating"), inferential distance is likely a factor. It's mostly his model of decisionmaking and commitment that I'm describing, with a few added twists of mine regarding the ranking bit, and the "worst that could happen" part, as well as links from it to the System 1/2 model. (And of course I've been talking about Fritz's idea of the ideal-belief-reality-conflict in other threads, and that relates here as well.)
-1Vladimir_Nesov12yBasically, our conversation went like this: You: People can't be reflectively consistent. Me: Yes they can, sometimes. You: Of course they can. Me: I'm confused. You: Of course people can be reflectively consistent. But only in the dreamland. If you are still confused, it's probably because you are still thinking about the dreamland, while I'm talking about reality.
3AdeleneDawner12yI think pjeby's point was that reflective consistency is a way of thinking - so if you commit to thinking in a reflectively consistent way, you will think in that way when you think, but you may still wind up not acting according to that kind of thoughts every time you would want to, because you're not entirely likely to notice that you need to think them in the first place.
0Vladimir_Nesov12yReflective consistency is not about a way of thinking. Decision theory, considered in the simplest case, talks about properties of actions, including future actions, while ignoring properties of the algorithm generating the actions.
1pjeby12yNo, it went like this: Me: People can't be reflectively consistent You: But they can precommit to be Me: But that won't *actually make them so* You: But they could precommit to acting as if they were Me: Of course they can, but it still won't actually make them so. See also Abraham Lincoln's, "If you call a tail a leg, how many legs does a dog have? Four, because calling a tail a leg doesn't make it so."
0conchis12yThis is a diversion, but this has always struck me as a stupid answer to an even stupider question. I don't really understand why people think it's supposed to reveal some deep wisdom.
-1pjeby12yThat's Zen for you. ;-) Seriously, the point (for me, anyhow) is that System 2 thinking routinely tries to call a tail a leg, and I think there's a strong argument to be made that it's an important part of what system 2 reasoning "evolved for".
0Vladimir_Nesov12yHuh? Reflective consistency is a property of behavior. If you behave as if you are reflectively consistent, you are.
1pjeby12yAnd I am saying that a single precommitment to behaving in a reflectively consistent way, will not result in you actually behaving in the same way as you would if you individually committed to all of the specific decisions recommended by your abstract decision theory. Your perceptions and motivation will differ, and therefore your actual actions will differ. People try to precommit in this fashion all the time, by adopting time management or organizational systems that purport to provide them with a consistent decision theory over some subdomain of decisions. They hope to then simply commit to that system, and thereby somehow escape the need for making (and committing to) the individual decisions. This doesn't usually work very well, for reasons that have nothing to do with which decision theory they are attempting to adopt.
0Vladimir_Nesov12yIn my original comment, I specified that I only consider the situations "where the calculations are available", that is you know (theoretically!) exactly what to do to be reflectively consistent in such situations and don't need to achieve great artistic feats to pull that off. You need to qualify what you are asserting, otherwise everything looks gray [http://wiki.lesswrong.com/wiki/Fallacy_of_gray].
2pjeby12yI'm asserting that people don't actually do what they "decide" to do on the abstract level of System 2, unless certain System 1 processes are engaged with respect to the concrete, "near" aspects of the situation where the behavior is to be executed, and that merely precommitting to follow a certain decision theory is not a substitute for the actual, concrete, System 1commitment processes involved. Now, could you commit to following a certain behavior under certain circumstances, that included the steps needed to also obtain System 1 commitment for the decision? That I do not know. I think maybe you could. It would depend, I think, on how concretely you could define the circumstances when these steps would be taken... and doing that in a way that was both concrete and comprehensive would likely be difficult, which is why I'm not so sure about its feasibility.
0Vladimir_Nesov12yYour model of human behavior doesn't look in the least realistic to me, with its prohibition of reason, and requirements for difficult rituals of baptising reason into action.
1pjeby12yWell, I suppose all the experiments that have been done on construal theory, and how concrete vs. abstract construal affects action and procrastination must be unrealistic, too, since that is a major piece of what I'm talking about here. (If people were generally good at turning their reasoning into action, akrasia wouldn't be such a hot topic here and in the rest of the world.)
0Vladimir_Nesov12yAkrasia happens, but it's not a universal mode. I object to you implying that akrasia is inevitable.
1pjeby12yI never said it was inevitable. I said it happens when there are conflicts, and you haven't really decided what to do about those conflicts, with enough detail and specificity for System 1 to automatically make the "right" choice in context. If you want different results, it's up to you to specify them for yourself.
1Cyan12yNewcomb's problem is a bad example to use here, because it depends on which math the person has committed to, e.g., Eliezer claims to have worked out a general analysis that justifies one-boxing... The personal stake I envision is defending their concept of their own identity. "I will do this because that's the kind of person I am."
3Cyan12yThank you for this interesting discussion. Although I posed the "emotionally committed to math" case as a specific hypothetical, many of the things you've written in response apply more generally, so I've got a lot more material to incorporate into my understanding of the pjeby model of cognition. (I know that's a misnomer, but since you're my main source for this material, that's how I think of it.) I'm going to have to go over this exchange more thoroughly after I get some sleep.
0conchis12yOf course, there are presumably situations where one's decision should change with the conditions. (I do get that there's a trade-off between retaining the ability to change with the right conditions and opening yourself up to changing with the wrong conditions though.)
0pjeby12yThe trade-off optimum is usually in making decisions aimed at producing concrete results, while leaving one's self largely free to determine how to achieve those results. But again, the level of required specificity is determined by the degree of conflict you can expect to arise (temptations and frustrations).
2billswift12yThis is similar to one problem Austrians have with conventional economics. They think the details of transactions are extremely important and that too much information is lost when they are aggregated in GDP and the like; more information than the weak utility of the aggregates can justify.
0timtyler12yRe: Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can't model all that crap with one number. That is not a coherent criticism of utilitarianism. Do you understand what it is that you are criticising?
2pjeby12yYes, I do... and it's not utilitarianism. ;-) What I'm criticizing is the built-in System 2 motivation-comprehending model whose function is predicting the actions of others, but which usually fails when applied to self, because it doesn't model all of the relevant System 1 features. If you try to build a human-values-friendly AI, or decide what would be of benefit to a person (or people), and you base it on System 2's model, you will get mistakes, because System 2's map of System 1 is flawed, in the same way that Newtonian physics is flawed for predicting near-light-speed mechanics: it leaves out important terms.
0SoullessAutomaton12yOf course you can. It just won't be a very good model. What do you think would work better as a simplified model of utility, then? It seems you think that having orthogonal utility and disutility values would be a start.
0loqi12yIt's been a while since I looked at CEV, but I thought the "coherent" part was meant to account for this. It assumes we have some relatively widespread, fairly unambiguous preferences, which may be easier to see in the light of that tired old example, paperclipping the light cone. If CEV outputs a null utility function, that would seem to imply that human preferences are completely symmetrically distributed, which seems hard to believe.
2pjeby12yIf by "null utility function", you mean one that says, "don't DO anything", then do note that it would not require that we all have balanced preferences, depending on how you do the combination. A global utility function that creates more pleasure for me by creating pain for you would probably not be very useful. Heck, a function that creates pleasure for me by creating pain for me might not be useful. Pain and pleasure are not readily subtractable from each other on real human hardware, and when one is required to subtract them by forces outside one's individual control, there is an additional disutility incurred. These things being the case, a truly "Friendly" AI might well decide to limit itself to squashing unfriendly AIs and otherwise refusing to meddle in human affairs.
1loqi12yI wouldn't be particularly surprised by this outcome.

Your observation is interesting. Note that I can't write down my wave function, either, but that doesn't mean I don't have one.

3orthonormal12yAnd without being able to calculate it exactly, you can approximate it usefully, and thus derive some of its most relevant properties for practical purposes.
1DanielLC12yHe didn't say it did. It's just one possible reason for it.

For a thread entitled "Post Your Utility Function" remarkably few people have actually posted what they think their utility function is.

Are people naturally secreteve about what they value? If so, why might that be?

Do people not know what their utility function is? That seems strange for such a basic issue.

Do people find their utility function hard to express? Why might that be?

0ShardPhoenix12yI assume that specifying your utility function is difficult for the same reasons that specifying the precise behaviour of any other aspect of your brain is difficult. If you're talking about a conscious, explicitly evaluated utility function then I doubt even most rationalists have such a thing.
0timtyler12yI wasn't expecting the ten significant figures. To be able to say what a utility function is, you have to be conscious of it. It has to be evaluated - or else it isn't your utility function. However, I am not sure that I know what you mean by "explicitly evaluated". In many cases, behaviour is produced unconsciously. The idea is more that a utility function should be consistent with most goal-oriented behaviour. If you claim that your goal in life is to convert people to Christianity, then you should show signs of actually trying to do that, as best you are able.
[-][anonymous]10y 0

Suppose individuals have several incommensurable utility functions: would this present a problem for decision theory? If you were presented with Newcomb's problem, but were at the same time worried about accepting money you didn't earn, would these sorts of considerations have to be incorporated into a single algorithm?

If not, how do we understand such ethical concerns as being involved in decisions? If so, how do we incorporate such concerns?

[-][anonymous]10y 0

I think I see some other purpose to thinking that you have a numerically well-defined utility function. It's a pet theory of mine, but here we go:

It pays off to do reasoning with the "mathematical" reasoning. This "mathematical" reasoning is the one that kicks in when I ask you what 67 + 49 is, it is the thing that kicks in when i say "if x < y and y < z is x < z?" Even putting your decision problem into just a vague algebraic structure will let you reason comparatively about them, even if you cannot for the life of y... (read more)

[This comment is no longer endorsed by its author]Reply

Maximize the # of 'people' who can do 'much' with their lives.

Some of the difficulty might be because the availability heuristic is causing us to focus on things which are relatively small factors in our global preferences, and ignore larger but more banal factors; e.g. being accepted within a social group, being treated with contempt, receiving positive and negative "strokes", demonstrating our moral superiority, etc.

Another problem is that although we seem to be finely attuned to small changes in social standing, as far as I know there have been no attempts to quantify this.

I vote for the first possibility - that utility functions are not particularly good match for human preferences, for following reasons: 1) I have never seen one, at least valid outside very narrow subject matter. That implies that people are not good at drawing these functions, which may be caused by the fact that these functions could in reality be very complicated, if even they exist. So even if my preferences are consistent with some utility function, any practical application would apply some strongly simplified model of the function, which could diffe... (read more)

Because everyone wants more money,

Why do people keep saying things like this? Intuition suggests, and research confirms, that there's a major diminishing returns factor involved with money, and acquiring lots of it can actually make people unhappy.

I want more money only to a degree, then I wouldn't want more. My utility function does not assign a positive, set value to money.

4conchis12yI was with you on diminishing returns, but that doesn't contradict the original claim. I haven't seen reliable research suggesting that more money actively makes people unhappy (in a broad-ish, ceteris paribus sense, so that e.g. having to work more to get it doesn't count as "money making you unhappy"). Could you point me to what you're referring to?
0taw12yTheoretically you should be able to assign marginal values, like your cat getting sick is worth $X, good weather for your barbecue party is worth$Y and so on - these being marginal values. As long as the numbers are pretty small diminishing utilities shouldn't cause any problems.

Do people care more about money's absolute value, or more about its relative value to what other people have? Does our utility function have a term for other people in it which is it in conflict with other people's utility functions?

0taw12yEven without bothering about rest of the world, just things that directly affect me, I totally failed, so I see no reason to think about such complications yet.