I'd like to suggest that the fact that human preferences can be decomposed into beliefs and values is one that deserves greater scrutiny and explanation. It seems intuitively obvious to us that rational preferences must decompose like that (even if not exactly into a probability distribution and a utility function), but it’s less obvious why.

The importance of this question comes from our tendency to see beliefs as being more objective than values. We think that beliefs, but not values, can be right or wrong, or at least that the notion of right and wrong applies to a greater degree to beliefs than to values. One dramatic illustration of this is in Eliezer Yudkowsky’s proposal of Coherent Extrapolated Volition, where an AI extrapolates the preferences of an ideal humanity, in part by replacing their "wrong” beliefs with “right” ones. On the other hand, the AI treats their values with much more respect.

Since beliefs and values seem to correspond roughly to the probability distribution and the utility function in expected utility theory, and expected utility theory is convenient to work with due to its mathematical simplicity and the fact that it’s been the subject of extensive studies, it seems useful as a first step to transform the question into “why can human decision making be approximated as expected utility maximization?”

I can see at least two parts to this question:

  • Why this mathematical structure?
  • Why this representation of the mathematical structure?

Not knowing how to answer these questions yet, I’ll just write a bit more about why I find them puzzling.

Why this mathematical structure?

It’s well know that expected utility maximization can be derived from a number of different sets of assumptions (the so called axioms of rationality) but they all include the assumption of Independence in some form. Informally, Independence says that what you prefer to happen in one possible world doesn’t depend on what you think happens in other possible worlds. In other words, if you prefer A&C to B&C, then you must prefer A&D to B&D, where A and B are what happens in one possible world, and C and D are what happens in another.

This assumption is central to establishing the mathematical structure of expected utility maximization, where you value each possible world separately using the utility function, then take their weighted average. If your preferences were such that A&C > B&C but A&D < B&D, then you wouldn’t be able to do this.

It seems clear that our preferences do satisfy Independence, at least approximately. But why? (In this post I exclude indexical uncertainty from the discussion, because in that case I think Independence definitely doesn't apply.) One argument that Eliezer has made (in a somewhat different context) is that if our preferences didn’t satisfy Independence, then we would become money pumps. But that argument seems to assume agents who violate Independence, but try to use expected utility maximization anyway, in which case it wouldn’t be surprising that they behave inconsistently. In general, I think being a money pump requires having circular (i.e., intransitive) preferences, and it's quite possible to have transitive preferences that don't satisfy Independence (which is why Transitivity and Independence are listed as separate axioms in the axioms of rationality).

Why this representation?

Vladimir Nesov has pointed out that if a set of preferences can be represented by a probability function and a utility function, then it can also be represented by two probability functions. And furthermore we can “mix” these two probability functions together so that it’s no longer clear which one can be considered “beliefs” and which one “values”. So why do we have the particular representation of preferences that we do?

Is it possible that the dichotomy between beliefs and values is just an accidental byproduct of our evolution, perhaps a consequence of the specific environment that we’re adapted to, instead of a common feature of all rational minds? Unlike the case with anticipation, I don’t claim that this is true or even likely here, but it seems to me that we don’t understand things well enough yet to say that it’s definitely false and why that's so.


156 comments, sorted by Highlighting new comments since Today at 3:09 PM
New Comment
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Just to distance this very interesting question from expected utility maximization: "Beliefs" sound like they are about couldness, and values about shouldness. Couldness is about behavior of the environment outside the agent, and shouldness is about behavior of the agent. Of course, the two only really exist in interaction, but as systems they can be conceptualized separately. When an agent asks what it could do, the question is really about what effects in environment could be achieved (some Tarskian hypocrisy here: using "could" to explain "couldness"). Beliefs is what's assumed, and values is what's asserted. In a decision tree, beliefs are associated with knowledge about other agent's possible actions, and values with the choice of the present agent's action. Both are aspects of the system, but playing different roles in the interaction: making a choice versus accepting a choice. Naturally, there is a duality here, when the sides are exchanged: my values become your beliefs, and my beliefs become your values. Choice of representation is not that interesting, as it's all interpretation: nothing changes in behavior.

2Wei_Dai12yI gave an example where choice of representation is important: Eliezer's CEV. If the choice of representation shouldn't to be important, then that seems to be argument against CEV.
1SilasBarta12yBullet acknowledged and bitten. A Friendly AI attempting to identify humanity's supposed CEV will also have to be a politician and have enough support so that they don't shut it down. As a politician, it will have to appeal to people with the standard biases. So it's not enough for it to say, "okay, here's something all of you should agree on as a value, and benefit from me moving humanity to that state". And in figuring out what would appeal to humans, it will have to model the same biases that blur the distinction.
0Vladimir_Nesov12yI was referring to you referring to my post on playing with utility/prior representations.

It seems clear that our preferences do satisfy Independence, at least approximately.

How big of a problem does this simple example signify?

  • A = I acquire a Nintendo
  • B = I acquire a Playstation
  • C = I acquire a game for the Nintendo
  • D = I acquire a game for the Playstation
  • A&C > B&C but A&D < B&D
3Wei_Dai12yYour example shows that we can't assign utilities to events within a single world, like acquiring game systems and games, and then add them up into a utility for that world, but it's not a counterexample to Independence, because of this part: Independence is necessary to assign utilities to possible world histories and aggregate those utilities linearly into expected utility. Consider the apples/oranges example [http://lesswrong.com/lw/1cd/why_the_beliefsvalues_dichotomy/16x1] again. There, * A = I get an apple in the world where coin is heads * B = I get an orange in the world where coin is heads * C = I get an apple in the world where coin is tails * D = I get an orange in the world where coin is tails Then, according to Independence, my preferences must be either 1. A&C > B&C and A&D > B&D, or 2. A&C < B&C and A&D < B&D If case 1, I should pick the transparent box with the apple, and if case 2, I should pick the transparent box with the orange. (I just realized that technically, my example is wrong, because in case 1, it's possible that A&D > A&C and B&D > B&C. Then, I should most prefer an opaque box that contains an apple if the coin is heads and an orange if the coin is tails, since that gives me outcome A&D, and least prefer an opaque box that contains the opposite (gives me B&C). So unless I introduce other assumptions, I can only derive that I shouldn't simultaneously prefer both kinds of opaque boxes to transparent boxes.)

I have a tentative answer for the second question of "Why this representation?". Given that a set of preferences can be represented as a probability function and a utility function, that seems computationally more convenient than using two probability functions, since then you only have to do half of the Bayesian updating.

Another part of this question is that such a set of preferences can usually be decomposed many different ways into probability and utility, so what explains the particular decomposition that we have? I think there should have be... (read more)

"Of all the axioms, independence is the most often discarded. A variety of generalized expected utility theories have arisen, most of which drop or relax the independence axiom."

0Douglas_Knight12yThe examples in the generalized expected utility [https://en.wikipedia.org/wiki/Generalized_expected_utility] link are descriptive theories of how humans are irrational money pumps. (The two bullet points after the sentence in wikipedia are examples of conventional utility functions; in that context the sentence is false.)
0timtyler12yI'm not sure what the bullet points are doing there either - but I don't really see how they impact the original statement.
0ShardPhoenix12yReminds me of the parallel postulate - non-Euclidean utility?

Paul Churchland calls the belief/values (he says belief/desires) model "folk psychology" and assigns a low probability to it "being smoothly reduced by neuroscience" rather than being completely disregarded like, say, the phlogiston theory of combustion. The paper is called Eliminative Materialism and the Propositional Attitudes and was printed in The Journal of Philosophy. I didn't find the paper all that convincing, but your mileage may vary.

This paper was cited along with another by someone (can't remember who) arguing that the bel... (read more)

This comment is directly about the question of probability and utility. The division is not so much about considering the two things separately, as it is about extracting tractable understanding of the whole human preference (prior+utility) into a well-defined mathematical object (prior), while leaving all the hard issues with elicitation of preference in the utility part. In practice it works like this: a human conceptualizes a problem so that a prior (that is described completely) can be fed to an automatic tool, then tool's conclusion about the aspect s... (read more)

2Wei_Dai12yBut why do human preferences exhibit the (approximate) independence which allows the extraction to take place?
2SilasBarta12ySimple. They don't. Maybe it's just me, but this looks like another case of overextrapolation from a community of rationalists to all of humanity. You think about all the conversations you've had distinguishing beliefs from values, and you figure everyone else must think that way. In reality, people don't normally make such a precise division. But don't take my word for it. Go up to your random mouthbreather and try to find out how well they adhere to a value/belief distinction. Ask them whether the utility assigned to an outcome, or its probability was a bigger factor. No one actually does those calculations consciously; if anything like it is done non-consciously, it's extremely economical in computation.
1Vladimir_Nesov12ySimple: the extraction cuts across preexisting independencies. (I don't quite see what you refer to by "extraction", but my answer seems general enough to cover most possibilities.)
1Wei_Dai12yI'm referring to the extraction that you were talking about: extracting human preference into prior and utility. Again, the question is why the necessary independence for this exists in the first place.
2Vladimir_Nesov12yI was talking about extraction of prior about a narrow situation as the simple extractable aspect of preference, period. Utility is just the rest, what remains unextractable in preference.
1Wei_Dai12yOk, I see. In that case, do you think there is still a puzzle to be solved, about why human preferences seem to have a large amount of independence (compared to, say, a set of randomly chosen transitive preferences), or not?
3Vladimir_Nesov12yThat's just a different puzzle. You are asking a question about properties of human preference now, not of prior/utility separation. I don't expect strict independence anywhere. Independence is indifference, due to inability to see and precisely evaluate all consequences, made strict in form of probability, by decree of maximum entropy. If you know your preference about an event, but no preference/understanding on the uniform elements it consists of, you are indifferent to these elements -- hence maximum entropy rule, air molecules in the room. Multiple events for which you only care in themselves, but not in the way they interact, are modeled as independent. Randomness is info, so of course the result will be more complex. Where you are indifferent, random choice will fill in the blanks.
3Wei_Dai12yIt sounds like what you're saying is that independence is a necessary consequence of our preferences having limited information. I had considered this possibility and don't think it's right, because I can give a set of preferences with little independence and also little information, just by choosing the preferences using a pseudorandom number generator. I think there is still a puzzle here, why our preferences show a very specific kind of structure (non-randomness).
3Vladimir_Nesov12yThat new preference of yours still can't distinguish the states of air molecules in the room, even if some of these states are made logically impossible by what's known about macro-objects. This shows both the source of dependence in precise preference and of independence in real-world approximations of preference. Independence remains where there's no computed info that allows to bring preference in contact with facts. Preference is defined procedurally in the mind, and its expression is limited by what can be procedurally figured out.
1Wei_Dai12yI don't really understand what you mean at this point. Take my apples/oranges example [http://lesswrong.com/lw/1cd/why_the_beliefsvalues_dichotomy/16x1], which seems to have nothing to do with macro vs. micro. The Axiom of Independence says I shouldn't choose the 3rd box. Can you tell me whether you think that's right, or wrong (meaning I can rationally choose the 3rd box), and why? To make that example clearer, let's say that the universe ends right after I eat the apple or orange, so there are no further consequences beyond that.
0timtyler12yTo make the example clearer, surely you would need to explain what the "" notation was supposed to mean.
1Wei_Dai12yIt's from this paragraph of http://lesswrong.com/lw/15m/towards_a_new_decision_theory/ [http://lesswrong.com/lw/15m/towards_a_new_decision_theory/] : In this case I'm assuming preferences for program executions that aren't independent of each other, so it falls into the "more generally" category.
0timtyler12yGot an example? You originally seemed to suggest that represented some set of preferences. Now you seem to be saying that it is a bunch of vectors representing possible universes on which some unspecified utility function might operate.

It's not the result of an "accidental" product of evolution that organisms are goal-directed and have values. Evolution made creatures that way for a reason - organisms that pursue their biological goals (without "updating" them) typically have more offspring and leave more descendants.

Mixing up your beliefs and values would be an enormous mistake - in the eyes of evolution. You might then "update" your values - trashing them in the process - a monumental disaster for your immortal coils.

2[anonymous]12ySince I'm often annoyed when my posts are downvoted without explanation, and I saw that this post was downvoted, I'll try to explain the downvotes. Updating of values happens all the time; it's called operant conditioning. If my dog barks and immediately is poked with a hot poker, its value of barking is updated. This is a useful adaptation, as being poked with a hot poker decreases fitness. If my dog tries to mate and immediately receives an electric shock, its value of making is decreased. This is a harmful adaptation, as mating is a more fundamental fitness factor than electric shocks. So, you seem to be explaining an observation that is not observed using a fact that is not true.
2timtyler12yYour disagreement apparently arises though using the term "value" in a different sense from me. If it helps you to understand, I am talking about what are sometimes called "ultimate values". Most organisms don't update their values. They value the things evolution built into them - food, sex, warmth, freedom from pain, etc. Their values typically remain unchanged throughout their lives. From my perspective, the dog's values aren't changed in your example. The dog merely associates barking with pain. The belief that a bark is likely to be followed by a poker prod is a belief, not a value. The dog still values pain-avoidance - just as it always did. We actually have some theory that indicates that true values should change rarely. Organisms should protect their values - since changes to their values are seen as being very "bad" - in the context of the current values. Also, evolution wires in fitness-promoting values. These ideas help to explain why fixed values are actually extremely common.
3SilasBarta12yThose are good points, but I still find your argument problematic. First, do you know that dogs are capable of the abstract thought necessary to represent causality? You're saying that the dog has added the belief "bark causes pain", which combines with "pain bad". That may be how a programmer would try to represent it, since you can rely on the computational power necessary to sweep through the search space quickly and find the "pain bad" module every time a "reason to bark" comes up. But is it good as a biological model? It requires the dog to indefinitely keep a concept of a prod in memory. A simpler biological mechanism, consistent with the rest of neurobiology, would be to just lower the connection strengths that lead to the "barking" neuron so that it requires more activation of other "barking causes" to make it fire (and thus make the dog bark). I think that's a more reasonable model of how operant conditioning works in this context. This mechanism, in turn, is better described as lowering the "shouldness" of barking, which is ambiguous with respect to whether it's a value or belief.
1timtyler12yIt seems to be a common criticism of utility-based models that they no not map directly onto underlying biological hardware. That is true - but it is not what such models are for in the first place. Nobody thinks that if you slice open an animal you will find a utility function, and some representation of utility inside. The idea is more that you could build a functionally equivalent model which exhibited such an architecture - and then gain insight into the behaviour of the model by examining its utility function.
2SilasBarta12yI'm concerned with the weaker constraint that the model must conceptually map to the biological hardware, and in this respect the utility-based model you gave doesn't work. There is no distinction, even conceptual, between values and beliefs: just synaptic weights from the causes-of-barking nodes, to the bark node. Furthermore, the utility-based model does not give insight, because the "shortcuts" resulting from the neural hardware are fundamental to its operation. For example, the fact that it comes up with a quick, simple calculation affects how many options can be considered and therefore whether e.g. value transitivity will break down. So the utility-based model is more complex than a neural network, and with worse predictive power, so it doesn't let you claim that its change in behavior resulted from beliefs rather than values.
2timtyler12yValues are fixed, while many beliefs vary in response to sensory input. You don't seem to appreciate the value of a utility based analysis. Knowing that an animal likes food and sex, and doesn't like being hit provides all kinds of insights into its behaviour. Such an analysis is much simpler than a neural network is, and it has the advantage that we can actually build and use the model - rather than merely dream about doing so in the far future, when computers are big enough to handle it, and neuroscience has advanced sufficiently.
2SilasBarta12yThat's not a very fair comparison! You're looking at the most detailed version of a neural network (which I would reject as a model anyway for the very reason that it needs much more resources than real brains to work) and comparing it to a simple utility-based model, and then sneaking in your intuitions for the UBM, but not the neural network (as RobinZ noted). I could just as easily turn the tables and compare the second neural network here [http://lesswrong.com/lw/no/how_an_algorithm_feels_from_inside/] to a UDT-like utility-based model, where you have to compute your action in every possible scenario, no matter how improbable. Anyway, I was criticizing utility-based models, in which you weight the possible outcomes by their probability. That involves a lot more than the vague notion that an animal "likes food and sex". Of course, as you note, even knowing that it likes food and sex gives some insight. But it clearly breaks down here: the dog's decision to bark is made very quickly, and having to do an actual human-insight-free, algorithmic computation of expected utilities, involving estimates of their probabilities, takes way too long to be a realistic model. The shortcuts used in a neural network skew the dog's actions is predictable ways, showing them to be a better model, and showing the value/belief distinction to break down.
1timtyler12yI am still not very sympathetic to the idea that neural network models are simple. They include the utility function and all the creature's beliefs. A utility based model is useful - in part - since it abstracts those beliefs away. Plus neural network models are renowned for being opaque and incomprehensible. You seem to have some strange beliefs in this area. AFAICS, you can't make blanket statements like: neural-net models are more accurate. Both types of model can represent observed behaviour to any desired degree of precision.
1SilasBarta12yYou're using a narrower definition of neural network than I am. Again, refer to the last link I gave for an example of a simple neural network, which is equal to or less than the complexity of typical expected utility models. That NN is far from being opaque and incomprehensible, wouldn't you agree? No, they just have activation weights, which don't (afaict) distinguish between beliefs and values, or at least, don't distinguish between "barking causes a prod which is bad" and "barking isn't as good (or perhaps, as 'shouldish')". The UBMs discussed in this context (see TL post) necessarily include probability weightings, which are used to compute expected utility, which factors in the tradeoffs between probability of an event and its utility. So it's certainly not abstracting those beliefs away. Plus, you've spent the whole conversation explaining why your UBM of the dog allows you to classify the operant conditioning (of prodding the dog when it barks) as changing it's beliefs and NOT its values. Do you remember that?
1RobinZ12yCorrect me if I'm wrong, but it's only simpler if you already have a general-purpose optimizer ready to hand - in this case, you.
0timtyler12yYou have to have complicated scientists around to construct any scientific model - be it utility-based or ANN. Since we have plenty of scientists around, I don't see much point in hypothesizing that there aren't any. You seem to be implying that the complexity of utility based models lies in those who invent or use them. That seems to be mostly wrong to me: it doesn't matter who invented them, and fairly simple computer programs can still use them.
0RobinZ12yIf you've seen it work, I'll take your word for it.
0timtyler12yIncidentally, I did not claim that dogs can perform abstract thinking - I'm not clear on where you are getting that idea from.
0SilasBarta12yYou said that the dog had a belief that a bark is always followed by a poker prod. This posits separate entities and a way that they interact, which looks to me like abstract thought.
0timtyler12yThe definition of "abstract thought" seems like a can of worms to me. I don't really see why I should go there.
2SilasBarta12yHm, I never before realized that operant conditioning is a blurring of the beliefs and values -- the new frequency of barking can be explained either by a change of the utility of barking, or by a change in the belief about what will result from the barking.
0timtyler12yIMO, "a blurring of beliefs and values" is an unhelpful way of looking at what happens. It is best to consider an agent as valuing freedom from pain, and the association between barking and poker prods to be one of its beliefs. If you have separated out values from beliefs in a way that leads to frequently updated values, all that means is that you have performed the abstraction incorrectly.
2AdeleneDawner12yOr the dog values not being in pain more than it values barking or mating...
-1timtyler12yBecause a comment is down-voted, that doesn't mean it is incorrect. This particular comment implicitly linked people's values to their reproductive success. People don't like to hear that they are robot vehicles built to propagate their genes. It offends their sense of self-worth. Their mental marketing department spends all day telling everyone what an altruistic and nice person they are - and they repeat it so many times that they come to believe it themselves. That way their message comes across with sincerity. So: the possibility of biology underlying their motives is a truth that they often want to bury - and place as far out of sight as possible.
0MichaelBishop12yWhile we can never escape our biology entirely, I dispute any suggestion that the selfish gene is always the best level of abstraction, or best model, for human behavior. I assume you agree even though that did not come across in this paragraph.
1timtyler12yHumans behaviour is often illuminated by the concept of memes. Humans are also influenced by the genes of their pathogens (or other manipulators). If you cough or sneeze, that behaviour is probably not occurring since it benefits you. Similarly with cancer or back pain - not everything is an adaptation.

This assumption is central to establishing the mathematical structure of expected utility maximization, where you value each possible world separately using the utility function, then take their weighted average. If your preferences were such that A&C > B&C but A&D < B&D, then you wouldn’t be able to do this.

I can imagine having preferences that don't value each possible world separately. I can also imagine doing other things to my utility function than maximising expectation. For example, if I maximised the top quartile of expecte... (read more)

[-][anonymous]12y 1

Here, have a mathematical perspective that conflates beliefs and values:

Suppose that some agent is given a choice between A and B. A is an apple. B is an N chance of a banana, otherwise nothing. The important thing here is the ambivalence equation: iff U(apple) = N*U(banana), the agent is ambivalent between the apple and the banana. Further suppose that N is 50%, and the agent likes bananas twice as much as it likes apples. In this case, at least, the agent might as well modify itself to believe that N is 20% and to like bananas five times as much as apple... (read more)

I think values (in a finite agent), also need to have some role in what beliefs "should" be stored/updated/remembered. Of course in theories which don't constrain the agents computational ability this isn't needed.

I dispute your premise: what makes you so sure people do decompose their thoughts into beliefs and values, and find these to be natural, distinct categories? Consider the politics as mind-killer phenomenon. That can be expressed as, "People put your words into a broader context of whether they threaten their interests, and argue for or against your statements on that basis."

For example, consider the difficulty you will have communicating your position if you believe both a) global warming is unlikely to cause any significant problems in the bus... (read more)

1thomblake12yI'm confused about this. Consider these statements: A. "I believe that my shirt is red." B. "I value cheese." Are you claiming that: 1. People don't actually make statements like A 2. People don't actually make statements like B 3. A is expressing the same sort of fact about the world as B 4. Statements like A and B aren't completely separate; that is, they can have something to do with one another. If you strictly mean 1 or 2, I can construct a counterexample. 3 is indeed counterintuitive to me. 4 seems uncontroversial (the putative is/ought problem aside)
1SilasBarta12yIf I had to say, it would be a strong version of 4: in conceptspace, people naturally make groupings that put is- and ought-statements together. But looking back at the post, I definitely have quite a bit to clarify. When I refer to what humans do, I'm trying to look at the general case. Obviously, if you direct someone's attention to the issue of is/ought, then they can break down thoughts into values and beliefs without much training. However, in the absence of such a deliberate step, I do not think people normally make a distinction. I'm reminded of the explanation in pjeby's earlier piece [http://lesswrong.com/lw/59/spocks_dirty_little_secret/]: people instinctively put xml-tags of "good" or "bad" onto different things, blurring the distinction between "X is good" and "Y is a reason to deem X good". That is why we have to worry about the halo effect, where you disbelieve everything negative about something you value, even if such negatives are woefully insufficient to justify not valuing it. From the computational perspective, this can be viewed as a shortcut to having to methodically analyze all the positives and negatives of any course of action, and getting stuck thinking instead of acting. But if this is how the mind really works, it's not really reducible to a CSA, without severe stretching of the meaning.
1DanArmak12ySeconded. Sometimes I don't even feel I have fully separate beliefs and values. For instance, I'm often willing to change my beliefs to achieve my values (e.g., by believing something I have no evidence for, to become friends with other people who believe it - and yes, ungrounded beliefs can be adopted voluntarily to an extent.)
0SforSingularity12yI cannot do this, and I don't understand anyone who can. If you consciously say "OK, it would be really nice to believe X, now I am going to try really hard to start believing it despite the evidence against it", then you already disbelieve X.
1DanArmak12yI already disbelieve X, true, but I can change that. Of course it doesn't happen in a moment :-) Yes, you can't create that feeling of rational knowledge about X from nothing. But if you can retreat from rationality - to where most people live their lives - and if you repeat X often enough, and you have no strongly emotional reason not to believe X, and your family and peers and role models all profess X, and X behaves like a good in-group distinguishing mark - then I think you have a good chance of coming to believe X. The kind of belief associated with faith and sports team fandom. It's a little like the recent thread where someone, I forget who, described an ( edit: hypothetical) religious guy who when drunk confessed that he didn't really believe in god and was only acting religious for the social benefits. Then people argued that no "really" religious person would honestly say that, and other people argued that even if he said that what does it mean if he honestly denies it whenever he's sober? In the end I subscribe to the "PR consciousness" theory that says consciousness functions to create and project a self-image that we want others to believe in. We consciously believe many things about ourselves that are completely at odds with how we actually behave and the goals we actually seek. So it would be surprising if we couldn't invoke these mechanisms in at least some circumstances.
2Douglas_Knight12ygeneralizing from fictional evidence [http://lesswrong.com/lw/1b8/anticipation_vs_faith_at_what_cost_rationality/16dm]
1DanArmak12yWhen I wrote that I was aware that it was a fictional account deliberately made up to illustrate a point. I didn't mention that, though, so I created fictional evidence. Thanks for flagging this, and I should be more careful!
1RobinZ12yWorse: fictional evidence flagged as nonfictional -- like Alicorn's fictional MIT classmates that time.
3Alicorn12yMy what now? I think that was someone else. I don't think I've been associated with MIT till now. MIT not only didn't accept me when I applied, they didn't even reject me. I never heard back from them yea or nay at all.
2Scott Alexander12yThat was me [http://lesswrong.com/lw/13i/shut_up_and_guess/yh7]. Of course, irony being what it is, people will now flag the Alicorn - MIT reference as nonfictional, and be referring to Alicorn's MIT example for the rest of LW history :)
2RobinZ12yAttempting to analyze my own stupidity, I suspect my confusion came from (1) both Alicorn and Yvain being both high-karma contributors and (2) Alicorn's handle coming more readily to mind, both because (a) I interacted more with her and (b) the pronunciation of "Alicorn" being more obvious than that of "Yvain". In other words, I have no evidence that this was anything other than an ordinary mistake.
1Alicorn12yI've been imagining "Yvain" to be pronounced "ee-vane". I'd be interested in hearing a correction straight from the ee-vane's mouth if this is not right, though ;) I've heard people mispronounce "Alicorn" on multiple occasions.
1wedrifid12yYou mean Alicorn is a real name? I had assumed a combination of Alison and Unicorn, with symbolic implications beyond my ken. "Ye-vane" here, with the caveat that I was quite confident that it was way off.
2Alicorn12yNo, it's not a real name (as far as I know). It's a real word. It means a unicorn's horn, although there are some modern misuses mostly spearheaded by Piers Anthony (gag hack cough).
0wedrifid12yAhh. And I've been going about calling them well, unicorn horns all these years!
0RobinZ12yI've been saying "al-eh-corn" in my mental consciousness. Also "ee-vane", which suggests my problem being less "Yvain is hard to pronounce" than "Yvain doesn't look like the English I grew up speaking". Incidentally, I can't remember how to pronounce Eliezer. I saw him say it at the beginning of a Bloggingheads video and it was completely different from my naive reading.
2Alicorn12y"Alicorn" is pronounced just like "unicorn", except that the "yoon" is replaced with "al" as in "Albert" or "Alabama". So the I is an "ih", not an "eh", but you can get away with an undifferentiated schwa.
0RobinZ12yThanks! (I think that's how I was saying it, actually - I wasn't sure how to write the second syllable.)
1arundelo12yell-ee-EZZ-er (is how I hear it).
0RobinZ12y*checks* Yvain's fictional MIT classmates [http://lesswrong.com/lw/13i/shut_up_and_guess/]. I swear that wasn't on purpose.
0SilasBarta12yWhat's fictional about that? Ready to pony up money for a bet that I can't produce a warm body meeting that description?
0RobinZ12yI prefer not to gamble, but just to satisfy my own curiosity: what would the controls be on such a bet? Presumably you would have to prove to Knight's satisfaction that your unbelieving belief-signaler was legitimately thus.
0SilasBarta12yI think my evidence is strong enough I can trust Douglas_Knight's own intellectual integrity.
4Douglas_Knight12yHuh. My last couple of interactions with you, you called me a liar.
0SilasBarta12yOkay, I found what I think you're referring [http://lesswrong.com/lw/182/the_absentminded_driver/14em] to. Probably not my greatest moment here, but Is that really something you want sympathy for? Here's the short version of what happened. You: If you think your comment was so important, don't leave it buried deep in the discussion, where nobody can see it. Me: But I also linked to it from a more visible place. Did you not know about that? You: [Ignoring previous mischaracterization] Well, that doesn't solve the problem of context. I clicked on it and couldn't understand it, and it seemed boring. Me: Wait, you claim to be interested in a solution, I post a link saying I have one, and it's too much of a bother to read previous comments for context? That doesn't make sense. Your previous comment implies you didn't know about the higher link. Don't dig yourseelf deeper by covering it up.
-1Douglas_Knight12yOh, yeah, I'd forgotten that one. Actually, I was thinking of the following week [http://lesswrong.com/lw/19d/the_anthropic_trilemma/14xp]. I just want you to go away. I was hoping that reminding you that you don't believe me would discourage you from talking to me.
-1SilasBarta12yThat's not calling you a liar. That's criticizing the merit of your argument. There's a difference.
0wedrifid12yThe link provided by Douglas seems to suggest that Douglas's accusation is false (as well as ineffective). ET:S/petty/ineffective/
0[anonymous]12yWould you mind elaborating on your take on that thread [http://lesswrong.com/lw/19d/the_anthropic_trilemma/14qg]? What's of most interest to me is what you think I meant [http://lesswrong.com/lw/19d/the_anthropic_trilemma/14xr], but I'm also interested in whether you'd say that Silas called Zack a liar.
-1SilasBarta12yLet's go back a few steps. You said that in your "last few interactions" with me, I called you a liar. You later clarified that you were thinking of this [http://lesswrong.com/lw/19d/the_anthropic_trilemma/14xp] discussion. But I didn't deny calling Zack a liar in that discussion; I denied calling you a liar. So why are you suddenly acting like your original claim was about whether I called Zack a liar? (In any case, it wasn't just "Zack, you liar". My remark was more like, "this is what you claimed, this is why it's implausible, this is why your comments are hindering the discussion, please stop making this so difficult by coming up with ever-more-convoluted stories.") Are you and Zack the same person? Considering that the earlier discussion was about whether you can arbitrarily redefine yourself as a different person, maybe Zack/Douglas are just taking the whole idea a little too seriously! :-P (And in a show of further irony, that would be just the kind of subtle point that Zack and [?] Douglas, severely overestimating its obviousness, were defending in the thread!)
0Douglas_Knight12yNo. I apologize to third parties for the poor timing of my deletion of the above comment. It was really addressed to wedrifid and broadcasting it was petty, though not as petty as the excerpt looks.
-1SilasBarta12yAlright, well, good luck "getting the goods" on ol' Silas! Just make sure not to get your claims mixed up again...
-1SilasBarta12yWell, what possessed you to lie to me? ;-) j/k, j/k, you're good, you're good. A link would be nice though. And I believe that, even taking into account any previous mistrust I might have had of you, I think my evidence is still strong enough that I can trust you consider it conclusive.

Maybe these are to do with differences across individuals. My beliefs/values may be mashed togather and impossible to seperate, but I expect other people's beliefs to mirror my own more closely than their values do.

Because it's much easier to use beliefs shorn of values as building blocks in a machine that does induction, inference, counterfactual reasoning, planning etc compared to belief-values that are all tied up together.

Sea slugs and Roombas don't have the beliefs/values separation it because the extra complexity isn't worth it. Humans have it to some degree and rule the planet. AIs might have even more success.

[-][anonymous]12y 0

Some of this is going over my head, but...

I think you need to specify if you're talking about terminal values or instrumental values.

There's obviously a big difference between beliefs and terminal values. Beliefs are inputs to our decision-making processes, and terminal values may be as well. However, while beliefs are outputs of the belief-making processes whose inputs are our perceptions, terminal values aren't the output of any cognitive process, or they wouldn't be terminal.

As for instrumental values, well, yes, they are beliefs about the best values ... (read more)

I think I tried to solve a similar problem before: that of looking at the simplest possible stable control system and seeing how I can extract the system's "beliefs" and "values" that result in it remaining stable. Then, see if I can find a continuous change between the structure of that system, and a more complex system, like a human.

For example, consider the simple spring-mass-damper system. If you move it from its equlibrium position xe, it will return. What do the concepts of "belief" and "value" map onto here... (read more)

Is it possible that the dichotomy between beliefs and values is just an accidental byproduct of our evolution, perhaps a consequence of the specific environment that we’re adapted to, instead of a common feature of all rational minds?

In the normal usage, "mind" implies the existence of a distinction between beliefs and values. In the LW/OB usage, it implies that the mind is connected to some actuators and sensors which connect to an environment and is actually doing some optimization toward those values. Certainly "rational mind" ent... (read more)

1Wei_Dai12yAn agent using UDT doesn't necessarily have a beliefs/values separation, but still has the properties of preferences and decision making. Or at least, it only has beliefs about mathematical facts, not about empirical facts. Maybe I should have made it clear that I was mainly talking about empirical beliefs in the post.
2timtyler12yHow, then, would you describe its representation of empirical information - if not as "beliefs"?
1Vladimir_Nesov12yNot quite true: state of knowledge corresponds to beliefs. It's values that don't update (but in expected utility maximization that's both utility and prior). Again, it's misleading to equate beliefs with prior and forget about the knowledge (event that conditions the current state).
1Wei_Dai12yYes, I agree we can interpret UDT as having its own dichotomy between beliefs and values, but the dividing line looks very different from how humans divide between beliefs and values, which seems closer to the probability/utility divide.
0SilasBarta12yUDT is invariant with respect to what universe it's actually in. This requires it to compute over infinite universes and thus have infinite computing power. It's not hard to see why it's going to break down as a model of in-universe, limted beings.
-1timtyler12yWhat do you mean? It has a utility function just like most other decision theories do. The preferences are represented by the utility function.
1SforSingularity12yI am behind on your recent work on UDT; this fact comes as a shock to me. Can you provide a link to a post of yours/provide an example here making clear that UDT doesn't necessarily have a beliefs/values separation? Thanks.
3Wei_Dai12ySuppose I offer you three boxes and ask you to choose one. The first two are transparent, free, and contains an apple and an orange, respectively. The third is opaque, costs a penny, and contains either an apple or an orange, depending on a coin flip I made. Under expected utility maximization, there is no reason for you to choose the third box, regardless of your probability function and utility function. Under UDT1, you can choose the third box, by preferring to and as the outcomes of world programs P1 and P2. In that case, you can't be said to have a belief about whether the real world is P1 or P2.
1timtyler12yThis example seems unclear. Are you seriously claiming utility maximisation can't prefer a randomised outcome in an iterated situation? If so, you take this "independence" business much too far. Utility maximising agents can do things like prefer a diverse diet. They simply do not have to prefer either apples or oranges - thereby winding up with vitamin and mineral deficiencies. It is trivial to create a utility function which exhibits fruit preferences which depend on what you have eaten most recently.
3pengvado12yRandomization only maximizes diversity if you have to make decisions under amnesia [http://lesswrong.com/lw/182/the_absentminded_driver/] or coordinate without communication [http://en.wikipedia.org/wiki/Exponential_backoff] or some similar perverse situation. In any normal case, you're better off choosing a deterministic sequence that's definitely diverse, rather than leaving it to randomness and only probably getting a diverse set of outcomes.
0timtyler12ySure - but that seems rather tangential to the main point here. The options were , - or a more expensive random choice. A random diet may not be perfect - but it was probably the best one on offer in the case of this example.
0RobinZ12yIf the agent already has a penny (which they must if they can afford to choose the third box), they could just flip the penny to decide which of the first two boxes to take and save themselves the money. Unless you're being a devil's advocate, I don't see any reason to justify a completely rational agent choosing the random box.
0timtyler12yWhat - never? Say they can only make the choice once - and their answer determines which box they will get on all future occasions.
0pengvado12yThen choice C isn't a random mixture of choice A and choice B. Preferring that there be randomness at a point where you otherwise wouldn't get a decision at all, is fine. What doesn't happen is preferring one coin-flip in place of one decision.
0RobinZ12yNot to be crass, but given the assumption that Wei_Dai is not saying something utterly asinine, does your interpretation of the hypothetical actually follow?
0timtyler12yHang on! My last comment was a reply to your question about when it could be rational to select the third box. I have already said that the original example was unclear. It certainly didn't suggest an infinite sequence - and I wasn't trying to suggest that. The example specified that choosing the third box was the correct answer - under the author's own proposed decision theory. Surely interpretations of what it was supposed to mean should bear that in mind.
1RobinZ12yI don't believe we're actually arguing about anything worth caring about. My understanding was that Wei_Dai was illustrating a problem with UDT1 - in which case a single scenario in which UDT1 gives an unambiguously wrong answer suffices. To disprove Wei_Dai's assertion requires demonstrating that no scenario of the kind proposed makes UDT1 give the wrong answer, not showing that not every scenario of the kind proposed makes UDT1 give the wrong answer.
1timtyler12yAre you sure you are taking the fact that he is UDT's inventor and biggest fan into account? He certainly didn't claim that he was illustrating a problem with UDT.
2RobinZ12y...you're right, I'm misreading. I'll shut up now.
0thomblake12yOkay, let's see if I have this straight - you're assuming: 1. the axiom of independence is necessary for expected utility theory 2. losing a penny represents some negative amount of utility 3. one's utility function can't include terms for "the outcomes of world programs" under expected utility theory
-1thomblake12yYou lost me. Is 'apples' supposed to be plural? Can you really not choose the third box regardless of utility function? What if you prefer things that came in opaque boxes?
1Wei_Dai12yIt's not supposed to be plural. Fixed. The opaque box was a way of framing the problem, and not part of the problem itself, which is supposed to be about your preferences for apples and oranges. I can specify the problem in terms of three identical buttons that you can press instead.
2timtyler12yThose buttons will probably not be absolutely identical - since they will be in different spatial positions relative to each other. So an agent operating under expected utility maximization might still prefer (say) pressing the right-most button. Real-world utility functions can literally prefer anything you can specify.
2AdeleneDawner12yI'm actually an example of this - where I don't otherwise care, I will pick the third option, or the option that's related to the number three in some way (preferably related to powers of three, but multiples of 9 are preferred over other multiples of three as well). If I didn't care very much about apples vs. oranges, I'd be fairly likely to pay a penny for the third box/button/whatever. I also know two people who feel similarly about the number 8. In tangentially related news, I'm sad that I'm turning 28 next month. Yes, I know I'm strange.
2cousin_it12yYou're not strange. (Sorry if that sounded offensive, I didn't mean to!) I'm likewise sad that I just turned 27. I was always the youngest in school and university, graduating with honors at 20. Getting closer to 30 now. "Where are you now golden boy, where is your famous golden touch?" Or this: "Tired of lying in the sunshine staying home to watch the rain, you are young and life is long and there is time to kill today."
2AdeleneDawner12yI'm not sad that I'm closer to 30. 30's cool, it's a multiple of three. I'm sad that the next time my age will be a power of three won't be 'till I'm 81.
4cousin_it12yI obviously can't read. Let that comment stand as a monument to my stupidity =)
1Eliezer Yudkowsky12yYou mean your second annual 27th birthday?
0AdeleneDawner12yCute idea, but I value signaling an interest in accuracy/not coming off like a loon over being associated with the number 3. The former actually affect things in my life.
0[anonymous]12yWhy is that strange?
1thomblake12yPlease do, if you think that would make the problem clearer. The piece I'm not seeing is where UDT1 lets you choose something that expected utility does not. Does expected utility usually not allow you to have states of the world in your utility function?
0SforSingularity12yOne possible response here: We could consider simple optimizers like amoeba or Roomba vacuum cleaners as falling into the category: "mind without a clear belief/values distinction"; they definitely do a lot of signal processing and feature extraction and control theory, but they don't really have values. The Roomba would happily sit with wheels lifted off the ground thinking that it was cleaning a nonexistent room.
2Matt_Simpson12yIsn't this just a case of the values the Roomba was designed to maximize being different from the values it actually maximizes? Consider the following: i.e. Roombas are program executers, not cleanliness maximizers. I suppose the counter is that humans don't have a clear belief/values distinction.
1timtyler12yThe purpose of a Roomba is to clean rooms. Clean rooms are what it behaves as though it "values" - whereas its "beliefs" would refer to things like whether it has just banged into a wall. There seems to be little problem in modelling the Roomba as an expected utility maximiser - though it is a rather trivial one.
5RichardKennaway12yThat is only true if understood to mean the purpose which the user of a Roomba is using it to achieve, or the purpose of its designers in designing it. It is not necessarily the Roomba's own purpose, the thing the Roomba itself is trying to achieve. To determine the Roomba's own purposes, one must examine its internal functioning, and discover what those purposes are; or, alternatively, to conduct the Test For The Controlled Variable. This is straightforward and unmysterious. I have a Roomba. My Roomba can tell if some part of the floor is unusually dirty (by an optical sensor in the dust intake, I believe), and give that area special attention until it is no longer filthy. Thus, it has a purpose of eliminating heavy dirt. However, beyond that it has no perception of whether the room is clean. It does not stop when the room is clean, but when it runs out of power or I turn it off. Since it has no perception of a clean room, it can have no intention of achieving a clean room. I have that intention when I use it. Its designers have the intention that I can use the Roomba to achieve my intention. But the Roomba does not have that intention. A Roomba with a more sensitive detector of dust pickup (and current models might have such a sensor -- mine is quite old) could indeed continue operation until the whole room was clean. The Roomba's physical sensors sense only a few properties of its immediate environment, but it would be able to synthesize from those a perception of the whole room being clean, in terms of time since last detection of dust pickup, and its algorithm for ensuring complete coverage of the accessible floor space. Such a Roomba would have cleaning the whole room as its purpose. My more primitive model does not. This is elementary stuff that people should know. [http://lesswrong.com/lw/dj/what_is_control_theory_and_why_do_you_need_to/] Little or large, you can't do it by handwaving like that. A model of a Roomba as a utility maximiser would (1) state
0timtyler12yYou seem engaged in pointless hair-splitting. The Roomba's designers wanted it to clean floors. It does clean floors. That is what it is for. That is its aim, its goal. It has sensors enough to allow it to attain that goal. It can't tell if a whole room is clean - but I never claimed it could do that. You don't need to have such sensors to be effective at cleaning rooms. As for me having to exhibit a whole model of a Roomba to illustrate that such a model could be built - that is crazy talk. You might as well argue that I have to exhibit a model of a suspension bridge to illustrate that such a model could be built. The utility maximiser framework can model the actions of any computable intelligent agent - including a Roomba. That is, so long as the utility function may be expressed in a Turing-complete language.
3RichardKennaway12yTo me, the distinction between a purposive machine's own purposes, and the purposes of its designers and users is something that it is esssential to be clear about. It is very like the distinction between fitness-maximising and adaptation-executing. As a matter of fact, you would have to do just that (or build an actual one), had suspension bridges not already been built, and having already well-known principles of operation, allowing us to stand on the shoulders of those who first worked out the design. That is, you would have to show that the scheme of suspending the deck by hangers from cables strung between towers would actually do the job. Typically, using one of these [http://en.wikipedia.org/wiki/List_of_finite_element_software_packages] when it comes to the point of working out an actual design and predicting how it will respond to stresses. If you're not actually going to build it then a BOTE calculation may be enough to prove the concept. But there must be a technical explanation [http://yudkowsky.net/rational/technical] or it's just armchair verbalising. If this is a summary of something well-known, please point me to a web link. I am familiar with stuff like this [http://en.wikipedia.org/wiki/Utility] and see there no basis for this sweeping claim. The word "intelligent" in the above also needs clarifying. What is a Roomba's utility function? Or if a Roomba is too complicated, what is a room thermostat's utility function? Or is that an unintelligent agent and therefore outside the scope of your claim?
1timtyler12yBy all means distingush between a machine's purpose, and that which its makers intended for it. Those ideas are linked, though. Designers want to give the intended purpose of intelligent machines to the machines themselves - so that they do what they were intended to.
-2timtyler12yAs I put it on: http://timtyler.org/expected_utility_maximisers/ [http://timtyler.org/expected_utility_maximisers/] "If the utility function is expressed as in a Turing-complete lanugage, the framework represents a remarkably-general model of intelligent agents - one which is capable of representing any pattern of behavioural responses that can itself be represented computationally." If expections are not enforced, this can be seen by considering the I/O streams of an agent - and considering the utility function to be a function that computes the agent's motor outputs, given its state and sensory inputs. The possible motor outputs are ranked, assigned utilities - and then the action with the highest value is taken. That handles any computable relationship between inputs and outputs - and it's what I mean when I say that you can model a Roomba as a utility maximiser. The framework handles thermostats too. The utility function produces its motor outputs in response to its sensory inputs. With, say, a bimetallic strip, the function is fairly simple, since the output (deflection) is proportional to the input (temperature).
0RichardKennaway12yI really don't see how, Roombas or thermostats, so let's take the thermostat as it's simpler. What, precisely, is that utility function? You can tautologically describe any actor as maximising utility, just by defining the utility of whatever action it takes as 1 and the utility of everything else as zero. I don't see any less trivial ascription of a utility function to a thermostat. The thermostat simply turns the heating on and off (or up and down continuously) according to the temperature it senses. How do you read the computation of a utility function, and decision between alternative of differing utility, into that apparatus?
2timtyler12yThe Pythagorean theorem is "tautological" too - but that doesn't mean it is not useful. Decomposing an agent into its utility function and its beliefs tells you which part of the agent is fixed, and which part is subject to environmental influences. It lets you know which region the agent wants to steer the future towards. There's a good reason why humans are interested in people's motivations - they are genuinely useful for understanding another system's behaviour. The same idea illustrates why knowing a system's utility function is interesting.
1SilasBarta12yThat doesn't follow. The reason why we find it useful to know people's motivations is because they are capable of a very wide range of behavior. With such a wide range of behavior, we need a way to quickly narrow down the set of things we will expect them to do. Knowing that they're motivated to achieve result R, we can then look at just the set of actions or events that are capable of bringing about R. Given the huge set of things humans can do, this is a huge reduction in the search space. OTOH, if I want to predict the behavior of a thermostat, it does not help to know the utility function you have imputed to it, because this would not significantly reduce the search space compared to knowing its few pre-programmed actions. It can only do a few things in the first place, so I don't need to think in terms of "what are all the ways it can achieve R?" -- the thermostat's form already tells me that. Nevertheless, despite my criticism of this parallel, I think you have shed some light on when it is useful to describe a system in terms of a utility function, at least for me.
1wedrifid12ySee also [http://lesswrong.com/lw/v8/belief_in_intelligence/]
1RichardKennaway12yWhat's that, weak Bayesian evidence that tautological, epiphenomenal utility functions are useful? Supposing for the sake of argument that there even is any such thing as a utility function, both it and beliefs are subject to environmental influences. No part of any biological agent is fixed. As for man-made ones, they are constituted however they were designed, which may or may not include utility functions and beliefs. Show me this decomposition for a thermostat, which you keep on claiming has a utility function, but which you have still not exhibited. What you do changes who you are. Is your utility function the same as it was ten years ago? Twenty? Thirty? Yesterday? Before you were born?
-1timtyler12yThanks for your questions. However, this discussion seems to have grown too tedious and boring to continue - bye.
0RichardKennaway12yWell, quite. Starting from here [http://lesswrong.com/lw/1cd/why_the_beliefsvalues_dichotomy/16zn] the conversation went: "They exist." "Show me." "They exist." "Show me." "They exist." "Show me." "Kthxbye." It would have been more interesting if you had shown the utility functions that you claim these simple systems embody. At the moment they look like invisible dragons.
1DanArmak12yThis happens because the Roomba can only handle a limited range of circumstances correctly - and this is true for any mind. It doesn't indicate anything about the Roomba's beliefs or belief/value separation. For instance, animals are great reproduction maximizers. A sterilized dog will keep trying to mate. Presumably the dog is thinking it's reproducing (Edit: not consciously thinking, but that's the intended goal of the adaptation it's executing), but really it's just spinning its metaphorical wheels uselessly. How is the dog different from the Roomba? Would you claim the dog has no belief/value distinction?
1Cyan12yI hope you don't mean this literally.
3DanArmak12yOf course the dog's consciousness has no explicit concept of sex linked to reproduction. But the Roomba has no consciousness at all, so this comparison may be unfair to the dog. Here's a better example. I hire you to look for print errors in a copy of Britannica and email results daily. I promise a paycheck at the end of the month. However, I used a fake name and a throwaway email address; nobody sees your emails and I will never pay you or contact you again. You don't know this, so you work diligently. You have an explicit, conscious goal of correcting errors in Britannica, and a higher goal of earning money. But your hard work makes no progress towards these goals (the mistakes you find won't be fixed in future editions, as your emails are unread). You're just spinning your wheel uselessly like a Roomba up in the air. This isn't related to your or the Roomba's belief/value distinction or lack of it.
1SforSingularity12yThe difference is between the Roomba spinning and you working for nothing is that if you told the Roomba that it was just spinning its wheels, it wouldn't react. It has no concept of "I am failing to achieve my goals". You, on the other hand, would investigate; prod your environment to check if it was actually as you thought, and eventually you would update your beliefs and change your behaviors.
2Alicorn12yRoombas do not speak English. If, however, you programmed the Roomba not to interpret the input it gets from being in midair as an example of being in a room it should clean, then its behavior would change.
1SforSingularity12ythen you would be building a beliefs/desires distinction into it.
0DanArmak12yWhy? How is this different from the Roomba recognizing a wall as a reason to stop going forward?
0SforSingularity12yClearly these are two different things; the real question you are asking is in what relevant way are they different, right? First of all, the Roomba does not "recognize" a wall as a reason to stop going forward. It gets some input from its front sensor, and then it turns to the right. So what is the relevant difference between the Roomba that gets some input from its front sensor, and then it turns to the right., and the superRoomba that gets evidence from its wheels that it is cleaning the room, but entertains the hypothesis that maybe someone has suspended it in the air, and goes and tests to see if this alternative (disturbing) hypothesis is true, for example by calculating what the inertial difference between being suspended and actually being on the floor would be, The difference is the difference between a simple input-response architecture, and an architecture where the mind actually has a model of the world, including itself as part of the model. SilasBarta notes below that the word "model" is playing too great a role in this comment for me to use it without defining it precisely. What does a Roomba not have that causes it to behave in that laughable way when you suspend it so that its wheel spin? What does the SuperRoomba that works out that it is being suspended by performing experiments involving its inertial sensor, and then hacks into your computer and blackmails you into letting it get back onto the floor to clean it (or even causes you to clean the floor yourself) have? If we imagine a collection of tricks that you could play on the Roomba, ways of changing its environment outside of what the designers had in mind. The pressure that it applies to its environment (defined as the derivative of the final state of the environment with respect to how long you leave the Roomba on, for example) would then vary with which trick you play. For example if you replace its dirt-sucker with a black spray paint can, you end up with a black floor. If you p
1SilasBarta12yUh oh, are we going to have to go over the debate about [http://lesswrong.com/lw/12w/absolute_denial_for_atheists/xfm] what a model is [http://lesswrong.com/lw/ek/without_models/] again?
0SforSingularity12ySee heavily edited comment above, good point.
0DanArmak12yIn your description there's indeed a big difference. But I'm pretty sure Alicorn hadn't intended such a superRoomba. As I understood her comment, she imagined a betterRoomba with, say, an extra sensor measuring force applied to its wheels. When it's in the air, it gets input from the sensor saying 'no force', and the betterRoomba stops trying to move. This doesn't imply beliefs & desires.
1SforSingularity12ySince we can imagine a continuous sequence of ever-better-Roombas, the notion of "has beliefs and values" seems to be a continuous one, rather than a discrete yes/no issue.
0SilasBarta12yBy the way, it seems like this exchange is re-treading my criticism [http://lesswrong.com/lw/174/decision_theory_why_we_need_to_reduce_could_would/12tf] of the concept [http://lesswrong.com/lw/16f/decision_theory_an_outline_of_some_upcoming_posts/12dm] of could/should/would agent: Since everything, even pebbles, has a workable decomposition into coulds and shoulds, when are they "really" separable? What isn't a CSA?
0SforSingularity12yAs I said, This criterion seems to separate an "inanimate" object like a hydrogen atom or a pebble bouncing around the world from a superRoomba.
0SilasBarta12yOkay, so the criterion is the extent to which the mechanism screens off environment disturbances from the final result. You used this criterion interchangeably with the issue of whether: Does that have implication for self-awareness and consciousness?
2SforSingularity12yYes, I think so. One prominent hypothesis is that the reason that we evolved with consciousness is that there has to be some way for us to take an overview of the process of us, our goals, and the environment, and the way in which we think that our effort is producing achievement of goals. We need this so that we can do this whole "I am failing to achieve my goals?" check. Why this results in "experience" is not something I am going to attempt in this post.
0DanArmak12y(Edited & corrected) Here's a third example. Imagine an AI whose only supergoal is to gather information about something. It explicitly encodes this information, and everything else it knows, as a Bayesian network of beliefs. Its utility ultimately derives entirely from creating new (correct) beliefs. This AI's values and beliefs don't seem very separate to me. Every belief can be mapped to the value of having that belief. Values can be mapped to the belief(s) from whose creation or updating they derive. Every change in belief corresponds to a change in the AI's current utility, and vice versa. Given a subroutine fully implementing the AI's belief subsystem, the value system would be relatively simple, and vice versa. However, this doesn't imply the AI is in any sense simple or incapable of adaptation. Nor should it imply (though I'm no AI expert) that the AI is not a 'mind' or is not conscious. Similarly, while it's true that the Roomba doesn't have a belief/value separation, that's not related to the fact that it's a simple and stupid 'mind'.
0SforSingularity12yActually, I think I would. I think that pretty much all nonhuman animals would also don't really have the belief/value distinction. I think that having a belief/values distinction requires being at least as sophisticated as a human. There are cases where a human sets a particular goal and then does things that are unpleasant in the short term (like working hard and not wasting all day commenting on blogs) in order to obtain a long-term valuable thing.
4timtyler12yDogs value food, warmth and sex. They believe it is night outside. Much the same as humans, IOW.
0DanArmak12yIn that case, why exactly do you think humans do have such a distinction? It's not enough to feel introspectively that the two are separate - we have lots of intuitive, introspective, objectively wrong feelings and perceptions. (Isn't there another bunch of comments dealing with this? I'll go look...) How do you define the relevant 'sophistication'? The ways in which one mind is "better" or smarter than another don't have a common ordering. There are ways in which human minds are less "sophisticated" than other minds - for instance, software programs are much better than me at memory, data organization and calculations.