Morality is not about willpower

Most people believe the way to lose weight is through willpower.  My successful experience losing weight is that this is not the case.  You will lose weight if you want to, meaning you effectively believe0 that the utility you will gain from losing weight, even time-discounted, will outweigh the utility from yummy food now.  In LW terms, you will lose weight if your utility function tells you to.  This is the basis of cognitive behavioral therapy (the effective kind of therapy), which tries to change peoples' behavior by examining their beliefs and changing their thinking habits.

Similarly, most people believe behaving ethically is a matter of willpower; and I believe this even less.  Your ethics is part of your utility function.  Acting morally is, technically, a choice; but not the difficult kind that holds up a stop sign and says "Choose wisely!"  We notice difficult moral choices more than easy moral choices; but most moral choices are easy, like choosing a ten dollar bill over a five.  Immorality is not a continual temptation we must resist; it's just a kind of stupidity.

This post can be summarized as:

  1. Each normal human has an instinctive personal morality.
  2. This morality consists of inputs into that human's decision-making system.  There is no need to propose separate moral and selfish decision-making systems.
  3. Acknowledging that all decisions are made by a single decision-making system, and that the moral elements enter it in the same manner as other preferences, results in many changes to how we encourage social behavior.

Many people have commented that humans don't make decisions based on utility functions.  This is a surprising attitude to find on LessWrong, given that Eliezer has often cast rationality and moral reasoning in terms of computing expected utility.  It also demonstrates a misunderstanding of what utility functions are.  Values, and utility functions, are models we construct to explain why we do what we do.  You can construct a set of values and a utility function to fit your observed behavior, no matter how your brain produces that behavior.  You can fit this model to the data arbitrarily well by adding parameters.  It will always have some error, as you are running on stochastic hardware.  Behavior is not a product of the utility function; the utility function is a product of (and predictor of) the behavior.  If your behavior can't be modelled with values and a utility function, you shouldn't bother reading LessWrong, because "being less wrong" means behaving in a way that is closer to the predictions of some model of rationality.  If you are a mysterious black box with inscrutable motives that makes unpredictable actions, no one can say you are "wrong" about anything.

If you still insist that I shouldn't talk about utility functions, though - it doesn't matter!  This post is about morality, not about utility functions.  I use utility functions just as a way of saying "what you want to do".  Substitute your own model of behavior.  The bottom line here is that moral behavior is not a qualitatively separate type of behavior and does not require a separate model of behavior.

My view isn't new.  It derives from ancient Greek ethics, Nietzsche, Ayn Rand, B.F. Skinner, and comments on LessWrong.  I thought it was the dominant view on LW, but the comments and votes indicate it is held at best by a weak majority.

Relevant EY posts include "What would you do without morality?", "The gift we give to tomorrow", "Changing your meta-ethics", and "The meaning of right"; and particularly the statement, "Maybe that which you would do even if there were no morality, is your morality."  I was surprised that no comments mentioned any of the many points of contact between this post and Eliezer's longest sequence.  (Did anyone even read the entire meta-ethics sequence?)  The view I'm presenting is, as far as I can tell, the same as that given in EY's meta-ethics sequence up through "The meaning of right"1; but I am talking about what it is that people are doing when they act in a way we recognize as ethical, whereas Eliezer was talking about where people get their notions of what is ethical.

Ethics as willpower

Society's main story is that behaving morally means constantly making tough decisions and doing things you don't want to do.  You have desires; other people have other desires; and ethics is a referee that helps us mutually satisfy these desires, or at least not kill each other.  There is one true ethics; society tries to discover and encode it; and the moral choice is to follow that code.

This story has implications that usually go together:

  • Ethics is about when peoples' desires conflict.  Thus, ethics is only concerned with interpersonal relations.
  • There is a single, Platonic, correct ethical system for a given X. (X used be a social class but not a context or society.  Nowadays it can be a society or context but not a social class.)
  • Your desires and feelings are anti-correlated with ethical behavior.  Humans are naturally unethical.  Being ethical is a continual, lifelong struggle.
  • The main purpose of ethics is to stop people from doing what they naturally want to do, so "thou shalt not" is more important than "thou shalt".
  • The key to being ethical is having the willpower not to follow your own utility function.
  • Social ethics are encouraged by teaching people to "be good", where "good" is the whole social ethical code.  Sometimes this is done without explaining what "good" is, since it is considered obvious, or perhaps more convenient to the priesthood to leave it unspecified. (Read the Koran for an extreme example.)
  • The key contrast is between "good" people who will do the moral thing, and "evil" people who do just the opposite.
  • Turning an evil person into a good person can be done by reasoning with them, teaching them willpower, or convincing them they will be punished for being evil.
  • Ethical judgements are different from utility judgements.  Utility is a tool of reason, and reason only tells you how to get what you want, whereas ethics tells you what you ought to want.  Therefore utilitarians are unethical.
  • Human society requires spiritual guidance and physical force to stop people from using reason to seek their own utility.
    • Religion is necessary even if it is false.
    • Reason must be strictly subordinated to spiritual authority.
    • Smart people are less moral than dumb people, because reason maximizes personal utility.
  • Since ethics are desirable, and yet contrary to human reason, they prove that human values transcend logic, biology, and the material world, and derive from a spiritual plane of existence.
  • If there is no God, and no spiritual world, then there is no such thing as good.
    • Sartre: "There can no longer be any good a priori, since there is no infinite and perfect consciousness to think it."
  • A person's ethicality is a single dimension, determined by the degree to which a person has willpower and subsumes their utility to social utility.  Each person has a level of ethicality that is the same in all domains.  You can be a good person, an evil person, or somewhere in between - but that's it.  You should not expect someone who cheats at cards to be courageous in battle, unless they really enjoy battle.

People do choose whether to follow the ethics society promulgates.  And they must weigh their personal satisfaction against the satisfaction of others; and those weights are probably relatively constant across domains for a given person.  So there is some truth in the standard view.  I want to point out errors; but I mostly want to change the focus.  The standard view focuses on a person struggling to implement an ethical system, and obliterates distinctions between the ethics of that person, the ethics of society, and "true" ethics (whatever they may be).  I will call these "personal ethics", "social ethics", and "normative ethics" (although the last encompasses all of the usual meaning of "ethics", including meta-ethics).  I want to increase the emphasis on personal ethics, or ethical intuitions.  Mostly just to insist that they exist.  (A surprising number of people simultaneously claim to have strong moral feelings, and that people naturally have no moral feelings.)

The conventional story denies these first two exist:  Ethics is what is good; society tries to figure out what is good; and a person is more or less ethical to the degree that they act in accordance with ethics.

The chief error of the standard view is that it explains ethics as a war between the physical and the spiritual.  If a person is struggling between doing the "selfish" thing and the "right" thing, that proves that they want both about equally.  The standard view instead supposes that they have a physical nature that wants only the "selfish" thing, and some internal or external spiritual force pulling them towards the "right" thing.  It isn't interested in the detailed differences between ethical beliefs of different cultures, or different individuals.  It may quantify them, to show they are not numerous, and thus "prove" that ethics are (and should be) universal.  Or it may average them together, to give a closer estimate of the one true ethics.  It thus hinders people from asking questions about the evolutionary stability of different ethical systems, or how society might use something analogous to a balance of power or a parliamentary system to produce order out of a variety of ethical systems.  The quest for perfection dismisses all natural ethics as unsatisfactory.  ("Natural law" is not the study of natural ethics, but the attempt to find the one true ethics in nature.)  And it emphasizes some ways of influencing a persons' ethics to the exclusion of other ways.

You could recast this as the conscious mind taking the place of the spiritual nature, and the subconscious mind taking the place of the physical nature; and willpower is the exertion of control over the subconscious by the conscious.  (Suggested by my misinterpretation of Matt's comment.)  But to use that to defend the "ethics as willpower" view, you assume that the subconscious usually wants to do immoral things, while the conscious mind is the source of morality.  And I have no evidence that my subconscious is less likely to propose moral actions than my conscious. My subconscious mind usually wants to be nice to people; and my conscious mind sometimes comes up with evil plans that my subconscious responds to with disgust.

... but being evil is harder than being good

At times, I've rationally convinced myself that I was being held back from my goals by my personal ethics, and I determined to act less ethically.  Sometimes I succeeded.  But more often, I did not.  Even when I did, I had to first build up a complex structure of rationalizations, and exert a lot of willpower to carry through.  I have never been able (or wanted) to say, "Now I will be evil" (by my personal ethics) and succeed.

If being good takes willpower, why does it take more willpower to be evil?

Ethics as innate

One theory that can explain why being evil is hard, says that people are noble savages by birth, and would enact the true ethics if only their inclinations were not crushed by society.2  If you have friends who have raised their children by this theory, I probably need say no more.

A non-absolutist version of this theory would be almost the same as my theory; but my theory is informed by evolutionary theory, and human child-rearing is also part of our evolutionary history.  Our genes can't be guaranteed to be adaptive if that powerful environmental influence is removed.  This means that the "ethics as innate" theory has vastly different consequences for child-rearing.

Ethics as taste

Try this on instead:  Think of the intuitions underlying your personal morality as the same sort of thing as your personal taste in food.  It's hard to learn not to like ice cream.  But it's easy to learn to like Brussels sprouts, Lima beans, beer, cigarettes, coffee, Cabernet, or other "acquired tastes".  You can learn to like, or not to hate, some outcomes by acclimatizing yourself to them.

We may use the word "morality" only for interactions with other agents; but we can appreciate that it is just part of a broader underlying cognitive activity.  This activity is the cognitive end of taste, where you can't just try everything out and see what you like best; you need to stop and figure out in advance what actions will usually produce the most pleasing outcomes.  This often involves interactions with other people, because people are complicated.  But it can also involve other complex decisions, such as whether to increase or decrease taxes, pass or punt on the fourth down, or use emacs or vim.  It's not surprising that our emotions about these decisions often resemble our emotions about moral decisions.

The important advantage, to a rationalist, is that rationality and morals are no longer separate magisteria.  We don't need separate models of rational behavior and moral behavior, and a way of resolving conflicts between them.  If you are using utility functions, you only need one model; values of all types go in, and a single utility comes out.  (If you are not using utility functions, use whatever it is you use to predict human behavior.  The point is that you only need one of them.)  It's true that we have separate neural systems that respond to different classes of situation; but no one has ever protested against a utility-based theory of rationality by pointing out that there are separate neural systems responding to images and sounds, and so we must have separate image-values and sound-values and some way of resolving conflicts between image-utility and sound-utility.  The division of utility into moral values and all other values may even have a neural basis; but modelling that difference has, historically, caused much greater problems than it has solved.

The problem for this theory is:  If ethics is just taste, why are we nice to each other?  The answer comes from evolutionary theory.  Exactly how it does this is controversial, but it is no longer a deep mystery.  One feasible answer is that reproductive success is proportional to inclusive fitness.3  It is important to know how much of our moral intuitions is innate, and how much is conditioned; but I have no strong opinion on this other than that it is probably some of each.

This theory has different implications than the standard story:

  • Behaving morally feels good.
  • Social morals are encouraged by creating conditions that bring personal morals into line with social morals.
  • A person can have personal morals similar to society's in one domain, and very different in another domain.
  • A person learns their personal morals when they are young.
  • Being smarter enables you to be more ethical.
  • A person will come to feel that an action is ethical if it leads to something pleasant shortly after doing it, and unethical if it leads to displeasure.
  • A person can extinguish a moral intuition by violating it many times without consequences - whether they do this of their own free will, or under duress.
  • It may be easier to learn to enjoy new ethical behaviors (thou shalts), than to dislike enjoyable behaviors (thou shalt nots).
  • The key contrast is between "good" people who want to do the moral thing, and "bad" people who are apathetic about it.
  • Turning a (socially) bad person into a good person is done one behavior at a time.
  • Society can reason about what ethics they would like to encourage under current conditions.

As I said, this is nothing new.  The standard story makes concessions to it, as social conservatives believe morals should be taught to children using behaviorist principles ("Spare the rod and spoil the child").  This is the theory of ethics endorsed by "Walden Two" and warned against by "A Clockwork Orange".  And it is the theory of ethics so badly abused by the former Soviet Union, among other tyrannical governments.  More on this, hopefully, in a later post.

Does that mean I can have all the pie?

No.

Eliezer addressed something that sounds like the "ethics as taste" theory in his post "Is morality preference?", and rejected it.  However, the position he rejected was the straw-man position that acting to immediately gratify your desires is moral behavior.  (The position he ultimately promoted, in "The meaning of right", seems to be the same I am promoting here:  That we have ethical intuitions because we have evolved to compute actions as preferable that maximized our inclusive fitness.)

Maximizing expected utility is not done by greedily grabbing everything within reach that has utility to you.  You may rationally leave your money in a 401K for 30 years, even though you don't know what you're going to do with it in 30 years and you do know that you'd really like a Maserati right now.  Wanting the Maserati does not make buying the Maserati rational.  Similarly, wanting all of the pie does not make taking all of the pie moral.

More importantly, I would never want all of the pie.  It would make me unhappy to make other people go hungry.  But what about people who really do want all of the pie?  I could argue that they reason that taking all the pie would incur social penalties.  But that would result in morals that vanish when no one is looking.  And that's not the kind of morals normal people have.

Normal people don't calculate the penalties they will incur from taking all the pie.  Sociopaths do that.  Unlike the "ethics as willpower" theorists, I am not going to construct a theory of ethics that takes sociopaths as normal.4  They are diseased, and my theory of ethical behavior does not have to explain their behavior, any more than a theory of rationality has to explain the behavior of schizophrenics.  Now that we have a theory of evolution that can explain how altruism could evolve, we don't have to come up with a theory of ethics that assumes people are not altruistic.

Why would you want to change your utility function?

Many LWers will reason like this:  "I should never want to change my utility function.  Therefore, I have no interest in effective means of changing my tastes or my ethics."

Reasoning this way makes the distinction between ethics as willpower and ethics as taste less interesting.  In fact, it makes the study of ethics in general less interesting - there is little motivation other than to figure out what your ethics are, and to use ethics to manipulate others into optimizing your values.

You don't have to contemplate changing your utility function for this distinction to be somewhat interesting.  We are usually talking about society collectively deciding how to change each others' utility functions.  The standard LessWrongian view is compatible with this:  You assume that ethics is a social game in which you should act deceptively, trying to foist your utility functions on other people and avoid letting yours being changed.

But I think we can contemplate changing our utility functions.  The short answer is that you may choose to change your future utility function when doing so will have the counter-intuitive effect of better-fulfilling your current utility function (as some humans do in one ending of Eliezer's story about babyeating aliens).  This can usually be described as a group of people all conspiring to choose utility functions that collectively solve prisoners' dilemmas, or (as in the case just cited) as a rational response to a threatened cost that your current utility function is likely to trigger.  (You might model this as a pre-commitment, like one-boxing, rather than as changing your utility function.  The results should be the same.  Consciously trying to change your behavior via pre-commitment, however, may be more difficult, and may be interpreted by others as deception and punished.)

(There are several longer, more frequently-applicable answers; but they require a separate post.)

Fuzzies and utilons

Eliezer's post, Purchase fuzzies and utilons separately, on the surface appears to say that you should not try to optimize your utility function, but that you should instead satisfy two separate utility functions:  a selfish utility function, and an altruistic utility function.

But remember what a utility function is.  It's a way of adding up all your different preferences and coming up with a single number.  Coming up with a single number is important, so that all possible outcomes can be ordered.  That's what you need, and ordering is what numbers do.  Having two utility functions is like having no utility function at all, because you don't have an ordering of preferences.

The "selfish utility function" and the "altruistic utility function" are different natural categories of human preferences.  Eliezer is getting indirectly at the fact that the altruistic utility function (which gives output in "fuzzies") is indexical.  That is, its values have the word "I" in them.  The altruistic utility function cares whether you help an old lady across the street, or some person you hired in Portland helps an old lady across the street.  If you aren't aware of this, you may say, "It is more cost-effective to hire boy scouts (who work for less than minimum wage) to help old ladies across the street and achieve my goal of old ladies having been helped across the street."  But your real utility function prefers that you helped them across the street; and so this doesn't work.

Conclusion

The old religious view of ethics as supernatural and contrary to human nature is dysfunctional and based on false assumptions.  Many religious people claim that evolutionary theory leads to the destruction of ethics, by teaching us that we are "just" animals.  But ironically, it is evolutionary theory that provides us with the understanding we need to build ethical societies.  Now that we have this explanation, the "ethics as taste" theory deserves to be evaluated again, and see if it isn't more sensible and more productive than the "ethics as willpower" theory.

 

0.  I use the phrase "effectively believe" to mean both having a belief, and having habits of thought that cause you to also believe the logical consequences of that belief.

1.  We have disagreements, such as the possibility of dividing values into terminal and instrumental, the relation of the values of the mind to the values of its organism, and whether having a value implies that propagating that value is also a value of yours (I say no).  But they don't come into play here.

2.  Contrary to popular opinion, this was not Rousseau's theory of human nature.  (It may be a bastard popularization of Rousseau's theory that eventually killed and supplanted its parent.)

3.  For more details, see Eliezer's meta-ethics sequence.

4.  Also, I do not take Gandhi as morally normal.  Not all brains develop as their genes planned; and we should expect as many humans to be pathologically good as are pathologically evil.  (A biographical comparison between Gandhi and Hitler shows a remarkable number of similarities.)

144 comments, sorted by
magical algorithm
Highlighting new comments since Today at 1:40 PM
Select new highlight date

So, when I agonize over whether to torrent an expensive album instead of paying for it, and about half the time I end up torrenting it and feeling bad, and about half the time I pay for it but don't enjoy doing so ... what exactly am I doing in the latter case if not employing willpower?

I mean, I know willpower probably isn't a real thing on the deepest levels of the brain, but it's fake in the same way centrifugal force is fake, not in the way Bigfoot is fake. It sure feels like I'm using willpower when I make moral decisions about pirating, and I don't understand how your model above interprets that.

Granted, there are many other moral decisions I make that don't require willpower and do conform to your model above, and if I had to choose black-and-white between ethics-as-willpower or ethics-as-choice I'd take the latter, your model just doesn't seem complete.

My interpretation of the post in this case is: it's not that you're not employing willpower, instead you're not employing personal morality. So, while TORRENT vs BUY fits into the societal ethics view, it does not fit into your personal morality.

From the personal morality perspective, the bad feeling you get is the thing you need willpower to fight against/suppress. You probably also need willpower to fight against/suppress the bad feeling you might be getting from buying the album. These need not be mutually exclusive. Personal morality can be both against torrenting and against spending money unduly.

Let me rephrase my objection, then.

I feel a certain sense of mental struggle when considering whether to torrent music. I don't feel this same sense of mental struggle when considering whether or not to murder or steal or cheat . Although both of these are situations that call my personal morality, the torrenting situation seems to be an interesting special case.

We need a word to define the way in which the torrenting situation is a special case and not just another case where I don't murder or steal or cheat because I'm not that kind of person. The majority of the English-speaking world seems to use "willpower". As far as I know there's no other definition of willpower, where we could say "Oh, that's real willpower, this torrenting thing is something else." If we didn't have the word "willpower", we'd have to make up a different word, like "conscious-alignment in mental struggle" or something.

So why not use the word "willpower" here?

Suppose that you have one extra ticket to the Grand Galloping Gala, and you have several friends who each want it desperately. You can give it to only one of them. Doesn't the agonizing over that decision feel a lot like the agonizing over whether to buy or torrent? Yet we don't think of that as involving willpower.

At the risk of totally reducing this to unsupportable subjective intuitions...no, the two decisions wouldn't feel the same at all.

I can think of some cases in which it would feel similar. If one of the ticket-seekers was my best friend whom I'd known forever, and another was a girl I was trying to impress, and I had to decide between loyalty to my best friend or personal gain from impressing the girl. Or if one of the ticket-seekers had an incurable disease and this was her last chance to enjoy herself, and the other was a much better friend and much more fun to be around. But both of these are, in some way, moral issues.

In the simple ticket-seeker case without any of these complications, there would be a tough decision, but it would be symmetrical: there would be certain reasons for choosing Friend A, and certain others for choosing Friend B, and I could just decide between them. In the torrenting case, and the complicated ticket-seeker cases, it feels asymmetrical: like I have a better nature tending toward one side, and temptation tending toward the other side. This asymmetry seems to be the uniting factor behind my feeling of needing "willpower" for some decisions.

Mm.

So, OK, to establish some context first: one (ridiculously oversimplified) way of modeling situations like this is to say that in both cases I have two valuation functions, F1 and F2, which give different results when comparing the expected value of the two choices, because they weight the relevant factors differently (for example, the relative merits of being aligned with my better nature and giving in to temptation, or the relative merits of choosing friend A and friend B), but in the first (simple) case the two functions are well-integrated and I can therefore easily calculate the weighted average of them, and in the second (complicated) case the two functions are poorly integrated and averaging their results is therefore more difficult. So by the time the results become available to consciousness in the first case, I've already made the decision, so I feel like I can "just decide" whereas in the second case, I haven't yet, and therefore feel like I have a difficult decision to make, and the difference is really one of how aware I am of the decision. (There are a lot of situations like this, where when an operation can be performed without conscious monitoring it "feels easy.")

So. In both cases the decision is "asymmetrical," in that F1 and F2 are different functions, but in the torrenting case (and the complicated ticket case), the difference between F1 and F2 is associated with a moral judgment (leading to words like "better nature" and "temptation"). Which feels very significant, because we're wired to attribute significance to moral judgments.

I wonder how far one could get by treating it like any other decision procedure, though. For example, if I decide explicitly that I weight "giving into temptation" and "following my better nature" with a ratio of 1:3, and I flip coins accordingly to determine whether to torrent or not (and adjust my weighting over time if I'm not liking the overall results)... do I still need so much "willpower"?

I love the idea of the coin-flip diet. Although it can be gamed by proposing to eat things more often. Maybe you could roll a 6-sided die for each meal. 1 = oatmeal and prune juice, 2-3 = lentil soup, 4-5 = teriyaki chicken, 6 = Big Mac or ice cream.

If you know the weight, and you have a way of sorting the things you would flip a coin for, you can use the sorting order instead. For instance, I typically buy rather than torrent if the artist is in the bottom half of artists sorted by income.

I diet more or less this way. Not a coinflip, but a distribution that seems sustainable in the long term. Lentil soup twice a week, Big Mac and ice cream once a week, so to speak.

Or, if I wanted to choose between a car with good gas mileage and one with good performance, that could seem moral. Or if I were choosing between a food high in sugar, or one high in protein. Or one high in potassium, or one high in calcium.

What's an example of an amoral choice?

Choosing between two cars with equally good gas mileage and performance, one which has more trunk space and one which has a roof rack.

It all depends on why you decide to torrent/not torrent:

Are you more likely to torrent if the album is very expensive, or if it is very cheap? If you expect it to be of high quality, or of low quality? If the store you could buy the album at is far away, or very close? If you like the band that made it, or if you don't like them? Longer albums or shorter? Would you torrent less if the punishment for doing so was increased? Would you torrent more if it was harder to get caught? What if you were much richer, or much poorer?

I'm confident that if you were to analyze when you torrent vs. when you buy, you'd notice trends that, with a bit of effort, could be translated into a fairly reasonable "Will I Torrent or Buy?" function that predicts whether you'll torrent or not with much better accuracy than random.

I'm confident that if you were to analyze when you torrent vs. when you buy, you'd notice trends that, with a bit of effort, could be translated into a fairly reasonable "Will I Torrent or Buy?" function that predicts whether you'll torrent or not with much better accuracy than random.

Yes, but the function might all include terms for things like how rude were Yvain's co-workers to Yvain that day, what mood was Yvain in that day, was Yvain hungry at the moment, i.e., stuff a reasonably behaved utility function shouldn't have terms for but the outcome of a willpower based struggle very well might.

I'm sure that's true, but what relevance does that have to the current discussion?

IAWYC, but...

Many people have commented that humans don't make decisions based on utility functions. This is a surprising attitude to find on LessWrong, given that Eliezer has often cast rationality and moral reasoning in terms of computing expected utility. It also demonstrates a misunderstanding of what utility functions are.

The issue is not that people wouldn't understand what utility functions are. Yes, you can define arbitrarily complicated utility functions to represent all of a human's preferences, we know that. There's an infinite amount of valid methods by which you could model a human's preferences, utility functions being one of them. The question is which model is the most useful, and which models have the least underlying assumptions that will lead your intuitions astray. Utility functions are sometimes an appropriate model and sometimes not.

To expand on this...

Tim van Gelder has an elegant, although somewhat lengthy, example of this. He presents us with a problem engineers working with early steam engines had: how to translate the oscillating action of the steam piston into the rotating motion of a flywheel?

(Note: it's going to take a while before the relationship between this and utility functions becomes clear. Bear with me.)

High-quality spinning and weaving required, however, that the source of power be highly uniform, that is, there should be little or no variation in the speed of revolution of the main driving flywheel. This is a problem, since the speed of the flywheel is affected both by the pressure of the steam from the boilers, and by the total workload being placed on the engine, and these are constantly fluctuating.

It was clear enough how the speed of the flywheel had to be regulated. In the pipe carrying steam from the boiler to the piston there was a throttle valve. The pressure in the piston, and so the speed of the wheel, could be adjusted by turning this valve. To keep engine speed uniform, the throttle valve would have to be turned, at just the right time and by just the right amount, to cope with changes in boiler pressure and workload. How was this to be done?

An obvious-to-us approach would be to break down the task into a number of subproblems. We might think of a device capable of solving this problem as implementing the following algorithm:

  1. Measure the speed of the flywheel.
  2. Compare the actual speed against the desired speed.
  3. If there is no discrepancy, return to step 1. Otherwise,
    a. measure the current steam pressure;
    b. calculate the desired alteration in steam pressure;
    c. calculate the necessary throttle valve adjustment.
  4. Make the throttle valve adjustment.
    Return to step 1.

There must be some physical device capable of actually carrying out each of these subtasks, and so we can think of the governor as incorporating a tachometer (for measuring the speed of the wheel); a device for calculating the speed discrepancy; a steam pressure meter; a device for calculating the throttle valve adjustment; a throttle valve adjuster; and some kind of central executive to handle sequencing of operations.

Let's call this approach the computational governor. As obvious this approach might seem, it was not the way by which the problem was actually solved. After all, they didn't have computers back then. Instead, what they came up with was a centrifugal governor.

It consisted of a vertical spindle geared into the main flywheel so that it rotated at a speed directly dependent upon that of the flywheel itself (see figure 1). Attached to the spindle by hinges were two arms, and on the end of each arm was a metal ball. As the spindle turned, centrifugal force drove the balls outward and hence upward. By a clever arrangement, this arm motion was linked directly to the throttle valve. The result was that as the speed of the main wheel increased, the arms raised, closing the valve and restricting the flow of steam; as the speed decreased, the arms fell, opening the valve and allowing more steam to flow. The engine adopted a constant speed, maintained with extraordinary swiftness and smoothness in the presence of large fluctuations in pressure and load.

Now let's compare the computational governor and the centrifugal governor. Are they two entirely different ways of achieving the same goal, or are they on some level fundamentally the same?

One of the defining characteristics in the computational governor is that it relies on representations of its environment: it has a symbolic representation for the current speed of the flywheel, a symbolic representation for the current steam pressure, and so on. It manipulates these representations, and the results of those manipulations tell it whether to adjust the throttle valve or not. Does the centrifugal governor represent aspects of the environment in a similar fashion?

There is a common and initially quite attractive intuition to the effect that the angle at which the arms are swinging is a representation of the current speed of the engine, and it is because the arms are related in this way to engine speed that the governor is able to control that speed. This intuition is misleading, however; arm angle and engine speed are of course intimately related, but the relationship is not representational. [...]

A noteworthy fact about standard explanations of how the centrifugal governor works is, however, that they never talk about representations. This was true for the informal description given above, which apparently suffices for most readers; more importantly, it has been true of the much more detailed descriptions offered by those who have actually been in the business of constructing centrifugal governors or analyzing their behavior. Thus, for example, a mechanics manual for construction of governors from the middle of last century, Maxwell's original dynamical analysis (see below), and contemporary mathematical treatments all describe the arm angle and its role in the operation of the governor in nonrepresentational terms. The reason, one might reasonably conclude, is that the governor contains no representations. [...]

The temptation to treat the arm angle as a representation comes from the informal observation that there is some kind of correlation between arm angle and engine speed; when the engine rotates at a certain speed, the arms will swing at a given angle. [...]

For a start, notice that the correlation at issue only obtains when the total system has reached its stable equilibrium point, and is immediately disturbed whenever there is some sudden change in, for example, the workload on the engine. At such times, the speed of the engine quickly drops for a short period, while the angle of the arms adjusts only at the relatively slow pace dictated by gravitational acceleration. Yet, even as the arms are falling, more steam is entering the piston, and hence the device is already working; indeed, these are exacdy the times when it is most crucial that the governor work effectively. Consequently, no simple correlation between arm angle and engine speed can be the basis of the operation of the governor.

[...] For notice that, because the arms are directly linked to the throttle valve, the angle of the arms is at all times determining the amount of steam entering the piston, and hence at all times the speed of the engine depends in some interesting way on the angle of the arms. Thus, arm angle and engine speed are at all times both determined by, and determining, each other's behavior. As we shall see below, there is nothing mysterious about this relationship; it is quite amenable to mathematical description. Yet it is much more subtle and complex than the standard concept of representation can handle, even when construed as broadly as is done here. In order to describe the relationship between arm angle and engine speed, we need a more powerful conceptual framework than mere talk of representations. That framework is the mathematical language of dynamics, and in that language, the two quantities are said to be coupled. The real problem with describing the governor as a representational device, then, is that the relation of representing-something standing in for some other state of affairs-is too simple to capture the actual interaction between the governor and the engine.

(comment split into two, since LW says it's too long)

(part two)

van Gelder holds that an algorithmic approach is simply insuitable for understanding the centrifugal governor. It just doesn't work, and there's no reason to even try. To understand the behavior of centrifugal governor, the appropriate tool to use are differential equations that describe its behavior as a dynamic system where the properties of various parts depend on each other.

Changing a parameter of a dynamical system changes its total dynamics (that is, the way its state variables change their values depending on their current values, across the full range of values they may take). Thus, any change in engine speed, no matter how small, changes not the state of the governor directly, but rather the way the state of the governor changes, and any change in arm angle changes the way the state of the engine changes. Again, however, the overall system (coupled engine and governor) settles quickly into a point attractor, that is, engine speed and arm angle remain constant.

Now we finally get into utility functions. van Gelder holds that all the various utility theories, no matter how complex, remain subject to specific drawbacks:

(1) They do not incorporate any account of the underlying motivations that give rise to the utility that an object or outcome holds at a given time.

(2) They conceive of the utilities themselves as static values, and can offer no good account of how and why they might change over time, and why preferences are often inconsistent and inconstant.

(3) They offer no serious account of the deliberation process, with its attendant vacillations, inconsistencies, and distress; and they have nothing to say about the relationships that have been uncovered between time spent deliberating and the choices eventually made.

Curiously, these drawbacks appear to have a common theme; they all concern, one way or another, temporal aspects of decision making. It is worth asking whether they arise because of some deep structural feature inherent in the whole framework which conceptualizes decision-making behavior in terms of calculating expected utilities.

Notice that utility-theory based accounts of human decision making ("utility theories") are deeply akin to the computational solution to the governing task. That is, if we take such accounts as not just describing the outcome of decision-making behavior, but also as a guide to the structures and processes that underlie such behavior, then there are basic structural similarities to the computational governor. Thus, utility theories are straightforwardly computational; they are based on static representations of options, utilities, probabilities, and so on, and processing is the algorithmically specifiable internal manipulation of these representations to obtain a final representation of the choice to be made. Consequently, utility theories are strictly sequential; they presuppose some initial temporal stage at which the relevant information about options, likelihoods, and so on, is acquired; a second stage in which expected utilities are calculated; and a third stage at which the choice is effected in actual behavior. And, like the computational governor, they are essentially atemporal; there are no inherent constraints on the timing of the various internal operations with respect to each other or change in the environment.

What we have, in other words, is a model of human cognition which, on one hand, instantiates the same deep structure as the computational governor, and on the other, seems structurally incapable of accounting for certain essentially temporal dimensions of decision-making behavior. At this stage, we might ask: What kind of model of decision-making behavior we would get if, rather, we took the centrifugal governor as a prototype? It would be a model with a relatively small number of continuous variables influencing each other in real time. It would be governed by nonlinear differential equations. And it would be a model in which the agent and the choice environment, like the governor and the engine, are tightly interlocked.

It would, in short, be rather like the motivational oscillatory theory (MOT) modeling framework described by mathematical psychologist James Townsend. MOT enables modeling of various qualitative properties of the kind of cyclical behaviors that occur when circumstances offer the possibility of satiation of desires arising from more or less permanent motivations; an obvious example is regular eating in response to recurrent natural hunger. It is built around the idea that in such situations, your underlying motivation, transitory desires with regard to the object, distance from the object, and consumption of it are continuously evolving and affecting each other in real time; for example, if your desire for food is high and you are far from it, you will move toward it (that is, z changes), which influences your satiation and so your desire. The framework thus includes variables for the current state of motivation, satiation, preference, and action (movement), and a set of differential equations describe how these variables change over time as a function of the current state of the system.

Kaj may be too humble to self-link the relevant top level post, so I'll do it for him.

I actually didn't link to it, because I felt that those comments ended up conveying the same point better than the post did.

van Gelder holds that an algorithmic approach is simply insuitable for understanding the centrifugal governor. It just doesn't work, and there's no reason to even try. To understand the behavior of centrifugal governor, the appropriate tool to use are differential equations that describe its behavior as a dynamic system where the properties of various parts depend on each other.

A set of differential equations that describe its behavior as a dynamic system where the properties of various parts depend on each other, would still be an algorithm. van Gelder appears not to have heard of universal computation.

(1) They do not incorporate any account of the underlying motivations that give rise to the utility that an object or outcome holds at a given time.

I would say that the selection and representation of values is exactly this account.

(2) They conceive of the utilities themselves as static values, and can offer no good account of how and why they might change over time

False.

and why preferences are often inconsistent and inconstant.

Perceived preferences are often inconsistent and inconstant. So you try to find underlying preferences.

(3) They offer no serious account of the deliberation process, with its attendant vacillations, inconsistencies, and distress; and they have nothing to say about the relationships that have been uncovered between time spent deliberating and the choices eventually made.

Also false. The utility function itself is precisely a model of the deliberation process. It isn't going to be an equation that fits on a single line. And it is going to have some computational complexity, which will relate the relationships between time spent deliberating and the choice eventually made.

I hope - because this is the most charitable interpretation I can make - that all these people complaining about utility functions are just forgetting that it uses the word "function". Not "arithmetic function", or "regular expression". Any computable function. If an output can't be modelled with a utility function, it is non-computable. If humans can't be modelled with utility functions, that is a proof that a computer program can't be intelligent. I'm not concerned with whether this is a good model. I just want to able to say, theoretically, that the question of what a human should do in response to a situation, is something that can be said to have right answers and wrong answers, given that human's values/preferences/morals.

All this harping about whether utility functions can model humans is not very relevant to my post. I bring up utility functions only to communicate, to a LW audience, that you are only doing what you want to do when you behave morally. If you have some other meaningful way of stating this - of saying what it means to "do what you want to do" - by all means do so!

(If you want to work with meta-ethics, and ask why some things are right and some things are wrong, you do have to work with utility functions, if you believe anything like the account in the meta-ethics sequence; for the same reason that evolution needs to talk about fitness. If you just want to talk about what humans do - which is what I'm doing here - you don't have to talk about utility functions unless you want to be able to evaluate whether a particular human is behaving morally or immorally. To make such a judgement, you have to have an algorithm that computes a judgement on an action in a situation. An that algorithm computes a utility function.)

A set of differential equations that describe its behavior as a dynamic system where the properties of various parts depend on each other, would still be an algorithm.

Sure, but that's not the sense of "algorithm" that was being used here.

If an output can't be modelled with a utility function, it is non-computable. If humans can't be modelled with utility functions, that is a proof that a computer program can't be intelligent. I'm not concerned with whether this is a good model. I just want to able to say, theoretically, that the question of what a human should do in response to a situation, is something that can be said to have right answers and wrong answers, given that human's values/preferences/morals.

None of this is being questioned. You said that you're not concerned with whether this is a good model, and that's fine, but whether or not it is a good model was the whole point of my comment. Neither I nor van Gelder claimed that utility functions couldn't be used as models in principle.

All this harping about whether utility functions can model humans is not very relevant to my post. I bring up utility functions only to communicate, to a LW audience, that you are only doing what you want to do when you behave morally.

My comments did not question the conclusions of your post (which I agreed with and upvoted). I was only the addressing the particular paragraph which I quoted in my initial comment. (I should probably have mentioned that IAWYC in that one. I'll edit that in now.)

Sorry. I'm getting very touchy about references utility functions now. When I write a post, I want to feel like I'm discussing a topic. On this post, I feel like I'm trying to compile C++ code and the comments are syntax error messages. I'm pretty much worn out on the subject for now, and probably getting sloppy, even though the post could still use a lot of clarification.

No problem - I could have expressed myself more clearly, as well.

Take it positively: if people only mostly nitpick on your utility function bit, then that implies that they agree with the rest of what you wrote. I didn't have much disagreement with the actual content of your post, either.

You can construct a set of values and a utility function to fit your observed behavior, no matter how your brain produces that behavior.

I'm deeply hesitant to jump into a debate that I don't know the history of, but...

Isn't it pretty generally understood that this is not true? The Utility Theory folks showed that behavior of an agent can be captured by a numerical utility function iff the agent's preferences conform to certain axioms, and Allais and others have shown that human behavior emphatically does not.

Seems to me that if human behavior were in general able to be captured by a utility function, we wouldn't need this website. We'd be making the best choices we could, given the information we had, to maximize our utility, by definition. In other words, "instrumental rationality" would be easy and automatic for everyone. It's not, and it seems to me a big part of what we can do to become more rational is try and wrestle our decision-making algorithms around until the choices they make are captured by some utility function. In the meantime, the fact that we're puzzled by things like moral dilemmas looks like a symptom of irrationality.

The Utility Theory folks showed that behavior of an agent can be captured by a numerical utility function iff the agent's preferences conform to certain axioms, and Allais and others have shown that human behavior emphatically does not.

A person's behavior can always be understood as optimizing a utility function, it just that if they are irrational (as in the Allais paradox) the utility functions start to look ridiculously complex. If all else fails, a utility function can be used that has a strong dependency on time in whatever way is required to match the observed behavior of the subject. "The subject had a strong preference for sneezing at 3:15:03pm October 8, 2011."

From the point of view of someone who wants to get FAI to work, the important question is, if the FAI does obey the axioms required by utility theory, and you don't obey those axioms for any simple utility function, are you better off if:

  • the FAI ascribes to you some mixture of possible complex utility functions and helps you to achieve that, or

  • the FAI uses a better explanation of your behavior, perhaps one of those alternative theories listed in the wikipedia article, and helps you to achieve some component of that explanation?

I don't understand the alternative theories well enough to know if the latter option even makes sense.

A person's behavior can always be understood as optimizing a utility function, it just that if they are irrational (as in the Allais paradox) the utility functions start to look ridiculously complex. If all else fails, a utility function can be used that has a strong dependency on time in whatever way is required to match the observed behavior of the subject. "The subject had a strong preference for sneezing at 3:15:03pm October 8, 2011."

This is the Texas Sharpshooter fallacy again. Labelling what a system does with 1 and what it does not with 0 tells you nothing about the system. It makes no predictions. It does not constrain expectation in any way. It is woo.

Woo need not look like talk of chakras and crystals and angels. It can just as easily be dressed in the clothes of science and mathematics.

This is the Texas Sharpshooter fallacy again. Labelling what a system does with 1 and what it does not with 0 tells you nothing about the system.

You say "again", but in the cited link it's called the "Texas Sharpshooter Utility Function". The word "fallacy" does not appear. If you're going to claim there's a fallacy here, you should support that statement. Where's the fallacy?

It makes no predictions. It does not constrain expectation in any way. It is woo.

The original claim was that human behavior does not conform to optimizing a utility function, and I offered the trivial counterexample. You're talking like you disagree with me, but you aren't actually doing so.

If the only goal is to predict human behavior, you can probably do it better without using a utility function. If the goal is to help someone get what they want, so far as I can tell you have to model them as though they want something, and unless there's something relevant in that Wikipedia article about the Allais paradox that I don't understand yet, that requires modeling them as though they have a utility function.

You'll surely want a prior distribution over utility functions. Since they are computable functions, the usual Universal Prior works fine here, so far as I can tell. With this prior, TSUF-like utility functions aren't going to dominate the set of utility functions consistent with the person's behavior, but mentioning them makes it obvious that the set is not empty.

You'll surely want a prior distribution over utility functions. Since they are computable functions, the usual Universal Prior works fine here, so far as I can tell. With this prior, TSUF-like utility functions aren't going to dominate the set of utility functions consistent with the person's behavior

How do you know this? If that's true, it can only be true by being a mathematical theorem, which will require defining mathematically what makes a UF a TSUF. I expect this is possible, but I'll have to think about it.

With [the universal] prior, TSUF-like utility functions aren't going to dominate the set of utility functions consistent with the person's behavior

How do you know this? If that's true, it can only be true by being a mathematical theorem...

No, it's true in the same sense that the statement "I have hands" is true. That is, it's an informal empirical statement about the world. People can be vaguely understood as having purposeful behavior. When you put them in strange situations, this breaks down a bit and if you wish to understand them as having purposeful behavior you have to contrive the utility function a bit, but for the most part people do things for a comprehensible purpose. If TSUF's were the simplest utility functions that described humans, then human behavior would be random, which is isn't. Thus the simplest utility functions that describe humans aren't going to be TSUF-like.

You say "again", but in the cited link it's called the "Texas Sharpshooter Utility Function". The word "fallacy" does not appear. If you're going to claim there's a fallacy here, you should support that statement. Where's the fallacy?

I was referring to the same fallacy in both cases. Perhaps I should have written out TSUF in full this time. The fallacy is the one I just described: attaching a utility function post hoc to what the system does and does not do.

The original claim was that human behavior does not conform to optimizing a utility function, and I offered the trivial counterexample. You're talking like you disagree with me, but you aren't actually doing so.

I am disagreeing, by saying that the triviality of the counterexample is so great as to vitiate it entirely. The TSUF is not a utility function. One might as well say that a rock has a utility of 1 for just lying there and 0 for leaping into the air.

If the goal is to help someone get what they want, so far as I can tell you have to model them as though they want something

You have to model them as if they want many things, some of them being from time to time in conflict with each other. The reason for this is that they do want many things, some of them being from time to time in conflict with each other. Members of LessWrong regularly make personal posts on such matters, generally under the heading of "akrasia", so it's not as if I was proposing here some strange new idea of human nature. The problem of dealing with such conflicts is a regular topic here. And yet there is still a (not universal but pervasive) assumption that acting according to a utility function is the pinnacle of rational behaviour. Responding to that conundrum with TSUFs is pretty much isomorphic to the parable of the Heartstone.

I know the von Neumann-Morgenstern theorem on utility functions, but since they begin by assuming a total preference ordering on states of the world, it would be begging the question to cite it in support of human utility functions.

The fallacy is the one I just described: attaching a utility function post hoc to what the system does and does not do.

A fallacy is a false statement. (Not all false statements are fallacies; a fallacy must also be plausible enough that someone is at risk of being deceived by it, but that doesn't matter here.) "Attaching a utility function post hoc to what the system does and does not do" is an activity. It is not a statement, so it cannot be false, and it cannot be a fallacy. You'll have to try again if you want to make sense here.

The TSUF is not a utility function.

It a function that maps world-states to utilities, so it is a utility function. You'll have to try again if you want to make sense here too.

We're nearly at the point where it's not worth my while to listen to you because you don't speak carefully enough. Can you do something to improve, please? Perhaps get a friend to review your posts, or write things one day and reread them the next before posting, or simply make an effort not to say things that are obviously false.

A fallacy is a false statement

Not a pattern of an invalid argument?

A fallacy is a false statement.

It a function that maps world-states to utilities, so it is a utility function.

As lessdazed has said, that is simply not what the word "fallacy" means. Neither is a utility function, in the sense of VNM, merely a function from world states to numbers; it is a function from lotteries over outcomes to numbers that satisfies their axioms. The TSUF does not satisfy those axioms. No function whose range includes 0, 1, and nothing in between can satisfy the VNM axioms. The range of a VNM utility function must be an interval of real numbers.

We're nearly at the point where it's not worth my while to listen to you because you

Ignored.

We're nearly at the point where it's not worth my while to listen to you because you don't speak carefully enough.

Perhaps you are not reading carefully enough.

A person's behavior can always be understood as optimizing a utility function

Models relying on expected utility make extremely strong assumption about treatment of probabilities with utility being strictly linear in probability, and these assumptions can be very easily demonstrated to be wrong.

They also make assumptions that many situations are equivalent (pay $50 for 50% chance to win $100 vs accept $50 for 50% chance of losing $100) where all experiments show otherwise.

Utility theory without these assumptions predicts nothing whatsoever.

Seems to me we've got a gen-u-ine semantic misunderstanding on our hands here, Tim :)

My understanding of these ideas is mostly taken from reinforcement learning theory in AI (a la Sutton & Barto 1998). In general, an agent is determined by a policy pi that determines the probability that the agent will make a particular action in a particular state, P = pi(s,a). In the most general case, Pi can also depend on time, and is typically quite complicated, though usually not complex ;).
Any computable agent operating over any possible state and action space can be represented by some function pi, though typically folks in this field deal in Markov Decision Processes since they're computationally tractable. More on that in the book, or in a longer post if folks are interested. It seems to me that when you say "utility function", you're thinking of something a lot like pi. If I'm wrong about that, please let me know

When folks in the RL field talk about "utility functions", generally they've got something a little different in mind. Some agents, but not all of them, determine their actions entirely using a time-invariant scalar function U(s) over the state space. U takes in future states of the world and outputs the reward that the agent can expect to receive upon reaching that state (loosely "how much the agent likes s"). Since each action in general leads to a range of different future states with different probabilities, you can use U(s) to get an expected utility U'(a,s):

U'(a,s) = sum((p(s,a,s')*U(s')),

where s is the state you're in, a is the action you take, s' are the possible future states, and p is the probability than action a taken in state s will lead to state s'. Once your agent has a U', some simple decision rule over that is enough to determine the agent's policy. There are a bunch of cool things about agents that do this, one of which (not the most important) is that their behavior is much easier to predict. This is because behavior is determined entirely by U, a function over just the state space, whereas Pi is over the conjunction of state and action spaces. From a limited sample of behavior, you can get a good estimate of U(s), and use this to predict future behavior, including in regions of state and action space that you've never actually observed. If your agent doesn't use this cool U(s) scheme, the only general way to learn Pi is to actually watch the thing behave in every possible region of action and state space. This I think is why von Neumann was so interested in specifying exactly when an agent could and could not be treated as a utility-maximizer.

Hopefully that makes some sense, and doesn't just look like an incomprehensible jargon-filled snow job. If folks are interested in this stuff I can write a longer article about it that'll (hopefully) be a lot more clear.

Some agents, but not all of them, determine their actions entirely using a time-invariant scalar function U(s) over the state space.

If we're talking about ascribing utility functions to humans, then the state space is the universe, right? (That is, the same universe the astronomers talk about.) In that case, the state space contains clocks, so there's no problem with having a time-dependent utility function, since the time is already present in the domain of the utility function.

Thus, I don't see the semantic misunderstanding -- human behavior is consistent with at least one utility function even in the formalism you have in mind.

(Maybe the state space is the part of the universe outside of the decision-making apparatus of the subject. No matter, that state space contains clocks too.)

The interesting question here for me is whether any of those alternatives to having a utility function mentioned in the Allais paradox Wikipedia article are actually useful if you're trying to help the subject get what they want. Can someone give me a clue how to raise the level of discourse enough so it's possible to talk about that, instead of wading through trivialities? PM'ing me would be fine if you have a suggestion here but don't want it to generate responses that will be more trivialities to wade through.

Allais did more than point out that human behavior disobeys utility theory, specifically the "Sure Thing Principle" or "Independence Axiom". He also argued - to my mind, successfully - that there needn't be anything irrational about violating the axiom.

I've said things that sound like this before but I want to distance myself from your position here.^

But remember what a utility function is. It's a way of adding up all your different preferences and coming up with a single number. Coming up with a single number is important, so that all possible outcomes can be ordered. That's what you need, and ordering is what numbers do. Having two utility functions is like having no utility function at all, because you don't have an ordering of preferences.

This is all true. But humans do not have utility functions.

Humans are not the coherent, consistent agents you make them out to be. We are capricious, confused, paradoxical and likely insane. When you choose to buy a plasma screen instead of donating to charity you are not acting on an algorithm that weighed the utility of the plasma screen against the utility of giving to charity. You ran one algorithm, call it the 'far algorithm' that returned 'Give to charity' and another algorithm, call it the 'near algorithm' that returned 'buy the plasma screen'. Then you ran a third which somehow returned 'indulge myself' and then you bought a plasma screen. Change a few inputs to the third algorithm- like say put you in a room full of guilt-riddled rich people talking about how much they give to charity- and you'd have done something else. In fact, it is useful to call the far-algorithm your far-self and the near your near-self. Though this is a gross simplification- you are likely more than just a near-self, far-self and over-self.

Fighting for willpower is basically your far-self trying to wrest control of your behavior from your near-self. And both these selves correspond to aspects of society- some aspects tell you to buy a tv, others tell you to give to charity. Most of the discussion on Less Wrong about akrasia occurs between different-far selves; it is something of a conspiracy to overthrow our near-selves.

*Being evil is harder than being good because it requires fighting your far-self while being good does not. But being evil is not the same as being lazy or apathetic.

*Of course, ethics is in part innate and in part a matter of taste. Your bit about supernatural beings is a red herring to me, maybe it had more significance to others.

*People don't have utility functions but often what they mean when they talk about changing them is giving one self more control over the other selves. Note, this isn't always about giving the far-self more power. There are people who work too hard, stress out too much and need to modify themselves to have more fun.

*It makes sense to purchase fuzzies and utils separately as something of a truce between your near and far-self. Often, both algorithms are better satisfied that way.

Finally, 'meta-ethics' already means something. Do not overload terms unnecessarily and especially don't do it by misusing prefixes.

^I disagree with what I wrote in February 2010, to some extent.

This is all true. But humans do not have utility functions... Humans are not the coherent, consistent agents you make them out to be.

If you think that's relevant, you should also go write the same comment on Eliezer's post on utilons and fuzzies. Having two coherent, consistent utility functions is no more realistic than having one.

If you want to be rational, you need to try to figure out what your values are, and what your utility function is. Humans don't act consistently. Whether their preferences can be described by a utility function is a more subtle question whose answer is unknown. But in either case, in order to be more rational, you need to be able to approximate your preferences with a utility function.

Fighting for willpower is basically your far-self trying to wrest control of your behavior from your near-self.

You can alternately describe this as the place where the part of your utility function that you call your far self, and the part of your utility function that you call your near self, sum to zero and provide no net information on what to do. You can choose to describe the resultant emotional confusion as "fighting for willpower". But this leads to the erroneous conclusions I described under the "ethics as willpower" section.

Just to clarify I am not, not, not defending the willpower model you described-- I just don't think willpower, properly understood as a conflict between near and far modes can be left out of an account of human decision making processes. I think the situation is both more complicated and more troubling than both models and don't think it is rational to force the square peg that is human values into the round hole that is 'the utility function'.

I'll agree that willpower may be a useful concept. I'm not providing a full model, though - mostly I want to dismiss the folk-psychology close tie between willpower and morals.

If you want to be rational, you need to try to figure out what your values are, and what your utility function is. Humans don't act consistently. Whether their preferences can be described by a utility function is a more subtle question whose answer is unknown. But in either case, in order to be more rational, you need to be able to approximate your preferences with a utility function.

This is neither here nor there. I have no doubt it can help to approximate your preferences with a utility function. But simply erasing complication by reducing all your preference-like stuff to a utility function decreases the accuracy of your model. You're ignoring what is really going on inside. So yes, if you try to model humans as holders of single utility functions... morality has nothing to do with willpower! Congrats! But my point is that such a model is far too simple.

You can alternately describe this as the place where the part of your utility function that you call your far self, and the part of your utility function that you call your near self, sum to zero and provide no net information on what to do. You can choose to describe the resultant emotional confusion as "fighting for willpower".

Well you can do that-- it doesn't seem at all representative of the way choices are made, though.

But this leads to the erroneous conclusions I described under the "ethics as willpower" section.

What erroneous conclusions? What does it predict that is not so?

Having two coherent, consistent utility functions is no more realistic than having one.

He never said these "utility functions" are coherent. In fact a large part of the problem is that the "fuzzies" utility function is extremely incoherent.

You keep using that word. I do not think it means what you think it means. A utility function that is incoherent is not a utility function.

If it is acceptable for Eliezer to talk about having two utility functions, one that measures utilons and one that measures fuzzies, then it is equally acceptable to talk about having a single utility function, with respect to the question of whether humans are capable of having utility functions.

A utility function that is incoherent is not a utility function.

I was using the same not-quite strict definition of "utility function" that you seemed to be using in your post. In any case, I don't believe Eliezer ever called fuzzies a utility function.

I think there are two "mistakes" in the article.

The first is claiming (or at least, assuming) that ethics are "monolithics", that either they come from willpower alone, or they don't come from willpower at all. Willpower do play a role in ethics, every time your ethical system contradicts the instinct, or unconscious, part of your mind. Be it to resist the temptation of a beautiful member of the opposite (or same, depending of your tastes) sex, overcome the fear of spiders or withstand torture to not betray your friends. I would say that ethics require willpower, and that someone with weak willpower will act less ethically, even according to himself, because he'll more easily let the subconscious part of his mind override the conscious one. But willpower will not define ethics. Someone with a strong willpower can be unethical, Hitler or Stalin probably had a very strong willpower. So willpower is required to act ethically, but is not enough to make us act ethically. That's an important distinction.

The second point is that "choice" and "willpower" means different things. Choice is how an algorithm feels from inside. Choice is the mental process of examining what you could do, and deciding from that what you should do. It doesn't necessarily involve willpower. It'll only do so when the subconscious part of your mind really push one way, but your ethical system, in your conscious mind, says the opposite.

I do not need any willpower to answer when asked "what time is it" or give directions around. I do need a bit willpower to help a single woman carrying a stroller in stairs at the train station, because well, it's heavy and it hurts a bit in my arms to do that (yeah, I'm not very strong physically...). I need much more willpower to act as human shield to protect an "illegal" immigrants from armed cops, and withstand the teargas, as I did once. But in none of those cases is willpower the source of my morality.

And sometimes it becomes a bit tricky : I need more willpower to not give a coin to a beggar than to give a coin to a beggar. But I don't often give coins to beggars, because my constructed ethical system tells me it's usually more efficient to give to a trusted charity than to give to a beggar (I may be wrong on that, but doesn't matter here). So here I need willpower to not be ethical in a given situation, so I can be more ethical (or at I least I think so) later on. That does correspond to the end of the article with the old lady and the boy scouts.

"true" ethics (whatever they may be). I call [this] ... "meta-ethics".

This is a bad choice of name, given that 'Metaethics' already means something (though people on LW often conflate it with Normative Ethics)

Perhaps I should use "normative ethics" instead.

Having two utility functions is like having no utility function at all, because you don't have an ordering of preferences.

The only kind of model that needs a global utility function is an optimization process. Obviously, after considering each alternative, there needs to be a way to decide which one to choose... assuming that we do things like considering alternatives and choosing one of them (using an ordering that is represented by the one utility function).

For example, evolution has a global utility function (inclusive genetic fitness). Of course, it may be described in parts (endurance, running speed, attractiveness to mates etc), but in the end it gets summed up (described by whether the genes are multiplied or not).

That said, there are things (such as Kaj's centrifugal governor or human beings) that aren't best modelled as optimization processes. Reflexes, for example, don't optimize, they just work (and a significant proportion of our brains does likewise). The fact that our conscious thinking is a (slightly buggy) implementation of an optimization process (with a more or less consistent utility function) might suggest that whole humans can also be modelled well that way...

Humans don't make decisions based primarily on utility functions. To the extent that the Wise Master presented that as a descriptive fact rather than a prescriptive exhortation, he was just wrong on the facts. You can model behavior with a set of values and a utility function, but that model will not fully capture human behavior, or else will be so overfit that it ceases to be descriptive at all (e.g. "I have utility infinity for doing the stuff I do and utility zero for everything else" technically predicts your actions but is practically useless.)

You say that if humans don't implement utility functions there's no point to reading Less Wrong. I disagree, but in any case, that doesn't seem like an argument that humans implement utility functions. This argument seems more like an appeal to emotion, we are Less Wrongers who have some fraction of our identity connected to this site, so you want us to reject this proposition because of the emotional cost of the conclusions it brings about. Logically, though, it makes little sense to take the meaningfulness of Less Wrong as given and use that to reason about human cognition. That's begging the question.

Nobody said that humans implement utility functions. Since I already said this, all I can do is say it again: Values, and utility functions, are both models we construct to explain why we do what we do. Whether or not any mechanism inside your brain does computations homomorphic to utility computations is irrelevant. [New edit uses different wording.]

Saying that humans don't implement utility functions is like saying that the ocean doesn't simulate fluid flow, or that a satellite doesn't compute a trajectory.

It's more like saying a pane of glass doesn't simulate fluid flow, or an electron doesn't compute a trajectory.

So how would you define rationality? What are you trying to do, when you're trying to behave rationally?

Values, and utility functions, are both models we construct to explain why we do what we do.

Indeed, and a model which treats fuzzies and utils as exchangeable is a poor one.

You could equally well analyze the utils and the fuzzies, and find subcategories of those, and say they are not exchangable.

The task of modeling a utility function is the task of finding how these different things are exchangeable. We know they are exchangable, because people have preferences between situations. They eventually do one thing or the other.

There is plenty of room for willpower in ethics-as-taste once you have a sufficiently complicated model of human psychology in mind. Humans are not monolithic decision makers (let alone do they have a coherent utility function, as others have mentioned).

Consider the "elephant and rider" model of consciousness (I thought Yvain wrote a post about this but I couldn't find it; in any case I'm not referring to this post by lukeprog, which is talking about something else). In this model, we divide the mind into two parts - we'll say my mind just for concreteness. The first part represents my conscious mind. The second part represents my unconscious mind.

In the elephant rider metaphor, my conscious mind is the rider and my unconscious mind is the elephant. The rider is in "control" of the elephant in the sense that he sits on top of the elephant and tells it where to go and the elephant listens, by and large. However, the rider doesn't make the elephant do anything too objectionable. If the elephant really wanted to throw the rider off its back and escape, there's nothing the rider could do to stop it. The relationship between the conscious mind and the unconscious mind is similar. My conscious mind acts as if it has full control of me, but it's typically doing things that my unconscious mind doesn't object to much or even heartily endorses. However, if my unconscious mind decides that I just need to punch this guy, consequences be damned, and floods my conscious mind with overwhelming emotions of anger and offence, there's little my conscious mind can do to regain control.

Now, this is obviously a gross simplification of the actual psychology of humans. It may make more sense to think of the rider as the collection of agents/programs/modules that make up my conscious mind and the elephant as everything else - but then again, it's not even clear that the relevant distinction is conscious vs. unconscious, or even if there's a hard line at all.

In any case, willpower to do the right thing falls out of this simple model precisely because it doesn't view humans as having monolithic minds. Assume the rider really wants the elephant to go somewhere and do something that the elephant objects strongly to. As long as the elephant doesn't object too much, the rider can do some things to get the elephant to comply, though the rider can't do them too often without completely losing control of the elephant. For example, the rider might smack the elephant on the head with a stick every time it goes the wrong direction. The elephant may comply at first, but enough smacking and that elephant just might revolt, and anyway it requires considerably more effort on the rider's part just to keep the elephant moving in the right direction. Analogously, when my conscious mind wants my unconscious mind to comply with something it doesn't want, it requires effort from my conscious mind to keep my unconscious mind from derailing the choice. For example, it took considerable effort from me last night to study for my microeconomics exam on Friday instead of watching the NLDS.

This effort on the conscious mind's part is exactly what willpower is. Suppose my unconscious mind thinks I should take a large sum of money from known genocidal dictators in exchange for weapons but my conscious mind thinks this would be a downright evil thing to do. It may take considerable effort on the part of my conscious mind in order to keep my unconscious mind from taking full control of me and collecting the dividends.

Purchasing fuzzies and utilons separately also makes sense in this context. Fuzzies satisfy the elephant - my unconscious mind - so that the rider can maintain control. My unconscious mind wants to feel like I did the right thing, but my conscious mind wants to actually do the right thing. I can't ignore my unconscious mind and purchase no fuzzies because it will eventually assert full control and get what it wants - all of those fuzzies - at the expense of everything that my conscious mind wants. So in order to keep this from happening, Eliezer suggests that I consciously purchase fuzzies in a cost effective manner to keep my unconscious mind in check (to purchase willpower essentially), and then separetely purchase utilons using cold calculation. I (i.e. my conscious mind) need to purchase fuzzies to stay in control, but it's all in the service of getting utilons. The idea isn't to maximize two utility functions, but to purchase willpower to continue maximizing one - keep that unconscious mind in check! (or rather those unconscious modules)

You could respond to all of this glibly by arguing that the preferences at stake here are the preferences of the entire me, not just my conscious mind or any other subset you come up with - and you've taken that tack before. I think you're wrong, but I'm not entirely sure how to argue against that position - other than asserting that given my experience, "I" feel like only a subset of "Matt_Simpson." In any case, that's where the disagreement is at.

I'll accept that willpower means something like the conscious mind trying to reign in the subconscious. But when you use that to defend the "ethics as willpower" view, you're assuming that the subconscious usually wants to do immoral things, and the conscious mind is the source of morality.

On the contrary, my subconscious is at least as likely to propose moral actions as my conscious. My subconscious mind wants to be nice to people. If anything, it's my conscious mind that comes up with evil plans; and my subconscious that kicks back.

I think there's a connection with the mythology of the werewolf. Bear with me. Humans have a tradition at least 2000 years long of saying that humans are better than animals because they're rational. We characterize beasts as bestial; and humans as humane. So we have the legend of the werewolf, in which a rational man is overcome by his animal (subconscious) nature and does horrible things.

Yet if you study wolves, you find they are often better parents and more devoted partners than humans are. Being more rational may let you be more effective at being moral; but it doesn't appear to give you new moral values.

(I once wrote a story about a wolf that was cursed with becoming human under the full moon, and did horrible things to become the pack alpha that it never could have conceived of as a wolf. It wasn't very good.)

In one of Terry Pratchett's novels (I think it is The Fifth Elephant) he writes that werewolves face as much hostility among wolves as among humans, because the wolves are well aware which of us is actually the more brutal animal.

On the contrary, my subconscious is at least as likely to propose moral actions as my conscious. My subconscious mind wants to be nice to people. If anything, it's my conscious mind that comes up with evil plans; and my subconscious that kicks back.

What do you call the part of your mind that judges whether proposed actions are good or evil?

I would need evidence that there is a part of my mind that specializes in judging whether proposed actions are good or evil.

You referred to some plans as good and some plans as evil; therefore, something in your mind must be making those judgements (I never said anything about specializing).

In that case, I call that part of my mind "my mind".

The post could be summarized as arguing that the division of decisions into moral and amoral components, if it is even neurally real, is not notably more important than the division of decisions into near and far components, or sensory and abstract components, or visual and auditory componets, etc.

Notice I said mind not brain. So I'm not arguing that it necessarily always takes place in the same part of the brain.

Oh yes, I should probably state my position. I want to call the judgement about whether a particular action is good or evil the "moral component", and everything else the "amoral" component. Thus ethics amounts to two things:

1) making the judgement about whether the action is good or evil as accurate as possible (this is the "wisdom" part)

2) acting in accordance with this judgement, i.e., performing good actions and not performing evil actions (this is the "willpower" part)

Why do you want to split things up that way? As opposed to splitting them up into the part requiring a quick answer and the part you can think about a long time (certainly practical), or the part related to short-term outcome versus the part related to long-term outcome, or other ways of categorizing decisions?

I'll accept that willpower means something like the conscious mind trying to reign in the subconscious. But when you use that to defend the "ethics as willpower" view, you're assuming that the subconscious usually wants to do immoral things, and the conscious mind is the source of morality.

On the contrary, my subconscious is at least as likely to propose moral actions as my conscious. My subconscious mind wants to be nice to people. If anything, it's my conscious mind that comes up with evil plans; and my subconscious that kicks back.

I agree. I'm not sure if you're accusing me of holding the position or not so just to be clear, I wasn't defending ethics as willpower - I was carving out a spot for willpower in ethics as taste. I'm not sure whether the conscious or unconscious is more likely to propose evil plans; only that both do sometimes (and thus the simple conscious/unconscious distinction is too simple).

What about the "memes=good, genes=evil" model? The literally meant one where feudalism or lolcats are "good" and loving your siblings or enjoying tasty food is "evil".

...Did you really just index your footnotes from zero?

Of course. Indices should always start at zero. It saves one CPU instruction, allows one more possible footnote, and helps avoid fencepost errors.

(I indexed my footnotes from 1, then wanted to add a footnote at the beginning.)

No, it only looks that way on your computer.

Look at the HTML; it contains a literal zero.

Something the conventional story about ethics gets right, with which you seem to disagree, is that ethics is a society-level affair. That is, to justify an action as ethically correct is implicitly to claim that a rational inquiry by society would deem the action acceptable.

Another point convention gets right, and here again you seem to differ, is motivational externalism. That is, a person can judge that X is right without necessarily being motivated to do X. Of course, you've given good evolutionary-biological reasons why most of the time moral judgments do motivate, but, I claim, there is no necessary connection. Morality and motivation can diverge, even for psychologically normal people.

I may be misreading you, please correct me if I have.

Something the conventional story about ethics gets right, with which you seem to disagree, is that ethics is a society-level affair. That is, to justify an action as ethically correct is implicitly to claim that a rational inquiry by society would deem the action acceptable.

You're just redefining "ethics" as what I called "social ethics", and ignoring the other levels. That's treating ethics as a platonic ideal rather than as a product of evolution.

Of course, you've given good evolutionary-biological reasons why most of the time moral judgments do motivate, but, I claim, there is no necessary connection.

In the view I'm presenting here, judgements by a person's personal ethics do always motivate action, by definition. Moral judgements computed using society's ethics don't directly motivate; the motivation is mediated through the person's motivation to accept society's ethics.