Ethics as a black box function

by Kaj_Sotala1 min read22nd Sep 200932 comments

14

Personal Blog

(Edited to add: See also this addendum.)

I commented on Facebook that I think our ethics is three-tiered. There are the things we imagine we consider right, the things we consider right, and the things we actually do. I was then asked to elaborate between the difference of the first two.

For the first one, I was primarily thinking about people following any idealized, formal ethical theories. People considering themselves act utilitarians, for instance. Yet when presented with real-life situations, they may often reply that the right course of action is different than what the purely act utilitarian framework would imply, taking into account things such as keeping promises and so on. Of course, a rule utilitarian would avoid that particular trap, but in general nobody is a pure follower of any formal ethical theory.

Now, people who don't even try to follow any formal ethical systems probably have a closer match between their first and second categories. But I recently came to view as our moral intuitions as a function that takes the circumstances of the situation as an input and gives a moral judgement as an output. We do not have access to the inner workings of that function, though we can and do try to build models that attempt to capture its inner workings. Still, as our understanding of the function is incomplete, our models are bound to sometimes produce mistaken predictions.

Based on our model, we imagine (if not thinking about the situations too much) that in certain kinds of situations we would arrive at a specific judgement, but a closer examination of them reveals that the function outputs the opposite value. For instance, we might think that maximizing total welfare is always for the best, but then realize that we don't actually want to maximize total welfare if the people we consider our friends would be hurt. This might happen even if you weren't explicitly following any formal theory of ethics. And if *actually* faced with that situation, we might end up acting selfishly instead.

This implies that people pick the moral frameworks which are best at justifying the ethical intuitions they already had. Of course, we knew that much already (even if we sometimes fail to apply it - I was previously puzzled over why so many smart people reject all forms of utilitarianism, as ultimately everyone has to perform some sort of expected utility calculations in order to make moral decisions at all, but then realized it had little to do with utilitarianism's merits as such). Some of us attempt to reprogram their moral intuitions, by taking those models and following them even when they fail to predict the correct response of the moral function. With enough practice, our intuitions may be shifted towards the consciously held stance, which may be a good or bad thing.

14

32 comments, sorted by Highlighting new comments since Today at 3:03 PM
New Comment

Addendum: a response to a person who asked what, in this theory, makes ethics different from any other kind of preference.

I consider ideologies to be a belief structure that lies somewhere halfway between ethics and empirical beliefs, heavily blending in parts of both. In an ideology, empirical beliefs are promoted to a level where they gain a moral worth by themselves.

To answer your actual point, I would say that ethics really are just a special case of ordinary preferences. Normatively, there's no reason why a preference for a hamburger would be more important than a preference for not killing. Of course, ethics-related preferences tend to be much stronger than others, giving them extra worth.

What makes ethics special is their functional role for the organism. (From now on, I'll reference the original moral intuitions as "morals", and the theoretical structure an organism builds to explain them as "ethics".) Morals tend to be rather strongly experienced preferences, driving behavior quite strongly. In order to better plan for the future, an organism needs to know how it will react in different situations, so over time it observes its moral reactions in a variety of circumstances and builds an ethical model that best fits the data. (This is basically a variant of the "the self is a self-model" idea from philosophy of mind, applied to ethics: see e.g. http://xuenay.livejournal.com/318670.html )

Of course, we humans tend to confuse models for the real thing. "I experience moral repugnance at this situation, which could be explained if my moral intuitions thought that killing was wrong" becomes "killing is objectively wrong". Eventually we forget that the model was a model at all, and it becomes an ideology - a system where empirical beliefs about the nature of our morals have taken a moral value by themselves. Our morals aren't entirely untouchable black boxes, of course, and this kind of confusion may serve to actually shift our morals in the direction of the theory. And I'm not saying that the models must be mistaken - they may very well be correct.

How is it possible to discuss ethics in such a scenario? Well, it needs to be noted that there's also an additional reason for ethics are likely to have evolved. Building ethical models that predict moral behavior is useful not only for predicting your own behavior, but also that of others. I suspect that part of the instinctive dislike many people feel towards hypocrites is the fact that inconsistencies between theory and behavior means the hypocrites' behavior is harder to predict, thus making alliances with them less safe. This drives people towards adopting ethical theories which are more consistent internally, or that at least appear such to others. (This theory is closely related to Robin Hanson's theory of identity: http://www.overcomingbias.com/2009/08/a-theory-of-identity.html ) And since ethical theories also take a moral worth for the individuals themselves, this provides another method for how discussion can actually modify our ethical systems.

[-][anonymous]12y 5

Most humans optimize their morality for their own well-being. A big reason why personal morality is subject to change is because an individual's morality will be workable for one social environment, and then when the environment changes, an individual's morality will suddenly be "out of sync" with the moralities of relevant people in the environment. Since incompatible moralities leads to difficulties in cooperation, this is a huge problem. At the same time, there are good reasons to have a reputation for stability. Thus the conflict of having an "incorrect" morality and not being able to change it drastically. I'd engage in more rambling about how the Stanford Prison Experiment supports this notion, but the point of this post is not to speculate on the interaction between identity and morality.

It would be desirable, then, to simply find what the "best" morality would be in terms of being compatible with various likely future environments, and stick with it. It seems like many of the candidates for these "optimal" all-purpose moralities are the broad moral theories under discussion.

Notice that most theoretical moral frameworks contain an element of fairness.

When morality is left to personal whim, the individual tends to modify their morality to favor themselves and their in-groups. Thus when they encounter others, especially ones outside their original groups, the individual's original morality comes into conflict with the new tribe, whose members have optimized their moralities for themselves and their own in-groups. Golden-rule-type ideas correct this, by having individuals consider the viewpoint of others (and thus, adjusting their morality to be more compatible with these viewpoints.)

If what I said is true, then moral "progress" tends to occur when social groups who have previously maintained prejudicial moralities find it necessary to work with groups they have previously discriminated against.

And if this is the case, the implications for moral philosophy are this: it would be profitable to put more focus on finding the reasons why morality differs between individuals, rather than get caught up in quandaries about our suboptimally egoistic moral intuitions.

Those "three tiers" sound a little bit like another classification I found useful, in Baron's "Thinking and Deciding". These are the normative, prescriptive and descriptive questions about thinking and decision making.

Descriptive models account for how people actually decide. Experimental results on biases fit in there. Normative theories are about how we should think; standards by which actual decisions can be evaluated. Expected utility fits in there.

Prescriptive models bridge the gap between the two: they are rules about how we can improve everyday thinking by bringing our decisions closer to what normative theories would advise. They account for the practical concerns of decision making, e.g. in some cases our resources (time, brainpower, etc.) are too limited for an exact computation according to the normative theory.

"Pick[ing] the moral frameworks which are best at justifying [our] ethical intuitions" is discussed at length in Rawls' Theory of Justice under the term "reflective equilibrium". It doesn't require that we hold our "ethical intuitions" as a fixed point; the process of working out (in advance and at leisure) consequences of normative models and comparing them with our intuitions may very well lead us to revise our intutions.

Reflective equilibrium is desirable from a prescriptive standpoint. When we are "in the thick of things" we usually will not have time to work out or moral positions on pen and paper, and will fall back on intuitions and heuristics. It is better to have trained those to yield the decisions we would wish ourselves to make if we could consider the situation in advance.

I hold that moral intuitions are nothing but learned prejudices. Historic examples from slavery to the divine right of kings to tortured confessions of witchcraft or Judaism to the subjugation of women to genocide all point to the fallibility of these 'moral intuitions'. There is absolutely no sense to the claim that its conclusions are to be adopted before those of a reasoned argument.

-Alonzo Fyfe

Fallible relative to what?

Skimming around his site, it's interesting, but I think he made a basic mistake

From here:

Act utilitarianism not only requires no desire for alcohol, it requires no desire for anything other than to maximize utility. If the agent likes the taste of steak better than hamburger, then there will be an instance in which he will sacrifice maximum utility for a steak. If he has a strong preference, it will have the same effect as a strong preference for alcohol. If he has an aversion to pain, a desire for sex, a particular interest in the well being of his children, there are instances in which she will sacrifice her desire to maximize utility to obtain fulfillment of any of these other desires.

I hold that a moral commandment to act as an act-utilitarian is no different than a commandment to alter the gravitational constant to a number that maximizes utility, or a commandment to move the Earth to an orbit that would produce a more pleasing climate. If it cannot be done, there is no sense in saying that it ought to be done.

Of course, the definition of my utility function will include a term for steaks, or alcohol, or whatever intrinsic value they help me achieve. Maximizing utility is not, therefore, contradictory to valuing a steak. My desire to maximize utility includes my desire to eat steak (or whatever intrinsic value it helps me attain).

This seems like a real simple mistake, so maybe I am simply misunderstanding him. Anyone who knows his work better care to comment (at least before I have more time to poke around his site some more)?

Fyfe annoys me sometimes because he continuously ignores my requests to express concepts in mathematical language.

I didn't read any more of his site, but just from the excerpt you gave, her [1] point seems to be that if you value total utility, then you will have to deprive yourself to benefit people in general, which people can't do -- they inevitably act as if their own utility carries more weight than that of others.

[1] Hey, if he can use pronouns confusingly and inconsistently, so can we!

"Reasoned argument", it says.

And how does that help if the premises in your "reasoned argument" are arrived at via intuition?

For instance, we might think that maximizing total welfare is always for the best, but then realize that we don't actually want to maximize total welfare if the people we consider our friends would be hurt.

Well, you have to understand what such a decision would actually look like. In order for a decision to truly maximize total welfare over all people, even as it "stabs your friends in the back", it would have to really increase total welfare, because this utility gain would have to at least cancel out the degradation of the value of friendship.

That is, if I expect my friendship with someone not to mean that they weight me higher than a random person in their utility function, friendship becomes less valuable, and an entire set of socially-beneficial activity enabled by friendship (e.g. lower cost of monitoring for cheating) contracts.

I think your hypothetical here has the same problem that presenting the true Prisoner's Dilemma has; in the true PD, it's hard to intuitively imagine a circumstance where utilities in the payoff matrix account for my compassion for my accomplice. Just the same, in the tradeoff your presented, it's hard to intuitively understand what kind of social gain could outweigh general degradation of friendship.

ETA: Okay, it's not that hard, but like with the true PD, such situations are rare: for example, if I were presented with the choice of "My twenty closest friends/loved ones die" vs. "All of humanity except me and my twenty closest die". But even then, if e.g. my friends have children not in the set of 20, it's still not clear that all of the twenty would prefer the second option!

But even then, if e.g. my friends have children not in the set of 20, it's still not clear that all of the twenty would prefer the second option!

Wow, you really don't search very hard for hypotheticals. It's not actually very hard to come up with situations that have this sort of conflict. E.g. a general sending a specialized squad (including several friends) on an extremely risky mission that only they could carry out, if the alternatives would cause much more risk to the army as a whole. (Not an entirely fabricated situation, although that example doesn't fit perfectly.)

Okay, fair point; I was interpreting the situation as being one in which you betray a friend for the benefit of others; in the example you gave, the sacrifice asked of them is part of the duties they signed up for and not an abrogation of friendship.

But I don't think your example works either: it benefited Americans at the expense of Japanese. That's not trading "friends' utilities for higher other utilities"; its' trading "friends' utilities for some higher and some lower other utilities".

Now, if you want to introduce some paperclip maximizers who value a few more paperclips to a billion human lives...

But I don't think your example works either: it benefited Americans at the expense of Japanese. That's not trading "friends' utilities for higher other utilities"; its' trading "friends' utilities for some higher and some lower other utilities".

When estimated by humans, utilities aren't objective. I'm pretty sure that if you asked Col. Doolittle in those terms, he'd be of the opinion that U(US winning Pacific Theater) >> U(Japan winning Pacific Theater), taking the whole world into account; thus he probably experienced conflict between his loyalty to friends and his calculation of optimal action. (Of course he's apt to be biased in said calculation, but that's beside the point. There exists some possible conflict in which a similar calculation is unambiguously justified by the evidence.)

Of course he's apt to be biased in said calculation, but that's beside the point. There exists some possible conflict in which a similar calculation is unambiguously justified by the evidence.

Then I'm sure you can cite that instead. If it's hard to find, well, that's my point exactly.

I'm not sure I'm understanding properly. You talk as if my action would drastically affect society's views of friendship. I doubt this is true for any action I could take.

Well, all my point really requires that is that it moves society in that direction. The fraction of "total elimination of friendship" that my decision causes must be weighed against the supposed net social gain (other people's gain minus that of my friends), and it's not at all obvious when one is greater than the other.

Plus, Eliezer_Yudkowsky's Timeless Decision Theory assumes that your decisions do have implications for everyone else's decisions!

This implies that people pick the moral frameworks which are best at justifying the ethical intuitions they already had.

The previous paragraph seemed to be arguing that people pick the moral frameworks which are best at describing the ethical intuitions they already had. Why do you choose this different interpretation?

I was previously puzzled over why so many smart people reject all forms of utilitarianism, as ultimately everyone has to perform some sort of expected utility calculations in order to make moral decisions at all

I don't see the necessity. Can you expand on that?

With enough practice, our intuitions may be shifted towards the consciously held stance, which may be a good or bad thing.

Quite. Changing the theory to fit the data seems to me preferable to the reverse.

I don't see the necessity. Can you expand on that?

I think you're right not to see it. Valuing happiness is a relatively recent development in human thought. Much of ethics prior to the enlightenment dealt more with duties and following rules. In fact, seeking pleasure or happiness (particularly from food, sex, etc.) was generally looked down or actively disapproved. People may generally do what they calculate to be best, but best need not mean maximizing anything related to happiness.

Ultra-orthodox adherence to religion is probably the most obvious example of this principle, particularly Judaism, since there's no infinitely-good-heaven to obfuscate the matter. You don't follow the rules because they'll make you or others happy, you follow them because you believe it's the right thing to do.

Valuing happiness is a relatively recent development in human thought. Much of ethics prior to the enlightenment dealt more with duties and following rules.

This just isn't true at all. Duty based morality was mostly a Kantian invention. Kant was a contemporary of Bentham's and died a few years before Mill was born. Pre-enlightenment ethics was dominated by Aristotelian virtue theory which put happiness in a really important position (it might be wrong to consider happiness the reason for acting virtuous but it is certainly coincident with Eudamonia).

Edit to say I'm interpreting ethics as "the study of morality" if you mean by ethics the actual rules governing practices of people throughout the world your comment makes more sense. For most people throughout history (maybe including right now) doing what is right means doing what someone tells you to do. Considered that way your comment makes more sense.

The Ancient Greek concept of happiness was significantly different from the modern concept of happiness. It tended to be rather teleological and prescriptive rather than being individualistic. You achieved "true happiness" because you lived correctly; the correct way to live was not defined based on the happiness you got out of it. There were some philosophers who touched on beliefs closer to utilitarianism, but it was never close to main stream. Epicurus, for example, but his concept was a long way from Bentham's utilitarianism. The idea that happiness was the ultimate human good and that more total happiness was unequivocally and absolutely better was not even close to a mainstream concept until the enlightenment.

Oh, some of Socrates' fake debate opponents did argue pleasure as the ultimate good. This was generally answered with the argument that true pleasure would require certain things, so that the pursuit of pure pleasure actually didn't give one the most amount of pleasure. This concept has objectionable objective, teleological, and non-falsifiable properties; it is a very long way from the utilitarian advocacy of the pursuit of pleasure, because its definition of pleasure is so constrained.

Much of ethics prior to the enlightenment dealt more with duties and following rules.

Virtue ethics was generally about following rules. Duty was not the primary motivator, but if you did not do things you were obliged to do, like obey your liege, your father, the church, etc., you were not virtuous. Most of society was, and in many ways still is, run by people slavishly adhering to social customs irrespective of their individual or collective utilitarian value.

I did not claim that everyone operated explicitly off of a Kantian belief that duty was the ultimate good. I am simply pointing out that most people's ethical systems were, in practice, simply based on obeying those society said should be obeyed and following rules society said they should follow. I don't think this is particularly controversial, and that people can operate off of such systems shows that one need not be utilitarian to make moral judgements.

As I added in my edit, I find it plausible (though not certainly the case) that the ethical systems of individuals have often amounted to merely obeying social rules. Indeed, for the most part they continue to do so. I don't think we disagree.

That said, as far as the scholarly examination of morality goes there wasn't any kind of paradigm shift away from duty-based theories to "happiness" based theories. Either theories that dealt with duty and following rules means something like Kantian ethics or Divine Rule in which case the Enlightenment saw an increase in such theories OR duty-based theory just refers to any theory which generates rules and duties in which case utilitarianism is just as much a duty-based theory as anything else (as it entails a duty to maximize utility).

Virtue ethics "was generally about following rules" only in this second sense. Obviously virtue ethics dealt with happiness in a different way then utilitarianism, since, you know, they're not the same thing. I agree that the word that Ancient Greek word that gets translated as happiness in the Nicomachean Ethics means something different from what we mean by happiness. I like "flourishing". But it certainly includes happiness and is far more central (for Aristotle Eudamonia is the purpose of your existence) to virtue ethics than duty is.

Bentham and Mill were definitely innovators, I'm not disputing that. But I think their innovation had more to do with their consequentialism than their hedonism. What seems crucially new, to me, is that actions are evaluated exclusively by the effect they have on the world. Previous ethical theories are theories for the powerless. When you don't know how you the effect the world it doesn't make any sense to judge actions by the effect. The scientific revolution, and in particular the resulting British empiricism were crucial for making this sort of innovation possible.

Its also true that certain kinds of pleasure came to be looked down upon less than they were looked down upon before but I think this has less to do with the theoretical. innovations of utilitarianism then with economic and social changes leading to changes in what counts as virtue which Hume noted. After all, Mill felt the need to distinguish between higher (art, friendship, etc.) and lower pleasures (sex, food, drink) the former of which couldn't be traded for any amount of the lower and were vast more valuable.

Anyway, I definitely agree that you don't have to be a utilitarian to make moral judgments. I was just replying to the notion that pre-utilitarian theories were best understood as being A) About duty and B) Not about happiness.

My reading of that sentence was that Kaj_Sotala focused not on the happiness part of utilitarianism, but on the expected utility calculation part. That is, that everyone needs to make an expected utility calculation to make moral decisions. I don't think any particular type of utility was meant to be implied as necessary.

Well, there was Epicurus...

The previous paragraph seemed to be arguing that people pick the moral frameworks which are best at describing the ethical intuitions they already had. Why do you choose this different interpretation?

Ah, you're right, I left out a few inferential steps. The important point is that over time, the frameworks take on a moral importance of their own - they cease to become mere models, instead becoming axioms. (More about this in my addendum.) That also makes the meaning of "models that best explain intuitions" and "models that best justify intuitions" blend together, especially since a consistent ethical framework is also good for your external image.

I don't see the necessity. Can you expand on that?

To put it briefly: by "all forms of utilitarianism", I wasn't referring to the classical meaning of utilitarianism as maximizing the happiness of everyone, but instead the meaning it seems to have taken in common parlance: any theory where decisions are made by maximizing expected total utility. Nobody (that I know of) has principles that are entirely absolute: they are always weighted against other principles and possible consequences, implying that they must have different weightings that are compared to find the combination that produces the best result (interpretable as the one that produces the highest utility). I suppose you could reject this and say that people just have this insanely huge preference ordering for different outcomes, but that sounds more than a bit implausible. (Not to mention that you can construct a utility function for any given preference ordering, anyway.) Of course, it looks politically better to claim that your principles are absolute and not subject to negotiation, so people want to instinctively reject any such thoughts.

Nobody (that I know of) has principles that are entirely absolute: they are always weighted against other principles and possible consequences, implying that they must have different weightings that are compared to find the combination that produces the best result (interpretable as the one that produces the highest utility). I suppose you could reject this and say that people just have this insanely huge preference ordering for different outcomes, but that sounds more than a bit implausible. (Not to mention that you can construct a utility function for any given preference ordering, anyway.)

I reject both it, and the straw alternative you offer. I see no reason to believe that people have utility functions, that people have global preferences satisfying the requirements of the utility function theorem, or that people have global preferences at all. People do not make decisions by weighing up the "utility" of all the alternatives and choosing the maximum. That's an introspective fairy tale. You can ask people to compare any two things you like, but there's no guarantee that the answers will mean anything. If you get cyclic answers, you haven't found a money pump unless the alternatives are ones you can actually offer.

An Etruscan column or Bach's cantata 148?

Three badgers or half a pallet of bricks? (One brick? A whole pallet?)

You might as well ask Feathers or lead? Whatever answer you get will be wrong.

Observing that people prefer some things to others and arriving at utility functions as the normative standard of rationality looks rather similar to the process you described of going from moral intuitions to attaching moral value to generalisations about them.

Whether an ideal rational agent would have a global utility function is a separate question. You can make it true by definition, but that just moves the question: why would one aspire to be such an agent? And what would one's global utility function be? Defining them as "autonomous programs that are capable of goal directed behavior" (from the same Wiki article) severs the connection with utility functions. You can put it back in: "a rational agent should select an action that is expected to maximize its performance measure" (Russell & Norvig), but that leaves the problem of defining its performance measure. However you slide these blocks around, they never fill the hole.

Huh. Reading this comment again, I realize I've shifted considerably closer to your view, while forgetting that we ever had this discussion in the first place.

Having non-global or circular preferences doesn't mean a utility function doesn't exist - it just means it's far more complex.

Having non-global or circular preferences doesn't mean a utility function doesn't exist - it just means it's far more complex.

Can you expand on that? I can't find any description on the web of utility functions that aren't intimately bound to global preferences. Well-behaved global preferences give you utility functions by the Utility Theorem; utility functions directly give you global preferences.

Someone recently remarked (in a comment I haven't been able to find again) that circular preferences really mean a preference for running around in circles, but this is a redefinition of "preference". A preference is what you were observing when you presented someone with pairs of alternatives and asked them to choose one from each. If, on eliciting a cyclic set of preferences, you ask them whether they prefer running around in circles or not, and they say not, then there you are, they've told you another preference. Are you going to then say they have a preference for contradicting themselves?

wasn't referring to the classical meaning of utilitarianism as maximizing the happiness of everyone, but instead the meaning it seems to have taken in common parlance: any theory where decisions are made by maximizing expected total utility.

I don't think that's the common usage. Maybe the same etymology means that any difference must erode, but I think it's worth fighting. A related distinction I think is important is consequentialism vs utilitarianism. I think that the modern meaning of consequentialism is using "good" purely in an ordinal sense and purely based on consequences, but I'm not sure what Anscombe meant. Decision theory says that coherent consequentialism is equivalent to maximizing a utility function.

[-][anonymous]12y 0

I was unsure of whether or not to post this, as its contents seemed obvious in retrospect, especially for this crowd. Still, formulating the set of our ethical intuitions as a black box function was a new and useful metaphor for me when I came up with it, so I thought I'd post this and see whether it got any upvotes.