The central argument of this post is that human values are only complex because all the obvious constraints and goals are easily fulfilled. The resulting post-optimization world is deeply confusing, and leads to noise as the primary driver of human values. This has worrying implications for any kind of world-optimizing. (This isn't a particularly new idea, but I am taking it a bit farther and/or in a different direction than this post by Scott Alexander, and I think it is worth making clear, given the previously noted connection to value alignment and effective altruism.)

First, it seems clear that formerly simple human values are now complex. "Help and protect relatives, babies, and friends" as a way to ensure group fitness and survival is mostly accomplished, so we find complex ethical dilemmas about the relative values of different behavior. "Don't hurt other people" as a tool for ensuring reciprocity has turned into compassion for humanity, animals, and perhaps other forms of suffering. These are more complex than they could possibly have been expressed in the ancestral environment, given restricted resources. It's worth looking at what changed, and how.

In the ancestral environment, humans had three basic desires; they wanted food, fighting, and fornication. Food is now relatively abundant, leading to people's complex preferences about exactly which flavors they like most. These differ because the base drive for food is overoptimizing. Fighting was competition between people for resources - and since we all have plenty, this turns into status-seeking in ways that aren't particularly meaningful outside of human social competition. The varieties of signalling and counter-signalling are the result. And fornication was originally for procreation, but we're adaptation executioners, not fitness maximizers, so we've short-cutted that with birth control and pornography, leading to an explosion in seeking sexual variety and individual kinks.

Past the point where maximizing the function has a meaningful impact on the intended result, we see the tails come apart. The goal seeking of human nature, however, needs to find some direction to push the optimization process. The implication from this is that humanity finds diverging goals because they are past the point where the basic desires run out. As Randall Munroe points out in an XKCD Comic, this leads to increasingly complex and divergent preferences for ever less meaningful results. And that comic would be funny if it weren't a huge problem for aligning group decision making and avoiding longer term problems.

If this is correct, the key takeaway is that as humans find ever fewer things to need, they inevitably to find ever more things to disagree about. Even though we expect convergent goals related to dominating resources, narrowly implying that we want to increase the pool of resources to reduce conflict, human values might be divergent as the pool of such resources grows.

New to LessWrong?

New Comment
13 comments, sorted by Click to highlight new comments since: Today at 5:47 PM

It is not obvious that "value" is synonymous with "moral value", and it is no clear that divergence in values is a moral issue, or necessarily leads to conflict. Food preferences are the classic example of preferences that can be satisfied in a non-exclusionary way.

This is a good point, but I'll lay out the argument against it.

To start, I'm personally skeptical of the claim that preferences and moral values can be clearly distinguished, especially given the variety of value systems that people have preferred over time, or even today.

Even if this is false, we seem to see the same phenomenon occur with moral values. I think the example of obvious differences in the relative preference for saving dogs, the elderly, or criminals points to actual differences in values - but as I argued above, I think this is a heavily optimized subspace of a moral intuition towards liking life which is now largely selecting on noise. But the difference in moral conclusions that follow from assigning animal lives exactly zero versus smaller-than human but nonzero value are huge.


To start, I’m personally skeptical of the claim that preferences and moral values can be clearly distinguished, especially given the variety of value systems that people have preferred over time, or even today.

There is a clear distinction within each system: if you violate a moral value, you are a bad person, if you violate a non-moral value, you are something else -- irrational or foolish, maybe.

Also, you have not excluded levelling down -- nothing is a moral value -- as an option.

If you want a scale of moral value that is objective universal and timeless, you are going to have problems. But it is a strange to want in the first place., because value is not an objective physical property. Different people value different things. Where those values or preferences can be satisfied individually, there is no moral concern. Where there are trade-offs , or potential for conflict, then there is a need for -- in the sense that a group is better off with -- publically agreed and known standards and rules. Societies with food scarcity have rules about who is allowed to eat what, societies with food abundance don't. Morality is an adaptation., it isn't and should not be the same everywhere.

The EA framing -- where moral is considered in terms of making inprovements, and individual , voluntary actions makes it quite hard to understand morality in general, because morality in general is about groups, obligations and prohibitions.

My favorite analogy is that of an automated control system, like a radio receiver. The main goal is to lock onto the station and maximize the signal fidelity, but once it's done, there is always the noise of the locking algorithm steering around the optimal point. The drive to optimize doesn't disappear, but ends up cranking up the gain on this residual optimization noise and trying to find something else to optimize. In psychology this is phenomenologically described as the Maslow's hierarchy. This (misapplied) drive to optimize creates this complexity, because it is trying to work on its own internal optimization algorithms, leading to a lot of non-linearity and spurious chaotic behavior. Hmm, wonder how hard it would be to model this mathematically or numerically.

Yeah, if you fulfill people's material desires, their remaining unfulfilled desires will be more conflicting. But the picture you paint - that "basic desires run out", and what remains is noise and irrelevance - doesn't look right to me.

Rather, what remains is mostly positional (status) desires. These are just as basic and universal as wanting food or comfort. And they're still relevant to the original goal, increasing the relative (zero-sum) frequency of your genes. That's why these desires are hard to fulfill for everyone, not because of noise or irrelevance. This old post by Phil Goetz explains the idea and takes it further.

I agree that positional goods are important even in the extreme, but:

1) I don't think that sexual desires or food preferences fit in this mold.

2) I don't think that which things are selected as positional goods (perhaps other than wealth and political power) is dictated by anything other than noise and path dependence - the best tennis player, the best DOTA player, or the most cited researcher are all positional goods, and all can absorb arbitrary levels of effort, but the form they take and the relative prestige they get is based on noise.

Agreed that the amount of status you can get from tennis or DOTA is path-dependent. But why do you count that as part of complexity of value? To me it looks more like we have one simple value (status) that we try to grab where we can.

In part, I think the implication of zero-sum versus non-zero sum status is critical. Non-zero sum status is "I'm the best left-handed minor league pitcher by allowed runs" while zero-sum status is "by total wealth/power, I'm 1,352,235,363rd in the world." Saying we only have on positional value for status seemingly assumes the zero-sum model.

The ability to admit these non-zero sum status signals has huge implications for whether we can fulfill values. If people can mostly find relatively high-position niches, the room for selection on noise and path-dependent value grows.

This also relates to TAG's point about whether we care about "value" or "moral value" - and I'd suggest there might be moral value in fulfilling preferences only if they are not zero-sum positional ones.

Non-zero sum status is “I’m the best left-handed minor league pitcher by allowed runs”

How is that non-zero-sum?

There are real examples of "non-zero-sum status" - e.g. you might feel inferior to an anime character - but the examples you give aren't that. The sum of many small zero-sum games is still zero-sum.

It would be non-zero-sum if, for example, you're the only person in the world who cares about “I’m the best left-handed minor league pitcher by allowed runs”. By thinking up new status positions that only you care about, or that you care about more than others, you can gain more value than other people lose.

It may be sensible, after all, to describe this sort of status as “non-zero-sum”, if such contexts satisfy these criteria:

  1. Status within some context subjectively matters as much as status outside the context would, in the absence of the context (or in lack of participation in it)
  2. There is no limit on the number of such independent contexts that may be created

What I am describing here is discussed in detail in another gwern classic: “The Melancholy of Subculture Society”.

Have you read the Blue-Minimizing Robot? Early Homo sapiens was in the simple environment where it seemed like they were "minimizing blue," i.e. maximizing genetic fitness. Now, you might say, it seems like our behavior indicates preferences for happiness, meaning, validation, etc, but really that's just an epiphenomenon no more meaningful than our previous apparent preference for genetic fitness.

However, there is an important difference between us and the blue-minimizing robot, which is that we have a much better model of the world, and within that model of the world we do a much better job than the robot at making plans. What kind of plans? The thing that motivates our plans is, from a purely functional perspective, our preferences. And this thing isn't all that different in modern humans versus hunter-gatherers. We know, we've talked to them. There have been some alterations due to biology and culture, but not as much as there could have been. Hunter-gatherers still like happiness, meaning, validation, etc.

What seems to have happened is that evolution stumbled upon a set of instincts that produced human planning, and that in the ancestral environment this correlated well with genetic fitness, but in the modern environment this diverges even though the planning process itself hasn't changed all that much. There are certain futuristic scenarios that could seriously disrupt the picture of human values I've given, but I don't think it's the default, particularly if there aren't any optimization processes much stronger than humans running around.

Yes, and that's closely related to the point I made about " we're adaptation executioners, not fitness maximizers."

My point is a step further, I think - I'm asking what decides which things we plan to do? It's obviously our "preferences," but if we've already destroyed everything blue, the next priority is very underspecified.