226

LESSWRONG
LW

225
Ethics & MoralityHuman ValuesWorld Optimization
Frontpage

50

Human Values ≠ Goodness

by johnswentworth
2nd Nov 2025
8 min read
30

50

Ethics & MoralityHuman ValuesWorld Optimization
Frontpage

50

Human Values ≠ Goodness
48Steven Byrnes
12johnswentworth
18cousin_it
15johnswentworth
14cousin_it
8johnswentworth
13cousin_it
5johnswentworth
5cousin_it
2johnswentworth
7cousin_it
4Garrett Baker
9johnswentworth
5Noosphere89
4Raemon
17Wei Dai
4johnswentworth
5Wei Dai
5Wei Dai
12Nina Panickssery
9Vanessa Kosoy
6Linda Linsefors
5skolemizer
5Kaarel
4the gears to ascension
4Lukas Finnveden
3julius vidal
3julius vidal
3williawa
3Jesper L.
New Comment
30 comments, sorted by
top scoring
Click to highlight new comments since: Today at 9:06 AM
[-]Steven Byrnes9d4813

(I say this all the time, but I think that [the thing you call “values”] is a closer match to the everyday usage of the word “desires” than the word “values”.)

I think we should distinguish three things: (A) societal norms that you have internalized, (B) societal norms that you have not internalized, (C) desires that you hold independent of [or even despite] societal norms.

For example:

  • a 12-year-old girl might feel very strongly that some style of dress is cool, and some other style in cringe. She internalized this from people she thinks of as good and important—older teens, her favorite celebrities, the kids she looks up to, etc. This is (A).
  • Meanwhile, her lame annoying parents tell her that kindness is a virtue, and she rolls her eyes. This is (B).
  • She has a certain way that she likes to arrange her pillows in bed at night before falling asleep. Very cozy. She has never told anyone about this, and has no idea how anyone else arranges their pillows. This is (C).

Anyway, the OP says: “our shared concept of Goodness is comprised of whatever messages people spread about what other people should value. … which sure is a different thing from what people do value, when they introspect on what feels yummy.”

I think that’s kinda treating the dichotomy as (B) versus (C), while denying the existence of (A).

If that 12yo girl “introspects on what feels yummy”, her introspection will say “myself wearing a crop-top with giant sweatpants feels yummy”. This obviously has memetic origins but the girl is very deeply enthusiastic about it, and will be insulted if you tell her she only likes that because she’s copying memes.

By the way, this is unrelated to “feeling of deep loving connection”. The 12yo girl does not have a “feeling of deep loving connection” to the tiktok influencers, high schoolers, etc., who have planted the idea in her head that crop-tops and giant sweatpants look super chic and awesome. I think you’re wayyy overstating the importance of “feeling of deep loving connection” for the average person’s “values”, and correspondingly wayyy understating the importance of this kind of norm-following thing. I have a draft post with much more about the norm-following thing, should be out soon :)

Reply1
[-]johnswentworth9d122

Good points as usual! On a meta note, I thought when writing this "Steve will probably say something like he usually says, and I still haven't fully incorporated it into my models, hopefully I'll absorb some more this time".

Anyway, I don't think I want to deny the existence of (A). I want to say that "style X is cool" is a true part of the girl's values insofar as style X summons up yummy/yearning/completeness/etc feelings on its own, and is not a true part of her values insofar as the feelings involved are mostly social anxiety or a yearning to be liked. (The desire to be liked would then be a part of her values, insofar as the prospect of being liked is what actually triggers the yearning.)

I do want to say that stuff is a true part of one's values once it triggers those feelings, regardless of whether memes were involved in installing the values along the way. I want to distinguish that from the case where people "tie themselves in knots", trying to act like they value something or telling themselves that they value something when the feelings are not in fact there, because they've been told (or logically convinced themselves) they "should" value the thing.

Reply1
[-]cousin_it9d184

I agree that the distinction is important. However, my view is that a lot of what you call "goodness" is part of society's mechanism to ensure cooperate/cooperate. It helps other people get yummy stuff, not just you.

You can of course free yourself from that mechanism, and explicitly strategize how to get the most "yumminess" for yourself without ending up broke/addicted/imprisoned/etc. If the rest of society still follows "goodness", that leads to defect/cooperate, and indeed you end up better off. But there's a flaw in this plan.

Reply
[-]johnswentworth9d1512

Part of the point I intended to convey with the post is that society pushing for cooperate/cooperate is one way that Goodness-claims can go memetic, but there are multiple others ways memeticity can be achieved which are not so well aligned with the Values of Humans (either one's own values or others'). Thus this part:

Albert has relatively low innate empathy, and throws out all the Goodness stuff about following the rules and spirit of high-trust communities. Albert just generally hits the “defect” button whenever it’s convenient. Then Albert goes all pikachu surprise face when he’s excluded from high trust communities.

The message is definitely not to go hammering the defect button all the time, that's stupid. Yet somehow every time someone suggests that Goodness is maybe not all it's cracked up to be, lots of onlookers immediately round this to "you should go around hammering the defect button all the time!" (some with positive affect, some with negative) and man I really wish people could stop rounding that off and absorb the actual point.

Reply
[-]cousin_it9d*140

Hmm. In all your examples, Albert goes against "goodness" and ends up with less "yumminess" as a result. But my point was about a different kind of situation: some hypothetical Albert goes against "goodness" and actually ends up with more "yumminess", but someone else ends up with less. What do you think about such situations?

Reply
[-]johnswentworth9d84

I would ask Albert: do you generally find it yummy when other people get more yumminess? Do you usually feel like shit when you screw over someone else? For most people, the answers to these are "yes". Most people do not actually like screwing over other people, most of the time (though there are of course exceptions).

Insofar Albert is a sociopath, or is in one of those moods where he really does want to screw over someone else... I would usually say "Look man, I want you to pursue your best life and fulfill your values, so I wish you luck. But also I'm going to try to stop you, because I want the same for other people too, and I want higher-order nice things like high trust communities.". One does not argue against the utility function, as the saying goes.

Reply
[-]cousin_it9d136

Most people do not actually like screwing over other people

I think this is very culturally dependent. For example, wars of conquest were considered glorious in most places and times, and that's pretty much the ultimate form of screwing over other people. Or for another example, the first orphanages were built by early Christians, before that the orphans were usually disposed of. Or recall how common slavery and serfdom have been throughout history.

Basically my view is that human nature without indoctrination into "goodness" is quite nasty by default. Empathy is indeed a feeling we have, and we can feel it deeply (...sometimes). But we ended up with this feeling mainly due to indoctrination into "goodness" over generations. We wouldn't have nearly as much empathy if that indoctrination hadn't happened, and it probably wouldn't stay long term if that indoctrination went away.

Reply
[-]johnswentworth9d50

I do want to say that stuff is a true part of one's values once it triggers the feelings of yumminess/yearning/etc, regardless of whether memes were involved in installing the values along the way. I want to distinguish that from the case where people "tie themselves in knots", trying to act like they value something or telling themselves that they value something when the feelings are not in fact there, because they've been told they "should" value the thing.

So yeah, some of our actual values are installed culturally/memetically, and that doesn't automatically make them bad or fake values. I'm on board with that, so long as the underlying feelings of yumminess/yearning/etc actually show up.

We can throw out the other junk of memetic egregore Goodness, without abandoning the stuff people actually feel good about.

Reply
[-]cousin_it9d*50

But why do you think that people's feelings of "yumminess" track the reality of whether an action is cooperate/cooperate? I've explained that it hasn't been true throughout most of history: people have been able to feel "yummy" about very defecting actions. Maybe today the two coincide unusually well, but then that demands an explanation.

I think it's just not true. There are too many ways to defect and end up better off, and people are too good at rationalizing why it's ok for them specifically to take one of those ways. That's why we need an evolving mechanism of social indoctrination, "goodness", to make people choose the cooperative action even when it doesn't feel "yummy" to them in the moment.

Reply
[-]johnswentworth9d20

But why do you think that people's feelings of "yumminess" track the reality of whether an action is cooperate/cooperate?

I don't think that's the right question here?

Let me turn it around: you say "That's why we need an evolving mechanism of social indoctrination, "goodness", to make people choose the cooperative action even when it doesn't feel "yummy" to them in the moment.". But, like, the memetic egregore "Goodness" clearly does not track that in a robust generalizable way, any more than people's feelings of yumminess do. The egregore is under lots of different selection pressures besides just "get people to not defect", and the egregore has indoctrinated people in different things over time. So why are you attached to the whole egregore, rather than wanting to jettison the bulk of the egregore and focus directly on getting people to not defect? Why do you think that the memetic egregore Goodness tracks the reality of whether an action is cooperate/cooperate?

Reply
[-]cousin_it9d70

But, like, the memetic egregore “Goodness” clearly does not track that in a robust generalizable way, any more than people’s feelings of yumminess do.

I feel you're overstating the "any more" part, or at least it doesn't match my experience. My feelings of "goodness" often track what would be good for other people, while my feelings of "yumminess" mostly track what would be good for me. Though of course there are exceptions to both.

So why are you attached to the whole egregore, rather than wanting to jettison the bulk of the egregore and focus directly on getting people to not defect?

This can be understood two ways. 1) A moral argument: "We shouldn't have so much extra stuff in the morality we're blasting in everyone's ears, it should focus more on the golden rule / unselfishness". That's fine, everyone can propose changes to morality, go for it. 2) "Everyone should stop listening to morality radio and follow their feels instead". Ok, but if nobody listens to the radio, by what mechanism do you get other people to not defect? Plenty of people are happy to defect by feels, I feel I've proved that sufficiently. Do you use police? Money? The radio was pretty useful for that actually, so I'm not with you on this.

Reply
[-]Garrett Baker9d*40

Insofar Albert is a sociopath, or is in one of those moods where he really does want to screw over someone else... I would usually say "Look man, I want you to pursue your best life and fulfill your values, so I wish you luck. But also I'm going to try to stop you, because I want the same for other people too, and I want higher-order nice things like high trust communities.". One does not argue against the utility function, as the saying goes.

This seems incoherent to me? I'd like it if all the sociopaths are duped by society into not pursuing their values, that's great for my values, and because they're evil I'd rather them not pursue their best life. However I still support distinguishing between goodness and human values for the same general-purpose reasons why often, even if its possible in principle to use some piece of information for evil, its still often better to spread & talk about that information than not.

More generally I think people are too quick to use the phrase "One does not argue against the utility function, as the saying goes." Yes, you can't argue against the utility function, but if someone has a bad utility function and is unaware what that utility function is, I'm not going to dissuade them from that (unless I think they'll be happy to cooperate with me on bettering both our goals if I do, but sociopaths are not known for such behavior). That's part of stopping them.

Reply
[-]johnswentworth8d94

I'm quite confident my preferences are coherent here, it's one of the parts of my values I'm most familiar with.

There's both an instrumentalish and a terminalish component. The terminalish component is roughly a really strong preference to not try to mislead people about their own values; that in particular is just incredibly deeply wrong for me to do according to my own values. The instrumentalish component is... very similar to the thing where people are like "well we need to be a little hyperbolic or misleading or conceal our true intent in order to spread our political message successfully" and then over and over again that type of reasoning leads people to metaphorically smack themselves in the face, it's a massive own goal, it just does not work.

Reply
[-]Noosphere899d50

Indeed, you could make a very reasonable argument that the entire reason AI might be dangerous is because once it's able to automate away the entire economy, as an example, defection no longer has any cost and has massive benefits (at least conditional on no alignment in values).

The basic reason why you can't defect easily and gain massive amounts of utility from social systems is a combo of humans not being able to evade enforcement reliably, due to logistics issues, combined with people being able to reliably detect defection in small groups due to reputation/honor systems, and combined with the fact that humans as individuals are far, far less powerful even selfishly as individuals than as cooperators.

This of course breaks once AGI/ASI is invented, but John Wentworth's post doesn't need to apply to post-AGI/ASI worlds.

Reply
[-]Raemon9d40

I think that could probably also use to be a short post with a 5 word title encapsulating it.

Reply1
[-]Wei Dai9d173
  1. How does this carry into the future, when we'll be able to modify our brains/minds?
    1. Are our Values the real-world things that trigger our feelings, or the feelings themselves? (If the latter, we'll be able to artificially trigger them at negligible cost and with no negative side effects, unlike today.)
    2. "We Don’t Get To Choose Our Own Values" will be false, so that part will be irrelevant. How does this affect your arguments/conclusions?
  2. Even today, Goodness-as-memetic-egregore can (and have) heavily influence our Values, through the kind of mechanism described in Morality is Scary. (Think of the Communists who yearned for communism so much that they were willing to endure extreme hardship and even torture for it.) This seems like a crucial part of the picture that you didn't mention, and which complicates any effort to draw conclusions from it.
  3. My own perspective is that what you call Human Values and Goodness are both potential sources (along with others) of "My Real Values", which I'll only be able to really figure out after doing or learning a lot more philosophy (e.g., to figure out which ones I really want to, or should, keep or discard, or how to answer questions like the above). In the meantime, my main goals are to preserve/optimize my option values and ability to eventually do/learn such philosophy, and don't do anything that might turn out to be really bad according to "My Real Values" (like deny some strong short-term desire, or commit a potential moral atrocity), using something like Bostrom and Ord's Moral Parliament model for handling moral uncertainty.
Reply
[-]johnswentworth9d40

Main answer: this post is aimed at a lower level than you are at, and I intentionally did not unpack some of the more advanced questions, because that would have involved long sections which lower-level readers would find either hard to follow or unmotivated.

That said, the way I'd think about your points is in Values Are Real Like Harry Potter and We Don't Know Our Own Values.

Reply
[-]Wei Dai3h50

I've now read your linked posts, but can't derive from them how you would answer my questions. Do you want to take a direct shot at answering them? And also the following question/counter-argument?

Think about the consequences, what will actually happen down the line and how well your Values will actually be satisfied long-term, not just about what feels yummy in the moment.

Suppose I'm a sadist who derives a lot of pleasure/reward from torturing animals, but also my parents and everyone else in society taught me that torturing animals is wrong. According to your posts, this implies that my Values = "torturing animals has high value", and Goodness = "don't torturing animals", and I shouldn't follow Goodness unless it actually lets me better satisfy my values better long-term, in other words allows me to torture more animals in the long run. Am I understanding your ideas correctly?

(Edit: It looks like @Johannes C. Mayer made a similar point under one of your previous posts, which you never answered.)

Assuming I am understanding you correctly, this would be a controversial position to say the least, and counter to many people's intuitions or metaethical beliefs. I think metaethics is a hard problem, and I probably can't easily convince you that you're wrong. But maybe I can at least convince you that you shouldn't be as confident in these ideas as you appear to be, nor present them to "lower-level readers" without indicating how controversial / counterintuitive-to-many the implications of your ideas are.

Reply
[-]Wei Dai8d50

Main answer: this post is aimed at a lower level than you are at, and I intentionally did not unpack some of the more advanced questions

I wish there was some kind of disclaimer or hint near the beginning of the text that this is the case, so I would know to read it with this in mind (or skip it altogether as not written for me).

Reply
[-]Nina Panickssery9d126

I think the confusion here is that "Goodness" means different things depending on whether you're a moral realist or anti-realist.

If you're a moral realist, Goodness is an objective quality that doesn't depend on your feelings/mental state. What is Good may or may not overlap with what you like/prefer/find yummy, but it doesn't have to.

If you're a moral anti-realist, either:

  • "Goodness" is meaningless.
  • "Goodness" is a shorthand for something like:
    • "My fundamental, least changeable preferences/likes/wants"
    • "The subset of my preferences/likes/wants that many other people share"
    • "The subset of my preferences/likes/wants that it's socially acceptable to talk a lot about/encourage others to adopt"
    • "The subset of my preferences/likes/wants that I want others to adopt"

I think "Human Values" is a very poor phrase because:

  • If you're a moral realist, you can just say "Goodness" instead of "Human Values".
  • If you're a moral anti-realist, you can just talk about your preferences, or a particular subset of your preferences (e.g. any of the options listed above).

Instead, people referring to "Human Values" obscure whether they are moral realists or anti-realists, which causes a lot of confusion when determining the implications and logical consistency of their views.

Reply
[-]Vanessa Kosoy9d90

I mostly agree with this, the part which feels off is

I’d like to say here “screw memetic egregores, follow the actual values of actual humans”

Humans already follow their actual Values[1], and will always do because their Values are the reason they do anything at all. They also construct narratives about themselves that involve Goodness, and sometimes deny the distinction between Goodness and Values altogether. This act of (self-)deception is in itself motivated by the Values, at least instrumentally.

I do have a version of the “screw memetic egregores” attitude, which is, stop self-deceiving. Because, deception distorts epistemics, and we cannot afford distorted epistemics right now. It's not necessarily correct advice for everyone, but I believe it's correct advice for everyone who is seriously trying to save the world, at least.

Another nuance is that, in addition to empathy and naive tit-for-tat, there is also acausal tit-for-tat. This further pushes the Value-recommended strategy in the direction of something Goodness-like (in certain respects), even though ofc it doesn't coincide with the Goodness of any particular culture in any particular historical period.

  1. ^

    As Steven Byrnes wrote, "values" might be not the best term, but I will keep it here.

Reply1
[-]Linda Linsefors6d68

To some extent "goodness" is some ever moving negotiated set of norms of how one should behave.

I notice that when I use the word "good" (or envoke this consept using other words such as "should"), I don't use it to point to the existing norms, but as a bid for what I think these norms should be. This sometimes overlap with the existing norms and sometimes not.

E.g. I might say that it's good to allow lots of diffrent subcultures to co-exist. This is a vote for a norm where peopel who don't my subculture leave me and my firends alone, in exchange for us leaving them alone. This is not unrelated to me getting what is jummy to me, but it at least one step removed.

"Good" is the set of norms we use to coordinate cooperation. If most people don't like when you pick your nose in public, then it's good to make an effort not to do so, and similar for a lot of other values. Even if you don't care about the nose picking, you probably care about some other of the things "good" coordinates around. For most people it's probably worth supporing the package deal. But I also think you "should" use your voice to help imrove the notion of what is "good".

Reply
[-]skolemizer9d52
  • Our Values are (roughly) the yumminess or yearning we feel when imagining something.
  • Goodness is (roughly) whatever stuff the memes say one should value.

I do not think this matches my usage of the words "Human Values" or (especially) "Goodness" (nor of the usage of the rare intelligent people whose ethical judgement I trust). The concept of yumminess/yearning is relevant; the concept of popular assertions of what one oughts to yearn for is relevant. But I object to both of these rough definitions on the grounds that they miss many central aspects.

Concretely: consider a heroin addict, in a memetic environment that strongly disapproves of heroin usage. Because of their addiction, by far the greatest yumminess they feel when imagining things is more heroin (and things which may have brought their past-self feelings of yumminess no longer have that feeling, because it cannot compete). In your framework, getting more heroin is part of their Values, but not part of their culture's Goodness.

So far so good — but now compare to your example of a gay man in a memetic environment that strongly disapproves of gay romance and sex. As far as I can tell, your analytic framework treats these cases exactly identically: it's a conflict between Values and Goodness, maybe with the man repeatedly tying himself up in knots to try and fail to crush his Values in the name of Goodness. But I claim this is wrong: an accurate account of Values and Goodness should be able to distinguish these two scenarios. (Lest you think I'm letting my own biases slip in: replace "gay romance and sex" with one of the sexual fetishes I personally disapprove of and think should be socially stigmatized. The distinction I'm getting at here is different.)

I challenge you to articulate the relevant difference between those two scenarios in your analytical framework. I claim any framework which can't is flinching away from a hard part of describing the type signatures and natures of Values and Goodness. This is the sense in which I meant that your rough definitions miss central concepts.

(Unless you assert that the two cases aren't different, in which case we might just have a more object-level disagreement, as opposed to you being wrong about your word usage.)

As for what central concepts your framework is missing — this deserves a longer response, but in lieu of that I will briefly gesture at one concept. There is the curious but well-known phenomenon whereby there is a difference between what a human wants (in the sense of revealed preference) and what he or she wants to want (in a particular complicated sense I'm only gesturing at). As you understand well, a man can have false beliefs about what he wants. For the same reason, he can have false beliefs about what he wants-to-want. (In particular, verbal description of what one wants-to-want are not identical to what one actually wants-to-want.)

I claim the self-hating socially-stigmatized heroin-addict has correct beliefs about what he wants-to-want, whereas the self-hating socially-stigmatized sexual-deviant has false beliefs thereof. This distinction is not one of yumminess-upon-imagining (each feels yummy upon imagining using heroin and having deviant sex), and it is not one of memetic pressure (each's behavior is disapproved of by society, and by me personally). But the distinction is central to ungderstanding Human Values and Goodness.

Reply
[-]Kaarel9d*50

This post doesn't seem to provide reasons to have one's actions be determined by one's feelings of yumminess/yearning, or reasons to think that what one should do is in some sense ultimately specified/defined by one's feelings of yumminess/yearning, over e.g. what you call "Goodness"? I want to state an opposing position, admittedly also basically without argument: that it is right to have one's actions be determined by a whole mess of things together importantly including e.g. linguistic goodness-reasoning, object-level ethical principles stated in language or not really stated in language, meta-principles stated in language or not really stated in language, various feelings, laws, commitments to various (grand and small, shared and individual) projects, assigned duties, debate, democracy, moral advice, various other processes involving (and in particular "running on") other people, etc.. These things in their present state are of course quite poor determiners of action compared to what is possible, and they will need to be critiqued and improved — but I think it is right to improve them from basically "the standpoint they themselves create".[1]

The distinction you're trying to make also strikes me as bizarre given that in almost all people, feelings of yumminess/yearning are determined largely by all these other (at least naively, but imo genuinely and duly) value-carrying things anyway. Are you advocating for a return to following some more primitively determined yumminess/yearning? (If I imagine doing this myself, I imagine ending up with some completely primitively retarded thing as "My Values", and then I feel like saying "no I'm not going to be guided by this lmao — fuck these "My Values"".) Or maybe you aren't saying one should undo the yumminess/yearning-shaping done by all this other stuff in the past, but are still advising one to avoid any further shaping in the future? It'd surprise me if ≈any philosophically serious person would really agree to abstain from e.g. using goodness-talk in this role going forward.

The distinction also strikes me as bizarre given that in ordinary action-determination, feelings of yumminess/yearning are often not directly applied to some low-level givens, but e.g. to principles stated in language, and so only becoming fully operational in conjunction with eg minimally something like internal partly-linguistic debate. So if one were to get rid of the role of goodness-talk in one's action-determination, even one's existing feelings of yumminess/yearning could no longer remotely be "fully themselves".


  1. If you ask me "but how does the meaning of "I should X" ultimately get specified/defined", then: I don't particularly feel a need to ultimately reduce shoulds to some other thing at all, kinda along the lines of https://en.wikipedia.org/wiki/Tarski's_undefinability_theorem and https://en.wikipedia.org/wiki/G._E._Moore#Open-question_argument . ↩︎

Reply
[-]the gears to ascension9d40

I mostly don't seem to have anything new to say in response to this at the moment, but I figured mentioning my comment from a few weeks ago on hunches about origins of caring-for-others was in order, so there it is.

Reply
[-]Lukas Finnveden9d40
  • Goodness is (roughly) whatever stuff the memes say one should value.

    Looking at that first one, the second might seem kind of silly. After all, we mostly don’t get to choose what triggers yumminess or yearning.

A lot of goodness is about what you should do rather than what you should feel yearning for. There’s less conflict there. Even if you can’t change what you feel yearning for, you can change what you do. 

Reply
[-]julius vidal9d30

One (over)optimistic hope I have is that something like a really good scale-free theory of intelligent agency  would define a way to construct a notion of goodness that was actually aligned with the values of the members of a society to the best extent possible.

Reply
[-]julius vidal9d30

Is there a distinction to be made between different kinds of social imperatives?


e.g. I think a lot of people might feel the mimetic egregore tells them they should try to look good more than it tells them to be humble, but they might still associate the latter with 'goodness' more because when they are told to do it it is in the context of morality or virtue.
 

Reply
[-]williawa9d30

I agree there is an important distinction, but I think the social memetic aspect of "Goodness" is not central. The central distinction is that we have access to yumminess directly, it is the only thing we "truly care about" in some sense, but as bounded and not even perfectly coherent agents, we're unable to roll our predictions forward over all possible action paths and maximize yumminess.

Instead we need to form a compact /abstracted representation of our values/yuminess to 1) make them legible to ourselves and 2) make plans to attain them 3) communicate them 4) make them more coherent

Reply1
[-]Jesper L.9d30

I update my moral values based on my ontology. I try to factor in epistmic uncertainty. I do not attribute goodness to human values, because I do not center my world view around humans only. What an odd thing to do.

Ethics to me is an epistemic project. I read literature, poetry, the Upanishads, the Gita, the Gospels, Meditations, the sequences... More obscure things. I think and I update.

Reply
Moderation Log
More from johnswentworth
View more
Curated and popular this week
30Comments

There is a temptation to simply define Goodness as Human Values, or vice versa.

Alas, we do not get to choose the definitions of commonly used words; our attempted definitions will simply be wrong. Unless we stick to mathematics, we will end up sneaking in intuitions which do not follow from our so-called definitions, and thereby mislead ourselves. People who claim that they use some standard word or phrase according to their own definition are, in nearly all cases outside of mathematics, wrong about their own usage patterns.[1]

If we want to know what words mean, we need to look at e.g. how they’re used and where the concepts come from and what mental pictures they summon. And when we look at those things for Goodness and Human Values… they don’t match. And I don’t mean that we shouldn’t pursue Human Values; I mean that the stuff people usually refer to as Goodness is a coherent thing which does not match the actual values of actual humans all that well.

The Yumminess You Feel When Imagining Things Measures Your Values

There’s this mental picture where a mind has some sort of goals inside it, stuff it wants, stuff it values, stuff which from-the-inside feels worth doing things for. In old-school AI we’d usually represent that stuff as a utility function, but we wanted some terminology for a more general kind of “values” which doesn’t commit so hard to the mathematical framework (and often-confused conceptual baggage outside the math) of utility functions. The phrase “human values” caught on.

We don’t really know what human values are, or what shape they are, or even whether they’re A Thing at all. We don’t have trivial introspective access to our own values; sometimes we think we value a thing a lot, but realize in hindsight that we value it only a little. But insofar as the mental picture is pointing to a real thing at all, it does tell us how to go look for our values within our own minds.

How do we go look for our own values?

Well, we’re looking for some sort of goals, stuff which our minds want or value, stuff which drives us, etc. What does that feel like from the inside? Think of the stuff that, when you imagine it, feels really yummy. It induces yearning and longing. It feels like you’d be more complete with it. That’s the feeling of stuff that you value a lot. Lesser versions of the same feeling come when imagining things you value less (but still positively).

Personally… I get that feeling of yumminess and yearning when I imagine having a principled mathematical framework for understanding the internal structures of minds, which actually works on e.g. image generators.[2] I also get that feeling of yumminess and yearning when I imagine a really great night of dancing, or particularly great sex, or physically fighting with friends, or my favorite immersive theater shows, or some of my favorite foods at specific restaurants. Sometimes I get a weaker version of the yumminess and yearning feeling when I imagine hanging out around a fire with friends, or just sitting out on my balcony alone at night and watching the city, or dealing with the sort of emergency which is important enough that I drop everything else from my mind and just focus

Those are my values. That’s what human values look like, and how to probe for yours.

“Goodness” Is A Memetic Egregore

I did not first learn about goodness by imagining things and checking how yummy they felt. I first learned about Goodness by my parents and teachers and religious figures and books and movies and so forth telling me that it’s Good to not steal things, Good to do unto others what I’d have them do unto me, Good to follow rules and authority figures, Good to clean up after myself, Good to share things with other kids, Good to not pick my nose, etc, etc.

In other words, I learned about Goodness mostly memetically, absorbing messages from others about what’s Good.

Some of those messages systematically follow from some general principles. Things like “don’t steal” are social rules which help build a high-trust society, making it easier for everyone to get what they want insofar as everyone else follows the rules. We want other people to follow those rules, so we teach other people the rules. Other aspects of Goodness, especially about cleanliness, seem to mostly follow humans’ purity instincts, and are memetically spread mainly by people with relatively-strong purity instincts in an attempt to get people with relatively-weaker purity instincts to be less gross (think nose picking). Still other aspects of Goodness seem rather suspiciously optimized for getting kids to be easier for their parents and teachers to manage - think following rules or respecting one’s elders. Then there are aspects of Goodness which seem to be largely political, driven by the usual political memetic forces.

The main unifying theme here is that Goodness is a memetic egregore; in practice, our shared concept of Goodness is comprised of whatever messages people spread about what other people should value.

… which sure is a different thing from what people do value, when they introspect on what feels yummy.

Aside: Loving Connection

One thing to flag at this point: you know the feeling of deep loving connection, like a parent-child bond or spousal bond or the feeling you get (to some degree) when deeply empathizing with someone or the feeling of loving connection to God or the universe which people sometimes get from religious experiences? I.e. oxytocin?

For many (most?) people, that feeling is a REALLY big chunk of their Values. It is the thing which feels yummiest, often by such a large margin that it overwhelms everything else. If that’s you, then it’s probably worth stopping to notice that there are other things you value. It is quite possible to hyperoptimize for that one particular yumminess, then burn out and later realize that one values other things too - as many a parent learns when the midlife crisis hits.

That feeling of deep loving connection is also a major component of the memetic egregore Goodness, to such an extent that people often say that Goodness just is that kind of love. Think of the songs or hippies or whoever saying that all the world’s problems would be solved if only we had more love. As with values, it is worth stopping to notice that loving connection is not the entirety of Goodness, as the term is typically used. The people saying that Goodness just is loving connection (or something along those lines) are making the same move as someone trying to define a word; in most cases their usage probably doesn’t even match their own definition on closer inspection.

It is true that deep loving connection is both an especially large chunk of Human Values and an especially large chunk of Goodness, and within that overlap Human Values and Goodness do match. But that’s not the entirety of either Human Values or Goodness, and losing track of the rest is a good way to shoot oneself in the foot eventually.

We Don’t Get To Choose Our Own Values (Mostly)

To summarize so far:

  • Our Values are (roughly) the yumminess or yearning we feel when imagining something.
  • Goodness is (roughly) whatever stuff the memes say one should value.

Looking at that first one, the second might seem kind of silly. After all, we mostly don’t get to choose what triggers yumminess or yearning. There are some loopholes - e.g. sometimes we can learn to like things, or intentionally build new associations - but mostly the yumminess is not within conscious control. So it’s kind of silly for the memetic egregore to tell us what we should find yummy.

A central example: gay men mostly don’t seem to have much control over their attraction to men; that yumminess is not under their control. In many times and places the memetic egregore Goodness said that men shouldn’t be sexually attracted to men (those darn purity instincts!), which… usually isn’t all that effective at changing the underlying yumminess or yearning.

What does often happen, when the memetic egregore Goodness dictates something in conflict with actual Humans’ actual Values, is that the humans “tie themselves in knots” internally. The gay man’s attraction to men is still there, but maybe that attraction also triggers a feeling of shame or social anxiety or something. Or maybe the guy just hides his feelings, and then feels alone and stressed because he doesn’t feel safe being open with other people.

Sex and especially BDSM is a ripe area for this sort of thing. An awful lot of people, probably a majority of the population, sure do feel deep yearning to either inflict or receive pain, to take total control over another or give total control to another, to take or be taken by force, to abandon propriety and just be a total slut, to give or receive humiliation, etc. And man, the memetic egregore Goodness sure does not generally approve of those things. And then people tie themselves in knots, with the things that turn them on most also triggering anxiety or insecurity.

So What Do?

I’d like to say here “screw memetic egregores, follow the actual values of actual humans”, but then many people will be complete fucking idiots about it. So first let’s go over what not to do.

There’s a certain type of person… let’s call him Albert. Albert realizes that Goodness is a memetic egregore, and that the memetic egregore is not particularly well aligned with Albert’s own values. And so Albert throws out all that Goodness crap, and just queries his own feelings of yumminess in-the-moment when making decisions.

This goes badly in a few different ways

Sometimes Albert has relatively low innate empathy, and throws out all the Goodness stuff about following the rules and spirit of high-trust communities. Albert just generally hits the “defect” button whenever it’s convenient. Then Albert goes all pikachu surprise face when he’s excluded from high trust communities.

Other times Albert is just bad at thinking far into the future, and jumps on whatever feels yummy in-the-moment without really thinking ahead. A few years down the line Albert is broke.

Or maybe Albert rejects memetic Goodness, ignores authority a little too much, and winds up unemployed or in prison. Or ignores purity instincts a little too much and winds up very sick.

Point is: there’s a Chesterton’s fence here. Don’t be an idiot. Goodness is not very well aligned with actual Humans’ actual Values, but it has been memetically selected for a long time and you probably shouldn’t just jettison the whole thing without checking the pieces for usefulness. In particular, a nontrivial chunk of the memetic egregore Goodness needs to be complied with in order to satisfy your actual Values long term (which usually involves other people), even when it conflicts with your Values short term. Think about the consequences, what will actually happen down the line and how well your Values will actually be satisfied long-term, not just about what feels yummy in the moment.

… and then jettison the memetic egregore and pay attention to your and others' actual Values. Don’t make the opposite mistake of motivatedly looking for clever reasons to not jettison the egregore just because it’s scary.

  1. ^

    You can quick-check this in individual cases by replacing the defined word with some made-up word wherever the person uses it - e.g. replace “Goodness” with “Bixness”.

  2. ^

    … actually when I first try to imagine that I get a mild “ugh” because I’ve tried and failed to make such a thing before. But when I set that aside and actually imagine the end product, then I get the yummy feeling.