your terminal values are complex and not objective

Tamsin Leake

60 your terminal values are complex and not objective

13th Mar 2023

3 min read

60

a lot of people seem to want terminal (aka intrinsic aka axiomatic) values (aka ethics aka morality aka preferences aka goals) to be simple and elegant, and to be objective and canonical. this carries over from epistemology, where we do favor simplicity and elegance.

we have uncertainty about our values, and it is true that our model of our values should, as per epistemology, generally tend to follow a simplicity prior. but that doesn't mean that our values themselves are simple; they're definitely evidently complex enough that just thinking about them a little bit should make you realize that they're much more complex than the kind of simple model people often come up with.

both for modeling the world and for modeling your values, you should favor simplicity as a prior and then update by filtering for hypotheses that match evidence, because the actual territory is big and complex.

there is no objectively correct universal metaethics. there's just a large, complex, tangled mess of stuff that is hard to categorize and contains not just human notions but also culturally local notions of love, happiness, culture, freedom, friendship, art, comfort, diversity, etc. and yes, these are terminal values; there is no simple process that re-derives those values. i believe that there is no thing for which i instrumentally value love or art, which if you presented me something else that does that thing better, i would happily give up on love/art. i value those things intrinsically.

if you talk of "a giant cosmopolitan value handshake between everyone", then picking that rather than paperclips, while intuitive to you (because you have your values) and even to other humans doesn't particularly track anything universally canonical.

even within the set of people who claim to have cosmopolitan values, how conflicts are resolved and what "everyone" means and many other implementation details of cosmopolitanism will differ from person to person, and again there is no canonical unique choice. your notion of cosmopolitanism is a very complex object, laden with not just human concepts but also cultural concepts you've been exposed to, which many other humans don't share both across time and space.

there is no "metaethics ladder" you can which climb up in order to resolve this in an objective way for everyone, not even all humans — what ladder and how you climb it is still a complex subjective object laden with human concepts and concepts from your culture, and there is no such thing as a "pure" you or a "pure" person without those.

some people say "simply detect all agents in the cosmos and do a giant value handshake between those"; but on top of the previous problems for implementation details, this has the added issue that the things whose values we want to be satisfied aren't agents but moral patients. those don't necessarily match — superintelligent grabby agents shouldn't get undue amounts of power in the value handshake.

some people see the simplicity of paperclips as the problem, and declare that complexity or negentropy or something like that is the ultimate good. but a superintelligence maximizing for that would just fill the universe with maximally random noise, as opposed to preserving the things you like. turns out, "i want whatever is complex" is not sufficient to get our values; they're not just anything complex or complexity itself, they're an extremely specific complex set of things, as opposed to other equally complex sets of things.

entropy just doesn't have much to do with terminal values whatsoever. sure, it has a lot to do with instrumental values: negentropy is the resource we have to allocate to the various things we want. but that's secondary to what it is we want to begin with.

as for myself, i love cosmopolitanism! i would like an egalitarian utopia where everyone has freedom and my personal lifestyle preferences aren't particularly imposed on anyone else. but make no mistake: this cosmopolitanism is my very specific view of it, and other people have different views of cosmopolitanism, when they're even cosmopolitan at all.

60

Mentioned in

106the QACI alignment plan: table of contents

your terminal values are complex and not objective

New Comment

6 comments, sorted by

top scoring

Click to highlight new comments since: Today at 4:39 AM

[-]romeostevensit1y70

Worldviews that contain more symmetries/invariances (ie don't need to update based on where/when you are standing) are generally considered better. Values are plausibly the same.

[-]Vladimir_Nesov1y20

Freedom is not necessarily freedom to get eaten by a recursively self-improving AI you built that then proceeds to eat everyone else. But freedom under negotiated norms that are mostly about placing boundaries on concerns that each party is primarily in control of might look like cosmopolitanism, not very sensitive to details of individual terminal values, with no need for merging of values in private spaces.

[-]Tamsin Leake1y40

sounds maybe kinda like a utopia design i've previously come up with, where you get your private computational garden and all interactions are voluntary.

that said some values need to come interfere into people's gardens: you can't create arbitrarily suffering moral patients, you might have to in some way be stopped from partaking in some molochianisms, etc.

[-]Vladimir_Nesov1y30

Hence the misaligned ASI example, the private spaces shouldn't be allowed arbitrary computation, they still need to respect some norms. Individual control is only retained over what the norms leave out, which might allow almost all details of individual terminal values to be expressed, as long as they don't step on what the norms proscribe. The boundaries set by norms/commitments are about possible computations/behaviors, not particularly about physical location.

[-]Petr Andreev1y10

We have many objective values that result from cultural history, such as mythology, concepts, and other "legacy" things built upon them. When we say these values are objective, we mean that we receive them as they are, and we cannot change them too much. In general, they are kind of infinite mythologies with many rules that "help" people do something right "like in the past" and achieve their goals "after all."

Also we have some objective programmed value, our biological nature, our genes that work for reproduction

When something really scary happens, like bombings, wars, or other threats to survival, simple values (whether they are biological, religious, or national) take charge. These observations confirm a certain hierarchy of values and needs.

Many of the values we talk about reflect our altruistic cosmopolitan hopes for the future, and they are not real values for most people. That's kind of a philosophical illusion that people usually talks after success in other values, such as biological, religious, or national. It's an illusion that every smart person can understand basic philosophical or ethical constructions. For many tech-savvy people, it's easier to wear a comfortable political and social point of view, and they don't have time to learn about complex concepts like "should not do to another what he does not want another to do to him" or "treat humanity, both in your own person and in the person of everyone else, as an end, and you would never have treated it only as a means."

These concepts are too complex for most people, even tech-savvy ones with big egos. People from the outskirts of humanity who might also build AI may not understand such complex conceptions like philosophy, terminal, axiomatic, epistemology, and other terms. For a basic utilitarian brain, these could be just words to explain why you think you should get his goods or betray the ideas of his nation for your own.

Many people live in a life where violence, nepotism, and elitism are the basis of the existence of society, and judging by the stability of these regimes, this is not without some basic foundation. People in highly competitive areas may not have time for learning humanitarian sciences, they may not have enough information, and they may have basic "ideology blocks." In other words, it's like choosing comfortable shoes for them that fit well.

If you were to ask people, "Okay, you have a button to kill someone you don't know. Nobody will know it was you, and you will get one million dollars. Will you press it?" For many of them, from 10% to 50%, the answer will be yes, or maybe even "How many times could I press it?" Many AI creators could be blind to cosmopolitan needs and values. They may not understand the dilemma of creating such buttons if they only do a small part of its creation or only part of the instruction to press it.

Maybe it is necessary to input moral and value monitoring inside products so that people use them in fervor not to harm others (maybe even in open source, so they could be so advanced that AI constructors should not use other sources). Some defense in the opportunity to create such things for themselves could be made. If someone could create a big graphical cluster or something like that, then they would have to seek help from advanced AI developers who apply basic precautions against existential threats. Some kind of red map needs to be drawn up so that the creators of the AI, or those who see its creation, can accurately see the signs that something is going completely wrong.

Of course, we cannot know what to do with solving GAI because we do not know what to expect, but maybe we could find something that will, with some probability, be good and identify what is completely wrong. Could we have at least red map? What could everyone do to be less wrong in it?

[-]LVSN1y10

You can't say values "aren't objective" without some semantic sense of objectivity that they are failing to fulfill.

If you can communicate such a sense to me, I can give you values to match. That doesn't mean your sense of objectivity will have been perfect and unarbitrary; perhaps I will want to reconcile with you about our different notions of objectivity.

Still, I'm damn going to try to be objectively good.

It just so happens that my values connote all of your values, minus the part about being culturally local; funny how that works.

If you explicitly tell me that your terminal values require culturally local connotations then I can infer you would have been equally happy with different values had you been born in a different time or place. I would like to think that my conscience is like that of Sejong the Great's and Benjamin Lay's: relatively less dependent on my culture's sticks and carrots.

Moderation Log