15 Upcoming stability of values

15th Mar 2018

2 min read

15

What would you say to someone old who hadn't changed their values since they were five years old?

What would you say to anyone old who hadn't changed their values since they were eighteen years old?

You'd probably have cause to pity the second and seriously worry about the first. The process of learning, and ageing, inevitably reshape our values and preferences, and we have well-worn narratives about how life circumstances change people.

But we may be entering a whole new era. Human values are malleable, and super-powered AIs may become adept at manipulating them, possibly at the behest of other humans.

Conversely, when we start becoming able to fine-tune our own values, people will start to stabilise their own values, preventing value drift. Especially if human lifespan increases, there will be a strong case to keeping your values close, and not allowing a random walk until it hits an attractor. The more we can self-modify, the more the argument about convergent instrumental goals will apply to us - including stability of terminal goals.

So, assuming human survival, I expect that we can look forward to much greater stability of values in the future, with humans making their values fixed, if only to protect themselves against manipulation.

Possible Consequences

In such a world, the whole narrative of human development will change, with "stages of life" marked more by information, wealth, or position, than by changes in values. Nick Bostrom once discussed "super-babies" - entities that preserved the values of babies but had the intelligence of adults. Indeed, many pre-adolescents would object to going through adolescence, and this is unlikely to be formed on all of them. So we may end up with perpetual pre-adolescents, babies created with adult values, or a completely different maturation process, which didn't involve value changes.

Thus, unlike today, creators/parents will be able to fix the values of their offspring with little risk that these values would change. There are many ways this could go wrong, the most obvious being the eternal conservation of pernicious values and the potential splintering of humanity into incompatible factions - or strict regulations on the creation of new entities, to prevent that happening.

In contrast, interactions between different groups may become more relaxed than previously. Change of values through argumentation or social pressure would no longer be options (and I presume these future humans would be able to cure themselves of patterns of interactions they dislike, such as feeling the need to respond to incendiary comments). So interactions would be between beings that know for a fact they could never convince each other of moral facts, thus removing any need to convince, shame, or proselytise in conversation.

It's also possible that divergent stable values may have less consequences than we think. Partly because there are instrumental reasons for people to compromise on values when interacting with each other. But also because it's not clear how much of human value divergence is actually divergence in factual understanding, or mere tribalism. Factual divergences are much harder to sustain artificially, and tribalism is likely to transform into various long term contracts.

On a personal note, if I had full design capabilities over my own values, I'd want to allow for some slack and moral progress, but constrain my values not to wander too far from their point of origin.

Human ValuesPhilosophyRobust AgentsValue Drift

Frontpage

15

Mentioned in

74Research Agenda v0.9: Synthesising a human's preferences into a utility function

32Resolving human values, completely and adequately

28A theory of human values

11Models of preferences in distant situations

Upcoming stability of values

New Comment

15 comments, sorted by

top scoring

Click to highlight new comments since: Today at 1:48 AM

[-]Qiaochu_Yuan8y60

You'd probably have cause to pity the second and seriously worry about the first.

Not really, seems fine to me. As far as I know Duncan would self-describe as not having changed his values since he was twelve and I deeply admire him for that.

[-]norswap8y20

That's a nitpick. He said you'd **probably** have cause to pity him, and indeed, except in rare ubermensch I think that would be the case.

[-]Qiaochu_Yuan8y120

Okay, so one thing I actually want here is for Stuart to clarify what he means by "values." One thing you might mean by a person's values changing as they get older is something like, I used to value eating as much ice cream as possible, and now I value reading books, or something. But this is pretty far on the instrumental side of instrumental vs. terminal values. One thing you might mean by a person's values staying the same as they get older, more in the terminal values direction, is something like, I used to value having fun and now I still value having fun.

Instrumental values can change all the time in response to learning more about how the world works and what sorts of strategies do or do not get you your terminal values, but that's orthogonal to the question of whether your terminal values are drifting (to the extent that it even makes sense to ask this question of a human) and whether that's good or not.

[-]johnswentworth8y50

Personally, I usually like the values of five-year-olds better than the values of adults. The five-year-olds haven't had the ambition beaten out of them yet, they at least still have their sights aimed high. They want to be astronauts or whatever. You talk to the average adult over thirty, and their life goals amount to "impress friends/family, raise the kids well, prep for retirement, have some fun".

Side note: I remember lying in bed worrying about this back in sixth grade. I promised myself I wouldn't abandon my ambitions when I got older. Turns out I broke that promise; I decided my childhood ambitions weren't ambitious enough. It just never occurred to me until high school that "don't die at all" could be on the table.

[-]norswap8y40

By and large 5 years old don't have such lofty values (well lofty for you at least). And they are incredibly cruel - maybe you weren't but then you were an outlier. I suspect you were probably less kind and more selfish than you are now, even though you didn't probably didn't realize it at the time (and probably couldn't, and that's precisely why children tend to be like this, it's incredibly hard for them to course correct without outside intervention).

[-]Pattern7y30

This invites the question - "why do we change our values" or "when is it good to change values". (While that seems to depend on the definition of "values", it seems worth engaging with this question, as is.)

What would you say to someone old who hadn't changed their values since they were five years old?

What would you say to anyone old who hadn't changed their values since they were eighteen years old?

Perhaps my answer depends on their age (or their values). If someone is 5 years old, and a day, how much change should we expect? 18 years and a day?

Maybe the key factor is information. While we don't expect every day to be (very) life changing, we expect a life changing day, to have an effect. In this sense value stability isn't valued. That is, as we acquire more information*, values should change (if the information is relevant, and different, or suggests change). So what we want might be more information (which is true). On the other hand, would you want to have a lot of life changing days, one after another? To some degree, stability and resources enable adjustments. Beliefs and habits may take time to change, and major life changes can be stressful. It is one thing to seek information, it would be another to live in an externally imposed (sonic) deluge of information.

It is worth noting both that 1) as time goes on, and specifically as one acquires more information, the evidence that should be needed to shift beliefs changes, namely increases, 2) No change means no growth. To have one's values frozen at the age of 100, and to still be the same at 200, seems a terrible thing.

(Meta-values might change less than lower level values, if there's less things that affect them, or that might be a result of meta-value change precipitating lower level value change, so delta L <= delta M because delta M -> delta L. How things might work in the other direction isn't as clear - would lots of value change cause change in the level above it?)

It is tricky to account for manipulation though. When is disseminating true information manipulative? (Cherry picking?)

Another factor might be something like exploration or 'preventing boredom'. Nutrition aside, while we might have a favorite food, eating too much of it, for too many days in a row may be unappealing (in advance, or in hindsight). Perhaps you have a desire to travel, to see new things; or to change in certain ways - a new skill you want to learn, a new habit to make, or new goals to achieve. (Still sounds like growth, though things we create which outlast us can also be about growing something other than ourselves.)

*No double counting, etc. On the other hand, if we've learned more/grown/changed, we might explore the new implications of "old" information. This isn't easy to model, outside of noticing recurring failure modes.

[-]Stuart_Armstrong7y40

Information can (and should) change your behaviour, even if it doesn't change your values. Becoming a parent should change your attitude to various things whose purpose you didn't see till then! And values can prefer a variety of experiences, if we cash our boredom properly.

The problem is that humans mix information and values together in highly complicated, non-rational ways.

[-]ESRogs8y30

Conversely, when we start becoming able to fine-tune our own values, people will start to stabilise their own values, preventing value drift.

When I read this I assume you have in mind a point in the future when we're uploads, or have broad control over biology. Which makes me surprised to then read this part:

So we may end up with perpetual pre-adolescents, babies created with adult values, or a completely different maturation process, which didn't involve value changes.

Are you imagine a future where we both can engineer our bodies / minds, and also still go through (or at least start to go through) the normal human physical maturation process from baby to adult (complete with all the hormonal changes and growth spurts, etc.)?

This seems sort of incongruous (anachronistic?) to me.

[-]Stuart_Armstrong8y40

That seems still plausible to me under the biology scenario (at least early on), not so much for uploads.

But stability of value is one of the reasons I expect we'll design a different maturation process, sooner than we otherwise would.

[-]Richard_Kennaway7y10

Especially if human lifespan increases, there will be a strong case to keeping your values close, and not allowinga random walk until it hits an attractor.

In other words, be an attractor for your current values already. But at what age should one decide that here, at last, is where I am going to fix myself like a sea squirt on the landscape of values?

[-]avturchin8y10

I think that it is possible a situation where meta-values are stable, but lower level values are changing. Meta-values are values about the ways how my values should change. For example, I prefer that they will not change against my will by some secret method of hypnosis. But I also prefer that they will change according their natural evolution, as I don't want to stuck in some repetitive behaviour.

[-]Vladimir_Nesov7y60

These "meta-values" you mention are just values applied to appraisal of values. So in these terms it's possible to have values about meta-values and to value change in meta-values. Value drift becomes instrumentally undesirable with greater power over the things you value, and this argument against value drift is not particularly sensitive to your values (or "meta-values"). Even if you prefer for your values to change according to their "natural evolution", it's still useful for them not to change. For change in values to be a good decision, you need to see value drift as more terminally valuable than its opportunity cost (decrease in value of the future according to your present values, in the event that your present values undergo value drift).

[-]Stuart_Armstrong7y20

Yep. But I note that many people seem to value letting their values drift somewhat, so that needs to be taken into account.

[-]Stuart_Armstrong8y20

What would you want your values to do if you lived two thousand+ years? And did you have a firm answer to that question before I asked it? Would most people?

[-]avturchin8y10

I prefer that I would still have meta-meta value to be alive. If it holds and works, I can enjoy the play of all possible values and their combinations on lower levels. I have been thinking before about it, may be not in the exact the same wording.

Most people never think in this way, but their preference may be learned from experiments - for example, it is well known how much would people pay for preserving their life for 1 year.

It was estimated based on meta-analysis of 42 studies, that humans are ready to pay between $100K-400K for QALY, that is only for one year (Hirth, Chernew, Miller, Fendrick, & Weissert, 2000), while the median household income at the time of the study was only $37,000 in inflation non-adjusted dollars (US Census Bureau, 1997).

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

15

Upcoming stability of values

15

Possible Consequences

15