Ben Goertzel:

I doubt human value is particularly fragile. Human value has evolved and morphed over time and will continue to do so. It already takes multiple different forms. It will likely evolve in future in coordination with AGI and other technology. I think it's fairly robust.

Robin Hanson:

Like Ben, I think it is ok (if not ideal) if our descendants' values deviate from ours, as ours have from our ancestors. The risks of attempting a world government anytime soon to prevent this outcome seem worse overall.

We all know the problem with deathism: a strong belief that death is almost impossible to avoid, clashing with undesirability of the outcome, leads people to rationalize either the illusory nature of death (afterlife memes), or desirability of death (deathism proper). But of course the claims are separate, and shouldn't influence each other.

Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise. It's easier to see a sudden change as morally relevant, and easier to rationalize gradual development as morally "business as usual", but if we look at the end result, the risks of value drift are the same. And it is difficult to make it so that the future is optimized: to stop uncontrolled "evolution" of value (value drift) or recover more of astronomical waste.

Regardless of difficulty of the challenge, it's NOT OK to lose the Future. The loss might prove impossible to avert, but still it's not OK, the value judgment cares not for feasibility of its desire. Let's not succumb to the deathist pattern and lose the battle before it's done. Have the courage and rationality to admit that the loss is real, even if it's too great for mere human emotions to express.

New to LessWrong?

New Comment
121 comments, sorted by Click to highlight new comments since: Today at 4:39 AM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Anyone who predicts that some decision may result in the world being optimized according to something other than their own values, and is okay with that, is probably not thinking about terminal values. More likely, they're thinking that humanity (or its successor) will clarify its terminal values and/or get better at reasoning from them to instrumental values to concrete decisions, and that their understanding of their own values will follow that. Of course, when people are considering whether it's a good idea to create a certain kind of mind, that kind of thinking probably means they're presuming that Friendliness comes mostly automatically. It's hard for the idea of an agent with different terminal values to really sink in; I've had a little bit of experience with trying to explain to people the idea of minds with really fundamentally different values, and they still often try to understand it in terms of justifications that are compelling (or at least comprehensible) to them personally. Like, imagining that a paperclip maximizer is just like a quirky highly-intelligent human who happens to love paperclips, or is under the mistaken impression that maximizing paperclips is the rig... (read more)

Ok. Well done. You have managed to frighten me. Frightened me enough to make me ask the question: "Just why do we want to build a powerful optimizer, anyways?" Oh, yeah. Now I remember. The reason we want to build a powerful optimizer is because some people think that "Is death okay/good?" is not a confusing question but that the question "Is it okay/good to risk the future of the Earth by building an amoral agent much more powerful than ourselves?" is confusing.

Ok. Well done. You have managed to frighten me. Frightened me enough to make me ask the question: "Just why do we want to build a powerful optimizer, anyways?"

I feel like I remember trying to answer the same question (asked by you) before, but essentially, the answer is that (1) eventually (assuming humanity survives long enough) someone is probably going to build one anyway, probably without being extremely careful about understanding what kind of optimizer it's goint to be, and getting FAI before then will probably be the only way to prevent it; (2) there are many reasons why humanity might not survive long enough for that to happen — it's likely that humanity's technological progress over the next century will continuously lower the amount of skill, intelligence, and resources needed to accidentally or intentionally do terrible things — and getting FAI before then may be the best long-term solution to that; (3) given that pursuing FAI is likely necessary to avert other huge risks, and is therefore less risky than doing nothing, it's an especially good cause considering that it subsumes all other humanitarian causes (if executed successfully).

Perhaps you did. This time, my question was mostly rhetorical, but since you gave a thoughtful response, it seems a shame to waste it. Uh. Prevent it how. I'm curious how that particular sausage will be made. More sausage. How does the FAI solve that problem? It seemed that you said the root cause of the problem was technological progress, but perhaps I misunderstood. Hmmm. Amnesty International, Doctors without Borders, and the Humane Society are three humanitarian causes that come to mind. FAI subsumes these ... how, exactly? Again, my questions are somewhat rhetorical. If I really wanted to engage in this particular dialog, I should probably do so in a top-level posting. So please do not feel obligated to respond. It is just that if Ben Goertzel is so confused as to hope that any sufficiently intelligent entity will automatically empathize with humans, then how much confusion exists here regarding just how much humans will automatically accept the idea of sharing a planet with an FAI? Smart people can have amazing blind spots.

If I knew how that sausage will be made, I'd make it myself. The point of FAI is to do a massive amount of good that we're not smart enough to figure out how to do on our own.

Hmmm. Amnesty International, Doctors without Borders, and the Humane Society are three humanitarian causes that come to mind. FAI subsumes these ... how, exactly?

If humanity's extrapolated volition largely agrees that those causes are working on important problems, problems urgent enough that we're okay with giving up the chance to solve them ourselves if they can be solved faster and better by superintelligence, then it'll do so. Doctors Without Borders? We shouldn't be needing doctors (or borders) anymore. Saying how that happens is explicitly not our job — as I said, that's the whole point of making something massively smarter than we are. Don't underestimate something potentially hundreds or thousands or billions of times smarter than every human put together.

I actually think we know how to do the major 'trauma care for civilization' without FAI at this point. FAI looks much cheaper and possibly faster though, so in the process of doing the "trauma care" we should obviously fund it as a top priority. I basically see it as the largest "victory point" option in a strategy game.
When answering questions like this, it's important to make the following disclaimer: I do not know what the best solution is. If a genuine FAI considers these questions, ve will probably come up with something much better. I'm proposing ideas solely to show that some options exist which are strictly preferable to human extinction, dystopias, and the status quo. It's pretty clear that (1) we don't want to be exterminated by a rogue AI, or nanotech, or plague, or nukes, (2) we want to have aging and disease fixed for us (at least for long enough to sit back and clearly think about what we want of the future), and (3) we don't want an FAI to strip us of all autonomy and growth in order to protect us. There are plenty of ways to avoid both these possibilities. For one, the FAI could basically act as a good Deist god should have: fix the most important aspects of aging, disease and dysfunction, make murder (and construction of superweapons/unsafe AIs) impossible via occasional miraculous interventions, but otherwise hang back and let us do our growing up. (If at some point humanity decides we've outgrown its help, it should fade out at our request.) None of this is technically that difficult, given nanotech. Personally, I think a FAI could do much better than this scenario, but if I talked about that we'd get lost arguing the weird points. I just want to ask, is there a sense in which this lower bound would really seem like a dystopia to you? (If so, please think for a few minutes about possible fixes first.)
No, not at all. It sounds pretty good. However, my opinion of what you describe is not the issue. The issue is what ordinary, average, stupid, paranoid, and conservative people think about the prospect of a powerful AI totally changing their lives when they have only your self-admittedly ill informed assurances regarding how good it is going to be.
Please don't move the goalposts. I'd much rather know whether I'm convincing you than whether I'm convincing a hypothetical average Joe. Figuring out a political case for FAI is important, but secondary to figuring out whether it's actually possible and desirable.
Ok, I don't mean to be unfairly moving goalposts around. But I will point out that gaining my assent to a hypothetical is not the same as gaining my agreement regarding the course that ought to be followed into an uncertain future.
That's fair enough. The choice of course depends on whether FAI is even possible, and whether any group could be trusted to build it. But conditional on those factors, we can at least agree that such a thing is desirable.
I'd really appreciate your attempting to write up some SIAI literature to communicate these points to the audiences you are talking about. It is hard.
What is questionable is not the possibility of fundamentally different values but that they could accidentally be implemented. What you are suggesting is that some intelligence is able to evolve a vast repertoire of heuristics, acquire vast amounts of knowledge about the universe, dramatically improve its cognitive flexibility and yet never evolve its values but keep its volition at the level of a washing machine. I think this idea is flawed, or at least not sufficiently backed up to take it serious right now. I believe that such an incentive, or any incentive, will have to be deliberately and carefully hardcoded or evolved. Otherwise we are merely talking about grey goo scenarios.
Is Death absolutely bad or not is a somewhat confusing question. If you can't phrase questions, at an emotional level, only choose between them, that can become "Is death okay/good" by pattern match.

Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise.

That really depends of what you mean by "our values":

1) The values of modern, western, educated humans? (as opposed to those of the ancient Greek, or of Confucius, or of medieval Islam), or

2) The "core" human values common to all human civilizations so far? ("stabbing someone who just saved your life is a bit of a dick move", "It would be a shame if humanity was exterminated in order to pave the universe with paperclips", etc.)

Both of those are quite fuzzy and I would find it hard to describe either of them precisely enough that even a computer could understand them.

When Eliezer talks of Friendly AI having human value, I think he's mostly talking about the second set (in The Psychological Unity of Mankind. But when Ben or Robin talk about how it isn't such a big deal if values change, because they've already changed in the past, they seem to be referring to the first kind of value.

I would agree with Ben and Robin that it isn... (read more)

I find this distinction useful. According to the OP, I'd be considered a proponent of values-deathism proper, but only in terms of the values you place in the first set; I consider the exploration of values-space to be one of the values in the second set, and a significant part of my objection to the idea of tiling the universe with paperclips is that it would stop that process.
Your values is at least something that on reflection you'd be glad happened, which doesn't apply to acting on human explicit beliefs that are often imprecise or wrong. More generally, any heuristic for good decisions you know doesn't qualify. "Don't kill people" doesn't qualify. Values are a single criterion that doesn't tolerate exceptions and status quo assumptions. See magical categories for further discussion.
But that may not be what Ben implied when saying (I read it as "our ancestors" meaning "the ancient Greeks", not, "early primates" but I may be wrong)
In a certain sense "primordial single celled replicator" may be an even more relevant comparison than either. Left free to deviate Nash would weed out those pesky 'general primate values'.
Spelling notice (bold added):
Fixed, thanks.
I don't think 2 accurately reflects Eliezer's Preservation target. CEV doesn't ensure beforehand that any of those core values aren't thrown out. What's important is the process by which we decide to adopt or reject values, how that process changes when we learn more, and things like that. That is also one thing that could now change as the direct result of choices we make, through brain modification, or genetic engineering, or AI's with whole new value-adoption systems. Our intuition tends to treat this as stable even when we know we're dealing with 'different' cultures.

How about ruler-of-the-universe deathism? Wouldn't it be great if I were sore undisputed ruler of the universe? And yet thinking that rather unlikely, I don't even try to achieve it. I even think trying to achieve it would be counter-productive. How freackin' defeatist is that?

That you won't try incorporates feasibility (and can well be a correct decision, just as expecting defeat may well be correct), but value judgment doesn't, and shouldn't be updated on lack of said feasibility. It's not OK to not take over the world. There is no value in trying.
I think that if I took over the world it might cause me to go Unfriendly; that is, there's a nontrivial chance that the values of a DSimon that rules the world would diverge from my current values sharply and somewhat quickly. Basically, I just don't think I'm immune to corruption, so I don't personally want to rule the world. However, I do wish that the world had an effective ruler that shared my current values.
See this comment. The intended meaning is managing to get your values to successfully optimize the world, not for your fallible human mind to issue orders. Your actions are pretty "Unfriendly" even now, to the extent they don't further your values because of poor knowledge of what you actually want and poor ability to form efficient plans.
I don't think you know what "OK" means.
Yes, that was some rhetoric applause-lighting on my part with little care about whether you meant what my post seemed to assume you meant. I think the point is worth making (with deathist interpretation of "OK"), even if it doesn't actually apply to yours or Ben's positions.
Unless you know you're kind of a git or, more generally, your value system itself doesn't rate 'you taking over the world' highly. I agree with your position though. It is interesting to note that Robin's comment is all valid when considered independently. The error he makes is that he presents it as a reply to your argument. "Should" is not determined by "probably will".
It's an instrumental goal, it doesn't have to be valuable in itself. If you don't want for your "personal attitude" to apply to the world as a whole, it reflects the fact that your values disagree with your personal attitude, and you prefer for the world to be controlled by your values rather than personal attitude. Taking over the world as a human ruler is certainly not what I meant, and I expect is a bad idea with bad expected consequences (apart from independent reasons like being in a position to better manage existential risks).
The point being that It can be a terminal anti-goal. People could (and some of them probably do) value not-taking-over-the-world very highly. Similarly there are people who actually do want to die after the normal alloted years, completely independently of sour grapes updating. I think they are silly, but it is their values that matter to them, not my evaluation thereof.
This is a statement about valuation of states of the world, a valuation that is best satisfied by some form of taking over the world (probably much more subtle than what gets classified so by the valuation itself). It's still your evaluation of their situation that says whether you should consider their opinion on the matter of their values, or know what they value better than they do. What is the epistemic content of your thinking they are silly?
I do not agree.

Voted up. However, I disagree with "it's not OK". Everything is always OK. OK is a feature of the map. From a psychological perspective, that's important. If an OK state of the map can't be generated by changing the territory, it WILL be generated by cheating and directly manipulating the map.

That said, we have preferences, rank orderings of outcomes. The value of futures with our values is high.

OK, fine, literally speaking, value drift is bad.

But if I live to see the Future, then my values will predictably be updated based on future events, and it is part of my current value system that they do so. I affirmatively value future decisions being made by entities that have taken a look at what the future is actually like and reflected on the data they gain.

Why should this change if it turns out that I don't live to see the future? I would like future-me to be one of the entities that help make future decisions, but failing that, my second-best option... (read more)

That's not our value system changing, that's your assessment of how to best achieve your values changing edit: or your best guess of what your values actually are changing. end edit You're using the term 'value' in a different sense from the original post.
Read the comments ([1], [2]).
OK, fine. That's a perfectly reasonable way to use the word "values." If that's what you mean, though, then I don't think any of us should get worked up about value drift. We can't even specify most of our top-node values with any kind of precision or accuracy -- why should we care if (a) they change or (b) a world that we personally do not live in becomes optimized for other values?
Where you don't have any preference, you have indifference, and you are not indifferent all around. There is plenty of content to your values. Uncertainty and indifference are no foes to accuracy, they can be captured as precisely as any other concept. Whether "you don't personally live" in the future is one property of the future to consider: would you like that property to hold? An uncaring future won't have you living in it, but a future that holds your values will try to arrange something at least as good, or rather much better. Also see Belief in the Implied Invisible. What you can't observe is still there, and still has moral weight.
As Poincaré said, "Every definition implies an axiom, since it asserts the existence of the object defined." You can call a value a "single criterion that doesn't tolerate exceptions and status quo assumptions" -- but it's not clear to me that I even have values, in that sense. Of course, I will believe in the invisible, provided that it is implied. But why is it, in this case? You also speak of the irrelevance (in this context) of the fact that these values might not even be feasibly computable. Or, even if we can identify them, there may be no feasible way to preserve them. But you're talking about moral significance. Maybe we differ, but to me there is no moral significance attached to the destruction of an uncomputable preference by a course of events that I can't control. It might be sad/horrible to live to see such days (if only by definition -- as above, if one can't compute their top-node values then it's possible that one can't compute how horrible it would be), as you say. It also might not. Although I can't speak personally for the values of a Stoic, they might be happy to... well, be happy.

Saying a certain future is "not ok", and saying gradual value drift is "business as usual", are both value judgments. I don't understand why you dismiss one of them but not the other, and call it "courageous and rational".

I don't understand your comment enough to reply ("business as usual"? - that was more of a fallacy pattern). Maybe if you write in more detail?
"Business as usual" can't be a "fallacy pattern" because it's not a statement about facts, it's a value statement that says we're okay with value drift (as long as it's some sort of "honest drift", I assume). As you see from the others' comments, people do really subscribe to this, so you're not allowed to dismiss it the way you do.
(Gradual) value drift -> Future not optimized -> Future worse than if it's optimized -> Bad. People subscribe to many crazy things. This one is incorrect pretty much by definition. (Note that a singleton can retain optimized future while allowing different-preference or changing-preference agents to roam in its domain without disturbing the overall optimality of the setup and indeed contributing to it, if that's preferable. This would not be an example of value drift.)
Value drift is a real-world phenomenon, something that happens to humans. No real-world phenomenon can be bad "by definition" - our morality is part of reality and cannot be affected by the sympathetic magic of dictionaries. Maybe you're using "value drift" in a non-traditional sense, as referring to some part of your math formalism, instead of the thing that happens to humans?
By definition, as in the property is logically deduced (given some necessary and not unreasonable assumptions). Consider a bucket with 20 apples of at least 100 grams each. Such buckets exist, they are "real-world phenomena". Its weight is at least 2 kg "by definition" in the same sense as misoptimized future is worse than optimized future.
I think I figured out a way to incorporate value drift into your framework: "I value the kind of future that is well-liked by the people actually living in it, as long as they arrived at their likes and dislikes by honest value drift starting from us". Do you think anyone making such a statement is wrong?
It's a start. But note that it is trivially satisfied by a world that has no one living in it.
If what you value happens to be what's valued by future people, then future people are simultaneously stipulated to have the same values as you do. You don't need the disclaimers about "honest value drift", and there is actually no value drift. If there is genuine value drift, then after long enough you won't like the same situations as the future people. If you postulate that you only care about the pattern of future people liking their situation, and not other properties of that situation, you are embracing a fake simplified preference, similarly to people who claim that they only value happiness or lack of suffering.
What's "honest value drift" and what's good about it? Normally "value drift" makes me think of our axiology randomly losing information over time; the kind of future-with-different-values I'd value is the kind that has different instrumental values from me (because I'm not omniscient and my terminal values may not be completely consistent) but is more optimized according to my actual terminal values, presumably because that future society will have gotten better at unmuddling its terminal values, knowing enough to agree more on instrumental values, and negotiating any remaining disagreements about terminal values.
value drift -> Future not optimized to original values -> Future more aligned with new values -> Bad from original viewpoint, better from new viewpoint, not optimal from either viewpoint, but what can you do? Use of word "optimized" without specifying the value system against which the optimization took place --> variant of mind projection fallacy. ETA: There is a very real sense in which it is axiomatic both that our value system is superior to the value system of our ancestors and that our values are superior to those of our descendants. This is not at all paradoxical - our values are better simply because they are ours, and therefore of course we see them as superior to anyone else's values. Where the paradox arises is in jumping from this understanding to the mistaken belief that we ought not to ever change our values.
A compelling moral argument may change our values, but not our moral frame of reference. The moral frame of reference is like a forking bush of possible future value systems stemming from a current human morality; it represents human morality's ability to modify itself upon hearing moral arguments. The notion of moral argument and moral progress is meaningful within my moral frame of reference, but not meaningful relative to a paperclipper utility function. A paperclipper will not ever switch to stapler maximization on any moral argument; a consistent paperclipper does not think that it will possibly modify its utility function upon acquiring new information. In contrast, I think that I will possibly modify my morality for the better, it's just that I don't yet know the argument that will compel me, because if I knew it I would have already changed my mind. It is not impossible that paperclipping is the endpoint to all moral progress, and there exists a perfectly compelling chain of reasoning that converts all humans to paperclippers. It is "just" vanishingly unlikely. We cannot, of course, observe our moral frame of reference from an outside omniscient vantage point but we're able to muse about it. If we do assume omniscience for a second, then there is a space of values that humans would never willingly modify themselves into. Value drift means drifting into such space rather than a modification of values in general. If our ancestors and our descendant are in the same moral frame of reference then you could possibly convert your ancestors or most of your ancestors to your morality and be converted to future morality by future people. Of course it is not easy to say which means of conversion are valid; on the most basic level I'd say that rearranging your brains' atoms to a paperclipper breaks out of the frame of reference while verbal education and arguments generally don't.
Rather (in your terminology), value drift is change in the moral frame of reference, even if (current instrumental) morality stays the same.
I agree, it seems a more general way of putting it. Anyway, now that you mention it I'm intrigued and slightly freaked out by a scenario in which my frame of reference changes without my current values changing. First, is it even knowable when it happens? All our reasoning is based on current values. If an alien race comes and modifies us in a way that our future moral progress changes but not our current values, we could never know the change happened at all. It is a type of value loss that preserves reflective consistency. I mean, we wouldn't agree to be changed to paperclippers but on what basis could we refuse an unspecified change to our moral frame of reference (leaving current values intact)?
I'm not sure I understand this talk of "moral frames of reference" vs simply "values". But would an analogy to frame change be theory change? As when we replace Newton's theory of gravity with Einstein's theory, leaving the vast majority of theoretical predictions intact? In this analogy, we might make the change (in theory or moral frame) because we encounter new information (new astronomical or moral facts) that impel the change. Or, we might change for the same reason we might change from the Copenhagen interpretation to MWI - it seems to work just as well, but has greater elegance.
By analogy, take a complicated program as "frame of reference", and state of knowledge about what it outputs "current values". As you learn more, "current values" change, but frame of reference, defining the subject matter, stays the same and determines the direction of discovering more precise "current values". Note that the exact output may well be unknowable in its explicit form, but "frame of reference" says precisely what it is. Compare with infinite mathematical structures that can never be seen "explicitly", but with the laws of correct reasoning about them perfectly defined.
Is there potential divergence of "current values" in this analogy (or in your model of morality)?
Moral frame of reference determines the direction in exploration of values, but you can't explicitly know this direction, even if you know its definition, because otherwise you'd already be there. It's like with definition of action in ambient control. When definition is changed, you have no reason to expect that the defined thing remains the same, even though at that very moment your state of knowledge about the previous definition might happen to coincide with your state of knowledge about the new definition. And then the states of knowledge go their different ways.

And it is difficult to make it so that the future is optimized: to stop uncontrolled "evolution" of value (value drift) or recover more of astronomical waste.

I still find it shocking and terrifying every time someone compares the morphing of human values with the death of the universe. Even though I saw another FAI-inspired person do it yesterday.

If all intelligent life held your view about the importance of their own values, then life in the universe would be doomed. The outcome of that view is that intelligent life greatly increases its ac... (read more)

Agreed. Yes, this (value drift -> death of the universe) belief needs to be excised.

Goertzel: Human value has evolved and morphed over time and will continue to do so. It already takes multiple different forms. It will likely evolve in future in coordination with AGI and other technology.

Agree, but the multiple different current forms of human values are the source of much conflict.

Hanson: Like Ben, I think it is ok (if not ideal) if our descendants' values deviate from ours, as ours have from our ancestors.

Agree again. And in honor of Robin's profession, I will point out that the multiple current forms of human values are the dr... (read more)

7timtyler13y The idea is not really that you care equally about future events - but rather that you don't care about them to the extent that you are uncertain about them; that you are likely to be unable to influence them; that you will be older when they happen - and so on. It is like in chess: future moves are given less consideration - but only because they are currently indistinct low probability events - and not because of some kind of other intrinsic temporal discounting of value.
How do you know this? It feels this way, but there is no way to be certain. That we probably can't have something doesn't imply we shouldn't have it. That we expect something to happen doesn't imply it's desirable that it happens. It's very difficult to arrange so that change in values is good. I expect you'd need oversight from a singleton for that to become possible (and in that case, "changing values" won't adequately describe what happens, as there are probably better stuff to make than different-valued agents). Preference is not about "rights". It's merely game theory for coordination of satisfaction of preference. God does not care about our mathematical difficulties. --Einstein.
Alright. I shouldn't have said "we". I care more about the short term. And I am quite certain. WAY! Huh? What is it that you are not convinced we shouldn't have? Control over the distant future? Well, if that is what you mean, then I have to disagree. We are completely unqualified to exercise that kind of control. We don't know enough. But there is reason to think that our descendants and/or future selves will be better informed. Then lets make sure not to hire the guy as an FAI programmer.
I believe you know my answer to that. You are not licensed to have absolute knowledge about yourself. There are no human or property rights on truth. How do you know that you care more about short term? You can have beliefs or emotions that suggest this, but you can't know what all the stuff you believe and all the moral arguments you respond to cash out into on reflection. We only ever know approximate answers, and given the complexity of human decision problem and sheer inadequacy of human brains, any approximate answers we do presume to know are highly suspect. That we aren't qualified doesn't mean that we shouldn't have that control. Exercising this control through decisions made with human brains is probably not it of course, we'd have to use finer tools, such as FAI or upload bureaucracies. Don't joke, it's serious business. What do you believe on the matter?
I am not the person who initiated this joke. Why did you mention God? If you don't care for discounting, what is your solution to the very standard puzzles regarding unbounded utilities and infinitely remote planning horizons?
Einstein mentioned God, as a stand-in for Nature. I didn't say I don't care for discounting. I said that I believe that we must be uncertain about this question. That I don't have solutions doesn't mean I must discard the questions as answered negatively.
Yes. So, for "our values", read "our extrapolated volition". It's not clear to me how much you and Nesov actually disagree about "changing" values, vs. you meaning by "change" the sort of reflective refinement that CEV is supposed to incorporate, while Nesov uses it to mean non-reflectively-guided (random, evolutionary, or whatever) change.
I do not mean "reflective refinement" if that refinement is expected to take place during a FOOM that happens within the next century or two. I expect values to change after the first superhuman AI comes into existence. They will inevitably change by some small epsilon each time a new physical human is born or an uploaded human is cloned. I want them to change. The "values of mankind" are something like the musical tastes of mankind or the genome of mankind. It is a collage of divergent things, and the set of participants in that collage continues to change. VN and I are in real disagreement, as far as I can tell.
This is not a disagreement, but failure of communication. There is no one relevant sentence in this dispute which we both agree that we understand in the same sense, and whose truth value we assign differently.
It is a complete failure of communication if you are under the impression that the dispute has anything to do with the truth values of sentences. I am under the impression that we are in dispute because we have different values - different aspirations for the future.
Any adequate disagreement must be about different assignment of truth values to the same meaning. For example, I disagree with the truth of the statement that we don't converge on agreement because of differences in our values, given both yours and mine preferred interpretation of "values". But explaining the reason for this condition not being the source of our disagreement requires me to explain to you my sense of "values", the normative and not factual one, which I fail to accomplish.
I think we are probably in agreement that we ought to mean the same thing by the words we use before our disagreement has any substance. But your mention of "truth values" here may be driving us into a diversion from the main issue. Because I maintain that simple "ought" sentences do not have truth values. Only "is" sentences can be analyzed as true or false in Tarskian semantics. But that is a diversion. I look forward to your explanation of your sense of the word "value" - a sense which has the curious property (as I understand it) that it would be a tragedy if mankind does not (with AI assistance) soon choose one point (out of a "value space" of rather high dimensionality) and then fix that point for all time as the one true goal of mankind and its creations.
I gave up on the main issue, and so described my understanding of the reasons that justify giving up. Yes, and this is the core of our disagreement. Since your position is that something is meaningless, and mine is that there is a sense behind that, this is a failure of communication and not a true disagreement, as I didn't manage to communicate to you the sense I see. At this point, I can only refer you to "metaethics sequence", which I know is not very helpful. One last attempt, using an intuition/analogy dump not carefully explained. Where do the objective conclusions about "is" statements come from? Roughly, you encounter new evidence, including logical evidence, and then you look back and decide that your previous understanding could be improved upon. This is the cognitive origin of anything normative: you have a sense of improvement, and expectation of potential improvement. Looking at the same situation from the past, you know that there is a future process that can suggest improvements, you just haven't experienced this process yet. And so you can reason about the truth without having it immediately available. If you understand the way previous paragraph explains the truth of "is" questions, you can apply exactly the same explanation to "ought" questions. You can decide in the moment what you prefer, what you choose, which action you perform. But in the future, when you learn more, experience more, you can look back and see that you should've chosen differently, that your decision could've been improved. This anticipation of possible improvement generates semantics of preference over the decisions that is not logically transparent. You don't know what you ought to choose, but you know that here is a sense in which some action is preferable to some other action, and you don't know which is which.
Sorry. I missed that subtext. Giving up may well be the best course. But my position is not that something (specifically an 'ought' statement) is meaningless. I only maintain that the meaning is not attained by assigning "truth value conditions". Your attempt was a step in the right direction, but still IMO still leaves a large gap in understanding. You seem to think that anyone who thinks carefully enough will agree with you that there is some set of core meta-ethical principles that acts as an attractor in a dynamic process of reflective updating. I disagree with this. There is no core attractor, and the dynamic process is not one of better and better thinking as time goes on. Instead, the dynamics I am talking about is the biological evolutionary process which results in a change over time in the typical human brain. That plus the technological change over time which is likely to bring uploaded humans, AIs, aliens, and "uplifted" non-human animals into our collective social contract.
How can we know whether that is true or not? If we had access to multiple mature alien races, and could examine their moral systems, that might be a reasonable conclusion - if they were all very different. However, until then, the moral systems we can see are primitive - and any such conclusions would seem to be premature.
I'm sorry. I don't know which statement you mean to designate by "that". Nor do I know which conclusions you worry might be premature. To the best of my knowledge, I did not draw any conclusions.
We do seem to have an example of systematic positive change in values - the history of the last thousand years. No doubt some will argue that our values only look "good" because they are closest to our current values - but I don't think that is true. Another possible explanation is that material wealth lets us show off our more positive values more frequently. That's a harder charge to defend against, but wealth-driven value changes are surely still value changes. Systematic, positive changes in values tend to suggest a bright future. Go, cultural evolution!
Too much discounting runs into problems with screwing the future up, to enjoy short-term benefits. With 5-year political horizons, that problem seems far more immediate and pressing than the problems posed by discounting too little. From the point of view of those fighting the evils that too much temporal discounting represents, arguments about mathematical infinity seem ridiculous and useless. Since such arguments are so feeble, why even bother mentioning them?
I agree, but be careful with "We expect our values to change. Change can be good." Dutifully explain, that you are not talking about value change in the mathematical sense, but about value creation, i.e. extending valuation to novel situations that is guided by values of a meta-level with respect to values casually applied to remotely similar familiar situations.
I beseech you, in the bowels of Christ, think it possible your fundamental values may be mistaken. I think that we need to be able to change our minds about fundamental values, just as we need to be able to change our minds about fundamental beliefs. Even if we don't currently know how to handle this kind of upheaval mathematically. If that is seen as a problem, then we better get started working on building better mathematics.
OK. I've been sympathetic with your view from the beginning, but haven't really thought through (so, thanks,) the formalization that puts values on epistemic level: distribution of believes over propositions "my-value (H, X)" where H is my history up to now and X is a preference (order over world states, which include me and my actions). But note that people here will call the very logic you use to derive such distributions your value system. ETA: obviously, distribution "my-value (H1, X[H2])", where "X[H2]" is the subset of worlds where my history turns out to be "H2", can differ greatly from "my-value (H2, X[H2])", due to all sorts of things, but primarily due to computational constraints (i.e. I think the formalism would see it as computational constraints). ETA P.S.: let's say for clarity, that I meant "X[H2]" is the subset of world-histories where my history has prefix "H2".
What we may need more urgently is the maths for agents who have "got religion" - because we may want to build that type of agent - to help to ensure that we continue to receive their prayers and supplications.

Hmm. I wonder if it helps with gathering energy to fight the views of others if you label their views as being "deathist".

Do you really? The choice of the label was certainly not optimized for this purpose, it was a pattern I saw in the two citations and used to communicate the idea of this post (it happens to have an existing label). Fighting the views of others is a wrong attitude, you communicate (if you expect your arguments to be accepted) and let people decide, not extinguish what others believe.
The first quote says human values have changed and that the core of our values is robust to radical, catastrophic change. The second quote says that human values have changed and that some future changes would be okay, and it states that there are greater risks to human values in accepting a global entity responsible for protecting against value changes due to AGI. Glossing those two quotes as being analogous to, and as equally irrational as, standard deathism seems like a deliberate misreading, and using the name 'value deathism' seems pretty suspect to me, for whatever that's worth.

Direct question. I cannot infer answer from you posts. If human values do not exist in closed form (i.e. do include updates on future observations including observations which in fact aren't possible in our universe), then is it better to have FAI operating on some closed form of values instead?

I don't understand the question. Unpack closed form/no closed form, and where updating comes in. (I probably won't be able to answer, since this deals with observations, which I don't understand still.)
Then it seems better to demonstrate it on toy model as I've done for no closed form already. One way I can think of to describe closed/no closed distinction is that latter does require unknown amount of input to be able to compute final/unchanging ordering over (internal representations of) world-states, former doesn't require input at all or requires predictable amount of input to do the same. Another way to think about value with no closed form is that it gradually incorporates terms/algorithms acquired/constructed from environment.
I understand the grandparent comment now. Open/closed distinction can in principle be extracted from values, so that values of the original agent only specify what kind of program the agent should self-improve into, while that program is left to deal with any potential observations. (It's not better to forget some component of values.)
I'm not sure I understand you. Values of the original agent specify a class of programs it can become. Which program of this class should deal with observations? Forget? Is it about "too smart to optimize"? This meaning I didn't intend. When computer encounters borders of universe, it will have incentive to explore every possibility that it is not true border of universe such as: active deception by adversary, different rules of game's "physics" for the rest of universe, possibility that its universe is simulated and so on. I don't see why it is rational for it to ever stop checking those hypotheses and begin to optimize universe.

"But of course the claims are separate, and shouldn't influence each other."

No, they are not separate, and they should influence each other.

Suppose your terminal value is squaring the circle using Euclidean geometry. When you find out that this is impossible, you should stop trying. You should go and do something else. You should even stop wanting to square the circle with Euclidean geometry.

What is possible, directly influences what you ought to do, and what you ought to desire.

Change in values of the future agents, however sudden of gradual, means that the Future (the whole freackin' Future!) won't be optimized according to our values, won't be anywhere as good as it could've been otherwise.

Even at the personal level our values change with time, somewhat dramatically as we grow in maturity from children to adolescents to adults, and more generally just as a process of learning modifying our belief networks.

Hanson's analogy to new generations of humans is especially important - value drift occurs naturally. A fully general ar... (read more)

Preserving morality doesn't mean that we have nothing more to decide, that there are no new ideas to create. New ideas (and new normative considerations) can be created while keeping morality. Value drift is destruction of morality, but lack of value drift doesn't preclude development, especially since morality endorses development (in the ways it does). Every improvement is a change, but not every change is an improvement. Current normative considerations that we have (i.e. morality) describes what kinds of change we should consider improvement. Forgetting what changes we consider to be improvement (i.e. value drift) will result in future change that is not moral (i.e. not improvement from the current standpoint).
I find myself in agreement with all this, except perhaps: Perhaps we are using the term 'value drift' differently. Would you consider the change from our ancestors values to modern western values to be 'value drift'? What about change from the values of the age-10 version of yourself versus the age-20 or the current? The current world is far from an idealization of my ancestor's values, and my current self is far from an idealized extrapolation of it's future according to it's values at the time. I don't find this type of 'value drift' to be equivalent to "forgetting what changes we consider to be improvement". At each step we are making changes to ourselves that we do believe are improvements at the time according to our values at the time, and over time this process can lead to significant changes in our values. Even so, this does not imply that the future changes are not moral (not improvement from the current standpoint). Change in values implies that the future will not be ideal from the perspective of current values, but from this it does not follow that all such futures are worse from the perspective of our current values (because all else is not equal).

Azathoth already killed you once, at puberty. Are you significantly worse off now that you value sex? Enough to eliminate that value?

There are two things wrong with this analogy. One is that this isn't a real value change. You gained the ability to appreciate sex as a source of pleasures, both lower and higher. As a child you already valued physical pleasure and social connections. Likewise, just as the invention of pianos allowed us to develop appreciations for things our ancestors never could, future technology will allow our descendants, we imagine, to find new sources of pleasure, higher and lower. And few people think this is bad in itself. The more fundamental problem with the analogy - or rather, the question that followed it - is that it asks the question from the perspective of the adult rather than the child. Of course if our descendants have radically different values, then unless their ability to alter their environment has been drastically reduced, the world they create will be much better, from their perspective, than ours is. But the perspective we care about is our own - in the future generations case, we're the child. Consider a young law student at a prestigious school who, like many young law students, is passionate about social justice and public service. She looks at the legal profession and notices that a great deal of lawyers, especially elite lawyers, don't seem to care about this to the extent that her age-mates do - after all, there are just as many of them fighting for the bad guys as the good guys, and so on and so on. Suppose for the sake of argument that she's right to conclude that as they get socialized into the legal profession, and start earning very high incomes, and begin to hobknob with the rich and powerful, their values start changing such that they care about protecting privilege rather than challenging it, and earning gobs of money rather than fighting what she would see as the good fight. And suppose further that she sees, accurately, that these corrupted lawyers are quite happy - they genuinely do enjoy what their work, and have changed their politics
By even considering how lawyers observably change, it would seem that our idealistic young law student is already infected with the memplex of perspective. Nevertheless, nigh tautologically, the future offers new perspectives as yet unappreciated. After all, as for any question of memetic self preservation of integrity, you have it simply given as premise, that the greedy future lawyers are entirely honest with themselves. Actually, memes only exist in context of their medium, being culture, an ongoing conscious and social phenomena that governs even Axiology.
I would consider valuing sex a real value change. Just as I would consider valuing heroin more than just a new source of pleasure. I used to smoke, and the quitting process has convinced me that it wasn't just another source of pleasure, it was a fundamental value shift. I know this wasn't clear in my original comment, but my question was not entirely rhetorical. It was during the quitting ordeal that this thought first occurred to me, and occasionally I still puzzle over it. It is not obviously clear to me that I'm better off liking sex than I was back when I was a child and disinterested. And arguments to the contrary seem too much like rationalization.
But in answer to your question - it seems rational on the surface. If her present and future selves are in direct competition for existence, she should obviously spend resources in support of her present self. If nothing else, at least her future self doesn't yet have any desires that can be thwarted. I was more trying to say that I suspect complete value ossification is not a good thing.

The future is lost when I cease to exist.

"Now I will destroy the whole world... - What a Bokononist says before committing suicide." *
Yep. That's the philosophy that the semi-wise pattern-recognizing pro-death are trying to oppose.
Again, see Belief in the Implied Invisible. What you can't observe is still there, and still has moral weight.
I do not anticipate any experience after I die. It is indistinguishable to me whether anything afterward exists or not. I'm not going to make arguments against any laws of thermodynamics, but if there is nothing to distinguish two states, no fact about them enters into my calculations. They are like fictional characters. And I assign moral weight based on ass-kicking. I suppose there might be an issue if denizens of the future are able to travel back in time.

There's no way you'd agree to receive $1000 a year before your death on condition that your family members will be tortured a minute after it. This is an example of what Vladimir means.

We are confused because a human individual does not possess her own morality, but rather the morality of a "virtual agent" comprising that human's genetic lineage.
To rephrase this in a less negative-sounding way: It makes no sense to ask what my utility function says should happen after I die.

Is it more important to you that people of the future share your values, or that your values are actually fulfilled? Do you want to share your values, so that other (future?) people can make the world better, or are you going to roll up your sleeves and do it yourself? After all, if everyone relies on other people to get work done, nothing will happen. It's not Pareto efficient.

I think your deathism metaphor is flawed, but in your terms: Why do you assume "living for as long as I want" has a positive utility in my values system? It's not Pareto e... (read more)

Doesn't matter whether you're afraid to admit it, what matters is what you're planning to do about it.
Sorry, a failed attempt at sarcasm there.
latter >= former by logical deduction.
Yes. That was my intended point, sorry for being unclear.
Are you sure your intended point wasn't "values - values about other's values > values about other's values"? That point is hard to express neatly but it is a more important intuitive point and one that seems to be well supported by your argument. (By the way. I was the one who had downvoted your earlier comment, but that was actually in response to "I'm cowardly, irrational, and shallow, and I'm not afraid to admit it!" which doesn't fit well as a response to that particular exhortation. But I removed the downvote because I decided there was no point being grumpy if I wasn't going to be grumpy and specific. ;))
Effort required to achieve your goal directly < effort required to convince others to achieve your goal for you. ...and I've just spotted the glaring hole in my argument, so the reason that it was unclear is probably that it was wrong. I assume that people who share your values will act similarly to you. Before, I only considered the possibility that you would work alone (number of people contributing: 1), or that everyone you convinced would do as you did and convince more people (number of people doing work other than marketing: 0). I concluded incorrectly that the best strategy was to work alone; in fact the best strategy is probably a mixed strategy of some sort. TLDR: I was wrong and you were right. Ignore my previous posts. (And I'm fine with being downvoted as long I know why. I can make good use of constructive criticism.)

The problem with this logic is that my values are better than those of my ancestors. Of course I would say that, but it's not just a matter of subjective judgment; I have better information on which to base my values. For example, my ancestors disapproved of lending money at interest, but if they could see how well loans work in the modern economy, I believe they'd change their minds.

It's easy to see how concepts like MWI or cognitive computationalism affect one's values when accepted. It's likely bordering on certain that transhumans will have more ins... (read more)

Your values are what they are. They talk about how good certain possible future-configurations are, compared to other possible future-configurations. Other concepts that happen to also be termed "values", such as your ancestors' values, don't say anything more about comparative goodness of the future-configurations, and if they do, then that is also part of your values. If you'd like for future people to be different in given respects from how people exist now, that is also a value judgment. For future people to feel different about their condition than you feel about their condition would make them disagree with your values (and dually).
I'm having difficulty understanding the relevance of this sentence. It sounds like you think I'm treating "my ancestors' values" as a term in my own set of values, instead of a separate set of values that overlaps with mine in some respects. My ancestors tried to steer their future away from economic systems that included money loaned at interest. They were unsuccessful, and that turned out to be fortunate; loaning money turned out to be economically valuable. If they had known in advance that loaning money would work out in everyone's best interest, they would have updated their values (future-configuration preferences). Of course, you could argue that neither of us really cared about loaning at interest; what we really cared about was a higher-level goal like a healthy economy. It would be convenient if we could establish a restate our values in a well-organized hierarchy, with a node at the top that was invariant on available information. But even if that could be done, which I doubt, it would still leave a role for available information in deciding something as concrete as a preferred future-configuration.
That's closer to the sense I wanted to convey with this word. Distinction is between a formal criterion of preference and computationally feasible algorithms for estimation of preference between specific plans. The concept relevant for this discussion is the former one.
I haven't yet been convinced that my values are any better than the values of my ancestors by this argument. Yes if I look at history people generally tend to move towards my own current values (with periods of detours). But this would be true if I looked at my travelled path after doing a random walk. Sure there are things like knowledge changing proxy values due to knowledge (I would like my ancestors favour punishing witches if it turned out that they factually do use demonically gifted powers to hurt others), but there has also been just plain old value drift. There are plenty of things our ancestors would never approve of even if they had all the knowledge we had.