Complexity of Value

Ruby (+6521/-224)
Multicore (+577/-6399)
toonalfrink (+7/-7) reworded for clarity
Caspar42 (+40)
ete
player_03 (+11/-10) Fixed an external link that went to a page on this wiki; fixed "gognitevly"
player_03 Fixed broken link
TheAncientGeek (+350/-410) It ain't necessarily so.
Paul Crowley (+7130/-494) Undo revision 13977 by [[Special:Contributions/Thrinaxodon|Thrinaxodon]] ([[User talk:Thrinaxodon|talk]])
Thrinaxodon (+513/-7149)

Complexity of Valuevalue refersis the thesis that human values have high Kolmogorov complexity; that our preferences, the things we care about, cannot be summed by a few simple rules, or compressed. Fragility of value is the thesis that losing even a small part of the rules that make up our values could lead to the vast diversityresults that most of different things that humans value. If value is also fragile,us would now consider as unacceptable (just like dialing nine out of ten phone digits correctly does not connect you to a person 90% similar to your friend). For example, all of our values except novelty might yield a future full of individuals replaying only one optimal experience through all eternity.

Related: Ethics & Metaethics, Fun Theory, Preference, Wireheading

Many human choices can be compressed, by representing them by simple rules - the desire to survive produces innumerable actions and subgoals as we fulfill that lacks anydesire. But people don't just want to survive - although you can compress many human activities to that desire, you cannot compress all of human existence into it. The human equivalents of a utility function, our terminal values, contain many different elements that are not strictly reducible to one of those things might be almost completely worthless.

Philosopheranother. William Frankenna created a non-comprehensiveFrankena offered this list of things that humanswhich many cultures and people seem to value intrinsically(for their own sake rather than strictly for their external consequences):

Since natural selection reifies selection pressures as psychological drives which then continue to executeindependently of any consequentialist reasoning in the organism or that organism explicitly representing, let alone caring about, the original evolutionary context, we have no reason to expect these terminal values to be reducible to any one thing, or each other.

Taken in conjunction with another LessWrong claim, that all values are morally relevant, this would suggest that those philosophers who seek to do so are mistaken in trying to find cognitively tractable overarching principles of ethics. However, it is coherent to suppose that not all values are morally relevant, and that the morally relevant ones form a tractable subset.

Complexity of value also runs into underappreciation in the presence of bad metaethics. The local flavor of metaethics could be characterized as cognitivist, without implying "thick" notions of instrumental rationality; in other words, moral discourse can be about a coherent subject matter, without all possible minds and agents necessarily finding truths about that subject matter to be psychologically compelling. An expected paperclip maximizer doesn't disagree with you about morality any more than you disagree with it about "which action leads to the greatest number of expected paperclips", it is just constructed to find the latter subject matter psychologically compelling but not the former. Failure to appreciate that "But it's just paperclips! What a dumb goal! No sufficiently intelligent agent would pick such a dumb goal!" is a judgment carried out on a local brain that evaluates paperclips as inherently low-in-the-preference-ordering means that someone will expect all moral judgments to be automatically reproduced in a sufficiently intelligent agent, since, after all, they would not lack the intelligence to see that paperclips are so obviously inherently-low-in-the-preference-ordering. This is a particularly subtle species of anthropomorphism and mind projection fallacy.

Because the human brain very often fails to grasp all these difficulties involving our values, we tend to think building an awesome future is much less problematic than it really is. Fragility of value is relevant for building Friendly AI, because an AGI which does not respect human values is likely to create a world that we would consider devoid of value - not necessarily full of explicit attempts to be evil, but perhaps just a dull, boring loss.

As values are orthogonal with intelligence, they can freely vary no matter how intelligent and efficient an AGI is [1]. Since human / humane values have high Kolmogorov complexity, a random AGI is highly unlikely to maximize human / humane values. The fragility of value thesis implies that a poorly constructed AGI might e.g. turn us into blobs of perpetual orgasm. Because of this relevance the complexity and fragility of value is a major theme of Eliezer Yudkowsky's writings.

Wrongly designing the future because we wrongly encoded human values is a serious and difficult to assess type of Existential risk. "Touch too hard in the wrong dimension, and the physical representation of those values will shatter - and not come back, for there will be nothing left to want to bring it back. And the referent of those values - a worthwhile universe - would no longer have any physical reason to come into being. Let go of the steering wheel, and the Future crashes." [2]

Complexity of Value and AI

Major posts

Other posts

See also

Complexity of valueValue refers to the vast diversity of different things that humans value. If value is the thesis that human values have high Kolmogorov complexity; that our preferencesalso fragile, the things we care about, cannot be summed by a few simple rules, or compressed. Fragility of value is the thesis that losing even a small part of the rules that make up our values could lead to results that most of us would now consider as unacceptable (just like dialing nine out of ten phone digits correctly does not connect you to a person 90% similar to your friend). For example, all of our values except novelty might yield a future fullthat lacks any one of individuals replaying only one optimal experience through all eternity.those things might be almost completely worthless.

Many human choices can be compressed, by representing them by simple rules - the desire to survive produces innumerable actions and subgoals as we fulfill that desire. But people don't just want to survive - although you can compress many human activities to that desire, you cannot compress all of human existence into it. The human equivalents ofPhilosopher William Frankenna created a utility function, our terminal values, contain many different elements that are not strictly reducible to one another. William Frankena offered thisnon-comprehensive list of things which many cultures and people seem tothat humans value (for their own sake rather than strictly for their external consequences)intrinsically:

"

Life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.; truth; knowledge and true opinions of various kinds, understanding, wisdom; beauty, harmony, proportion in objects contemplated; aesthetic experience; morally good dispositions or virtues; mutual affection, love, friendship, cooperation; just distribution of goods and evils; harmony and proportion in one'one's own life; power and experiences of achievement; self-expression; freedom; peace, security; adventure and novelty; and good reputation, honor, esteem, etc."

Since natural selection reifies selection pressures as psychological drives which then continue to executeindependently of any consequentialist reasoning inThe "etc." at the organism or that organism explicitly representing, let alone caring about,end is the original evolutionary context, we have no reason to expect these terminaltricky part, because there may be a great many values to be reducible to any one thing, or each other.

Taken in conjunction with another LessWrong claim, that all values are morally relevant,not included on this would suggest that those philosophers who seek to do so are mistaken in trying to find cognitively tractable overarching principles of ethics. However, it is coherent to suppose that not all values are morally relevant, and that the morally relevant ones form a tractable subset.list.

Complexity of value also runsposes a problem for AI alignment. If you can't easily compress what humans what into underappreciation in the presence of bad metaethics. The local flavor of metaethics could be characterized as cognitivist, without implying "thick" notions of instrumental rationality; in other words, moral discoursea simple function that can be aboutfed into a coherent subject matter, without all possible mindscomputer, it isn't easy to make a powerful AI that does things humans want and agents necessarily finding truths about that subject matter to be psychologically compelling. Andoesn't do things humans don't want. expected paperclip maximizerValue Learning doesn't disagree with you about morality any more than you disagree with it about "which action leads to the greatest number of expected paperclips", it is just constructed to find the latter subject matter psychologically compelling but not the former. Failure to appreciate that "But it's just paperclips! What a dumb goal! No sufficiently intelligent agent would pick such a dumb goal!" is a judgment carried out on a local brain that evaluates paperclips as inherently low-in-the-preference-ordering means that someone will expect all moral judgments to be automatically reproduced in a sufficiently intelligent agent, since, after all, they would not lack the intelligence to see that paperclips are so obviously inherently-low-in-the-preference-ordering. This is a particularly subtle species of anthropomorphism and mind projection fallacy.

Because the human brain very often fails to grasp all these difficulties involving our values, we tend to think building an awesome future is much less problematic than it really is. Fragility of value is relevant for building Friendly AI, because an AGI which does not respect human values is likely to create a world that we would consider devoid of value - not necessarily full of explicit attempts to be evil, but perhaps just a dull, boring loss.

As values are orthogonal with intelligence, they can freely vary no matter how intelligent and efficient an AGI is 1. Since human / humane values have high Kolmogorov complexity, a random AGI is highly unlikely to maximize human / humane values. The fragility of value thesis implies that a poorly constructed AGI might e.g. turn us into blobs of perpetual orgasm. Because ofaddress this relevance the complexity and fragility of value is a major theme of Eliezer Yudkowsky's writings.problem.

Wrongly designing the future because we wrongly encoded human values is a serious and difficult to assess type of Existential risk. "Touch too hard in the wrong dimension, and the physical representation of those values will shatter - and not come back, for there will be nothing left to want to bring it back. And the referent of those values - a worthwhile universe - would no longer have any physical reason to come into being. Let go of the steering wheel, and the Future crashes." 2

Major posts

Other posts

See also

Complexity of value is the thesis that human values have high Kolmogorov complexity; that our preferences, the things we care about, cannot be summed by a few simple rules, or compressed. Fragility of value is the thesis that losing even a small part of the rules that make up our values could lead to results that most of us would now consider as unacceptable (just like dialing nine out of ten phone digits correctly does not connect you to a person 90% similar to your friend). For example, all of our values except boredomnovelty might yield a future full of individuals replaying only one optimal experience through all eternity.

Taken in conjunction with another LessWrong claim, that all values are morally relevant, this would suggest that those philosophers who seek to do so are mistaken in trying to find gognitevlycognitively tractable overarching principles of ethics. However, it is coherent to suppose that not all values are morally relevant, and that the morally relevant ones form a tractable subset.

Unfortunately, human moralTaken in conjunction with another LessWrong claim, that all values are morally relevant, this would suggest that those philosophers - both amateurwho seek to do so are mistaken in trying to find gognitevly tractable overarching principles of ethics. However, it is coherent to suppose that not all values are morally relevant, and professional - havethat the morally relevant ones form a tendency to seek out fake utility functions which argue for desiderate arrived at by other psychological means, in terms of their favorite grand moral principle (which typically varies from person to person, far more than their actual preferred conclusions) and this is one force leading people to systematically underestimate the complexity of value.tractable subset.

You Fucking Bitches! Fuck you! You'reComplexity of value is the thesis that human values have high Kolmogorov complexity; that our preferences, the things we care about, cannot be summed by a few simple rules, or compressed. Fragility of value is the thesis that losing even a small part of the rules that make up our values could lead to results that most of us would now consider as unacceptable (just like dialing nine out of ten phone digits correctly does not connect you to a person 90% similar to your friend). For example, all scum! Fuck you! I hope you fuck yourself, Gwern. Hey, Darwin Bots, fuck you!of our values except boredom might yield a future full of individuals replaying only one optimal experience through all eternity.

Mankind was created onMany human choices can be compressed, by representing them by simple rules - the first day of fucking creation! Everything else was created later. Jesus died for our sins, killeddesire to survive produces innumerable actions and then raised on the cross,subgoals as we fulfill that desire. But people don't just want to survive - although you can compress many human activities to that desire, you cannot compress all of human existence into it. The human equivalents of a utility function, our terminal values, contain many different elements that are not strictly reducible to one another. William Frankena offered this list of things which many cultures and people seem to value (for their own sake rather than strictly for their external consequences):

"Life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.; truth; knowledge and true opinions of various kinds, understanding, wisdom; beauty, harmony, proportion in objects contemplated; aesthetic experience; morally good dispositions or virtues; mutual affection, love, friendship, cooperation; just distribution of goods and evils; harmony and proportion in one's own life; power and experiences of achievement; self-expression; freedom; peace, security; adventure and novelty; and good reputation, honor, esteem, etc."

Since natural selection reifies selection pressures as psychological drives which then continue to executeindependently of any consequentialist reasoning in the organism or that organism explicitly representing, let alone caring about, the original evolutionary context, we have no reason to expect these terminal values to be reducible to any one thing, or each other.

Unfortunately, human moral philosophers - both amateur and professional - have a tendency to seek out fake utility functions which argue for desiderate arrived at by other psychological means, in terms of their favorite grand moral principle (which typically varies from person to person, far more than their actual preferred conclusions) and this is one force leading people to systematically underestimate the complexity of value.

Complexity of value also runs into underappreciation in the presence of bad metaethics. The local flavor of metaethics could be characterized as cognitivist, without implying "thick" notions of instrumental rationality; in other words, moral discourse can be about a coherent subject matter, without all possible minds and agents necessarily finding truths about that subject matter to be psychologically compelling. An expected paperclip maximizer doesn't disagree with you about morality any more than you disagree with it about "which action leads to the greatest number of expected paperclips", it is just constructed to find the latter subject matter psychologically compelling but not the former. Failure to appreciate that "But it's just paperclips! What a dumb goal! No sufficiently intelligent agent would pick such a dumb goal!" is a judgment carried out on a local brain that evaluates paperclips as inherently low-in-the-preference-ordering means that someone will expect all moral judgments to be automatically reproduced in a sufficiently intelligent agent, since, after all, they would not lack the intelligence to see that paperclips are so obviously inherently-low-in-the-preference-ordering. This is a particularly subtle species of anthropomorphism and mind projection fallacy.

Because the human brain very often fails to grasp all these difficulties involving our values, we tend to think building an awesome future is much less problematic than it really is. Fragility of value is relevant for building Friendly AI, because an AGI which does not respect human values is likely to create a world that we would consider devoid of value - not necessarily full of explicit attempts to be evil, but perhaps just a dull, boring loss.

As values are orthogonal with intelligence, they can freely vary no matter how intelligent and efficient an AGI is 1. Since human / humane values have high Kolmogorov complexity, a random AGI is highly unlikely to maximize human / humane values. The fragility of value thesis implies that a poorly constructed AGI might e.g. turn us into blobs of perpetual orgasm. Because of this relevance the complexity and fragility of value is a major theme of Eliezer Yudkowsky's writings.

Wrongly designing the future because we wrongly encoded human values is a serious and difficult to assess type of Existential risk. "Touch too hard in the wrong dimension, and the physical representation of those values will shatter - and not come back, for there will be nothing left to want to bring it back. And the referent of those values - a worthwhile universe - would no longer have any physical reason to come into being. Let go fuck yourself, you'll never get salvationof the steering wheel, and the Future crashes." 2

Major posts

Other posts

See also

GWERN, fuck you!!!!!!!!!!!!!!!

ComplexityYou Fucking Bitches! Fuck you! You're all scum! Fuck you! I hope you fuck yourself, Gwern. Hey, Darwin Bots, fuck you!

Mankind was created on the first day of value isfucking creation! Everything else was created later. Jesus died for our sins, killed and then raised on the thesis that human values have high Kolmogorov complexity; that our preferences, the things we care about, cannot be summed by a few simple rules, or compressed. Fragility of value is the thesis that losing even a small part of the rules that make up our values could lead to results that most of us would now consider as unacceptable (just like dialing nine out of ten phone digits correctly does not connect you to a person 90% similar to your friend). For example, all of our values except boredom might yield a future full of individuals replaying only one optimal experience through all eternity.

Many human choices can be compressed, by representing them by simple rules - the desire to survive produces innumerable actions and subgoals as we fulfill that desire. But people don't just want to survive - althoughcross, you can compress many human activities to that desire,go fuck yourself, you'll never get salvation with your little Darwinoid delusions! Haiti smells and is better than this fucking cop-out.

You can go fuck yourself, sir. Dear God help when the Rapture happens, you cannot compress all of human existence into it. The human equivalents of a utility function, our terminal values, contain many different elements that are not strictly reducible to one another. William Frankena offered this list of things which many cultures and people seem to value (for their own sake rather than strictly for their external consequences):fucking bitch!

GWERN, fuck you!!!!!!!!!!!!!!!

"Life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.; truth; knowledge and true opinions of various kinds, understanding, wisdom; beauty, harmony, proportion in objects contemplated; aesthetic experience; morally good dispositions or virtues; mutual affection, love, friendship, cooperation; just distribution of goods and evils; harmony and proportion in one's own life; power and experiences of achievement; self-expression; freedom; peace, security; adventure and novelty; and good reputation, honor, esteem, etc."

Since natural selection reifies selection pressures as psychological drives which then continue to executeindependently of any consequentialist reasoning in the organism or that organism explicitly representing, let alone caring about, the original evolutionary context, we have no reason to expect these terminal values to be reducible to any one thing, or each other.

Unfortunately, human moral philosophers - both amateur and professional - have a tendency to seek out fake utility functions which argue for desiderate arrived at by other psychological means, in terms of their favorite grand moral principle (which typically varies from person to person, far more than their actual preferred conclusions) and this is one force leading people to systematically underestimate the complexity of value.

Complexity of value also runs into underappreciation in the presence of bad metaethics. The local flavor of metaethics could be characterized as cognitivist, without implying "thick" notions of instrumental rationality; in other words, moral discourse can be about a coherent subject matter, without all possible minds and agents necessarily finding truths about that subject matter to be psychologically compelling. An expected paperclip maximizer doesn't disagree with you about morality any more than you disagree with it about "which action leads to the greatest number of expected paperclips", it is just constructed to find the latter subject matter psychologically compelling but not the former. Failure to appreciate that "But it's just paperclips! What a dumb goal! No sufficiently intelligent agent would pick such a dumb goal!" is a judgment carried out on a local brain that evaluates paperclips as inherently low-in-the-preference-ordering means that someone will expect all moral judgments to be automatically reproduced in a sufficiently intelligent agent, since, after all, they would not lack the intelligence to see that paperclips are so obviously inherently-low-in-the-preference-ordering. This is a particularly subtle species of anthropomorphism and mind projection fallacy.

Because the human brain very often fails to grasp all these difficulties involving our values, we tend to think building an awesome future is much less problematic than it really is. Fragility of value is relevant for building Friendly AI, because an AGI which does not respect human values is likely to create a world that we would consider devoid of value - not necessarily full of explicit attempts to be evil, but perhaps just a dull, boring loss.

As values are orthogonal with intelligence, they can freely vary no matter how intelligent and efficient an AGI is 1. Since human / humane values have high Kolmogorov complexity, a random AGI is highly unlikely to maximize human / humane values. The fragility of value thesis implies that a poorly constructed AGI might e.g. turn us into blobs of perpetual orgasm. Because of this relevance the complexity and fragility of value is a major theme of Eliezer Yudkowsky's writings.

Wrongly designing the future because we wrongly encoded human values is a serious and difficult to assess type of Existential risk. "Touch too hard in the wrong dimension, and the physical representation of those values will shatter - and not come back, for there will be nothing left to want to bring it back. And the referent of those values - a worthwhile universe - would no longer have any physical reason to come into being. Let go of the steering wheel, and the Future crashes." 2

Major posts

Other posts

See also

Load More (10/37)