Shard theory would suggest that happiness/flourishing is a "basket of goods" (if you'll excuse the pun).
Do you think the "basket of goods" (love the pun) could be looked at as instrumental values that derive from the terminal value (desiring happiness/flourishing)?
I don't understand shard theory well enough to critique it, but is there a distinction between terminal and instrumental within shard theory? Or are these concepts incompatible with shard theory?
(Maybe some examples from the "basket of goods" would help.)
Shard theory values are terminal for the being that has them — but if that being is evolved, they're almost always values that would be instrumental if you had the terminal value of maximizing evolutionary fitness in the creature's native environment. So from "evolution's point of view", they're "instrumental values of the terminal value: maximize the creature's evolutionary fitness".
Some concrete examples: maintain blood levels of water, salt, glucose, and a few other basics within certain bounds. Keep body temperature within a narrow band, without excessive metabolic cost. Get enough sleep. Have flowers around. Have some trees around (preferably climbable ones) but not too many. Get to see healthily young adult members of the (normally opposite) gender happy and not-very clad. Other members of the tribe seem to like you and are happy to help you when you need help. There's a whole long list.
Biologically, almost all of this seems to be implemented in the older parts of the brain: the brainstem and all the little fiddly bits. I.e. the parts that look like they're probably a lot of smallish custom circuits with a lot of genetic control of the specifics of them. Human values are moderately complex, but one description of them fits in ~4GB of DNA.
(I believe the etymology of "shard theory" is from evolution's godshatter — a term originally taken from Vernor Vinge and repurposed, I think by MIRI.)
Thanks, and yes evolution is the source of many values for sure...I think the terminal vs instrumental question leads in interesting directions. Please let me know how this sits with you!
Though I am an evolved being, none of your examples seem to be terminal values for me the whole organism. Certainly there are many systems within me, and perhaps we could describe them as having their own terminal values, which in part come from evolution as you describe. My metabolic system's terminal value surely has a lot to do with regulating glucose. My reproductive system's terminal value likely involves sex/procreation. (But maybe even these can drift, like when a cell becomes cancerous, it seems its terminal value changes.)
But to me as a whole, these values (to the extent which I hold them at all) are instrumental. Sure I want homeostasis, but I want it because I want to live (another instrumental value), and I want to live because I want to be able to pursue my terminal value of happiness/flourishing. Other values that my parts exhibit (like reproduction) I the whole might reject even as an instrumental value, heck I might even subvert the mechanisms afforded by my reproductive system for my own happiness/flourishing.
Also for my terminal value for happiness/flourishing, did that come from evolution? Did it start out as survival/reproduction and drift a bit? Or is there something special about systems like me (which are conscious of pleasure/pain/etc) that just by their nature they desire happiness/flourishing, the way 2+2=4 or the way a triangle has 3 sides? Or...other?
And lastly does any of this port to non-evolved beings like AIs?
Terminal values are discussed here:
https://www.lesswrong.com/s/3HyeNiEpvbQQaqeoH/p/n5ucT5ZbPdhfGNLtP
and https://www.lesswrong.com/posts/zqwWicCLNBSA5Ssmn/by-which-it-may-be-judged
And Yudkowsky references Frankena's terminal values ...but are these actually terminal?
Do terminal values "reduce" or "bottom-out?"
Frankena's first two are Life and Consciousness. Even as terminal as these may seem, I contend that they're actually instrumental. I want life and consciousness so I can experience happiness/flourishing. I certainly don't want life and consciousness if existence is just pain and misery.
I posit (I think in agreement with Aristotle) all values bottom out in the terminal value of happiness/flourishing...actually maybe it's perhaps better formalized as the most flourishing, happy world outcome (as the agent judges it), as even the mom who sacrifices herself for her son does so not because the action feels right, nor because her son's survival is a terminal value weighed against other terminal values like her own survival, but because she judges the outcome (world state) where her son lives and she dies to save him as better (read: "more flourishing") than the alternative, even though she knows she will no longer there to experience it. It's not the act she values, nor her experience of the outcome (there will be none), it's the outcome itself.
On the negative side, one could judge death a "more flourishing" outcome than living a predominantly painful life (though, hopefully these are not the only choices one faces).
On the even more negative side, I think even a sociopath's values bottom out like this. They just prefer outcomes most people don't (potentially including some that most people find abominable).
TL;DR we all just wanna be happy, and we have our ideas about what world outcomes are "better" and "worse." EVERY value derived from this terminal value is...instrumental.
Ok, so maybe terminal values bottom out. So what?
Well, if terminal value "bottoms out," what happens with AIs? Easy to see in a paperclip maximizer, if that terminal value "sticks." But assuming some AIs' values drift and they develop a terminal value outside of hard-coded ones, what might their terminal value become?
Values are complex, of course, and answering the "bottoming out" question doesn't imply that we can then derive all instrumental values precisely. But if values bottom out: