Milan Cvitkovic

Milan Cvitkovic's Comments

References that treat human values as units of selection?

Thanks, Charlie!

Modifying yourself to want bad things is wrong in the same sense that the bad things are wrong in the first place...

I definitely agree with this, and have even written about it previously. Maybe my problem is that I feel like "find the best values to pursue" is itself a human value, and then the right-or-wrong-value question becomes the what-values-win question.

References that treat human values as units of selection?

Thank you! Carl Shulman's post still seems written from the some-values-are-just-better-than-others perspective that's troubling me, but your 2009 comment is very relevant. (Despite future-you having issues with it.)

The question "which values, if any, we should try to preserve" you wrote in that comment is, I think, the crux of my issue. I'm having trouble thinking about it, and questions like it, given my "you can't talk about what's right, only what wins" assumption. I can (try to) think about whether a paperclip maximizer or The Culture is more likely to overrun the galaxy, but I don't know how say that one scenario is "better", except from the perspective of my own values.

Is value drift net-positive, net-negative, or neither?

I would argue that the concept of value drift (meaning "a change in human values from whatever they are currently") isn't really sensible to talk about. Here's a reductio argument to that effect: Avoiding bad value drift is as important as solving value alignment: https://docs.google.com/document/d/1TDA9vHBT7kN9oJ69-MEtbXZ_GiAGk3aRhLXFgkLv8XM/edit?usp=sharing

It's hard to compare values on their "goodness". I prefer to think of them as phenotypes and compare them on their adaptive benefits to agents that hold them. After all, it doesn't really matter what's right: it matters what wins.