Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Note: working on a research agenda, hence the large amount of small individual posts, to have things to link to in the main documents.

I've been working on a way of synthesising the preferences of a single given human.

After this is done, there is the business of combining these utilities into a single utility function for the whole human population. One obvious way of doing this is to weigh the different utilities by some individual measure of intensity, then add them together, and maximise the sum.

But there are some arguments against this. For example, a lot of people's preferences is over the behaviour or experiences of other people. This includes things like promoting religious or cultural beliefs (and this can cover preferences like wanting others to have basic human rights or to be free from oppressive situations).

Apart from simple summing, which may result in an excessive strength to a majority, there are two other obvious solutions: the first is to remove preferences over other people entirely. Note that this would also remove most preferences over social and cultural systems. The other one is to remove any anti-altruistic components to the system: preferences over the suffering of others are no longer valid (note this would allow punishment-as-a-deterrence, or punishment-as-retribution-to-a-victim, but not punishment-as-abstract-justice). This may be a bit tricky to define - what's the clear difference between wanting to be of higher status and wanting others to be of lower status? - but might be a desirable compromise. It might, indeed, be the kind of compromise that different people would "negotiate to" in some sort of "moral parliament" example, since people often tend to prefer their own individual desires over desires to control other people, so this might be a solution that would have majority support.

1 comments, sorted by Click to highlight new comments since: Today at 5:39 PM
New Comment

I just joined LW and this is my first comment. It all feels fairly intimidating as I've only skimmed AI to Zombies, but I like to think of myself as a fairly smart kid, though I may be in over my head here. Anyway, I just think this is a really cool idea. I've thought about this a lot in terms of a general unifying economic model. I tend to be a big picture thinker so I don't have as much to say about the technical stuff. I think in general such a combination of utilities and the consideration of these utilities should rely very heavily on psychological components and I think would be best of combined maybe in a dynamic system. As a start it may be best to think of this in terms of a small society on a deserted island and map out some of the resources, psychological profiles, goals, etc.

Also I took a class on socialism in college when I was thinking about this stuff. I was curious about the Cambridge Capital Controversy and asked my professor if anyone had a definite answer to it. He referred me to a book by economist, Ian Steedman, "Consumption Takes Time: Implications for Economic Theory," which at first glance seems to be possibly relevant and at the very least interesting. I haven't gotten the chance to read it but would still like to.

Anyway, that's just me spitballing. I know I can be a bit hard to follow and all over the place, but my hope is getting involved on a forum like this might help me with that. Very cool topic though!

New to LessWrong?