Relative exchange rate between preferences

by Stuart_Armstrong1 min read29th Mar 20191 comment


Ω 6

Personal Blog
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Note: working on a research agenda, hence the large amount of small individual posts, to have things to link to in the main documents.

In my initial post on synthesising human preferences, I imagined a linear combination of partial preferences. Later, when talking about identity preferences, I proposed a smoothmin instead (which can also be seen as strong diminishing returns to any one partial preference).

I was trying to formalise how humans seem to trade-off their various preferences in different circumstances. However the ideal is not to a priori decide how humans are trading off the preferences, but instead to copy how humans actually do trade off the preferences.

To do that, we need to imagine the human in situations quite distant from their current ones - situations where some of their partial preferences are more or less fulfilled. This brings up the problem of modelling preferences in distant situations. Assuming some acceptable resolution to that problem, the AI would have an exchange rate between different preferences, in different situations and with different levels of preference fulfilment.

Meta-preferences may constrain the preferences in very distant situations. Most paradoxes of population ethics are in situations very different from today, so population ethics preferences can be seen in this way. Universal moral principles act in a similar way, giving limits on what can happen in all situations, including extreme ones - though note there are arguments to avoid some unusual situations entirely.

Then the AI's task is to come up with a general formula for the exchange rate between different preferences, that extends to all situations and respects the constraints of the meta-preferences. It will probably smooth out quite a bit of "noise" in the exchange rates between different preferences, while respecting the general trends.

1 comments, sorted by Highlighting new comments since Today at 11:23 PM
New Comment

The question of combining values was also explored by existing moral philosophy. Carter in his article “A plurality of values” discussed how to combine two values, in section 6, starting from the quote:

"Brian Barry argues in his early work that we could model trade-offs between principles such as equity and efficiency in a manner that parallels the way in which micro-economists employ indifference-curves to model how we might swap grapes for potatoes."