In January of last year, Nick Bostrom wrote a post on Overcoming Bias about his and Toby Ord’s proposed method of handling moral uncertainty. To abstract away a bit from their specific proposal, the general approach was to convert a problem involving moral uncertainty into a game of negotiation, with each player’s bargaining power determined by one’s confidence in the moral philosophy represented by that player.

Robin Hanson suggested in his comments to Nick’s post that moral uncertainty should be handled the same way we're supposed to handle ordinary uncertainty, by using standard decision theory (i.e., expected utility maximization). Nick’s reply was that many ethical systems don’t fit into the standard decision theory framework, so it’s hard to see how to combine them that way.

In this post, I suggest we look into the seemingly easier problem of value uncertainty, in which we fix a consequentialist ethical system, and just try to deal with uncertainty about values (i.e., utility function). Value uncertainty can be considered a special case of moral uncertainty in which there is no apparent obstacle to applying Robin’s suggestion. I’ll consider a specific example of a decision problem involving value uncertainty, and work out how Nick and Toby’s negotiation approach differs in its treatment of the problem from standard decision theory. Besides showing the difference in the approaches, I think the specific problem is also quite important in its own right.

The problem I want to consider is, suppose we believe that a singleton scenario is very unlikely, but *may* have very high utility if it were realized, should we focus most of our attention and effort into trying to increase its probability and/or improve its outcome? The main issue here is (putting aside uncertainty about what will happen after a singleton scenario is realized) uncertainty about how much we value what is likely to happen.

Let’s say there is a 1% chance that a singleton scenario does occur, and conditional on it, you will have expected utility that is equivalent to a 1 in 5 billion chance of controlling the entire universe. If a singleton scenario does not occur, you will have a 1/5 billionth share of the resources of the solar system, and the rest of the universe will be taken over by beings like the ones described in Robin’s The Rapacious Hardscrapple Frontier. There are two projects that you can work on. Project A increases the probability of a singleton scenario to 1.001%. Project B increases the wealth you will have in the non-singleton scenario by a factor of a million (so you’ll have a 1/5 thousandth share of the solar system). The decision you have to make is which project to work on. (The numbers I picked are meant to be stacked in favor of project B.)

Unfortunately, you’re not sure how much utility to assign to these scenarios. Let’s say that you think there is a 99% probability that your utility (U1) scales logarithmically with the amount of negentropy you will have control over, and 1% probability that your utility (U2) scales as the square root of negentropy. (I assume that you’re an ethical egoist and do not care much about what other people do with their resources. And these numbers are again deliberately stacked in favor of project B, since the better your utility function scales, the more attractive project A is.)

Let’s compute the expected U1 and U2 of Project A and Project B. Let N_{U}=10^{120} be the negentropy (in bits) of the universe, and N_{S}=10^{77} be the negentropy of the solar system, then:

- EU1(status quo) = .01 * log(N
_{U})/5e9 + .99 * log(N_{S}/5e9) - EU1(A) = .01001 * log(N
_{U})/5e9 + .98999 * log(N_{S}/5e9) ≈ 66.6 - EU1(B) = .01 * log(N
_{U})/5e9 + .99 * log(N_{S}/5e3) ≈ 72.6

EU2 is computed similarly, except with log replaced by sqrt:

- EU2(A) = .01001 * sqrt(N
_{U})/5e9 + .98999 * sqrt(N_{S}/5e9) ≈ 2.002e48 - EU2(B) = .01 * sqrt(N
_{U})/5e9 + .99 * sqrt(N_{S}/5e3) ≈ 2.000e48

Under Robin’s approach to value uncertainty, we would (I presume) combine these two utility functions into one linearly, by weighing each with its probability, so we get EU(x) = 0.99 EU1(x) + 0.01 EU2(x):

- EU(A) ≈ 0.99 * 66.6 + 0.01 * 2.002e48 ≈ 2.002e46
- EU(B) ≈ 0.99 * 72.6 + 0.01 * 2.000e48 ≈ 2.000e46

This suggests that we should focus our attention and efforts on the singleton scenario. In fact, even if Project A had a much, much smaller probability of success, like 10^{-30} instead of 0.00001, or you have a much lower confidence that your utility scales as well as the square root of negentropy, it would still be the case that EU(A)>EU(B). (This is contrary to Robin's position that we pay too much attention to the singleton scenario, and I would be interested to know in which detail his calculation differs from mine.)

What about Nick and Toby’s approach? In their scheme, delegate 1, representing U1, would vote for project B, while delegate 2, representing U2, would vote for project A. Since delegate 1 has 99 votes to delegate 2’s one vote, the obvious outcome is that we should work on project B. The details of the negotiation process don't seem to matter much, given the large advantage in bargaining power that delegate 1 has over delegate 2.

Each of these approaches to value uncertainty seems intuitively attractive on its own, but together they give conflicting advice on this important practical problem. Which is the right approach, or is there a better third choice? I think this is perhaps one of the most important open questions that an aspiring rationalist can work on.

The negotiation approach doesn't value information. This is a big problem with that approach.

If you're uncertain about which values are correct, it's very important for you to get whatever information you need to reduce your uncertainty. But by Conservation of Expected Evidence, none of those value systems would advocate doing so.

To use the parliament analogy, imagine that you have a button that, when pressed, will randomly increase the vote share of some members of parliament, and decrease the vote share of others. Since this button represents gaining information that you don't already have, no member of parliament can

expectto increase his or her vote share by pressing the button. Maybe parliament is indifferent to pressing it, or maybe due to some quirk of the negotiation system they would vote to press it, but they certainly wouldn't expend vast resources to press it. But figuring out your morality is actuallyworthspending vast resources on!Hmm, maybe the way to fix this is to have each agent in the parliament believe that future experiments will validate its position. More precisely, the agent's own predictions condition on its value system being correct. Then the parliament would vote to expend resources on information about the value system.

Is it possible to enforce that? It seems like specifying a bottom line to me.

It would be specifying a bottom line if each sub-agent could look at any result and say afterwards that this result supports its position. That's not what I'm suggesting. I'm saying that each sub-agent should make predictions as if its own value system is correct, rather than having each sub-agent use the same set of predictions generated by the super-agent.

Quick dive into the concrete: I think that legalization of marijuana would be a good thing ... but that evaluation is based on my current state of knowledge, including several places where my knowledge is ambiguous. By Baye's Rule, I can't possibly have a nonzero expectation for the

changein my evaluation based on the discovery of new data.Am I misunderstanding the situation you hypothesize?

You can have a nonzero expectation for the change in

someone else'sevaluation, which is what I was talking about. The super-agent and the sub-agent have different beliefs.I see - that is sensible.

Upvoted for significant insight, would like to do so more than once.

(would upvote more than once)

(would upvote several times)

You say you have stacked assumptions in favor of project B, but then you make two quite bizarre assumptions:

A singleton has a one in 5 billion chance of giving you control of the entire visible universe,

In the non-singleton scenario, there is zero probability that any of the universe outside the solar system will have any utility.

To make these assumptions is essentially to prejudge the matter in favor of project A, for obvious reasons; but that says nothing about what the outcome would be given more plausible assumptions.

I already addressed the rationale for this assumption. Why do you think this assumption favors project A?

It's hard to see how, in a non-singleton scenario, one might get more resources than 1/5000 share of the solar system. Perhaps what other people do with their resources does matter somewhat to me, but in the expected utility computation I think it would count for very little in the presence of other large values, so for simplicity I set it to 0. If you disagree, let me know what you think a more realistic assumption is, and I can redo the calculations.

As Tim Tyler pointed out, the fact that a singleton government physically could choose a random person and appoint him dictator of the universe is irrelevant; we know very well it isn't going to. This assumption favored project A because almost all your calculated utility derived from the hope of becoming dictator of the universe; when we accept this is not going to happen, all that fictional utility evaporates.

To take the total utility of the rest of the universe as approximately zero for the purpose of this calculation would require that we value other people in general less than we value ourselves by a factor on the order of 10^32. Some discount factor is reasonable -- we do behave as though we value ourselves more highly than random other people. But if you agree that you wouldn't save your own life at the cost of letting a million other people die, then you agree the discount factor should not be as high as 10^6, let alone 10^32.

To answer 1, the reason that a singleton government won't choose a random person and let him be dictator is that it has an improvement upon that. For example, if people's utilities are less than linear in negentropy, then it would do better to give everyone an equal share of negentropy. So why shouldn't I assume that in the singleton scenario my utility would be at least as large as if I have a random chance to be dictator?

For 2, I don't think a typical egoist would have a constant discount factor for other people, and certainly not the kind described in Robin's The Rapacious Hardscrapple Frontier. He might be willing to value the entire rest of the universe combined at, say, a billion times his own life, but that's not nearly enough to make EU(B)>EU(A). An altruist would have a completely different kind of utility function, but I think it would still be the case that EU(A)>EU(B).

Okay, so now the assumptions seem to be that a singleton government will give you exclusive personal title to a trillion galaxies, that we should otherwise behave as though the future universe were going to imitate a particular work of early 21st century dystopian science fiction, and that one discounts the value of other people compared to oneself by a factor of perhaps 10^23. I stand by my claim that the only effect of whipping out the calculator here is obfuscation; the real source of the bizarre conclusions is the bizarre set of assumptions.

I think you misunderstand Wei Dai's assumptions, but that may be his fault for adding too many irrelevant details to a simple problem.

Perhaps you would care to say more about how you think Wei_Dai's assumptions differ from what rwallace described?

Thanks for the post Wei, I have a couple of comments.

Firstly, the dichotomy between Robin's approach and Nick and mine is not right. Nick and I have always been tempted to treat moral and descriptive uncertainty in exactly the same way insofar as this is possible. However, there are cases where this appears to be ill-defined (eg how much happiness for utilitarians is worth breaking a promise for Kantians?), and to deal with these cases Nick and I consider methods that are more generally applicable. We don't consider the bargaining/voting/market approach to be very plausible as a contender for a unique canonical answer, but as an approach to at least get the hard cases mostly right instead of remaining silent about them.

In the case you consider (which I find rather odd...) Nick and I would simply multiply it out. However, even if you looked at what our bargaining solution would do, it is not quite what you say. One thing we know is that simple majoritarianism doesn't work (it is equivalent to picking the theory with the highest credence in two-theory cases). We would prefer to use a random dictator model, or allow bargaining over future situations too, or all conceivable situations, such that the proponent of the square-root view would be willing to offer to capitulate in most future votes in order to win this one.

Thanks for the clarifications. It looks like I might be more of a proponent for the bargaining approach than you and Nick are at this point.

I think bargaining, or some of the ideas in bargaining theory (or improvements upon them), could be contenders for the canonical way of merging values (if not moral philosophies).

Why? (And why do you find it odd, BTW?)

I was implicitly assuming that this is the only decision (there are no future decisions), in which case the solution Nick described in his Overcoming Bias post does pick project B with certainty, I think. I know this glosses over some subtleties in your ideas, but my main goal was to highlight the difference between bargaining and linearly combining utility functions.

ETA:Also, if we make the probability of the sqrt utility function much smaller, like 10^-10, then the sqrt representative has very little chance of offering enough concessions on future decisions to get its way on this one, but it would still be the case that EU(A)>EU(B).Whenever you deviate from maximizing expected value (in contexts where this is possible) you can normally find examples where this behaviour looks incorrect. For example, we might be value-pumped or something.

For one thing, negentropy may well be one of the most generally useful resources, but it seems somewhat unlikely to be intrinsically good (more likely it matters what you do with it). Thus, the question looks like one of descriptive uncertainty, just as if you had asked about money: uncertainty about whether you value that according to a particular function is descriptive uncertainty for all plausible theories. Also, while evaluative uncertainty does arise in self-interested cases, this example is a strange case of self-interest for reasons others have pointed out.

Can a bargaining solution be value-pumped? My intuition says if it can, then the delegates would choose a different solution. (This seems like an interesting question to look into in more detail though.) But doesn't your answer also argue against using the bargaining solution in moral uncertainty, and in favor of just sticking with expected utility maximization (and throwing away other incompatible moral philosophies that might be value-pumped)?

But what I do with negentropy largely depends on what I value, which I don't know at this point...

You seem to be making this a more complicated example than it has to be. All that has to be done to show the difference between these resolution strategies is to have the following setup:

You're unsure of how your values treat the utility of an action A; you give 99.9% odds that it has expected utility 0 and 0.1% odds that it has utility 1,000,000. You're sure that an action B has expected utility 1 by your values.

But in this format, it looks silly to do anything but treat it as the simple expected-utility problem

a laRobin. Is this somehow not a correct version of your dilemma, or is there some reason we should be horrified at the result of choosing A?The reason for the complexity in my example was to make it relevant to the real-world decision of whether we should concentrate our attention and efforts on the singleton scenario or not. Should I have introduced a simple example first, and then the more complex example separately?

Not exactly horrified, but it's not quite intuitive either, especially if you make the numbers more extreme, like in my example.

Those interested in the topic of this post might be interested to note the recent critical article in the general area - by James Hughes:

"Liberal Democracy vs. Technocratic Absolutism"

You can guess from the title which side of the debate he is on.

The difference between the Hanson and Bostrom-Ord proposals is very like the difference between the mean and the median. The former lets you take more notice of probably-wrong data, at the cost of being more sensitive to disruption by distant/low-probability outliers.

I'm inclined to think that the presence of uncertainty about one's values (still more about the shape of one's overall ethical system) the more cautious/robust/insensitive approach is perferable. But I'm by no means certain. Well, let's say I'm 80% sure of the former and 20% of the latter, and do an expected-utility calculation or a vote ... :-)

Seriously, I think we're a long way from having a good handle on how to deal with the limitations, unreliability, and nontransparency of our own hardware, and the tendency for attempts to do so to end up in this sort of "but what if we're uncertain about the uncertainty in our own uncertainty" recursion is a symptom of this. I suspect that the specific open question Wei_Dai has identified here is best viewed as a special case of that higher-level problem.

It's also not clear which affine transformations of EU1 and EU2 should be considered relevant. If the question of 'what fraction of achievable utility will we get?' plays a consideration (for example, as part of a strategy to bound your utility function to avoid pascal's mugging), then EU2 will get squashed more than EU1.

The idea that - if one big organism arises - it has a small chance of "being you" seems very strange. ISTM that the most likely chance of such a situation arising is if we get a powerful world government.

The idea is that if one big organism arises, you get at least as much utility as if it has a small chance of being you (but probably more), because if the big organism can't figure out how to merge everyone's values, it can always pick someone randomly and use his or her values. See also this related comment.

It still doesn't make much sense to me. The "if" statement seems simply silly: governments simply do not pick someone at random and devote their country's resources to fulfilling their wishes. They don't try and "merge everyone's values" either. The whole thing seems like an idea from another planet to me.

Looking at http://en.wikipedia.org/wiki/Globalization a world government seems highly unpopular idea among many people. Looking at the relevant history suggests that basic things - like a single currency - are still some distance away.

Also, if one forms, it is unlikely to be all-powerful for a considerable period of time - a fair quantity of the resulting dynamics will arise from selection effects and self-organization dynamics - rather than its wishes.

I'm not convinced that the numbers are stacked the way you say they are. Specifically, conditional on ethical egoism, I am not at all uncertain whether my utility curve is logarithmic or square-root; the question is whether it's log, log-log, or something slower growing and perhaps even bounded. So from my perspective the single biggest bit of "stacking" goes in the direction that favours A over B.

Note 1: The non-egoist case is different since it's very plausible prima facie that the utilities of others should be combined additively. Note 2: I suppose "not at all uncertain" is, as usual, an overstatement, but I think that if I'm wrong on this point then my understanding of my own preferences is so badly wrong that "all bets are off" and focusing on the particular possibility you have in mind here is privileging the hypothesis you happen to prefer. For instance, I think an egoist with apparent-to-self preferences that broadly resemble mine should give at least as much weight to the possibility that s/he isn't really an egoist as to the possibility that his/her utility function is as rapidly growing as you suggest. Note that, e.g., conditional on non-egoism one should probably give non-negligible probability to various egalitarian principles, either as axioms or as consequences of rapidly-diminishing individual utility functions, and these will tend to give strong reason for preferring B over A.