Infinite ethics comparisons


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Work done with Amanda Askell; the errors are mine.

It's very difficult to compare utilities across worlds with infinite populations. For instance, it seems clear that world is better than , if the number indicate the utilities of various agents:

However, up to relabelling of the agents, these two worlds are actually identical. For this post, we'll only care about countable infinities of agents, and we'll assume that all utilities must occupy a certain finite range. This means that and of utilities in a world are defined and finite, independently of the ordering of the agents in that world. For a world , label these as and .

Unambiguous gains and losses

Then compare the following worlds, where means there are infinitely many agents with utility :

It seems that is better than , because the middle category is higher. But this is deceptive, as we'll see.

Let's restrict ourselves to actions that change the utilities of agents in worlds, without creating or removing any agents.

Given any such action , call the signature of if and are the and of all utility changes caused by . If and , we'll call an unambiguous gain; if and , we'll call an unambiguous loss.

Then consider the action that transforms by moving all the agents at utility to utility . This is certainly an unambiguous gain. But now consider the action that sends all agents at utility down to utility , and sends infinitely many agents at utility down to utility (while leaving infinitely many at utility ). This is certainly an unambiguous loss.

However, both actions will send to . So it's not clear at all which of these worlds is better than the other.

Comparing infinite worlds

Define as , the average of and of . Then here are five ways of comparing worlds, which allow finer and finer comparisons.

#. If , then . #. If , , and one of these inequalities is strict, then . #. If , then . #. If , then . #. If , then .

All these are transitive preorders, with being a total preorder. The is a refinement of (in that if , then ), as are the and . The is a refinement of all of , , , but none of these three are refinements of each other.

In fact, is a minimal refinement of and . To see this, assume , hence , and introduce , a world where everyone has a single utility at a value between and . Then .

However, is not a minimal refinement of and (or of and ). To see this, assume ; then , and similarly implies . Therefore the minimal refinement of and is simply the union of and (the same goes for and ).

Correspondence with actions and signatures

Those five methods of world comparisons correspond neatly to features of actions mapping between worlds. In particular:

#. If , then every action that changes into is an unambiguous gain. #. If , then there is an unambiguous gain that changes into , and no such action that is an ambiguous loss. #. If , then , and if is an action transforming into of signature , then . #. If , then , and if is an action transforming into of signature , then . #. If , and is an action that changes into , of signature , and and are the minimums of all such and for different , then .

Note that and can be rephrased in terms of : in the first case, must be an improvement for infinitely many of the lowest agents in , in the second case, infinitely many of the highest agents in must have been improved by .

These results can be displayed graphically here, with the blue line being the interval for a world . The examples for and are pretty clear:

Here are and :

For , it helps to consider separately cases where the interval is smaller or larger than :

Finer comparisons and unbounded utilities

We may be able to get finer comparisons than , for instance by looking at the fine structure of the utilities around and .

Now let's allow unbounded utilities for individual agents. If the utilities are unbounded in one direction only, then all are still defined, as long as we allow infinity (or minus that) to be a valid value for (or ), and take the average of an infinite value and a finite value to be equal to that infinite value.

If we allow unbounded utilities in both directions, then both and can become infinite. The and remain well defined, but not , or , since is not always defined. If we arbitrarily set to some fixed value when , , then we can define all the .