A dozen years ago, Eliezer Yudkowsky asked us which was less (wrong?) bad:
- 3^^^3 people each getting a dust speck in their eyes,
- 1 person getting horribly tortured continually for 50 years.
He cheekily ended the post with "I think the answer is obvious. How about you?"
To this day, I do not have a clue which answer he thinks is obviously true, but I strongly believe that the dust specks are preferable. Namely, I don't think there's any number we can increase 3^^^3 to that would change the answer, because I think the disutility of dust specks is fundamentally incomparable with that of torture. If you disagree with me, I don't care--that's not actually the point of this post. Below, I'll introduce a generalized type of utility function which allows for making this kind of statement with mathematical rigor.
Linearity and Archimedes
A utility function is called linear if it respects scaling--5 people getting dust specked is 5 times as bad as a single dust specker. This is a natural and common assumption for what utility functions should look like.
An ordered algebraic structure (number system) is called Archimedean if for any two positive values , , there exists natural numbers , such that and . In other words, adding either value to itself a finite number of times will eventually outweigh the other. Any pair , for which this property holds of and is called comparable. Note that we're taking the absolute value because we care about magnitude and not sign here: of course any positive number of people enjoying a sunset is better than any positive number of people being tortured, but the question remains whether there is some number of sunset-views which would outweigh a year of torture (in the sense that a world with additional sunset watchers and 1 additional year of torture would be preferable to a world with no more of either).
Another way to think about comparability is that both values are relevant to an analysis of world state preference: a duster does not need to know how many dust specks are in world 1 or world 2 if they know that world 2 has more torture than world 1, because the preference is already decided by the torture differential.
Convicted dusters now face a dilemma: a linear, Archimedean utility function necessarily yields some number of dust specks to which torture is preferable. One resolution is to reject linearity: very smart people have discussed this at length. There is another option though which, as far as the author can tell, hardly ever gets proper attention: rejecting Archimedean utilities. This is what I'll explore in the post.
Comparability defines an equivalence class on utilities of world states (and their utilities).
- Any world state is comparable to itself (reflexivity),
- if state 1 is comparable to state 2, then state 2 is comparable to state 1 (symmetry), and
- if state 1 is comparable to state 2 and state 2 is comparable to state 3, then state 1 is comparable to state 3 (transitivity).
These properties follow immediately from the definition of comparability provided above. Now we have a partition of world states into comparability classes. Note that the comparability classes do not form a vector space: 1 torture + 1 dust speck is comparable to one torture, but their difference, 1 dust speck, is not comparable to either.
As I said before, I don't care if you disagree about dust specks. If you think there are any pair of utilities which are in some way qualitatively distinct or otherwise incomparable (take your pick from: universe destruction, death, having corn stuck between one's teeth, the simple joy of seeing a friend, etc--normative claims about which utilities are comparable is outside the scope of this post), then there is more than one comparability class and the utility function needs a non-Archimedean co-domain to reflect this.
If you staunchly believe that every pair of utilities is comparable, that's fine--there will be only one comparability class containing all world states, and the construction below will boringly return your existing Archimedean utility function.
Pseudograded Vector Spaces
We can assign a total ordering (severity) to our collection of comparability classes: means that, for any world states in classes , respectively, for all . It is clear from the construction of comparability classes that this does not depend on which representatives are chosen. Note that severity does not distinguish between severe positive and severe negative utilities: the classes are defined by magnitude and not sign. Thus, the framework is agnostic on such matters as negative utilitarianism: if one believes that reducing disutility of various forms takes priority over expanding positive utility, then that will be reflected in the severity levels of these comparability classes--perhaps a large portion of the most severe classes will be disutility-reduction based, and only after that do we encounter classes prioritizing positive utility world states.
Then we can think of utilities as being vectors, where each component records the amount of utility of a given severity level. We'll write the components with most severe first (since those are the most significant ones for comparing world states). Thus, if an agent believes that there are exactly two comparability classes:
- a severe class which contains e.g. 50 years of torture, and
- a milder class containing e.g. being hit with a dust speck,
then world A with 1 person being tortured and 100 dust-speckers would have utility . Choosing between this and world B with no torture and 3^^^3 dust speckers is now easy: world B has utility , and since the first component is greater in world B, it is preferable no matter what the second component is. If this seems unreasonable to the reader, it is not a fault of the mathematics but the normative assumption that dust specks are incomparable to torture--if comparability classes are constructed correctly, then this sort of lexicographic ordering on utilities necessarily follows.
This is what we refer to as a pseudograded vector space: the utility space consists of -dimensional vectors, which are I-pseudograded by severity, meaning that for any vector we can identify its severity as the comparability class of the first non-zero component. Equivalently, utility values are functions from I to , and thus utility functions have type (where is the world state space) (this interpretation becomes important if we would like to consider the possibility of infinitely many comparability classes).
Formally, the utility vector space admits a grading function which gives the index of the first (most severe) non-zero coordinate of the input vector. In our example, and . In general, if we're comparing two world states with different severity gradings, we need only check which world has greater severity, and then consult the sign of (that's a function evaluation, not a product!) to determine if it's preferable to the alternative (the sign will be non-zero by definition of ). More generally, we compare worlds by checking the sign of . Note that two utility vectors satisfy exactly when , so the lexicographic ordering is determined fully by which utilities are preferable to 0, the utility of an empty world. We will refer to such utility vectors as positive, keeping in mind that this only means that the most severe non-zero coefficient is positive and not that the whole vector is.
Expected utility computations work exactly as in Archimedean utility functions: averages are performed component-wise. Thus, existing work with utilities and decision theory should import easily to this generalization.
Observe that satisfies ultrametric-like conditions (recalling those found in -adic norms):
- For any non-zero and utility vector ,
- For any pair of utility vectors , we have with equality if
One may reasonably be concerned about the well-definedness of if they believe : isn't it possible that I is not well-ordered under severity, meaning that some vectors may not have a first non-zero element? [Technically we are considering the well-orderedness of the reverse ordering on , since we want to find the most severe non-zero utility and not the least severe].
Consider for instance the function defined on the positive reals: for all its smoothness, it has no least input with a non-zero output, and what's worse the output oscillates signs infinitely many times approaching such that we aren't prepared to weigh a world state with such a utility against an empty world with utility 0. Luckily, our setting is actually a special case of the Hahn Embedding Theorem for Abelian Ordered Groups, which guarantees that any utility vector can only have non-zero entries on a well-ordered set of indices (thanks to Sam Eisenstat for pointing this out to me), and thus is necessarily well-defined. We'll refer to this vector space as : the space of all functions which each only have non-zero values on a well-ordered subset of , and thus in particular all have a first non-zero value.
Working with vector spaces, we often think of the vectors as lists of (or functions from the index set to) numbers, but this is a convenient lie--in doing so, we are fixing a basis with which the vector space was not intrinsically equipped. Abstractly, a vector space is coordinate-free, and we merely pick bases for ease of visualization and computation. However, it is important for our purposes here that the lexicographic ordering on utility functions is ``basis-independent", for a suitable notion of basis (otherwise, our world state preference would depend on arbitrary choices made in representations of the utility space). In fact, classical graded vector spaces have distinguished homogeneous elements of grading i (e.g. polynomials which only have degree i monomials, rather than mixed polynomials with max degree i, in the graded vector space of polynomials), but this is too much structure and would be unnatural to apply to our setting, hence pseudograded vector spaces.
The standard definition of basis (a linearly independent set of vectors which spans the space) isn't sufficient here, since we have extra structure we'd like to encode in our bases--namely, the severity grading. Instead, we'll define a graded basis to be a choice of one positive utility vector for each grading (i.e., a map such that --in categorical terms, is a section of ).
Then suppose Alice and Bob agree that there are two severity levels, and even agree on which world states fall into each, but have picked different graded bases. Letting X=50 years of torture, and Y=1 dust speck in someone's eye, perhaps Alice's basis has and , but for some reason Bob thinks the sensible choice of representatives is and --he agrees with Alice that , but factors the world state differently (while Bob may seem slightly ridiculous here, there are certainly world states with multiple natural factorizations).
Then Alice's utility for world A in her basis is and her , while Bob writes (since ) and . Note that they will both prefer world B, since the change in graded basis did not affect the lexicographic ordering on utilities. Specifically, the transformation from Alice's coordinates to Bob's is induced by left multiplication by the matrix
We observe two key facts about :
- is lower triangular (all entries above the main diagonal are 0), and
- has positive diagonal entries.
These will be true of any transformations between graded bases of the same utility space: lower triangularity follows from the ultrametric property of the grading , and positive diagonals follow from our insistence that the graded basis vectors all have positive utilities. In general (when ), it won't make sense to view as a matrix, but we will have a linear transformation operator with the same properties.
Recalling that the ordering on utilities is determined by the positive cone of utilities (since iff ), we simply observe that lower triangular linear transformations with positive diagonal entries will preserve the positive cone of : if the first non-zero entry of is positive, then will also have 0 for all entries before index (since lower triangularity means changes only propagate downstream) and will scale by the positive diagonal element, returning a positive utility. Thus, our ordering is independent of graded basis and depends only upon the severity classes and comparisons within them.
What's the point?
I think many LW folk (and almost all non-philosophers) are firm dusters and have felt dismayed at their inability to justify this within Archimedean utility theory, or have thrown up their hands and given up linearity believing it to be the only option. While we will never encounter a world with people in it to be dusted, it's important to understand the subtleties of how our utility functions actually work, especially if we would like to use them to align AI who will have vastly more ability than us to consider many small utilities and aggregate them in ways that we don't necessarily believe they should.
Moreover, the framework is very practical in terms of computational efficiency. It means that choosing the best course of action or comparing world states requires one to only have strong information on the most severe utility levels at stake, rather than losing their mind (or overclocking their CPU) trying to consider all the tiny utilities which might add up. Indeed, this is how most people operate in daily life; while some may counter that this is a failure of rationality, perhaps it is in fact because people understand that there are some tiny impacts which can never accumulate to more importance than the big stuff, and it can't all be chalked up to scope neglect.
Finally, completely aside from all ethical questions, I think there's some interesting math going on and I wanted people to look at it. Thanks for looking at it!
This is my first blog post and I look forward to feedback and discussion (I'm sure there are many issues here, and I hope that I will not be alone in trying to solve them). Thanks to everyone at MSFP with whom I talked about this--even if you don't think you gave me any ideas, being organic rubber ducks was very productive for my ability to formulate these thoughts in semi-coherent ways.