Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

The fundamentals of Bayesian thinking have been justified in many ways over the years. Most people here have heard of the VNM axioms and Dutch Book arguments. Far fewer, I think, have heard of the Complete Class Theorems (CCT).

Here, I explain why I think of CCT as a more purely consequentialist foundation for decision theory. I also show how complete-class style arguments play a role is social choice theory, justifying utilitarianism and a version of futarchy. This means CCT acts as a bridging analogy between single-agent decisions and collective decisions, therefore shedding some light on how a pile of agent-like pieces can come together and act like one agent. To me, this suggests a potentially rich vein of intellectual ore.

I have some ideas about modifying CCT to be more interesting for MIRI-style decision theory, but I'll only do a little of that here, mostly gesturing at the problems with CCT which could motivate such modifications.

There is also a secondary motivation in human (ir)rationality: to the extent foundational arguments are real reasons why rational behavior is better than irrational behavior, one might expect these arguments to be helpful in teaching or training rationality. This is related to my criterion of consequentialism: the argument in favor of Bayesian decision theory should directly point to why it matters.

With respect to this second quest, CCT is interesting because Dutch Book and money-pump arguments point out irrationality in agents by exploiting the irrational agent. CCT is more amenable to a model in which you point out irrationality by helping the irrational agent. I am working on a more thorough expansion of that view with some co-authors.

Other Foundations

(Skip this section if you just want to know about CCT, and not why I claim it is better than alternatives.)

I give an overview of many proposed foundational arguments for Bayesianism in the first post in this series. I called out Dutch Book and money-pump arguments as the most promising, in terms of motivating decision theory only from "winning". The second post in the series attempted to motivate all of decision theory from only those two arguments (extending work of Stuart Armstrong along those lines), and succeeded. However, the resulting argument was in itself not very satisfying. If you look at the structure of the argument, it justifies constraints on decisions via problems which would occur in hypothetical games involving money. Many philosophers have argued that the Dutch Book argument is in fact a way of illustrating inconsistency in belief, rather than truly an argument that you must be consistent or else. I think this is right. I now think this is a serious flaw behind both Dutch Book and money-pump arguments. There is no pure consequentialist reason to constrain decisions based on consistency relationships with thought experiments.

The position I'm defending in the current post has much in common with the paper Actualist Rationality by C. Manski. My disagreement with him lies in his dismissal of CCT as yet another bad argument. In my view, CCT seems to address his concerns almost precisely!

Caveat --

Dutch Book arguments are fairly practical. Betting with people, or asking them to consider hypothetical bets, is a useful tool. It may even be what convinces someone to use probabilities to represent degrees of belief. However, the argument falls apart if you examine it too closely, or at least requires extra assumptions which you have to argue in a different way. Simply put, belief is not literally the same thing as willingness to bet. Consequentialist decision theories are in the business of relating beliefs to actions, not relating beliefs to betting behavior.

Similarly, money-pump arguments can sometimes be extremely practical. The resource you're pumped of doesn't need to be money -- it can simply be the cost of thinking longer. If you spin forever between different options because you prefer strawberry ice cream to chocolate and chocolate to vanilla and vanilla to strawberry, you will not get any ice cream. However, the set-up to money pump assumes that you will not notice this happening; whatever the extra cost of indecision is, it is placed outside of the considerations which can influence your decision.

So, Dutch Book "defines" belief as willingness-to-bet, and money-pump "defines" preference as willingness-to-pay; in doing so, both arguments put the justification of decision theory into hypothetical exploitation scenarios which are not quite the same as the actual decisions we face. If these were the best justifications for consequentialism we could muster, I would be somewhat dissatisfied, but would likely leave it alone. Fortunately, a better alternative exists: complete class theorems.

Four Complete Class Theorems

For a thorough introduction to complete class theorems, I recommend Peter Hoff's course notes. I'm going to walk through four complete class theorems dealing with what I think are particularly interesting cases. Here's a map:

In words: first we'll look at the standard setup, which assumes likelihood functions. Then we will remove the assumption of likelihood functions, since we want to argue for probability theory from scratch. Then, we will switch from talking about decision theory to social choice theory, and use CCT to derive a variant of Harsanyi's utilitarian theorem, AKA Harsanyi's social aggregation theorem, which tells us about cooperation between agents with common beliefs (but different utility functions). Finally, we'll add likelihoods back in. This gets us a version of Critch's multi-objective learning framework, which tells us about cooperation between agents with different beliefs and different utility functions.

I think of Harsanyi's utilitarianism theorem as the best justification for utilitarianism, in much the same way that I think of CCT as the best justification for Bayesian decision theory. It is not an argument that your personal values are necessarily utilitarian-altruism. However, it is a strong argument for utilitarian altruism as the most coherent way to care about others; and furthermore, to the extent that groups can make rational decisions, I think it is an extremely strong argument that the group decision should be utilitarian. AlexMennen discusses the theorem and implications for CEV here.

I somewhat jokingly think of Critch's variation as "Critch's Futarchy theorem" -- in the same way that Harsanyi shows that utilitarianism is the unique way to make rational collective decisions when everyone agrees about the facts on the ground, Critch shows that rational collective decisions when there is disagreement must involve a betting market. However, Critch's conclusion is not quite Futarchy. It is more extreme: in Critch's framework, agents bet their voting stake rather than money! The more bets you win, the more control you have over the system; the more bets you lose, the less your preferences will be taken into account. This is, perhaps, rather harsh in comparison to governance systems we would want to implement. However, rational agents of the classical Bayesian variety are happy to make this trade.

Without further adieu, let's dive into the theorems.

Basic CCT

We set up decision problems like this:

Θ is the set of possible states of the external world.

X is the set of possible observations.

A is the set of actions which the agent can take.

F(x|θ) is a likelihood function, giving the probability of an observation x∈X under a particular world-state θ∈Θ.

D is a set of decision rules. For δ∈D, δ(x) outputs an action. Stochastic decision rules are allowed, though, in which case we should really think of it as outputting an action probability.

L(θ,a), the loss function, takes a world θ∈Θ and an action a∈A and returns a real-valued "loss". L encodes preferences: the lower the loss, the better. One way of thinking about this is that the agent knows how its actions play out in each possible world; the agent is only uncertain about consequences because it doesn't know which possible world is the case.

In this post, I'm only going to deal with cases where Θ and X are finite. This is not a minor theoretical convenience -- things get significantly more complicated with unbounded sets, and the justification for Bayesianism in particular is weaker. So, it's potentially quite interesting. However, there's only so much I want to deal with in one post.

Some more definitions:

The risk of a policy in a particular true world-state: R(θ,δ)=EF(x|θ)[L(θ,δ(x))].

A decision rule δ∗ is a pareto improvement over another rule δ if and only if R(θ,δ)≥R(θ,δ∗) for all θ, and strictly > for at least one. This is typically called dominance in treatments of CCT, but it's exactly parallel to the idea of pareto-improvement from economics and game theory: everyone is at least as well off, and at least one person is better off. An improvement which harms no one. The only difference here is that it's with respect to possible states, rather than people.

A decision rule δ is admissible if and only if there is no pareto improvement over it. The idea is that there should be no reason not to take pareto improvements, since you're only doing better no matter what state the world turns out to be in. (We could also call this pareto-optimal.)

A class C of decision rules is a complete classif and only if for any rule not in C, δ∉C, there exists a rule δ∗ in C which is a pareto improvement. Note, not every rule in a complete class will be admissible itself. In particular, the set of all decision rules is a complete class. So, the complete class is a device for proving a weaker result than admissibility. This will actually be a bit silly for the finite case, because we can characterize the set of admissible decision rules. However, it is the namesake of complete class theorems in general; so, I figured that it would be confusing not to include it here.

Given a probability distribution π(θ) on world-states, the Bayes riskr(π,δ) is the expected risk over worlds, IE: Eπ(θ)R(θ,δ).

A probability distribution π is non-dogmatic when π(θ)>0 for all θ.

A decision rule is bayes-optimal with respect to a distribution π if it minimizes Bayes risk with respect to π. (This is usually called a Bayes rule with respect to π, but that seems fairly confusing, since it sounds like "Bayes' rule" aka Bayes' theorem.)

THEOREM: When Θand A are finite, decision rules which are bayes-optimal with respect to a non-dogmatic π are admissible.

PROOF: On the one hand, if δ is Bayes-optimal with respect to non-dogmatic π, it minimizes the expectation Eπ(θ)R(θ,δ). Since π(θ)>0 for each world, any pareto-improvement δ′ (which must be strictly better in some world, and not worse in any) must decrease this expectation. So, δ must be minimizing the expectation if it is Bayes-optimal. □

THEOREM: (basic CCT) When Θ and A are finite, a decision rule δ is admissible if and only if it is Bayes-optimal with respect to some prior π.

PROOF: If δ is admissible, we wish to show that it is Bayes-optimal with respect to some π.

A decision rule has a risk in each world; think of this as a vector in R|Θ|. The set R of achievable risk vectors in R|Θ| (given by all δ) is convex, since we can make mixed strategies between any two decision rules. It is also closed, since A and X are finite. Consider a risk vector s as a point in this space (not necessarily achievable by any δ). Define the lower quadrantQ(s) to be the set of points which would be pareto improvements if they were achievable by a decision rule. Note that for an admissible decision rule with risk vector s, Q(s) and R are disjoint. By the hyperplane separation theorem, there is a separating hyperplane H between Q(s) and R. We can define π(θ) by taking a vector normal to the hyperplane and normalizing it to sub to one. This is a prior for which

The fundamentals of Bayesian thinking have been justified in many ways over the years. Most people here have heard of the VNM axioms and Dutch Book arguments. Far fewer, I think, have heard of the Complete Class Theorems (CCT).

Here, I explain why I think of CCT as a more purely consequentialist foundation for decision theory. I also show how complete-class style arguments play a role is social choice theory, justifying utilitarianism and a version of futarchy. This means CCT acts as a bridging analogy between single-agent decisions and collective decisions, therefore shedding some light on how a pile of agent-like pieces can come together and act like one agent. To me, this suggests a potentially rich vein of intellectual ore.

I have some ideas about modifying CCT to be more interesting for MIRI-style decision theory, but I'll only do a little of that here, mostly gesturing at the problems with CCT which could motivate such modifications.

## Background

## My Motives

This post is a continuation of what I started in Generalizing Foundations of Decision Theory and Generalizing Foundations of Decision Theory II. The core motivation is to understand the justification for existing decision theory very well, see which assumptions are weakest, and see what happens when we remove them.

There is also a secondary motivation in human (ir)rationality: to the extent foundational arguments are

real reasonswhy rational behavior is better than irrational behavior, one might expect these arguments to be helpful in teaching or training rationality. This is related to my criterion of consequentialism: the argument in favor of Bayesian decision theory should directly point towhy it matters.With respect to this second quest, CCT is interesting because Dutch Book and money-pump arguments point out irrationality in agents by

exploitingthe irrational agent. CCT is more amenable to a model in which you point out irrationality byhelpingthe irrational agent. I am working on a more thorough expansion of that view with some co-authors.## Other Foundations

(Skip this section if you just want to know about CCT, and not why I claim it is better than alternatives.)

I give an overview of many proposed foundational arguments for Bayesianism in the first post in this series. I called out Dutch Book and money-pump arguments as the most promising, in terms of motivating decision theory only from "winning". The second post in the series attempted to motivate all of decision theory from only those two arguments (extending work of Stuart Armstrong along those lines), and succeeded. However, the resulting argument was in itself not very satisfying. If you look at the structure of the argument, it justifies constraints on decisions via problems which would occur in hypothetical games involving money. Many philosophers have argued that the Dutch Book argument is in fact a way of illustrating inconsistency in belief, rather than truly an argument that you must be consistent or else. I think this is right. I now think this is a serious flaw behind both Dutch Book and money-pump arguments. There is no pure consequentialist reason to constrain decisions based on consistency relationships with thought experiments.

The position I'm defending in the current post has much in common with the paper Actualist Rationality by C. Manski. My disagreement with him lies in his dismissal of CCT as yet another bad argument. In my view, CCT seems to address his concerns almost precisely!

Caveat --

Dutch Book arguments are

fairlypractical. Betting with people, or asking them to consider hypothetical bets, is a useful tool. It may even be what convinces someone to use probabilities to represent degrees of belief. However, the argument falls apart if you examine it too closely, or at least requires extra assumptions which you have to argue in a different way. Simply put, belief is not literally the same thing as willingness to bet. Consequentialist decision theories are in the business of relating beliefs to actions, not relating beliefs to betting behavior.Similarly, money-pump arguments can sometimes be extremely practical. The resource you're pumped of doesn't need to be money -- it can simply be the cost of thinking longer. If you spin forever between different options because you prefer strawberry ice cream to chocolate and chocolate to vanilla and vanilla to strawberry, you will not get any ice cream. However, the set-up to money pump

assumesthat you will not notice this happening; whatever the extra cost of indecision is, it is placed outside of the considerations which can influence your decision.So, Dutch Book "defines" belief as willingness-to-bet, and money-pump "defines" preference as willingness-to-pay; in doing so, both arguments put the justification of decision theory into hypothetical exploitation scenarios which are not quite the same as the actual decisions we face. If these were the best justifications for consequentialism we could muster, I would be somewhat dissatisfied, but would likely leave it alone. Fortunately, a better alternative exists: complete class theorems.

## Four Complete Class Theorems

For a thorough introduction to complete class theorems, I recommend Peter Hoff's course notes. I'm going to walk through four complete class theorems dealing with what I think are particularly interesting cases. Here's a map:

In words: first we'll look at the standard setup, which assumes likelihood functions. Then we will remove the assumption of likelihood functions, since we want to argue for probability theory from scratch. Then, we will switch from talking about decision theory to social choice theory, and use CCT to derive a variant of Harsanyi's utilitarian theorem, AKA Harsanyi's social aggregation theorem, which tells us about cooperation between agents with common beliefs (but different utility functions). Finally, we'll add likelihoods back in. This gets us a version of Critch's multi-objective learning framework, which tells us about cooperation between agents with different beliefs

anddifferent utility functions.I think of

Harsanyi's utilitarianism theoremas the best justification for utilitarianism, in much the same way that I think of CCT as the best justification for Bayesian decision theory. It is not an argument thatyour personal valuesare necessarily utilitarian-altruism. However, itisa strong argument for utilitarian altruism as the most coherent way to care about others; and furthermore, to the extent that groups can make rational decisions, I think it is an extremely strong argument that the group decision should be utilitarian. AlexMennen discusses the theorem and implications for CEV here.I somewhat jokingly think of Critch's variation as "Critch's Futarchy theorem" -- in the same way that Harsanyi shows that utilitarianism is the unique way to make rational collective decisions when everyone agrees about the facts on the ground, Critch shows that rational collective decisions when there is disagreement must involve a betting market. However, Critch's conclusion is not quite Futarchy. It is more extreme: in Critch's framework, agents bet their voting stake rather than money! The more bets you win, the more control you have over the system; the more bets you lose, the less your preferences will be taken into account. This is, perhaps, rather harsh in comparison to governance systems we would want to implement. However, rational agents of the classical Bayesian variety are happy to make this trade.

Without further adieu, let's dive into the theorems.

## Basic CCT

We set up decision problems like this:

In this post, I'm only going to deal with cases where Θ and X are finite. This is not a minor theoretical convenience -- things get significantly more complicated with unbounded sets, and the justification for Bayesianism in particular is weaker. So, it's potentially quite interesting. However, there's only so much I want to deal with in one post.

Some more definitions:

The

of a policy in a particular true world-state: R(θ,δ)=EF(x|θ)[L(θ,δ(x))].riskA decision rule δ∗ is a

over another rule δ if and only if R(θ,δ)≥R(θ,δ∗) for all θ, and strictly > for at least one. This is typically calledpareto improvementin treatments of CCT, but it's exactly parallel to the idea of pareto-improvement from economics and game theory: everyone is at least as well off, and at least one person is better off. An improvement which harms no one. The only difference here is that it's with respect to possible states, rather than people.dominanceA decision rule δ is

if and only if there is no pareto improvement over it. The idea is that there should be no reason not to take pareto improvements, since you're only doing better no matter what state the world turns out to be in. (We could also call thisadmissiblepareto-optimal.)A class C of decision rules is a

if and only if for any rulecomplete classnotin C, δ∉C, there exists a rule δ∗ in C which is a pareto improvement. Note, not every rule in a complete class will be admissible itself. In particular, the set of all decision rules is a complete class. So, the complete class is a device for proving a weaker result than admissibility. This will actually be a bit silly for the finite case, because we can characterize the set of admissible decision rules. However, it is the namesake of complete class theorems in general; so, I figured that it would be confusing not to include it here.Given a probability distribution π(θ) on world-states, the

r(π,δ) is the expected risk over worlds, IE: Eπ(θ)R(θ,δ).Bayes riskA probability distribution π is

when π(θ)>0 for all θ.non-dogmaticA decision rule is

with respect to a distribution π if it minimizes Bayes risk with respect to π. (This is usually called abayes-optimalBayes rulewith respect to π, but that seems fairly confusing, since it sounds like "Bayes' rule" aka Bayes' theorem.)THEOREM:WhenΘand A are finite, decision rules which are bayes-optimal with respect to a non-dogmatic π are admissible.PROOF:On the one hand, if δ is Bayes-optimal with respect to non-dogmatic π, it minimizes the expectation Eπ(θ)R(θ,δ). Since π(θ)>0 for each world, any pareto-improvement δ′ (which must be strictly better in some world, and not worse in any) must decrease this expectation. So, δ must be minimizing the expectation if it is Bayes-optimal. □THEOREM:(basic CCT)When Θ and A are finite, a decision rule δ is admissible if and only if it is Bayes-optimal with respect to some prior π.PROOF:If δ is admissible, we wish to show that it is Bayes-optimal with respect to some π.A decision rule has a risk in each world; think of this as a vector in R|Θ|. The set R of achievable risk vectors in R|Θ| (given by all δ) is convex, since we can make mixed strategies between any two decision rules. It is also closed, since A and X are finite. Consider a risk vector s as a point in this space (not necessarily achievable by any δ). Define the

Q(s) to be the set of points which would be pareto improvements if they were achievable by a decision rule. Note that for an admissible decision rule with risk vector s, Q(s) and R are disjoint. By the hyperplane separation theorem, there is a separating hyperplane H between Q(s) and R. We can define π(θ) by taking a vector normal to the hyperplane and normalizing it to sub to one. This is a prior for whichlower quadrant