Donald Regan's masterful (1980) Utilitarianism and Co-operation raises a problem for traditional moral theories, which conceive of agents as choosing between external options like 'push' or 'not-push' (options that are specifiable independently of the motive from which they are performed). He proves that no such traditional theory T is adaptable, in the sense that "the agents who satisfy T, whoever and however numerous they may be, are guaranteed to produce the best consequences possible [from among their options] as a group, given the behaviour of everyone else." (p.6) It's easy to see that various forms of rule or collective consequentialism fail when you're the only agent satisfying the theory -- doing what would be best if everyone played their part is not necessarily to do what's actually best. What's more interesting is that even Act Utilitarianism can fail to beat co-ordination problems like the following:

  Poof: push Not-push
Whiff: push 10 0
        Not-push 0 6

Here the best result is obviously for Whiff and Poof to both push. But this isn't guaranteed by the mere fact that each agent does as AU says they ought. Why not? Well, what each ought to do depends on what the other does. If Poof doesn't push then neither should Whiff (that way he can at least secure 6 utils, which is better than 0). And vice versa. So, if Whiff and Poof both happen to not-push, then both have satisfied AU. Each, considered individually, has picked the best option available. But clearly this is insufficient: the two of them together have fallen into a bad equilibrium, and hence not done as well as they (collectively) could have.

Regan's solution is build a certain decision-procedure into the objective requirements of the theory:

The basic idea is that each agent should proceed in two steps: First he should identify the other agents who are willing and able to co-operate in the production of the best possible consequences. Then he should do his part in the best plan of behaviour for the group consisting of himself and the others so identified, in view of the behaviour of non-members of the group. (p.x)


This theory, which Regan calls 'Co-operative Utilitarianism', secures the property of adaptability.  (You can read Regan for the technical details; here I'm simply aiming to convey the rough idea.)  To illustrate with our previous example: suppose Poof is a non-cooperator, and so decides on outside grounds to not-push. Then Whiff should (i) determine that Poof is not available to cooperate, and hence (ii) make the best of a bad situation by likewise not-pushing. In this case, only Whiff satisfies CU, and hence the agents who satisfy the theory (namely, Whiff alone) collectively achieve the best results available to them in the circumstances.

If both agents satisfied the theory, then they would first recognize the other as a cooperator, and then each would push, as that is what is required for them to "do their part" to achieve the best outcome available to the actual cooperators.

* * *

[Originally posted to Philosophy, etc.  Reproduced here as an experiment of sorts: despite discussing philosophical topics, LW doesn't tend to engage much with the extant philosophical literature, which seems like a lost opportunity.  I chose this piece because of the possible connections between Regan's view of cooperative games and the dominant LW view of competitive games: that one should be disposed to co-operate if and only if dealing with another co-operator.  In any case, I'll be interested to see whether others find this at all helpful or interesting -- naturally that'll influence whether I attempt this sort of thing again.]


New Comment
9 comments, sorted by Click to highlight new comments since: Today at 10:52 AM

One thing you can do is look for cabal equilibria. A cabal equilibrium is a strategy profile that is pareto-optimal for each subset of players. In other words, no set of players can improve their outcome by simultaneously changing strategies. Cabal equilibria in elections are pretty interesting.

He proves that no such traditional theory T is adaptable, in the sense that "the agents who satisfy T, whoever and however numerous they may be, are guaranteed to produce the best consequences possible [from among their options] as a group, given the behaviour of everyone else."

What makes a decision theory "traditional"? Presumably, TDT and UDT are not "traditional" in this sense.

Defined in the previous sentence. A traditional or 'exclusively act-oriented' theory is concerned with which option an agent chooses, where the 'options' are understood as actions like pushing or not-pushing -- acts which can be specified independently of the motive from which they're performed.

For an agent to satisfy the requirements of a traditional theory like Act Consequentialism, they simply need to perform the right action. The contrast is with theories like CU which are not exclusively act-oriented, but require agents to actually use a particular decision procedure (and not simply act in the way that would be recommended by the procedure).

How do you expect agents to systematically act in the way recommended by a particular procedure without actually using that procedure?

Not sure what gave you the impression that I have any such expectation. People may satisfy a moral theory just occasionally. On those occasions, we would expect the group who satisfy the theory to do as well as was possible, if the theory advertises itself as a consequentialist one. Surprisingly, it turns out that this is not so for the aforementioned class of such theories.

It is not suprising at all that other agents' counterfactual expectations of your behavior affects their behavior, which in turn can affect you.

You'll need to explain how that relates to my previous comment. I get the sense that we're talking past each other (or at least starting from very different places).

The point is, your judging a decision theory based on the results for agents that happen to do what it recommends rather for agents that systematically do what it recommends because they actually compute what it recommends and then do that, is not a good way to judge decision theories, if you are judging them with the purpose of choosing one to systematically follow. In particular, a big problem with using the results for agents that happen to do what the decision theory recommends is that you don't expect other agents to expect the agent you are considering to follow the decision theory in counterfactual computations they make that inform their own decisions, which in affect the outcome for the agent under consideration.

Thanks, that's helpful. I'm actually not "judging them with the purpose of choosing one to systematically follow" -- my interests are more theoretical than that (e.g., I'm interested in what sort of moral theory best represents the core idea of Consequentialism).

Having said that, I agree about the importance of counterfactuals here, and hence the importance of agents following a theory rather than merely conforming their behaviour to it -- indeed, that's precisely the point I was wanting to highlight from Regan's classic work. (Note that this is actually a distinct point from systematicity across time: we can imagine a case where agents have reliable knowledge that just this once the other person is following the CU decision procedure.)

New to LessWrong?