Deontology and virtue ethics as "effective theories" of consequentialist ethics

Jan_Kulveit

Note: this was written months ago, before the recent FTX events - so the post can’t be read as a specific (over-)reaction to the FTX fiasco. I do think there is a weak connection: some blogs and comments by the FTX leadership provide some evidence that the sort of bounded consequentialism / local consequentialism criticised in this post was part of what went wrong at FTX, or at least part of the rationalisations. This makes me a bit more confident that if people are making updates from the FTX events, the arguments in this post are at least directionally sane.

This is a rough attempt to present deontology and virtue ethics as approximate, effective theories ^[1] of consequentialist morality, for agents severely bounded in their ability to determine the consequences of their actions, and delegating actions to future selves. I think this view could help effective altruists to avoid some problems caused by what we may call bounded consequentialism or local consequentialism. None of this is very original: you can get some of the same intuitions from the works of Derek Parfit and Toby Ord among others, but it seemed worth writing down a short less formal synthesis of my current views.

The main claims I want to make are:

Because of boundedness, deontology and virtue ethics act as effective theories of consequentialism
Where attempts at consequentialist calculations diverge from virtue ethical and deontic intuitions, it's often because the consequentialist calculation is mistaken
If you're strongly consequentialist, because of boundedness and delegation, you should take deontic and virtue ethical intuitions pretty seriously - as that will lead to better consequences

In the post, I:

Set out how deontology and virtue ethics can be seen as effective approximations of idealised, unbounded consequentialism, given we are just humans
Offer some metaphors from physics for understanding the relationship between different ethical theories
Give some real-life examples of what I see as bounded consequentialism in effective altruism, relating to fire alarms, demandingness, outreach to young people, and optimization of social interactions

Consequentialism

Consequentialism is the view that the right thing to do is always the action that will produce the best consequences.

Act consequentialism - where an actor directly chooses actions based on an assessment of the consequences - works straightforwardly in situations where it is possible to compute the consequences of actions, and we are able to estimate the value of the consequences. A textbook example would be selecting between a small number of lotteries, which have different probabilities of leading to different amounts of money. Another example would be a case of triage, where a paramedic has to choose between several casualties with different survival probabilities.

The direct act-based decision procedure becomes less tractable with large action spaces, when not having complete information about the world and limited ability to predict consequences. Note that this is usually the situation we find ourselves in in practice.

Deontology

Classically, deontology is the view that some actions are intrinsically right or wrong, regardless of the consequences. It holds that there are certain things that we ought never to do, even if doing them would lead to good consequences. For example, it is always wrong to torture an innocent person, regardless of whether or not doing so would lead to some greater good.

It has been well argued already that some versions of rule-based consequentialism - where instead of assessing the consequences of individual actions, actors follow rules which are selected for their average consequences - are equivalent to deontology, in the sense that both theories will recommend the same actions.

In my view, deontology approach to ethics as a whole can often be seen as an approximate effective consequentialist theory in the following way:

The consequences of individual actions are often hard to compute, and hard to experimentally verify.
The consequences of a large number of applications of a rule are easier to compute, and easier to experimentally verify.
So deontology can be seen as a method of reducing the space of possible actions by eliminating some actions which are likely to have bad consequences, either on the basis of past data, or considerations about the consequences of widespread application of the rules

Within a given ethical system, it is possible that some deontological rules will have worse consequences in some cases than a direct consequentialist approach. But, in general, deontological rules are much more likely to have good consequences than the actions they rule out.

The normative consequentialist claim is something like this: because decision making based on rules is easier to compute and the effects of rules are easier to verify, a deontic decision procedure is often the optimal computation to run under some combination of bounds on computation, information, rationality and cluelessness.

Briefly: something like non-fanatical deontology seems like an effective first order approximation of consequentialism in at the regime of bounded rationality at human level and moderate cluelessness.

Very informal paraphrase: in practice, if you are violating some clearly sensible deontic constraint, such as 'don't lie', and justifying it with some sort of clever consequentialist argument, it's likely you are more confused and less smart than you think.

Virtue ethics

Virtue ethics is the view that our actions should be motivated by the virtues and habits of character that promote the good life. Virtues are not intrinsically right or wrong; they are attributes that we should strive to cultivate. In classical virtue ethics, the virtuous life is seen as the purpose of humans, and of intrinsic value.

Taking honesty as an example virtue, we should strive to be honest, even if being dishonest would lead to some greater good.

In my view, virtue ethics can also often be seen as an approximate effective theory of consequentialism in the following way:

Realistically, the action we take most of the time with our most important decisions is to delegate them to other agents. This is particularly true if you think about your future selves as different ("successor") agents. From this perspective, actions usually have two different types of consequences - in the external world, and in "creating a successor agent, namely future you".

Virtue ethics tracks the consideration that the self-modification part is often the larger part of the consequences.

In all sorts of delegations, often the sensible thing to do is to try to characterise what sort of agent we delegate to, or what sort of computation we want the agent to run, rather than trying to specify what actions should the agent take. ^[2] As you are delegating the rest of your life to your future selves, it makes sense to pay attention to who they are.

Once we are focusing on what sort of person it would be good for us to be, what sort of person would make good decisions for the benefit of others as well as themselves, what sort of computations to select, we are close to virtue ethics. (Or at least to what I would call virtue ethics)

Relations between the approximate theories

Often, people try to understand the relationships between the theories with the help of "credences", or of philosophical arguments. Here I’m going to offer a different approach: using physics as a metaphor for understanding the relationship between theories. The short version: theories often exist alongside one another.

Pareto-optimality under computational constraints

In contrast to the oversimplified view sometimes presented in introductory history of science courses, it is often not the case that newer theories or paradigms wholly replace older theories. Older theories often stay pareto-optimal under particular computational constraints.

For example, even in predicting the motion of molecules and atoms, quantum mechanics has not fully replaced Newtonian mechanics for practical applications. While QM provides deeper understanding, in practice so-called "force field" simulations are common, and are basically Newtonian mechanics. Why do we use them, when we have had QM for more than a century? Because these simulations are pareto-optimal on some combination of system size, time, precision, effects studied, and available computational budget.

Perturbative expansions

Another common relation between theories in physics is perturbative expansion: the harder-to-compute theory is used to generate some modifications or approximations, which are then added on top of the easier-to-compute theory in borderline cases. For example, general relativity is used to compute small deviations from Newton's law, and this modified Newtonian theory is then used to do calculations.

Limiting cases

Sometimes, the relation between theories in physics is that one theory can be understood to be a special or limited case of another theory, for example a low-energy limit, high-energy limit, or classical limit. In these cases, the theory covering more of the territory can sometimes tell us when we can safely use the simpler, limiting case theory.

Using the physics metaphor to reconcile ethical theories

I suggest attempting to understand the relations between deontology and virtue ethics on the one hand, and consequentialism on the other, in a similar way.

In the same way that different theories in physics are pareto-optimal under different computational constraints, system sizes and complexities; deontology and virtue ethics can be pareto-optimal under different computational constraints, decision problem sizes and environmental complexities.

A lot of philosophising about ethics deals with extreme cases, improbable thought experiments, and situations that "stretch ethical theories to the limits of their validity." This is often seen as a way to refute a theory or point out its weaknesses. From the perspective of the physics metaphor, we can see these edge cases not as refuting moral theories, but as either suggesting modifications (as when harder-to-compute theories in physics are used to generate modifications or approximations which are added on top of the easier-to-compute theory), or illustrating the limits of a moral theory’s validity (as when a new theory in physics shows that an older theory is actually a limit case, and suggests where the old theory still applies).

Applying this metaphor to virtue ethics, deontology and consequentialism puts consequentialism in the role of the newer/harder-to-compute/more complete theory.

But note the other side of this view: in practice, even when you assume that consequentialism is never outside of its domain of theoretical validity, the normative theory often suggests that bounded and clueless agents should be doing something other than directly attempting to predict and compare consequences. To take another physics example: if you want to direct gunfire using a pocket calculator, your best bet is Newtonian mechanics. In practice, attempts to do the numerical calculations directly from the quantum level would fail horribly. Consequentialism understood correctly actually warns you against this sort of calculation.

This view also points to typical mistakes bounded consequentialists or local consequentialist are prone to:

Mistakes from the shallow depth of evaluating plans.
Mistakes from not understanding the action space: when deliberative resources are spent on comparing just a few options from a huge action space.
Delegation failures: mistakes when instead of trying to specify virtues of a delegate, the naive consequentialist attempts to specify what actions should be taken.

Conflict of local consequentialist intuitions and virtue ethics and/or deontic rules often points to the fact that the local consequentilist calculation is actually mistaken, and a different decision procedure should be used.

Examples in practice

How does this relate to a range of practical problems in current effective altruism?

Fire alarms prompting people to suggest rule-violating and unvirtuous acts

Multiple recent advances of machine learning, coupled with sceptical takes on AI alignment progress by Eliezer Yudkowsky, prompted some people to suggest various extreme actions which would be both unvirtuous, and rule-violating under most deontic considerations. While I won't link to specific public proposals, you can, for example, imagine actions in the space of "sabotaging the work of others".

In multiple cases here, the conflict between the alleged consequentialist value of the suggested acts, and virtuous or rule-abiding behaviour, is resolved just by reflecting for a longer time. When a less bounded consequentialist starts to consider the impacts of such non-virtuous actions on coordination, epistemics, responses from other actors in the space… the actions seem clearly bad bets. In these cases, the fact that a local bounded calculation comes apart from virtuous or deontic intuitions should be taken as an indication that the calculations may be incorrect; and just acting upon virtuous or deontic intuitions in the first place would be a better approximation of consequentialist action.

(It's probably worth noting that Eliezer Yudkowsky himself understands this and warned readers of his post against non virtuous, confused consequentialist actions in strong words.)

Demandingness objection

Ín a recent criticism of effective altruism, Michael Nielsen eloquently raises a concern about the principle of "doing the most good" leading to unhealthy outcomes when taken very seriously, such as people considering whether to give up having children, converting ice cream purchases to lives saved, or working in an inadequate environment,.

Again, to me, these unhealthy outcomes seem like problems of local consequentialism. There are at least two things going on here:

Often the action space is huge and there's lots of uncertainty about which actions are best. Even if you are willing to sacrifice your happiness for the greater good, in practice it's likely that there will be many actions in the intersection between 'could have really good consequences' and 'makes me happy'. Choosing actions from the 'makes me unhappy' part of the space often seems confused given large amounts of uncertainty, and likely negative second-order effects on your future selves.
We don't control our minds. Ignoring the things you actually want, care about, feel motivated by (like children or friends or art) seems empirically to make people miserable, which in turn seems to lead to bad consequences overall.

Noticing the tension between these demanding lifestyles and virtues like moderation or wholesomeness can be read as a warning that there may be mistakes in the naive consequentialist calculation.

Naive over-optimization of social interactions

Local consequentialist group organiser may be prone to thinking about people as means to their own impact, and optimising by, for example, allocating their time based on the estimated increase of likelihood the person will become 'highly engaged EA' per amount of effort spent on persuasion.

There are various problems with this approach from a deontological perspective, but a clear one is that the approach is not stable upon becoming public: some of people you would want to talk to the most would not agree to such a conversation if they understood that this was the motivation.

From virtues perspective, the group organizer should be worried if they are on the path to turn themselves from a thinker to an idea salesperson.

Outreach to young people

Some EA-motivated efforts targeting young people seem to me to be based on local consequalist calculations. Based on an intuition like "we will need more ML engineers in 10 years", the aim seems to be to motivate young people to move to such careers early.

This seems problematic from the perspective of virtue ethics, and also conflicts with deontic constraints such as "not considering humans as just means to your impact".

What might this tension point to? Plausibly, again, a problem with cluelessness. I don’t know what will happen in 10 years’ time.. If someone asked me to propose concrete effective altruistic actions for them to take in 2032, I would be pretty stuck. Naive consequentialist calculations may make approaches like this seem valuable - but the world is a sufficiently complex system that it seems hubristic to actually act upon them.

Approaching people in a more virtue ethical way seems much more robust to me:

I would be much more confident in advising someone about which virtues it would be good for them to cultivate - for example scale-sensitivity, ability to think clearly, getting better at noticing inadequacies…
Virtue-based approaches have a much better track record: decisions made by scale-sensitive people attempting to think clearly now seem much better than their speculations about specific actions from 10 years ago.

Personal addendum

This part was written post FTX events.

In my personal experience, what's often hard about this, is, the confused bounded consequentialist actions don't feel like that from the inside. I was most prone to them when my internal feeling was closest to consequentialism is clearly the only sensible theory here.

For this reason, I also dislike the philosophical vocabulary using the term naive consequentialism for all related problems: the word naive subtly suggests that if you are smart, and spent few hours reasoning on something, surely you are not naive? The reality is much worse - in some sense, all consequentialism running on human brains is quite bounded, and only rarely do humans see the contours of what would the true, non-naive consequentialism imply.

So is dropping direct consequentialism and embracing deontology a virtue ethics the answer? Unfortunately, while at places this post advocates to give them a lot of weight, these theories are also not the answer. We are in state where everything has serious limits or is too hard to compute.

Also: we don't have some nice, simple and easy to explain meta-theory even for the effective theories of physics, ahe current state of theorizing about moral uncertainty is much less developed than that. In physics, we at least have many links explaining some effective theories as limit cases of stronger theories - however, in practice, physicists also rely on a lot of implicit reasoning in decisions about what you can ignore, and what models to use, and this also likely the current best option for ethics.

Thanks Rose and Gavin for help with editing various versions of this text.

^{^}
I'm using effective theory as in physics: effective theory is a tool used to handle situations where full knowledge of a phenomenon is not available, but where partial knowledge is sufficient to make useful predictions. An effective theory may be thought of as an approximate model of the underlying full theory. It is usually simpler than the full theory, and contains only the degrees of freedom and interactions that are relevant to the problem at hand.
^{^}
Let's take a simple example of a car repair shop. If I arrive there with the problem "the car has started making strange noises", I probably have the best chance of a good result if I can specify the virtues of the mechanic.
Since I don't understand the car, I can't describe what actions the mechanic should take.
I can't even describe well the "goal" of the repair: for example, the specification "remove strange noises" could be met by a recless mechanic by removing the part of the brake that is causing the noise.

[-]Jonathan Moregård2y60

I really enjoyed your "successor agent" framing of virtue ethics! There are some parts of the section that could use clarification:

Virtue ethics is the view that our actions should be motivated by the virtues and habits of character that promote the good life

This sentence doesn't make sense to me. Do you mean something like "Virtue ethics is the view that our actions should be motivated by the virtues and habits of character they promote" or "Virtue ethics is the view that our actions should reinforce virtues and habits of character that promote the good life"? It looks like two sentences got mixed up.

"Virtues are not intrinsically right or wrong;"

I get confused by this statement. I think of virtue ethics as putting all moral value onto the way you are training yourself to act. Virtue is the sole Good etc. Can you clarify what you mean here?

"Taking honesty as an example virtue, we should strive to be honest, even if being dishonest would lead to some greater good"

I guess you mean "lead to consequences that would be better according to a consequentialist perspective". When discussing different views on ethics the term "good" gets overloaded.

[-]Jan_Kulveit2y40

Virtue ethics is the view that our actions should be motivated by the virtues and habits of character that promote the good life
This sentence doesn't make sense to me. Do you mean something like "Virtue ethics is the view that our actions should be motivated by the virtues and habits of character they promote" or "Virtue ethics is the view that our actions should reinforce virtues and habits of character that promote the good life"? It looks like two sentences got mixed up

Sorry for confusion I tried to paraphrase what classical virtue ethicist believe, in my view.

For clarity, this is how I interpret it in a computationalist way: virtue ethics focuses on the properties of decision procedures leading to actions, and takes them as the central object of theory. "Action is good so far as it was produced by a good(=virtuous) computational procedure + reinforces the good computations". Where the focus is on the computations.

The philosophy encyclopedia states .... virtue ethicists will resist the attempt to define virtues in terms of some other concept that is taken to be more fundamental. Rather, virtues and vices will be foundational for virtue ethical theories and other normative notions will be grounded in them.

"Virtues are not intrinsically right or wrong;"
I get confused by this statement. I think of virtue ethics as putting all moral value onto the way you are training yourself to act. Virtue is the sole Good etc. Can you clarify what you mean here?

Again, it's me trying to paraphrase what I believe classical virtue ethicists believe.

My interpretation of the claim is this: in the previously described computationalist paraphrase, you may be left wondering how do you decide about which properties of the computations make them good. Where you have an easy option to ground it in outcomes, consequentialist style. But as I understand it, the classical claim is you try to motivate it purely "intrinsically": your goal is to design the best possible successor agent ... and that it. You evaluate the properties of the computations using that. All other forms of "good", such as good outcomes, will follow.

My personal take is this leaves virtue ethics partially under-defined.

"Taking honesty as an example virtue, we should strive to be honest, even if being dishonest would lead to some greater good"
I guess you mean "lead to consequences that would be better according to a consequentialist perspective". When discussing different views on ethics the term "good" gets overloaded.

Yes.

[-]Jan_Kulveit10mo42Review for 2022 Review

My current view is this post is decent at explaining something which is "2nd type of obvious" in a limited space, using a physics metaphor. What is there to see is basically given in the title: you can get a nuanced understanding of the relations between deontology, virtue ethics and consequentialism using the frame of "effective theory" originating in physics, and using "bounded rationality" from econ.

There are many other ways how to get this: for example, you can read hundreds of pages of moral philosophy, or do a degree in it. Advantage of this text is you can take a shortcut and get the same using the physics metaphorical map. The disadvantage is understanding how effective theories work in physics is a prerequisite, which quite constrains the range of people to which this is useful, and the broad appeal.

[-]Ruby2y40

LESSWRONG
LW

63