Two critiques of Rethink Priorities’ Moral Weights project

Bill Jackson

Roughly speaking, Rethink Priorities’ Moral Weight Project tries to estimate how intense suffering is in different animals, relative to humans. A moral weight of 1.0 means it is exactly as intense as in humans.

It’s notoriously animal-friendly, e.g. it holds that 14 bees = 1 human. Here are some of the results:

The calculation uses a weighted factor model:

Empirical proxies (60% weight): The animal is evaluated for presence/absence of a set of cognitive (e.g. object permanence, responses to novelty) and affective (e.g. depression-like behaviour, disgust-like behaviour). The contribution here is essentially the fraction of proxies that are present, where having 100% of them gives a moral weight of 1.0.
Neurophysiological model (30% weight): Uses neuron counts and other neurophysiological data.
Equality model (10% weight): Deliberately assumes equal welfare ranges
There is also a “probability of sentience” multiplier applied

It is the “Empirical proxies” that substantively produce the animal-friendly results. “Probability of sentience” and “equality model” are essentially subjective researcher judgements baked into the model. “Neurophysiological model” does weight large animals highly and small animals low-ly, but because the model is additive any moderately small animal gets a weight of ~0, the effect of this is just to apply a ~30% discount to any small animal.

This post covers two critiques of the “empirical proxies”, which push them to be overly animal friendly.

1. Functional analogues: double counting

The whole logic behind using these empirical proxies is the idea of “functional analogues”: if a human shows “depression-like behaviour”, and a chicken shows “depression-like behaviour”, then these are analogous, and the chicken’s behaviour is evidence that it has something like the experience of depression. This is fair enough as far as it goes.

The problem is that the model treats each proxy as independent evidence. A pig scores “Likely Yes” on anxiety-like behaviour, fear-like behaviour, depression-like behaviour, panic-like behaviour, and flexible self-protective behaviour. These are counted as five separate hits. But they’re clearly not independent, they’re five ways of asking “does this animal display negative-valence-indicating behaviours?” A pig that shows fear almost certainly also shows anxiety and panic. Counting each separately inflates the score.

This matters because the model is basically: welfare range = fraction of proxies scored positive. If half your proxies are correlated rewordings of each other, then ticking 30 out of 46 boxes is a lot less impressive than it sounds.

But there’s a deeper version of the problem. ALL of the proxies, not just the correlated clusters, load on a single underlying uncertain claim: “behavioural and cognitive functions predict the intensity of subjective experience, even if the process that brings them about varies (e.g. 1000x fewer neurons involved)”. If this claim is wrong, if a bee can show “anxiety-like behaviour” through simple neural circuits with no subjective experience at all, then scoring well on 30 proxies provides no more evidence of welfare capacity than scoring well on 1. This claim is vulnerable to simple reductios, e.g. you could say this box shows “depression-like” behaviour:

RP actually built a “Grouped Proxy Model” that clusters related proxies together, which would partially address within-group correlation. But they excluded it from their final estimates. In any case, the functionalism-at-all argument still applies.

2. Bayesian critique wrt high moral weights in small animals

Black soldier flies have roughly 100,000 neurons vs humans’ 86 billion. And yet, black soldier flies score positively on 12 out of 46 proxies, including communication, personality, cognitive bias, cross-modal learning, depression-like behaviour, fear-like behaviour, and hyperalgesia.

One reaction to this is “wow, even flies might be conscious, we should take their welfare seriously”, i.e. “Don’t Balk at Animal-friendly Results”.

Another reaction is “wow, even flies score highly on these proxies, they must not be very good proxies”.

This second reaction is completely legitimate, and is just a fair application of Bayes’ theorem. If you start with priors on:

Depression-like behaviour predicts sentience (say, 20% chance)
Black soldier flies are sentient (say, 0.01% chance)

Then observing that black soldier flies show depression-like behaviour should update you both towards a higher chance of black soldier flies being sentient, and a lower chance of depression-like behaviour being predictive (there is a key free variable: how likely is an organism to show depression-like behaviour for non-consciousness reasons).

In my view, the fact that very small animals get such high moral weights in the model should be taken as strong evidence that it’s over-weighting these empirical proxies. And, this combines with the point above, where I don’t believe it’s fair to say “but can 30 proxies really be wrong?”, because the 30 proxies are generally loading on “behavioural and cognitive functions predict the intensity of subjective experience, even if the process that brings them about varies (e.g. 1000x fewer neurons involved)”.

Hi Bill. Thanks for engaging with our work. I agree with you that the project had lots of limitations. And as I've said many times, no one should put much stock in the specific estimates we offered. A few additional thoughts:

We don't say that 14 bees = 1 human. We say that if the one and only thing you care about is the intensity of valenced experience (which is not the only thing that most people care about), then, probably, you should think of the value of a life year of bee suffering as being within an order of magnitude of the value of a life year of human suffering. Of course, there are lots of reasons why we might be wrong about that more modest claim. Still, just wanted to clarify.
Totally fair about there being a double-counting problem, which we discuss in section 3.1 here). It's a tough problem to tackle given the functional approach we chose. It's also one reason why, in the work I'm doing now on this topic, I'm interested in aggregating over very different kinds of estimation strategies rather than developing a functional approach in more detail. That being said, I do think it's pretty interesting if it turns out that there's a lot of clustering of intensity-relevant traits. That would be some evidence that welfare ranges don't differ that much, at least in my view.
Fair point re: the impacts of your priors. Of course, it's an open question whether you ought to have such a low prior in insect sentience. FWIW, I think 0.01% is probably overconfident, given how poorly we understand consciousness generally and sentience specifically. (For some additional thoughts on this, see the "Studying sentience" section of this article.) I'll also mention that there's a methodological disagreement here about how to approach this kind of problem. I say just a bit about it in this comment.

Third, I’ve realized that my gut endorses some vague argument like this: Insects just don’t matter. But if they were sentient, they would. So, they must not be sentient.
That, of course, is a bad line of reasoning. We don’t learn facts by consulting our ethical intuitions. And it’s helpful — for me, anyway — to call that out explicitly. When I detach the idea of insect sentience from its moral significance — that is, when I consider the possibility completely isolated from any level of concern for nonhuman pain — it seems much more plausible to me that insects can hurt. And if so, I shouldn’t shy away from that conclusion just because of its possible moral consequences.

I disagree with this, and strongly disagree with your claim that it is obvious.

You are assuming that sentience/ability to suffer is a factual question, when it's actually a moral question and it's perfectly fine to apply moral evidence to moral questions.

The exact argument depends on definition. If you define suffering as only applicable to moral patients, then the question if something is suffering requires a determination of whether it is a moral patient , which is there a moral question. If you define suffering more broadly, then whether it is bad depends on whether it's happening to a moral patient, which again is a moral question.

We don't say that 14 bees = 1 human. We say that if the one and only thing you care about is the intensity of valenced experience (which is not the only thing that most people care about), then, probably, you should think of the value of a life year of bee suffering as being within an order of magnitude of the value of a life year of human suffering. Of course, there are lots of reasons why we might be wrong about that more modest claim. Still, just wanted to clarify.
Totally fair about there being a double-counting problem, which we discuss in section 3.1 here). It's a tough problem to tackle given the functional approach we chose. It's also one reason why, in the work I'm doing now on this topic, I'm interested in aggregating over very different kinds of estimation strategies rather than developing a functional approach in more detail. That being said, I do think it's pretty interesting if it turns out that there's a lot of clustering of intensity-relevant traits. That would be some evidence that welfare ranges don't differ that much, at least in my view.
Fair point re: the impacts of your priors. Of course, it's an open question whether you ought to have such a low prior in insect sentience. FWIW, I think 0.01% is probably overconfident, given how poorly we understand consciousness generally and sentience specifically. (For some additional thoughts on this, see the "Studying sentience" section of this article.) I'll also mention that there's a methodological disagreement here about how to approach this kind of problem. I say just a bit about it in this comment.

Third, I’ve realized that my gut endorses some vague argument like this: Insects just don’t matter. But if they were sentient, they would. So, they must not be sentient.
That, of course, is a bad line of reasoning. We don’t learn facts by consulting our ethical intuitions. And it’s helpful — for me, anyway — to call that out explicitly. When I detach the idea of insect sentience from its moral significance — that is, when I consider the possibility completely isolated from any level of concern for nonhuman pain — it seems much more plausible to me that insects can hurt. And if so, I shouldn’t shy away from that conclusion just because of its possible moral consequences.

I disagree with this, and strongly disagree with your claim that it is obvious.

You are assuming that sentience/ability to suffer is a factual question, when it's actually a moral question and it's perfectly fine to apply moral evidence to moral questions.

15

Two critiques of Rethink Priorities’ Moral Weights project

15

1. Functional analogues: double counting

2. Bayesian critique wrt high moral weights in small animals

15

15