Hi! I am Tse Yip Fai. This is a linkpost to Peter Singer's first published piece on AI, it's also my first published paper (second author). Since the paper is open access, I won't be doing any copy-and-paste here. 

This is a seminal paper resulting from an one-year research. We hope that this paper can break the ground and raise awareness in the field of AI ethics so that they start caring about the impact on animals. This, I believe, is not just relevant to the AI ethics field, but also to the AI alignment field. 

If you prefer videos, here's a talk by Professor Singer and myself on the same topic.

New to LessWrong?

New Comment
5 comments, sorted by Click to highlight new comments since: Today at 9:01 PM

Can you go into a bit of detail about how the paper is relevant to AI alignment?  I read most of it (and skimmed a few sections that looked less relevant), and the section titled Can AI systems make ethically sound decisions? was the closest to being relevant, but didn't seem to meaningful engage with the core concerns of AI alignment.

The paper also didn't include any discussion of the most significant impact we'd expect of misaligned AI on animals, i.e. total extinction.

[-]Fai2y91

but didn't seem to meaningful engage with the core concerns of AI alignment.

Yes, not directly. We didn't include any discussion of AI alignment, or even any futuristic-sounding stuff, to keep things close to the average conversation in the field of AI ethics, plus cater to the taste of our funder - two orgs at Princeton. We might write about these things in the future, or we might not.

But I argue that the paper is relevant to AI alignment because of the core claims of the paper: AI will affect the lives of (many) animals & These impacts matter ethically. And if these claims are true, then extending from them there might be a case for AI alignment to be broadened to "alignment with all sentient beings". And it seems to me such broadening creates a lot of new issues that the alignment field needs to think about. (I will write about it, not sure I will publish any)

The paper also didn't include any discussion of the most significant impact we'd expect of misaligned AI on animals, i.e. total extinction.

I sense that we might be disagreeing on many levels on this point, please correct me if I am wrong.

1. We might or might not mean very different things when we say "aligned" or "misaligned AI". For me, an AI is not really aligned if it only aligns with humans (humans' intent, interests, preferences, values, CEV, etc).  

2. It seems that you might be thinking that total extinction is bad for animals. But I think it's the reverse, most animals live net negative lives, so their total extinction could be good "for them". In other words, it sounds plausible to me an AI that makes all nonhuman animals go extinct could be (but also possibly not be) one that is "aligned".  (A pragmatic consideration related to this is that we probably can't and shouldn't say things like this in an introductory paper to a proposed new field.)

3. Ignoring the sign of the impact, I disagree that total extinction is the most significant impact on animals we should expect from misaligned AI. Extinction of all nonhuman animals takes away a certain (huge) amount (X) of net suffering. But misaligned AI can also create suffering/create things that cause suffering. It sounds plausible that there are many scenarios where misaligned AIs create >X, or even >>X amount of net suffering for nonhuman animals. 

Thanks for the detailed response!

a case for AI alignment to be broadened to "alignment with all sentient beings". And it seems to me such broadening creates a lot of new issues that the alignment field needs to think about.

This seems uncontroversial to me.  I expect most people currently thinking about alignment would consider a "good outcome" to be one where the interests of all moral patients, not just humans, are represented - i.e. non-human animals (and potentially aliens).  If you have other ideas in mind that you think have significant philosophical or technical implications for alignment, I'd be very interested in even a cursory writeup, especially if they're new to the field.

For me, an AI is not really aligned if it only aligns with humans

Yep, see above.

It seems that you might be thinking that total extinction is bad for animals. But I think it's the reverse, most animals live net negative lives, so their total extinction could be good "for them". In other words, it sounds plausible to me an AI that makes all nonhuman animals go extinct could be (but also possibly not be) one that is "aligned".

I think total extinction is bad for animals compared to counterfactual future outcomes where their interests are represented by an aligned AI.  I don't have a strong opinion on how it compares to the current state of affairs (but, purely on first-order considerations, it might be an improvement due to factory farming).

But misaligned AI can also create suffering/create things that cause suffering.

Agreed in principle, though I don't think S-risks are substantially likely. 

Agreed in principle, though I don't think S-risks are substantially likely. 

If animals are currently net suffering, AI might still think that the current way animals live is good and increase the number of animals that live the way animals currently live. 

From some animal right perspectives that might be an S-risk. 

Dwelling on issues such as if a moral self-driving car may run over migrating crabs on Christmas Island, this paper is unlikely to find much appeal here. Simply put, unless something entails the total destruction of everything, locals are unlikely to take much notice. Movement building has no appeal to them. This paper will likely be dismissed as academic lobbying, without relevance to the apocalypse.

But thank you very much for sharing it. I enjoyed a quick read, and in particular seeing familiar maneuvers like:

It is important to emphasize that although we ourselves accept the principle of equal consideration of similar interests, the arguments that follow are relevant to everyone who accepts that the interests of animals matter to some extent, even if they do not think that similar human and non-human interests matter equally.

This is what I find so strong in Singer's argumentation: graceful degradation with weakening assumptions, retaining its conclusion under all cases. No room for respectable indifference. Ethics should bite, bleed, not let go.

This point at the end is great, I think:

In fact, it is evident that some scholars hold complacent and misguided judgments about ethics, and therefore are likely to develop methods of ethics building in AI that are unfriendly to animals. For example, Wu and Lin, ... proposed an approach to ethics-building in AI systems that is based on “the assumption that under normal circumstances the majority of humans do behave ethically” ... Most humans, however, buy factory-farmed animal products, if they are available and affordable for them. As we have already argued, this causes unnecessary harm to animals, and so is ethically dubious. Therefore, at least with regard to the treatment of animals, we cannot accept Wu and Lin’s assumption.

This is a point that I wish I heard more often on LessWrong. Preserving the world as it is today would not be an achievement but rather a disaster. Preserving the world that most humans want is not obviously better than Clippy, either. An "aligned" AGI may be worse than useless, depending on who does the "alignment". These are all simple observations for the utilitarian, but I frequently find the instinct for survival has a yet greater hold on people here.