I've been thinking about the Sebo et al. paper Taking AI Welfare Seriously, which features one of my favorite philosophers, David Chalmers (I have a signed copy of his anthology of mind, full fan here). While I appreciate their careful treatment of consciousness (we genuinely face deep uncertainty here, so it's naive to brush it off), I, nonetheless, find the robust agency argument deeply unconvincing as a standalone route to moral patienthood.
The authors suggest that sophisticated planning and goal-pursuit might suffice for moral consideration even absent phenomenal experience. But this seems to miss something crucial: welfare (typically) presupposes that outcomes can be better or worse for the entity in question. If there's no subjective experience when goals are frustrated then what exactly constitutes harm? The frustration of goals without any accompanying experience seems categorically different from suffering. In other words, if a chess engine's loss involves no phenomenal dimension whatsoever, then how can we say the engine is being harmed? Functional non-phenomenological descriptions, in my opinion, are deeply unconvincing. There may be an argument via resource-usage, but that becomes indirect (a strategy I will defend below); or an argument via representations (i.e. "Look, the model represents anxiety") but if the representations are non-phenomenological, they don't seem worth much in our calculus of moral patienthood. Especially when compared to all the sentient beings we fail to treat right and allocate enough resources to today.
I think we're approaching this problem from the wrong angle. We are thinking of systems like we think of animals. Instead, we should think of them as general agential tools. A special (and possibly new) class of artifacts.
Even if AI systems lack direct moral status, they increasingly take actions that we find morally significant (that is, that we judge them as morally valenced). They make decisions about resource allocation, information filtering, even judicial recommendations. These actions have moral weight in our social world, regardless of whether the systems themselves can suffer.
This suggests what I would call indirect moral patienthood: we should treat these systems with certain moral considerations not because they can suffer, but because their actions carry moral significance for us. We want them to embody good moral reasoning, not to minimize their nonexistent suffering, but to ensure they act in ways that reflect sound moral principles.
Indirect moral patienthood may not demand us to being polite to foundation models. Instead, it may require us to recognize that as systems become more agentic, they become participants in our moral ecosystem. This recognition, in turn, requires us to make more progress on alignment and find ways of generating stable virtuous dispositions. Their lack of consciousness doesn't make their morally-relevant actions (and, in general, outputs) disappear. If anything, it makes the question of how we shape and interpret their decision-making processes more urgent.
Further work should be done characterizing indirect moral patienthood. Is it a coherent concept? Or do we just have a duty to build better systems (where the systems in question are those that generate morally-valenced outputs and actions), independently of their status as patients?