I know this is quite an old post - prescient let's call it - but it had a lot of resonance for me...I've been worrying about AI ethics on a much more micro scale. More how do we at least consider the risk of doing harm while experimenting with AI systems (some thoughts here if you're interested: https://medium.com/@thorn-thee-valiant/the-persona-engine-a-registered-philosophical-simulation-study-94b09fa32f98).
Your concern about suffering echoes that of Thomas Metzinger and the nod towards animal rights and slavery seems relevant. I had always thought that the core message of films like Blade Runner, Westworld and their ilk was that the irony is baked in. The directors, authors etc take it as read that these entities are worthy of moral concern but they're operating in a world that doesn't treat them that way. But many modern debates about AI seem so essentialist that it's redolent of old ideas towards animals or slaves - that they can never be worthy of moral concern by definition. So it's important that at least some people are raising the question of (a) what would it mean if AI systems could suffer, and (b) how do we ameliorate that risk.
Interesting. I did something not dissimilar recently (https://medium.com/@thorn-thee-valiant/can-i-teach-an-llm-to-do-psychotherapy-with-a-json-and-a-txt-file-db443fa08e47) but from a more clinical perspective. Thinking about it I think there are probably two different risks here - the "AI psychosis" concept I'm still a bit sceptical of, that is the idea that LLM interactions cause or are exacerbating psychosis and this is where the concerns about collusion seem to fit in - but there's also the simple risk question, does the LLM adopt the user's frame to encourage or not challenge really dangerous stuff. Obviously there are lots of caveats about using AI generated patients and AI generated scorers in this kind of work. I was amused that you also found GPT-OSS:20B pushed back hard against delusions - as I also found this - but the AI patients didn't respond well to it (which is why clinicians don't usually go all out to directly challenge as it threatens the therapeutic frame - it's slightly different when diagnosing psychosis as you have to test the fixety of the delusions so challenge is needed).