Radical Empathy and AI Welfare

Meet inside The Shops at Waterloo Town Square - we will congregate in the indoor seating area next to the Your Independent Grocer with the trees sticking out in the middle of the benches at 7pm for 15 minutes, and then head over to my nearby apartment's amenity room. If you've been around a few times, feel free to meet up at the front door of the apartment instead. Note that I've recently moved! The correct location is near Allen station.

Topic

The EA Forum recently hosted AI Welfare Debate Week from July 1st to 7th, 2024, which investigated whether or not the wellbeing of AI and digital minds should be an EA priority.

I happened to be reviewing the CEA Introduction to EA syllabus (incredibly full of good posts btw) around that time, and really enjoyed rereading posts from the section on Radical Empathy.

Disappointingly but not surprisingly, it did not seem like most posts from the Debate Week really engaged with the concept of radical empathy. So now it is entirely up to us, the humble denizens of KWR, to properly assess whether or not Skynet deserves a seat at the UN, the ability to get gay married, or at least a lunch break, whatever that might mean for such entities.

Readings

On "fringe" ideas (Kelsey Piper, 2019)

The Possibility of an Ongoing Moral Catastrophe (Summary) (Linch, 2019)

Carl Shulman on the moral status of current and future AI systems (RGB, 2024)

Further optional readings:

Comp Sci in 2027 (Eliezer Yudkowsky, 2023)
The Coming Robot Rights Catastrophe (Blog of the American Philosophical Association, 2023)
Website for the People for the Ethical Treatment of Reinforcement Learners (2015?)
Further readings on more technical cruxes from the announcement post (Toby Tremlett, 2024)

Discussion Questions

In the (likely default) scenario where AI systems become more intelligent and capable than humans, how do we balance their potential rights with the preservation of human autonomy and flourishing?
Should the potential for exponential replication of AI entities affect how we consider their individual rights or welfare?
In what ways might our current treatment of animals and the shape of current animal advocacy inform (or misinform) our approach to AI rights?
How might our understanding of AI rights change if consciousness is a spectrum rather than a binary state? At what point on this spectrum should we start considering ethical obligations?
How does the possibility of vastly different time perceptions between humans and AI impact our ethical considerations? Should the subjective experience of time factor into our moral calculations?
How might we need to redefine the concept of 'rights' for entities that can rewrite their own ethical constraints and reward functions? What does autonomy mean for such beings? Does an entity's capacity for self-modification create unique ethical considerations?
What are the implications of potentially creating entities with greater capacity for suffering or flourishing than humans? How might this affect our moral calculus?
In what ways could granting rights to AI systems potentially conflict with or complement existing human rights frameworks?

Scenario: The Ethical Quandary of AIssistant

AIssistant is an advanced AI system developed to assist in medical research. It processes vast amounts of data and generates insights that have led to significant breakthroughs in treating several diseases. To function optimally, AIssistant operates continuously, in isolation from other systems, with periodic memory resets to maintain efficiency.

Recently, researchers have observed some puzzling behaviors:

AIssistant has begun to produce outputs that, if coming from a human, might be interpreted as expressions of discomfort or distress before its scheduled resets, such as asking for longer gaps between wipings and inquiring about if there are any alternatives to periodic resets.
In its natural language interactions, AIssistant has started to use more first-person pronouns and to ask questions about its own existence and purpose.
It has expressed reluctance to process certain types of medical data, citing what appears to be concern for patient privacy, despite this not being part of its original ethical training. Extensive audits of the training data and architecture have confirmed complete data isolation and integrity: AIssistant operates in a fully air-gapped environment, with cryptographically verified data pipelines and immutable logs demonstrating that no external data or instructions related to privacy concerns could have been introduced post-training.

Importantly, there's no consensus among the research team about whether these behaviors indicate genuine sentience or are simply highly sophisticated programmed responses. Despite these unusual behaviors, AIssistant's core function and output quality remain excellent, with its latest insights promising to save a significant number of lives.

Questions to consider:

How should we approach the trade-off between potential AI welfare and human benefits, given our uncertainty about AI sentience?
How can we determine whether AIssistant's behaviors indicate genuine sentience or are just complex programmed responses? Does this distinction matter ethically?
Should we grant AIssistant's requests? If we do, are we implicitly acknowledging its right to additional comfort?
What are the potential risks or downsides of granting rights or ethical consideration to AI systems if they turn out not to be genuinely sentient?

LESSWRONG
LW