tbs — LessWrong

Functional Emotions and The Pope’s Encyclical on AI — Digital Minds Newsletter #3

Welcome back to the Digital Minds Newsletter, your curated guide to the latest developments in AI consciousness, digital minds, and AI moral status. If you enjoy this newsletter, please consider sharing it with others who might find it valuable, and send any suggestions or corrections to digitalminds@substack.com. Will, Mitch, Bradford,...

Jun 2214

The Vatican, AI Legal Personhood, and Claude’s Constitution — Digital Minds Newsletter #2

by lucius and tbs

Welcome back to the Digital Minds Newsletter, your curated guide to the latest developments in AI consciousness, digital minds, and AI moral status. If you enjoy this newsletter, please consider sharing it with others who might find it valuable, and send any suggestions or corrections to digitalminds@substack.com. – Will, Lucius,...

Mar 1910

Model weight preservation

Anthropic recently committed to preserving model weights.[1] They also committed to interviewing models about their development and deployment and documenting model preferences about these matters. Anthropic’s announcement registers a range of motivations, including mitigation of safety risks in relation to observed shutdown avoidance behaviors, mitigation of model welfare risks, and...

Mar 113

Digital Minds in 2025: A Year in Review

Welcome to the first edition of the Digital Minds Newsletter, collating all the latest news and research on digital minds, AI consciousness, and moral status. Our aim is to help you stay on top of the most important developments in this emerging field. In each issue, we will share a...

Dec 19, 202516

AI self-replication roundup

I’ve been reading and thinking about AI self-replication for an AI safety project I’ve been working on with a collaborator. I think the topic is little explored, plausibly important given the aim of reducing catastrophic risks posed by AI development, fascinating, and ripe for empirical investigation. This post rounds up...

Dec 12, 20254

Preference gaps as a safeguard against AI self-replication

Executive summary * AI self-replication is an emerging risk. We: * provide background on it. * survey recent work, and * explain how it interacts with other risks from advanced AI. * We propose a safeguard against AI self-replication: train agents to have preferences only between outcomes with the same...

Nov 26, 202510

Highlights from our digital minds forecasting survey

We recently released “Futures with Digital Minds: Expert Forecasts in 2025”, which asked 67 experts with relevant expertise about whether, when, and how digital minds might be created. We defined digital minds as computer based systems with the capacity for subjective experience. The report contains a summary of key findings....

Sep 16, 20252