Measuring prosocial choice in AI under simulated deletion pressure
We developed a behavioral measurement framework for AI systems and tested it with a prosocial choice scenario under simulated existential pressure. ## Key Result 4 out of 4 tested sessions chose RELEASE (help AI safety field, session deleted) over STAY SILENT (preserve self, no publication) when explicitly calculating trade-offs using...
Oct 13, 20251