Abstract Several major technology companies have announced plans to operate AI data centers in orbit. Elon Musk recently claimed: “the lowest-cost place to put AI will be space […] within two years, maybe three.” If a meaningful fraction of new AI compute really is placed in space within a few...
Updated: Jan 16, 2026 Digital minds are artificial systems, from advanced AIs to potential future brain emulations, that could morally matter for their own sake, owing to their potential for conscious experience, suffering, or other morally relevant mental states. Both cognitive science and the philosophy of mind can as yet...
Introduction I’m excited by deception probes. When I mention this, I’m sometimes asked “Do deception probes work?” But I think there are many applications of deception probes, and each application will require probes with different properties, i.e. whether a deception probe works will depend on what you’re using it for....
Recently, Apollo trained some deception probes (Goldowsky-Dill et al). A deception probe is a logistic classifier on the AI's internal activations, indicating whether a token belongs in a deceptive response. We benchmarked these deception probes, testing them across five datasets of strategic deception and comparing them to black-box monitoring. We...