LESSWRONG
LW

CAIS Philosophy Fellowship Midpoint Deliverables

Jun 07, 2023 by Dan H

Conceptual AI safety researchers aim to help orient the broader field of AI safety, but in doing so, they must wrestle with imprecise, nebulous, hard-to-define problems. Philosophers specialize in dealing with problems like these. The CAIS Philosophy supports PhD students, postdocs, and professors of philosophy to produce novel conceptual AI safety research.

This sequence is a collection of drafts written by the CAIS Philosophy Fellows meant to elicit feedback. 

48Instrumental Convergence? [Draft]
Ω
J. Dmitri Gallow
2y
Ω
20
24The Polarity Problem [Draft]
Ω
Dan H, cdkg, Simon Goldstein
2y
Ω
3
50Shutdown-Seeking AI
Ω
Simon Goldstein
2y
Ω
32
19Is Deontological AI Safe? [Feedback Draft]
Ω
Dan H, William D'Alessandro
2y
Ω
15
155There are no coherence theorems
Ω
Dan H, EJT
2y
Ω
130
28Aggregating Utilities for Corrigible AI [Feedback Draft]
Ω
Dan H, Simon Goldstein
2y
Ω
7
28AI Will Not Want to Self-Improve
Ω
petersalib
2y
Ω
24
8Group Prioritarianism: Why AI Should Not Replace Humanity [draft]
fsh
2y
0
39Language Agents Reduce the Risk of Existential Catastrophe
Ω
cdkg, Simon Goldstein
2y
Ω
14