Safety researchers should take a public stance
[Co-written by Mateusz Bagiński and Samuel Buteau (Ishual)] TL;DR Many X-risk-concerned people who join AI capabilities labs with the intent to contribute to existential safety think that the labs are currently engaging in a race that is unacceptably likely to lead to human disempowerment and/or extinction, and would prefer an AGI ban[1] over the current path. This post makes the case that such people should speak out publicly[2] against the current AI R&D regime and in favor of an AGI ban[3]. They should explicitly communicate that a saner world would coordinate not to build existentially dangerous intelligences, at least until we know how to do it in a principled, safe way. They could choose to maintain their political capital by not calling the current AI R&D regime insane, or find a way to lean into this valid persona of “we will either cooperate (if enough others cooperate) or win the competition in style (otherwise)”. X-risk-concerned people who have some influence within AI capabilities labs should additionally truthfully state publicly and advocate internally to ensure that the lab lets its employees speak out publicly, as mentioned above, without any official retaliation. If they are unable to make a lab follow this policy, they should state so publicly. X-risk-concerned people in our communities should enforce the norm of praising the heroism of those who [join AI capabilities labs while speaking out publicly on the current mad race], and being deeply skeptical of the motives of those who [join without publicly speaking out]. Not being public about one's views on this hinders the development of common knowledge, nearly guarantees that the exposure to corrupting influence from working inside the lab (which doesn’t depend on whether they publicly speak out) partially reshapes one into a worse version of oneself, and gives an alibi[4] to people who want to join labs for other reasons that would otherwise be condemned by their community. Quotes > Liron
I think this is a plausibly fruitful direction of investigation, but I also believe that a mature ontology/theory of ~value/agency will cut across the categories of consequentialism, deontology, virtue ethics, etc., inherited from moral philosophy, and so a proper solution grounded in such a theory will cut across those categories as well.