[Linkpost] Community polls on alignment controversies

Jasmine Brazilek; MilesTS; jonahmattwoodward

This is a linkpost for https://forum.effectivealtruism.org/posts/MkcseQrXPkeFDjFuA/community-polls-on-alignment-controversies

Planning where we focus at CaML requires forming views on many controversial questions. In many cases, people we've talked to have wildly different perceptions of the balance of opinions, so we thought this would be a great way for us and others to know where we're out of step on these issues. Also feel free to tell us if you think the questions are ambiguous or embed false assumptions.

The questions (voting in comments):

Robust alignment requires alignment-relevant intervention during pretraining
AI alignment to humans will in practice avoid moral catastrophes to animals
AI alignment to humans will in practice avoid moral catastrophes to digital minds
Research into digital mind suffering is sufficiently tractable to work on
Partially aligned transformative AIs are likely to be stable under reflection
Alignment to specific values is underrated in research relative to control
Multipolar worlds will compete away >90% of net value that would otherwise be preserved

Q7: Multipolar worlds will compete away >90% of net value that would otherwise be preserved

Q6: Alignment to specific values is underrated in research relative to control

Q5: Partially aligned transformative AIs are likely to be stable under reflection

Q4: Research into digital mind suffering is sufficiently tractable to work on

Q3: AI alignment to humans will in practice avoid moral catastrophes to digital minds

Q2: AI alignment to humans will in practice avoid moral catastrophes to animals

Q1: Robust alignment requires alignment-relevant intervention during pretraining