I'd like AI researchers to establish a consensus (>=70%) opinion on this question: What properties would a hypothetical AI system need to demonstrate for you to agree that we should completely halt AI development?
There are at least two ways that an agreement among researchers on this seems useful:
1. Labs could then attempt to build non-lethal non-AGI 'toy' systems which demonstrate the agreed-upon properties. If a sufficiently powerful AI system will be lethal to humans, this work could potentially prove it before such a system is built.
2. If it turns out there is a need to completely shut down AI development, the only way this will happen is with government help. We would likely need a consensus opinion that we are in the process of building a system lethal to humans if we want to compel governments to shut it all down.
Building this kind of a consensus takes time and I think we should start trying now!
I wrote in a previous post that I would like to prepare governments for responding quickly and decisively to AGI warning shots. A consensus on this question would both clarify what those warning shots might look like, and give us a better chance of noticing them before it's too late if alignment is failing catastrophically.