Interesting. Your crux seems good; I think it's a crux for us. I expect things play out more like Eliezer predicts here:

I also predict that there will be types of failure we will not notice, or will misinterpret. It seems fairly likely to me proto-AGI (i.e. AI that could autonomously learn to become AGI within <~10yrs of acting in the real world) is deployed and creates proto-AGI subagents, some of which ... (read more)

