3.1: What are the current major open problems in Paul’s agenda?
The most important open problems in Paul’s agenda, according to Paul:
(See: Two guarantees, The informed oversight problem, Corrigibility, and the “Low Bandwidth Overseer” section of William Saunder's post Understanding Iterated Distillation and Amplification: Claims and Oversight.)
3.2: How close to completion is Paul’s research agenda?
Not very close. For all we know, these problems might be extraordinarily difficult. For example, a subproblem of “transparent cognition” is “how can humans understand what goes on inside neural nets”, which is a broad open question in ML. Subproblems of “worst-case guarantees” include ensuring that ML systems are robust to distributional shift and adversarial inputs, which are also broad open questions in ML, and which might require substantial progress on MIRI-style research to articulate and prove formal bounds. And getting a formalization of corrigibility might require formalizing aspects of good reasoning (like calibration about uncertainty), which might in turn require substantial progress on MIRI-style research.
I think people commonly conflate “Paul has a safety agenda he feels optimistic about” with “Paul thinks he has a solution to AI alignment”. Paul in fact feels optimistic about these problems getting solved well enough for his agenda to work, but does notconsider his research agenda anything close to complete.
(See: Universality and security amplification, search “MIRI”)