This is a linkpost for https://www.fhi.ox.ac.uk/wp-content/uploads/AI-Governance_-A-Research-Agenda.pdf
Allan Dafoe, head of the AI Governance team at FHI has written up a research agenda. It's a very large document, and I haven't gotten around to reading it all, but one of the things I would be most excited about is people in the comments quoting the most important snippets that give a big picture overview over the content of the agenda.
I thought this was a quite good operationalisation of how hard aligning advanced AI systems might be, which I've taken from the conclusion of the overview of the technical landscape. (All of what follows is a direct quote, but it's not in quotes because the editor can't do that at the same time as bullet points.)
---
There are a broad range of implicit views about how technically hard it will be to make safe advanced AI systems. They differ on the technical difficulty of safe advanced AI systems, as well as risks of catastrophe, and rationality of regulatory systems. We might characterize them as follows:
[85] This assumes that recoverable accidents occur with sufficient probability before non-recoverable accidents.
[86] Yudkowsky, Eliezer. “So Far: Unfriendly AI Edition.” EconLog | Library of Economics and Liberty, 2016. http://econlog.econlib.org/archives/2016/03/so_far_unfriend.html.
There's an interesting appendix listing desiderata for good AI forecasting, that includes the catchy phrase "epistemically temporally fractal" (which I feel compelled to find a place to use in my life). This first three points are reminiscent of Zvi's recent post.
Most of the sections are just lists of interesting questions for further research; the lists of questions seem fairly comprehensive. The section on avoiding arms races has something more in the way of conceptually breaking up the space - in particular, the third and fourth paragraphs distill the basic models around these topics in a way I found useful. My guess is that this section is most representative of the future work of Allan Dafoe.