Question: MIRI Corrigbility Agenda
MIRI's reading list on corrigbility seems out dated, and I can't find a centralised list Does anyone have, or know of, one? As a side note, has MIRI stopped updating their reading list? It seems like that's the case. EDIT: Links given in the comment section to do with corrigibility....

I am kind of suprised you didn't reference causal inference here to just gesture at the task in which we "figure out which variables are directly relevant - i.e. which variables mediate the influence of everything else". Are you pointing to a different sort of idea/do you not feel causal inference is adequate for describing this task?
Also, scenario 1 and 2 seem fairly close to the "linear" and "non-linear" models of innovation Jason Crawford described in his talk "The Non-Linear Model of Innovation." To be honest, I prefered his description of the models. Though he didn't cover how miraculous it is that somehow the model can work. That, to a good approximation, the universe is simple and local.