I can think of a few reasons someone might think AI Control research should receive very high priority, apart from what is mentioned in the post or in Buck's comment:
I agree with basically everything in the post but put enough probability on these points to think that control research has really high expected value anyway.
Interesting!
I thought of a couple of things that I was wondering if you have considered.
It seems to me like when examining mutual information between two objects, there might be a lot of mutual information that an agent cannot use. Like there is a lot of mutual information between my present self and me in 10 minutes, but most of that is in information about myself that I am not aware of, that I cannot use for decision making.
Also, if you examine an object that is fairly constant, would you not get high mutual information for the object at different times, even though it is not very agentic? Can you differentiate autonomy and a stable object?
I think my default response when I learn about [trait X] is almost the opposite of how it is described in the post, at least if I learn that someone I know has it.
My mind reflexively tries to explain how [trait X] is not that bad, or good in the certain context. I have had to force myself to not automatically defend it in my head. I might signal (consciously or unconsciously) dislike for the trait in general, but not when I am confronted with someone I know having it. There are probably exceptions to this though, maybe for more extreme traits. I hope I wouldn't automatically try do internally defend rape for example, even if it was reflexive and only for one or two seconds.
I just wanted to note that people like me exist too, and in certain cultures it might be fairly common (though I'm just speculating here).
My apologies, when I started on the post I searched for the word "memorization", and there were not many results. I forgot to change the statement when I realised there were more posts than I first thought.
Although, I still think there is too little discussion about memorization, perhaps with the exception of spaced repetition.
Thank you for pointing out the error.
I think my points argue more that control research might have higher expected value than some other approaches, that don't address delegation at all or are much less tractable. But I agree, if slop is the major problem, then most current control research doesn't adress it, though it's nice to see that this might change if Buck is right.
And my point about formal verification was to work around the slop problem by verifying the safety approach to a high degree of certainty. I don't know if it's feasible, though, but some seem to think so. Why do you think it's a bad idea?