I am extremely interested in these sorts of questions myself (message me if you would want to chat more about them). In terms of the relation between accuracy and calibration, I think you might be able to see some of this relation from Open Philanthropy's report on the quality of their predictions. In footnote 10, I believe they decompose Brier score into a term for miscalibration, a term for resolution, and a term for entropy.
Also, would you be able to explain a bit how it would be possible for someone who is perfectly calibrated at predicting rain to pre...
I think this is a very useful post that is talking about many of the right things. One question though: isn't it only worth focusing on the worlds where iterative design does not work for alignment to the extent to which progress can still be made towards mitigating those worlds? It appears to me that progress in technical fields is usually accomplished through iterative design, so it makes sense to have a high prior on non-iterative approaches being less effective. Depending on your specific numbers here, it seems like it could be worth it to pay attentio...
Really good post. Based on this, it seems extremely valuable to me to test the assumption that we already have animal-level AIs. I understand that this is difficult due to built-in brain structure in animals, different training distributions, and the difficulty of creating a simulation as complex as real life. It still seems like we could test this assumption by doing something along the lines of training a neural network to perform as well as a cat's visual cortex on image recognition. I predict that if this was done in a way that accounted for the flexib... (read more)