I like this story, it touches on certainty bounds relative to superintelligence, which I think is an underexplored concept. That is: if you're an AGI that knows every bit of information relevant to a certain plan, and has the "cognitive" power to make use of this info, what would you estimate your plan's chance of success to be? Can it ever reach 100%?
To be fair, I don't think answers to this question are very tractable right now, afaict we don't have detailed enough physics to usefully estimate our world's level of determinism. But this feature-of-universe seems especially relevant to how likely an AGI is to attempt complex plans (both aligned and unaligned) which carry some probability of it being turned off forever.
If anyone knows of existing works which discuss this in more depth I'd love some recommendations!
Insight volume/quality doesn't seem meaningfully correlated with hours worked (see June Huh for an extreme example), high-insight people tend to have work schedules optimized for their mental comfort. I don't think encouraging someone who's producing insights at 35 hours per week to work 60 hours per week is positive will result in more alignment progress, and I also doubt that the only people producing insight are those working 60 hours per week.
EDIT: this of course relies on the prior belief that more insights are what we need for alignment right now.
This seems to support Reward is Enough.
DM simulates a lower fidelity version of real world physics -> Applies real world AI methods -> Achieves generalised AI performance.
This is a pretty concrete demonstration that current AI methods are sufficient to achieve generality, just need more real world data to match the more complex physics of reality.