I found this post to be a clear and reasonable-sounding articulation of one of the main arguments for there being catastrophic risk from AI development. It helped me with my own thinking to an extent. I think it has a lot of shareability value.
I think this is basically correct and I'm glad to see someone saying it clearly.
I agree with this post. However, I think it's common amongst ML enthusiasts to eschew specification and defer to statistics on everything. (Or datapoints trying to capture an "I know it when I see it" "specification".)
This is one of the answers: https://www.alignmentforum.org/posts/FWvzwCDRgcjb9sigb/why-agent-foundations-an-overly-abstract-explanation
The trick is that for some of the optimisations, a mind is not necessary. There is a sense perhaps in which the whole history of the universe (or life on earth, or evolution, or whatever is appropriate) will become implicated for some questions, though.
I think https://www.alignmentforum.org/posts/TATWqHvxKEpL34yKz/intelligence-or-evolution is somewhat related in case you haven't seen it.
I'll add $500 to the pot.
Interesting - it's not so obvious to me that it's safe. Maybe it is because avoiding POUDA is such a low bar. But the sped up human can do the reflection thing, and plausibly with enough speed up can be superintelligent wrt everyone else.
A possibly helpful - because starker - hypothetical training approach you could try for thinking about these arguments is make an instance of the imitatee that has all their (at least cognitive) actions sped up by some large factor (e.g. 100x), e.g., via brain emulation (or just "by magic" for the purpose of the hypothetical).
It means f(x) = 1 is true for some particular x's, e.g., f(x_1) = 1 and f(x_2) = 1, there are distinct mechanisms for why f(x_1) = 1 compared to why f(x_2) = 1, and there's no efficient discriminator that can take two instances f(x_1) = 1 and f(x_2) = 1 and tell you whether they are due to the same mechanism or not.