I think this came up in the previous discussion as well that a AI that was able to competently design a nanofactory could have the capability to manipulate humans as at a high level as well. For example:
Then when the system generalizes well enough to solve domains like "build a nanosystem" - which, I strongly suspect, can't be solved without imaginative reasoning because we can't afford to simulate that domain perfectly and do a trillion gradient descent updates on simulated attempts - the kind of actions of thoughts you can detect as bad, that might have provided earlier warning, were trained out of the system by gradient descent; leaving actions and thoughts you can't detect as bad.
Even within humans, it seems we have people e.g on the autistic spectrum etc, who I can imagine as having the imaginative reasoning & creativity required to design something like a nano-factory(at 2-3 SD above the normal human) while also being 2-3SD below the average human in manipulating other humans. At least it points to those 2 things maybe not being the same general-purpose cognition or using the same "core of generality"
While this is not by-default guaranteed in the first nanosystem-design capable AI system, it seems like it shouldn't be impossible to do so with more research.
I read The Vital Question by Nick Lane a while ago and it was the most persuasive argument I've seen on abiogenesis. May be of interest to you. The argument made was that it could be fairly common based on proton gradients.
Thanks! This was a super informative read, it helped me grok a few things I didn't before.
The MCTS naming is surprising. Seems strange they're sticking with that name.
BTW, does anyone know of any similar write-ups to this on transformers?
Were you surprised by the direction of the change or the amount?
I still plan to be there this Friday, so just in case anyone is interested and reading this - I encourage you to show up