Thanks a lot for your detailed reply and sorry for my slow response (I had to take some exams!).
Regarding terminal goals the only compelling one I have come across is coherent extrapolated volition as outlined in Superintelligence. But how to even program this into code is of course problematic and I haven't followed the literature closely since for rebuttals or better ideas.
I enjoyed your piece on Steered Optimizers, and think it has helped give me examples where the algorithmic design and inductive biases can play a part in how controllable our system is. This also brings to mind this piece which I suspect you may really enjoy: https://www.gwern.net/Backstop.
I am quite a believer in fast takeoff scenarios so I am unsure to what extent we can control a full AGI, but until it reaches criticality the tools we have to test and control it will indeed be crucial.
One concern I have that you might be able to address is that evolution did not optimize for interpretability! While DNNs are certainly quite black box, they remain more interpretable than the brain. I assign some prior probability to the same relative interpretability of DNNs vs neocortex based AGI.
Another concern is with the human morals that you mentioned. This should certainly be investigated further but I don't think almost any human has an internally consistent set of morals. In addition, I think that the morals we have were selected by the selfish gene and even if we could re-simulate them through an evolutionary like process we would get the good with the bad. https://slatestarcodex.com/2019/06/04/book-review-the-secret-of-our-success/ and a few other evolutionary biology books have shaped my thinking on this.
Hi Steve, thanks for all of your posts.
It is unclear to me how this investigation into brain-like AGI will aid in safety research.
Can you provide some examples of what discoveries would indicate that this is an AGI route that is very dangerous or safe?
Without having thought about this much it seems to me like the control/alignment problem depends upon the terminal goals we provide the AGI rather than the substrate and algorithms it is running to obtain AGI level intelligence.
Thank you for the kind words and flagging some terms to look out for in societal change approaches.
Fair enough but for it to be that powerful and used as part of our immune system we may be free of parasites because we are all dead xD.
Thanks for the informative comments. You make great points. I think the population structure of bats may have something to do with their unique immune response to these infections but definitely want to look at the bat immune system more.