As a neuroscientist-turned-machine-learning-engineer, I have been thinking about this situation in a very similar way to that described in this article. One (perhaps) difference is that I think there are a fair number of possible algorithms/architectures that could successfully generate an agentive general learner sufficient for AGI. I think that a human brain -similar algorithm might be the first developed because of fairly good efficiency and having a working model to study (albeit with difficultly). On the other hand, I think it's probable that deep learning, scaled up enough, will stumble across a surprisingly effective algorithm all of a sudden with little warning (aka the lottery ticket hypothesis), risking an accidental hard take-off scenario. I kinda hope the human brain-like algorithm actually does turn out to be the first breakthrough, since I feel like we'd have a better chance of understanding and controlling it, and noticing/measuring when we'd gotten quite close.
With the blind groping into unknown solution spaces that deep learning represents, we might find more than we'd bargained for with no warning at all. Just a sudden jump from awkward semi-competent statistical machine to powerful deceitful alien-minded agent.
I also studied neuroscience for several years, and Jeff's first book was a major inspiration for me beginning that journey. I agree very much with the points you make in this review and in https://www.lesswrong.com/posts/W6wBmQheDiFmfJqZy/brain-inspired-agi-and-the-lifetime-anchor
Since we seem to be more on the same page than most other people I've talked to about this, perhaps a collaboration between us could be fruitful. Not sure on what exactly, but I've been thinking about how to transition into direct work on AGI safety since updating in the past couple years that it is potentially even closer than I'd thought.
As for the brain division, I also think of the neocortex and basal ganglia working together as a subsystem. I actually strengthened my belief in their tight coupling in my last year of grad school when I learned more about the striatum gating thoughts (not just motor actions), and the cerebellum smoothing abstract thoughts (not just motor actions). So now I envision it more like the brain is thousands of mostly repetitive loops of little neocortex region -> little basal ganglia region -> little hindbrain region -> little cerebellum region -> same little neocortex region, and that these loops also communicate sideways a bit in each region but mostly in the neocortex. With this understanding, I feel like I can't at all get behind Jeff H's idea of safely separating out the neocortical functions from the mid/hind brain functions. I think that an effective AGI general learning algorithm is likely to have to have at least some aspects of those little loops, with striatum gating and cerebellar smoothing, and hippocampal memory linkages.... I do think that the available data in neuroscience is very close, if not already, sufficient for describing the necessary algorithm and it's just a question of a bit more focused work on sorting out the necessary parts from the unneeded complexity. I pulled back from actively trying to do just that once I realized that gaining that knowledge without sufficient safety preparation could be a bad thing for humanity.
Hey so, this tickled my curiosity and I went exploring for similar projects just to see what's out there. I came across a couple youtube videos I enjoyed enough to feel they were worth passing on. So just in case you are also fascinated by similar sorts of sims...