Background / Context
Some people, including me, think that it will be very hard and risky to write and run Artificial General Intelligence (AGI) code without risking catastrophic accidents—up to and including human extinction (see for example my post here).
If that’s true, one option that’s sometimes brought up is a particular Differential Technological Development strategy, wherein we specifically try to get the technology for Whole Brain Emulation (WBE) before we have the technology for writing AGI source code.
Would that actually help solve the problem? I mean, other things equal, if flesh-and-blood humans have probability P of accidentally creating catastrophically-out-of-control AGIs, well, emulated human brains would do the exact same thing with the exact same probability…. Right? Well, maybe. Or it might be more complicated than that. There’s a nice discussion about this in the report from the 2011 “Singularity Summit”. That's from a decade ago, but AFAICT not much progress has been made since then towards clarifying this particular strategic question.
Anyway, when considering whether or not we should strategically try to differentially accelerate WBE technology, one important aspect is whether it would be feasible to get WBE without incidentally first understanding brain algorithms well enough to code an AGI from scratch using similar algorithms. So that brings us to the point of this post:
Randal Koene is apparently very big in WBE circles—he coined the term “WBE”, he’s the co-founder of The Carboncopies Foundation, he’s a PhD computational neuroscientist and neuroengineer, etc. etc. Anyway, in a recent interview he seems to come down firmly on the side of “we will understand brain algorithms before WBE”. Here’s his reasoning:
Interviewer (Paul Middlebrooks): Engineering and science-for-understanding are not at odds with each other, necessarily, but they are two different things, and I know this is an open question, but in your opinion how much do we need to understand brains, how much do we need to understand minds, and what is a mind, and how brains and minds are related, how much is understanding part of this picture? Randal, let’s start with you.
Interviewee (Randal Koene): I think I may have shifted my views on that a bit over time, as I understand more about the practical problem of how would you get from where we are to where you can do WBE, and just looking at it in that sense. Y’know, in the past I might have emphasized more that the idea behind WBE is precisely that you don’t need to know everything about the brain as long as you know how the underlying mechanisms work, if you can scan enough, and you can put those mechanisms together, then you’re gonna end up with a working brain. That’s a bit naïve because it presumes that we collect data correctly, that we collect the right data, that we know how to transform that data to the parameters we use in the model, that we’re using the right model, all this kind of stuff, right? And all these questions that I just mentioned, they all require testing. And so validation is a huge issue, and that’s where the understanding of the brain comes in, because if you want to validate that at least the model you’ve built works like a human hippocampus, then you need to have a fairly good understanding of how a human hippocampus works, then you can see whether your system even fits within those boundaries before you can even say “Is this Steve’s hippocampus?” So I would still say that the thing that WBE holds as a tenet is that we don’t need to understand everything about Steve to be able to make a WBE of Steve. We need to understand a heck of a lot about human brains so that we can build a testable model of a human brain that will then house Steve. But we can collect the data about Steve that makes that personalized and tuned to be Steve. So we need to understand a lot about the brain, but in the context of how brains work, not how Steve’s brain works, that’s where you would then be taking the data, and of course you need to know a lot about that transformation of what makes it Steve’s brain in this particular case. (Source: Brain Inspired podcast, 1:15:00)
I just thought this was an interesting perspective and wanted to put it out there.
(For my part, I happen to agree with the conclusion that it's probably infeasible to do successful WBE without first understanding brain algorithms well enough to make AGI, but for kinda different (non-mutually-exclusive) reasons—basically I think the former is just way harder than the latter. See Building brain-inspired AGI is infinitely easier than understanding the brain. OK, well, I was talking about something slightly different in that post—"understanding" is not the same as "emulating"—but it would be mostly the same arguments and examples.)
IMO there's a fair chance that it's much easier to do alignment given WBE, since it gives you a formal specification of the entire human policy instead of just some samples. For example, we might be able to go from policy to utility function using the AIT definition of goal-directedness. So, there is some case for doing WBE before TAI if that's feasible.
I think it’s plausible we’ll be able to use deep learning to model a brain well before we understand how the brain works.
Congratulations, you now have an emulated human. No need to understand any brain algorithms. You just need tons of brain + behaviour data and compute. I think this will be possible before non brain-based AGI because current AI research indicates it’s easier to train a model by distilling/imitating an already trained model than it is to train from scratch, e.g., DistilBERT: https://arxiv.org/abs/1910.01108v4
I've been trying to brand this paradigm as "brain imitation learning" but it hasn't caught on. The research still continues and we're seeing exponential increases in neuron recording capabilities and DL models are doing ever better in cracking open the human brain's neural code*, but this in-between approach is still mostly ignored.
* so IMO the only reason to be less interested in it than a few years ago is if you think pure DL scaling/progress has gone so fast that it's outpacing even that, which is reasonable but given the imponderables here and the potential for sudden plateaus in scaling or pure DL progress, I think people should still be keeping more of an eye on brain imitation learning than they do.
I don't think the thing you're talking about is "an emulated human", at least not in the WBE sense of the term.
I think the two reasons people are interested in WBE is:
What you're talking about wouldn't have either of those benefits, or at least not much.
I wasn't recording my brain when Pat kissed me in fourth grade, and I haven't recalled that memory since then, so there's no way that an emulation could have access to that memory just based on a database of real-time brain recording. The only way to get that memory is to slice up my brain and look at the synapses under a microscope. (Made-up example of course, nobody in fourth grade would have dreamed of kissing me.)
Also, I believe that human motivation—so important for safety—heavily involves autonomic inputs and outputs (pain, hunger, circulating hormone levels, vasoconstriction, etc. etc.)—and in this domain your proposed system wouldn't be able to measure most of the inputs, and wouldn't be able to measure most of the outputs, and probably wouldn't be able to measure most of the brain processing that goes on between the inputs and outputs either! (Well, it depends on exactly what the brain-computer interface type is, but autonomic processing tends to happen in deeply-buried hard-to-measure brain areas like the insular and cingulate cortex, brainstem, and even inside the spinal cord). Maybe you'll say "that's fine, we'll measure a subset of inputs and a subset of outputs and a subset of brain processing, and then we'll fill in the gaps by learning". And, well, that's not unreasonable. I mean, by the same token, GPT-3 had only a tiny subset of human inputs and outputs, and zero direct measurements of brain processing, and yet GPT-3 arguably learned an implicit model of brain processing. Not a perfect one by any means, but something.
So anyway, one can make an argument that there are safety benefits of human imitation learning (versus, say, training by pure RL in a virtual environment), and then one can add that there are additional safety benefits when we go to "human imitation learning which is souped-up via throwing EEG data or whatever into the model prediction target". I'm open-minded to that kind of argument and have talked about vaguely similar things myself. But I still think that's a different sort of argument then the WBE safety argument above, the argument that the WBE of a trustworthy human is automatically trustworthy because it's the same person. In particular, the imitation-learning safety argument is much less airtight I think. It requires additional careful thought about distributional shifts and so on.
So my point is: I don't think what you're talking about should be called "emulations", and even if you're right, I don't think it would undermine the point of this post, which is that WBE is unlikely to happen before non-WBE AGI even if we wanted it to.
So now we move on to whether I believe your scenario. Well it's hard to be confident, but I don't currently put much weight on it. I figure, option 1 is: "deep neural nets do in fact scale to AGI". In that case, your argument is that EEG data or whatever will reduce training time/data because it's like model distillation. I would say "sure, maybe model distillation helps, other things equal … but on the other hand we have 100,000 years of YouTube videos to train on, and a comparatively very expensive and infinitesimal amount of EEG data". So I expect that all things considered, future engineers would just go with the YouTube option. Option 2 is: "deep neural nets do not in fact scale to AGI"—they're the wrong kind of algorithm for AGI. (I've made this argument, although I mean who knows, I don't feel that strongly.) In that case adding EEG data as an additional prediction target wouldn't help.
1. Accidentally creating AGI seems unlikely.
2. If you only have one emulated brain, it's less likely to do so than humans (base rate).
3. Emulating brains in order to increase capability is currently...an idea. Even if you could run it 'at the same speed' (to the extent that such a thing makes sense - and remember we interact with the world through bodies, a brain can't see, etc.), running faster would take correspondingly more effort and be even more expensive. (I've heard brains are remarkably power efficient, physically. The cost of running a supercomputer to simulate a human brain, seems high. (Working out better software might improve this some amount.))
Practically, progress requires doing both, i.e. better equipment to create and measure electricity is needed to understand it better, which helps understand how to direct, contain, and generate it better, etc.
Sorry if I was unclear; my intended parsing was "accidentally (creating catastrophically-out-of-control AGIs)". In other words, I don't expect that people will try to create catastrophically-out-of-control AGIs. Therefore, if they create catastrophically-out-of-control AGIs, it would be by accident.
I think you're overly confident that WBE would be irrelevant to the timeline of AGI capabilities research, but I think it's a moot point anyway, since I don't expect WBE before AGI, so I'm not really interested in arguing about it. :-P
I do in fact agree with you, but I think it's not as clear-cut as you make it out to be in the WBE case, I think it takes a more detailed argument where reasonable people could disagree. In particular, there's an argument on the other side that says "implementation-level understanding" is a different thing from "algorithm-level understanding", and you only need the first one for WBE, not the second one.
So for example, if I give you a binary executable "factor.exe" that solves hard factorization problems, you would be able to run it on a computer much more easily than you could decompile it and understand how the algorithm works.
This example goes through because we have perfect implementation-level understanding about running executables on CPUs. In the brain case, Randal is arguing (and I agree) that we don't have perfect implementation-level understanding, and we won't get it by just studying the implementation level. The implementation-level is just very complicated—much more complicated than "dendrites are inputs, axons are outputs" etc. And it could involve subtle things that we won't actually go measure and simulate unless we know that we need to go looking for them. So in practice, the only way to make up for our expected deficiencies in implementation-level understanding is to also have good algorithm-level understanding.
Ah, I wrote this around the same time as another comment responding to something about 'alignment work is a good idea even if the particular alignment method won't work for a super intelligence'. (A positive utility argument is not a max utility argument.)
So, I wasn't thinking about the timeline (and how relevant it would be) when I wrote that, just that it seems far out to me.
would be feasible to get WBE without incidentally first understanding brain algorithms well enough to code an AGI from scratch using similar algorithms.
I should have just responded to something like this (above).
I can see this being right (similar understanding required for both), although the idea that one must be easier than the other, I'm less sure of. Mostly in the sense that: I don't know how small an AGI can be. Yes brains are big (and complicated), but I don't know how much that can be avoided. So I think a working, understood, digital mind is a sufficiently large task that:
*an alternative would be that we'll get an answer to this question before we get AGI:
As we start understanding minds, our view of the brain starts to recognize the difficulty. Like 'we know something like X is required, but since evolution was proceeding in a very random/greedy way, in order to get something like X, a lot of unneeded complexity is added (because continuous improvement is a difficult constrain to fulfill, and relentless focus on improvement through that view is a path that does more 'visiting every local maximum along the way' than 'goes for the global maximum') and figuring this out will be a lot harder than figuring out (how to make) a digital** mind.
**I don't know how much 'custom/optimized hardware for architectures/etc.' addresses the difficulties I mentioned. This might make your point about AGI before WBE a lot stronger - if the brain is an architecture optimized for minimizing power consumption in ways that make it way harder to emulate, timewise, than 'AGI more optimized for 'computers'', then that could be a reason WBE would take longer.
I'd have thought that the main reason WBE would come up would be 'understandability' or 'alignment' rather than speed, though I can see why at first glance people would say 'reverse engineering the brain (which exists) seems easier than making something new' (even if that is wrong).
Strong agree, WBE seems far out to me too, like 100 years, although really who knows. By contrast "understanding the brain and building AGI using similar algorithms" does not seem far out to me—well, it won't happen in 5 years, but I certainly wouldn't rule it out in 20 or 30 years.
I think the bigness and complicatedness of brains consists in large part of things that will not be necessary to include in the source code of a future AGI algorithm. See the first half of my post here, for example, for why I think that.
There's a normative question of whether it's (A) good or (B) bad to have WBE before AGI, and there's a forecasting question of whether WBE-before-AGI is (1) likely by default, vs (2) possible with advocacy/targeted funding/etc., vs (3) very unlikely even with advocacy/targeted funding/etc. For my part, I vote for (3), and therefore I'm not thinking very hard about whether (A) or (B) is right.
I'm not sure whether this part of your comment is referring to the normative question or the forecasting question.
I'm also not sure if when you say "reverse engineering the brain" you're referring to WBE. For my part, I would say that "reverse-engineering the brain" = "understanding the brain at the algorithm level" = brain-inspired AGI, not WBE. I think the only way to get WBE before brain-inspired AGI is to not understand the brain at the algorithm level.
Normative. Say that aligning something much smarter/more complicated than you (or your processing**) is difficult. The obvious fix would be: can we make people smarter?* If digital enhancement is easier (which seems like it could be likely, at least for crude ways - more computers more processing (though this may be more difficult than it sounds - serial has to be finished, parallel has to be communicated, etc.).)
This might help with analysis, or something like bandwidth. (Being able to process more information might make it easier to analyze the output of a process - if we wanted to evaluate something GPT like thing's ability to generate poetry, then if it can generate 'poetry' faster than we can read or rate, then we're the bottle neck.)
*Or algorithms simpler/easier to understand.
**Better ways of analyzing chess might help someone understand a (current) position in a chess game better (or just allow phones to compete with humans (though I don't know how much of this is better algorithms, versus more powerful phones)).
Yeah, I'm not sure this approach is fruitful either. We haven't even managed to emulate a C. elegans - which has a mere 302 neurons - and they've been trying for two decades now. Apparently there's still no way to actually find out the synaptic weights at each neuron. Even if we had the entire connectome of the human brain mapped out it might not be useful to us.
My google fu is failing me as to the latest news on the OpenWorm project (most importantly if there's any new approaches to figuring out the synaptic weights), if anyone is up to date on this stuff, I'd love to know.
I should be