I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
Maybe it will do that using GPUs, or maybe it will do that using some more neuromorphic design, but I really don't understand why energy density matters very much. The vast majority of energy that current humans produce is of course not spent on running human brains, and there are easily 10-30 OOMs of improvement lying around without going into density (just using the energy output of a single power plant under your model would produce something that would likely be easily capable of disempowering humanity).
More broadly, you list these three "assumptions" of Eliezer's worldview:
...The brain inefficiency assumption: The human brain is inefficient in multiple dimensions/ways/metrics that translate into intelligence per dollar; inefficient as a hardware platform in key metrics such as thermodynamic efficiency.
The mind inefficiency or human incompetence assumption: In terms of software he describes the brain as an inefficient complex "kludgy mess of spaghetti-code". He deri
I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
I'm going to expand on this.
Jacob's conclusion to the speed section of his post on brain efficiency is this:
The brain is a million times slower than digital computers, but its slow speed is probably efficient for its given energy budget, as it allows for a full utilization of an enormous memory capacity and memory bandwidth. As a consequence of being very slow, brains are enormously circuit cycle efficient. Thus even some hypothetical superintelligence, running on non-exotic hardware, will not be able to think much faster than an artificial brain running on equivalent hardware at the same clock rate.
Let's accept all Jacob's analysis about the tradeoffs of clock speed, memory capacity and bandwidth.
The force of his conclusion depends on the superintelligence "running on equivalent hardware." Obviously, core to Eliezer's superintelligence argument, and habryka's comment here, is the point that the hardware underpinning AI can be made large and expanded upon in a way that is not possible...
First, he needs to explain why any efficiency constraints can't be overcome by just throwing a lot of material and energy resources into building and powering inefficient or as-efficient-as-human-brains GPUs. If energy is not a taut constraint for AGI, and it's also expected to be an increasing fraction of costs over time, then that sounds like an argument that we can overcome any efficiency limits with increasing expenditures to achieve superhuman performance.
If Jake claims to disagree with the claim that ai can starkly surpass humans [now disproven - he has made more explicit that it can], I'd roll my eyes at him. He is doing a significant amount of work based on the premise that this ai can surpass humans. His claims about safety must therefore not rely on ai being limited in capability; if his claims had relied on ai being naturally capability bounded I'd have rolled to disbelieve [edit: his claims do not rely on it]. I don't think his claims rely on it, as I currently think his views on safety are damn close to simply being a lower resolution version of mine held overconfidently [this is intended to be a pointer to stalking both our profiles]; it's possible he actually disa...
The bioweapons is something of a tangent, but I felled compelled to mention it because every time I've pointed out that strong nanotech can't have any core thermodynamic efficiency over biology someone has to mention superviruses or something, even that isn't part of EY's model - he talks about diamond nanobots. But sure, that paragraph is something of a tangent.
EY's model requires slightly-smarter-than-us AGI running on normal hardware to start a FOOM cycle of recursive self improvement resulting in many OOM intelligence improvement in a short amount of time. That requires some combination of 1.) many OOM software improvement on current hardware, 2.) many OOM hardware improvement with current foundry tech, or 3.) completely new foundry tech with many OOM improvement over current - ie nanotech woo. The viability of all/any of this is all entirely dependent on near term engineering practicality.
I'm confused because you describe an "argument specifically that you are dispatching with your efficiency arguments", and the first paragraph sounds like an EY argument, but the 2nd more like my argument. (And 'dispatching' is ambiguous)
Also "being already superintelligent" presumes the conclusion at the onset.
So lets restart:
EY and I agree on 1 but diverge past that. Point 2 is partly a matter of software efficiency but not entirely. Recall that I correctly predicted in advance that AGI requires brain-like massive training compute, which largely defeats EY's view of 2 where it's just a modest "rewrite of its own source code". The efficiency considerations matter for both 2 and 3, as they determine how effectively it can quickly turn resources (energy/materials/money/etc) into bigger better training runs to upgrade its intelligence.
Thanks Jacob for talking me through your model. I agree with you that this is a model that EY and others associated with him have put forth. I've looked back through Eliezer's old posts, and he is consistently against the idea that LLMs are the path to superintelligence (not just that they're not the only path, but he outright denies that superintelligence could come from neural nets).
My update, based on your arguments here, is that any future claim about a mechanism for iterative self-improvement that happens suddenly, on the training hardware and involves > 2 OOMs of improvement, needs to first deal with the objections you are raising here to be a meaningful way of moving the conversation forward.
I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
I am genuinely curious and confused as to what exactly you concretely imagine this supposed 'superintelligence' to be, such that is not already the size of a factory, such that you mention "size of a factory" as if that is something actually worth mentioning - at all. Please show at least your first pass fermi estimates for the compute requirements. By that I mean - what are the compute requirements for the initial SI - and then the later presumably more powerful 'factory'?
Maybe it will do that using GPUs, or maybe it will do that using some more neuromorphic design, but I really don't understand why energy density matters very much.
I would suggest reading more about advanced GPU/accelerator design, and then about datacenter design and the thermodynamic/cooling considerations therein.
...The vast majority of energy that current humans produce is of course not spent on running human brains, and there are easily 10-30 OOMs of improvement lying around without going into density (j
I am genuinely curious and confused as to what exactly you concretely imagine this supposed 'superintelligence' to be, such that is not already the size of a factory, such that you mention "size of a factory" as if that is something actually worth mentioning - at all. Please show at least your first pass fermi estimates for the compute requirements.
Despite your claim to be "genuinely curious and confused," the overarching tone and content of this bit does not strike me as projecting curiosity or confusion, but instead confident and sharp-toned burden-of-proof-shifting to habryka. That's merely a stylistic note, not impacting the content of your claims.
It sounds here like you are agreeing with him that you can deal with any limits to ops/mm^3 limits by simply building a bigger computer. It's therefore hard for me to see why these arguments about efficiency limitations matter very much for AI's ability to be superintelligent and exhibit superhuman takeover capabilities.
I can see why maybe human brains, being efficient according to certain metrics, might be a useful tool for the AI to keep around, but I don't see why we ought to feel at all reassured by that. I don't really want...
Despite your claim to be "genuinely curious and confused," the overarching tone and content of this bit does not strike me as projecting curiosity or confusion, but instead confident and sharp-toned burden-of-proof-shifting to habryka
I see how that tone could come off as rude, but really I don't understand habryka's model when he says "a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily."
So it takes 1% of a single power plant output to train GPT4. If GPT-4 got put on chips and distributed, wouldn't it take only a very small amount of power comparatively to actually run it once trained? Why are we talking about training costs rather than primarily about the cost to operate models once they have been trained?
The transformer arch is fully parallelizable only during training, but is roughly just as or more inefficient than RNNs on GPUs/accelerators for inference. The inference costs of GPT4 are of course a openai/microsoft secret, but it is not a cheap model. Also human-level AGI, let alone superintelligence, will likely require continual learning/training.
I guess by "put on chips" you mean ...
Re Yudkowsky, I don’t think his entire argument rests on efficiency, and the pieces that don’t can’t be dispatched by arguing about efficiency.
Regarding “alien mindspace,” what I mean is that the physical form of AI, and whatever awareness the AI has of that, makes it alien. Like, if I knew I could potentially transmit my consciousness with perfect precision over the internet and create self-clones almost effortlessly, I would think very differently than I do now.
His argument entirely depends on efficiency. He claims that near future AGI somewhat smarter than us creates even smarter AGI and so on, recursively bottoming out in something that is many many OOM more intelligent than us without using unrealistic amounts of energy, and all of this happens very quickly.
So that's entirely an argument that boils down to practical computational engineering efficiency considerations. Additionally he needs the AGI to be unaligned by default, and that argument is also faulty.
Sure. We will probably get enormous hardware progress over the next few decades, so that's not really an obstacle.
As we get more hardware and slow mostly-aligned AGI/AI progress this further raises the bar for foom.
It seems to me your argument is "smarter than human intelligence cannot make enormous hardware or software progress in a relatively short amount of time", but this has nothing to do with "efficiency arguments".
That is actually an efficiency argument, and in my brain efficiency post I discuss multiple sub components of net efficiency that translate into intelligence/$.
The bottleneck is not energy, the bottleneck is algorithmic improvements and improvements to GPU production, neither of which is remotely bottlenecked on energy consumption.
Ahh I see - energy efficiency is tightly coupled to other circuit efficiency metrics as they are all primarily driven by shrinkage. As you increasingly bottom out hardware improvements energy then becomes an increasingly more direct constraint. This is already happening with GPUs where power consumption is roughly doubling with each generation, and could soon dominate operating costs.
See here where I line the roodman model up...
To reiterate the model of EY that I am critiquing is one where an AGI quickly rapidly fooms through many OOM efficiency improvements. All key required improvements are efficiency improvements - it needs to improve it's world modelling/planning per unit compute, and or improve compute per dollar and or compute per joule, etc.
In EY's model there are some perhaps many OOM software improvements over the initial NN arch/aglorithms, perhaps then continued with more OOM hardware improvements. I don't believe "buying more GPUs" is a key part of his model - it is far far too slow to provide even one OOM upgrade. Renting/hacking your way to even one OOM more GPUs is also largely unrealistic (I run one of the larger GPU compute markets and talk to many suppliers, I have inside knowledge here).
Both scenarios (going both big, in that you just use whole power-plant levels of energy, or going down in that you improve efficiency of chips) require changing semiconductor manufacturing, which is unlikely to be one of the first things a nascent AI does, unless it does successfully develop and deploy drexlerian nanotech
Right, so I have arguments against drexlerian nanotech (Moore room at the bot...
Biology is incredibly efficient, and generally seems to be near pareto-optimal.
This seems really implausible. I'd like to see a debate about this. E.g. why can't I improve on heat by having super-cooled fluid pumped throughout my artificial brain; doesn't having no skull-size limit help a lot; doesn't metal help; doesn't it help to not have to worry about immune system stuff; doesn't it help to be able to maintain full neuroplasticity; etc.
Biology is incredibly efficient at certain things that happen at the cell level. To me, it seems like OP is extrapolating this observation rather too broadly. Human brains are quite inefficient at things they haven't faced selective pressure to be good at, like matrix multiplication.
Claiming that human brains are near Pareto-optimal efficiency for general intelligence seems like a huge stretch to me. Even assuming that's true, I'm much more worried about absolute levels of general intelligence rather than intelligence per Watt. Conventional nuclear bombs are dangerous even though they aren't anywhere near the efficiency of a theoretical antimatter bomb. AI "brains" need not be constrained by the size and energy constraints of a human brain.
Your instinct is right. The Landauer limit says that it takes at least energy to erase 1 bit of information, which is necessary to run a function which outputs 1 bit (to erase the output bit). The important thing to note is that it scales with temperature (measured in an absolute scale). Human brains operate at 310 Kelvin. Ordinary chips can already operate down to around ~230 Kelvin, and there is even a recently developed chip which operates at ~0.02 Kelvin.
So human brains being near the thermodynamic limit in this case means very little about what sort of efficiencies are possible in practice.
Your point about skull-sizes [being bounded by childbirth death risk] seems very strong for evolutionary reasons, and to which I would also add the fact that bird brains seem to do similar amounts of cognition (to smallish mammals) in a much more compact volume without having substantially higher body temperatures (~315 Kelvin).
Cooling the computer doesn't let you get around the Landauer limit! The savings in energy you get by erasing bits at low temperature are offset by the energy you need to dissipate to keep your computer cold. (Erasing a bit at low temperature still generates some heat, and when you work out how much energy your refrigerator has to use to get rid of that heat, it turns out that you must dissipate the same amount as the Landauer limit says you'd have to if you just erased the bit at ambient temperatures.) To get real savings, you have to actually put your computer in an environment that is naturally colder. For example, if you could put a computer in deep space, that would work.
On the other hand, there might also be other good reasons to keep a computer cold, for example if you want to lower the voltage needed to represent a bit, then keeping your computer cold would plausibly help with that. It just won't reduce your Landauer-limit-imposed power bill.
None of this is to say that I agree with the rest of Jacob's analysis of thermodynamic efficiency, I believe he's made a couple of shaky assumptions and one actual mistake. Since this is getting a lot of attention, I might write a post on it.
Ordinary chips can already operate down to around ~230 Kelvin, and there is even a recently developed chip which operates at ~0.02 Kelvin.
In a room temp bath this always costs more energy - there is no free lunch in cooling. However in the depths of outer space this may become relevant.
I don't think heat dissipation is actually a limiting factor for humans as things stand right now. Looking at the heat dissipation capabilities of a human brain from three perspectives (maximum possible heat dissipation by sweat glands across the whole body, maximum actual amount of sustained power output by a human in practice, maximum heat transfer from the brain to arterial blood with current-human levels of arterial bloodflow), none of them look to me to be close to the 20w the human brain consumes.
That's if you're counting the cerebellum, which doesn't seem to contribute much to intelligence, but is important for controlling the complicated musculature of a trunk and large body.
By cortical neuron count, humans have about 18 billion, while elephants have less than 6, comparable to a chimpanzee. (source)
Elephants are undeniably intelligent as animals go, but not at human level.
Even blue whales barely approach human level by cortical neuron count, although some cetaceans (notably orcas) exceed it.
Except that is exactly what is happening. TSMC is approaching the limits of circuit miniaturization, and it is increasing obvious that fully closing the (now not so large) gap with the brain will require more directly mimicking it through neuromorphic computing[2].
You mention a few times that you seem confident about Moore's law ending very soon. I am confused where this confidence comes from (though you might have looked into this more than I have).
In-general the transistor-density aspect of Moore's law always seemed pretty contingent to me. The economic pressures care about flops/$, not about transistor density, which has just historically been the best way to get flops/$. Also for forecasting AI dynamics, flops/$ seems like it matters a lot more, since in the near future AI seems unlikely to have to care much about transistor density, given that there are easily 10-20 OOMs of energy and materials to be used on earth's surface for some kind of semiconductor or neuromorphic compute production.
And in the space of flops/$, Moore's law seems to be going strong.The last report from AI Impacts I remember reading suggests things were going strong until at least 2020:
ht...
Jensen Huang/Nvidia is almost un-arguably one of TSMC's most important clients and probably has some insights/access to their roadmaps, and I don't particularly suspect he is lying when he claims Moore's Law is dead, it matches my own analysis of TSMC's public roadmap, as well my analysis of the industry research/chatter/gossip/analysis. Moore's Law was a long recursive miniaturization optimization process which just was always naturally destined to bottom out somewhat before new on-moore's law leading foundries cost sizable fractions of world GDP and features approach minimal sizes (well predicted in advance).
This obviously isn't the end of technological progress in computing! It's just the end of the easy era. Neuromorphic computing is much harder for comparatively small gains. Reversible computing seems almost impossibly difficult, such that many envision just jumping straight to quantum computing, which itself is no panacea and very far.
And this 2022 analysis suggests things were also going quite strong very recently,
As were chip clock frequencies under dennard scaling, until that suddenly ended. I have uncertainty over how far we are from minimal viable switch energie...
As were chip clock frequencies under dennard scaling, until that suddenly ended. I have uncertainty over how far we are from minimal viable switch energies but it is not multiple OOM. There are more architectural tricks in the pipes in the nature of lower precision tensorcores, but not many of those left either
Want to take a bet? $1000, even odds.
I predict flops/$ to continue going down at between a factor of 2x every 2 years and 2x every 3 years. Happy to have someone else be a referee on whether it holds up.
[Edit: Actually, to avoid having to condition on a fast takeoff itself, let's say "going down faster than a factor of 2x every 3 years for the next 6 years"]
I may be up for that but we need to first define 'flops', acceptable GPUs/products, how to calculate prices (preferably some standard rental price with power cost), and finally the bet implementation.
He boo-hooed neural networks, and in fact actively bet against them in actions by hiring researchers trained in abstract math/philosophy, ignoring neuroscience and early DL, etc.
This seems to assume that those researchers were meant to work out how to create AI. But the goal of that research was rather to formalize and study some of the challenges in AI alignment in crisp language to make them as clear as possible. The intent was not to study the question of "how do we build AI" but rather "what would we want from an AI and what would prevent us from getting that, assuming that we could build one". That approach doesn't make any assumptions of how the AI would be built, it could be neural nets or anything else.
Eliezer makes that explicit in e.g. this SSC comment:
...MIRI doesn’t assume all AIs will be logical and I really need to write a long long screed about this at some point if I can stop myself from banging the keyboard so hard that the keys break. We worked on problems involving logic, because when you are confused about a *really big* thing, one of the ways to proceed is to try to list out all the really deep obstacles. And then, instead of the usual practice of trying to
EY's belief distribution about NNs and early DL from over a decade ago and how that reflects on his predictive track record has already been extensively litigated in other recent threads like here. I mostly agree that EY 2008 and later is somewhat cautious/circumspect about making explicitly future-disprovable predictions, but he surely did seem to exude skepticism which complements my interpretation of his actions.
That being said I also largely agree that MIRI's research path was chosen specifically to try and be more generic than any viable route to AGI. But one could consider that also as something of a failure or missed opportunity vs investing more in studying neural networks, the neuroscience of human alignment, etc.
But I've always said (perhaps not in public, but nonetheless) that I thought MIRI had a very small chance of success, but it was still a reasonable bet for at least one team to make, just in case the connectivists were all wrong about this DL thing.
Thus my model (or the systems/cybernetic model in general) correctly predicted - well in advance - that LLMs would have anthropomorphic cognition: mirroring much or our seemingly idiosyncratic cognitive biases, quirks, and limitations.
This seems false. LLMs are trained to predict, which often results in them mimicking certain kinds of human errors. Mimicking errors doesn't mean that the underlying cognition which produced those errors is similar.
This is exactly how we train modern large ANNs, and LLMs specifically: by training them on the internet, we are training them on human thoughts and thus (partially) distilling human minds.
In what sense is predicting internet text "training [LLMs] on human thoughts"? Human thoughts are causally upstream of some internet text, so learning to predict human thoughts is one way of being good at predicting, but it's certainly not the only one. More on this general point here.
One thing that I have observed, working with LLMs, is that when they're predicting the next token in a Python REPL they also make kinda similar mistakes to the ones that a human who wasn't paying that much attention would make. For example, consider the following
>>> a, b = 3, 5 # input
>>> a + b # input
8 # input
>>> a, b = b, a # input
>>> a * b # input
15 # prediction (text-davinci-003, temperature=0, correct)
>>> a / b # input
1.0 # prediction (text-davinci-003, temperature=0, incorrect but understandable mistake)
>>> a # input
5 # prediction (text-davinci-003, temperature=0, correct)
>>> a / b # input
1.0 # prediction (text-davinci-003, temperature=0, incorrect but understandable mistake)
>>> a # input
5 # prediction (text-davinci-003, temperature=0, correct)
>>> a / b # input
1.0 # prediction (text-davinci-003, temperature=0, incorrect but understandable mistake)
>>> b # input
3
... Even if every one of your object level objections is likely to be right, this wouldn't shift me much in terms of policies I think we should pursue because the downside risks from TAI are astronomically large even at small probabilities (unless you discount all future and non-human life to 0). I see Eliezer as making arguments about the worst ways things could go wrong and why it's not guaranteed that they won't go that way. We could get lucky, but we shouldn't count on luck, so even if Eliezer is wrong he's wrong in ways that, if we adopt policies that account for his arguments, better protect us from existential catastrophe at the cost of getting to TAI a few decades later, which is a small price to pay to offset very large risks that exist even at small probabilities.
I am reasonably sympathetic to this argument, and I agree that the difference between EY's p(doom) > 50% and my p(doom) of perhaps 5% to 10% doesn't obviously cash out into major policy differences.
I of course fully agree with EY/bostrom/others that AI is the dominant risk, we should be appropriately cautious, etc. This is more about why I find EY's specific classic doom argument to be uncompelling.
My own doom scenario is somewhat different and more subtle, but mostly beyond scope of this (fairly quick) summary essay.
This is interesting but would benefit from more citations for claims and fewer personal attacks on Eliezer.
I had the same impression at first, but in the areas where I most wanted these, I realized that Jacob linked to additional posts where he has defended specific claims at length.
Here is one example:
EY is just completely out of his depth here: he doesn't seem to understand how the Landauer limit actually works, doesn't seem to understand that synapses are analog MACs which minimally require OOMs more energy than simple binary switches, doesn't seem to understand that interconnect dominates energy usage regardless, etc.
I usually find Tyler Cowenesque (and heck, Yudkowskian) phrases like this irritating, and usually they're pretty hard to interrogate, but Jacob helpfully links to an entire factpost he wrote on this specific point, elaborating on this claim in detail.
He does something similar here:
...EY derived much of his negative beliefs about the human mind from the cognitive biases and ev psych literature, and especially Tooby and Cosmide's influential evolved modularity hypothesis. The primary competitor to evolved modularity was/is the universal learning hypothesis and associated scaling hypothesis, and there was already sufficient evidence to rule out evolved modularity back i
Humanity is generating and consuming enormous amount of power - why is the power budget even relevant? And even if it was, energy for running brains ultimately comes from Sun - if you include the agriculture energy chain, and "grade" the energy efficiency of brains by the amount of solar energy it ultimately takes to power a brain, AI definitely has a potential to be more efficient. And even if a single human brain is fairly efficient, the human civilization is clearly not. With AI, you can quickly scale up the amount of compute you use, but scaling beyond a single brain is very inefficient.
If the optimal AGI design running on GPUs takes about 10 GPUs and 10kw to rival one human-brain power, and superintelligence which kills humanity ala the foom model requires 10 billion human brain power and thus 100 billion GPUs and a 100 terrawatt power plant - that is just not something that is possible in any near term.
In EY's model there is supposedly 6 OOM improvement from nanotech, so you could get the 10 billion human brainpower with a much more feasible 100 MW power plant and 100 thousand GPUs ish (equivalent).
you're assuming sublinear scaling. why wouldn't it be superlinear post training? it certainly seems like it is now. it need not be sharply superlinear like yud expected to still be superlinear.
You will likely die, but probably not because of a nanotech holocaust initiated by a god-like machine superintelligence.
This I agree with and always assumed, but it is also largely irrelevant if the end conclusion is that AGI still destroys us all. To most people, I'd say, the specific method of death doesn't matter as much as the substance. It's a special kind of academic argument one where we can endlessly debate on how precisely will the end come to be through making this thing while we all mostly agree that this thing we are making, and that we could stop making, will likely end us all. Sane people (and civilizations) just... don't make the deadly thing.
I haven't gone through the numbers so I'll give it a try, but out of the box, feels to me like your arguments about biology's computational efficiency aren't the end of it. I actually mentioned the topic as one possible point of interest here: https://www.lesswrong.com/posts/76n4pMcoDBTdXHTLY/ideas-for-studies-on-agi-risk. My impression is that biology can come up with some spectacularly efficient trade-offs, but that's only within the rules of biology. For example, biology can produce very fast animals with very good legs, b...
There are some other assumptions that go into Eliezer's model that are required for doom. I can think of one very clearly which is:
5. The transition to that god-AGI will be as quick that other entities won't have the time to reach also superhuman capabilities. There are no "intermediate" AGIs that can be used to work on Alignment related problems or even as a defence from unaligned AGIs
This is the first contra AI doom case I've read which felt like it was addressing some of the core questions, rather than nitpicking on some irrelevant point, or just completely failing to understand the AI doom argument.
So, whilst I still think some of your points need fleshing out/further arguments, thank you very much for this post!
The little pockets of cognitive science that I've geeked out about - usually in the predictive processing camp - have featured researchers who are usually quite surprised by or are going to great lengths to double underline the importance of language and culture in our embodied / extended / enacted cognition.
A simple version of the story I have in my head is this: We have physical brains thanks to evolution, and then by being an embodied predictive perception/action loop out in the world, we started transforming our world into affordances for new perceptio...
Actually I think the shoggoth mask framing is somewhat correct, but it also applies to humans. We don't have a single fixed personality, we are also mask-wearers.
Hm, neuron impulses travel at around 200 m/s, electric signals travel at around 2e8 m/s, so I think electronics have an advantage there. (I agree that you may have a point with "That Alien Mindspace".)
I agree that the human brain is roughly at a local optimum. But think about what could be done just with adding a fiber optic connection between two brains (I think there are some ethical issues here so this is a thought experiment, not something I recommend). The two brains could be a kilometer apart, and the signal between them on the fiber optic link takes less time than a signal takes to get from one side to the other of a regular brain. So these two brains could think together (probably with some (a lot?) neural rewiring) as fast as a regular brain thinks individually. Repeat with some more brains.
Or imagine if myelination was under conscious control. If you need to learn a new language, demyelinate the right parts of the brain, learn the language quickly, and then remyelinate it.
So I think even without changing things much neurons could be used in ways that provide faster thinking and faster learning.
As for energy efficiency, there is no reason that a superintelligence has to be limited to the approximately 20 watts that a human brain has access to. Gaming computers can have 1000 W power supplies, which is 50 times more power. I think 50 brains thinking together really ...
"These GPUs cost $1M and use 10x the energy of a human for the same work" is still a pretty bad deal for any workers that have to compete with that. And I don't expect economic gains to go to displaced workers.
Even if an AI is more expensive per computational capacity than humans, it being much faster and immortal would still be a threat. I could imagine a single immortal human genius becoming world-emperor eventually. Now imagine them operating 10^6 or even 10^3 faster than ordinary humans.
This post raised some interesting points, and stimulated a bunch of interesting discussion in the comments. I updated a little bit away from foom-like scenarios and towards slow-takeoff scenarios. Thanks. For that, I'd like to upvote this post.
On the other hand: I think direct/non-polite/uncompromising argumentation against other arguments, models, or beliefs is (usually) fine and good. And I think it's especially important to counter-argue possible inaccuracies in key models that lots of people have about AI/ML/alignment. However, in many places, the post...
A very naive question for Jacob. A few years ago the fact that bird brains are about 10x more computationally dense than human brains was mentioned on SlateStarCodex and by Diana Fleischman. This is something I would not expect to be true if there were not some significant "room at the bottom."
Is this false? Does this not imply what I think it should? Am I just wrong in thinking this is of any relevance?
https://slatestarcodex.com/2019/03/25/neurons-and-intelligence-a-birdbrained-perspective/
I don't understand the physics, so this is just me not...
If we look at the game of Go, AI managed to be vastly better than humans. An AI that can outcompete humans at any task the way that AlphaGo can outcompete human at Go is a serious problems even if it's not capable of directly figuring out how to build nanobots.
This is exactly how we train modern large ANNs, and LLMs specifically: by training them on the internet, we are training them on human thoughts and thus (partially) distilling human minds.
While that's true that currently most of the training data we put into LLMs seems human-created, I don't th...
Thus my model (or the systems/cybernetic model in general) correctly predicted - well in advance - that LLMs would have anthropomorphic cognition: mirroring much or our seemingly idiosyncratic cognitive biases, quirks, and limitations. Thus we have AGI that can write poems and code (like humans) but struggles with multiplying numbers (like humans), generally exhibits human like psychology, is susceptible to flattery, priming, the Jungian "shadow self" effect, etc.
Is this comment the best example of your model predicting anthropomorphic cognition? I r...
The universal learning/scaling model was largely correct - as tested by openAI scaling up GPT to proto-AGI.
I don't understand how OpenAIs success at scaling GPT proves the universal learning model. Couldn't there be an as yet undiscovered algorithm for intelligence that is more efficient?
Humans suck at arithmetic. Really suck. From comparison of current GPU's to a human trying and failing to multiply 10 digit numbers in their head, we can conclude that something about humans, hardware or software, is Incredibly inefficient.
Almost all humans have roughly the same sized brain.
So even if Einsteins brain was operating at 100% efficiency, the brain of the average human is operating at a lot less.
ie intelligence is easy - it just takes enormous amounts of compute for training.
Making a technology work at all is generally easier than m...
mirroring much or our seemingly idiosyncratic cognitive biases, quirks, and limitations.
True.
They also have a big pile of their own new idiosyncratic quirks.
https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation
These are bizarre behaviour patterns that don't resemble any humans.
This looks less like a human, and more like a very realistic painted statue. It looks like a human, complete with painted on warts, but scratch the paint, and the inhuman nature shows through.
...The width of mindspace is compl
then we should not expect moore's law to end with brains still having a non-trivial thermodynamic efficiency advantage over digital computers. Except that is exactly what is happening. TSMC is approaching the limits of circuit miniaturization, and it is increasing obvious that fully closing the (now not so large) gap with the brain will require more directly mimicking it through neuromorphic computing[2].
This is a clear error.
There is no particular reason to expect TSMC to taper off at a point anywhere near the theoretical limits.
A closely anal...
I don't have much to contribute on AI risk, but I do want to say +1 for the gutsy title. It's not often you see the equivalent of "Contra The Founding Mission of an Entire Community".
Eliezer Yudkowsky predicts doom from AI: that humanity faces likely extinction in the near future (years or decades) from a rogue unaligned superintelligent AI system. Moreover he predicts that this is the default outcome, and AI alignment is so incredibly difficult that even he failed to solve it.
EY is an entertaining and skilled writer, but do not confuse rhetorical writing talent for depth and breadth of technical knowledge. I do not have EY's talents there, or Scott Alexander's poetic powers of prose. My skill points instead have gone near exclusively towards extensive study of neuroscience, deep learning, and graphics/GPU programming. More than most, I actually have the depth and breadth of technical knowledge necessary to evaluate these claims in detail.
I have evaluated this model in detail and found it substantially incorrect and in fact brazenly naively overconfident.
Intro
Even though the central prediction of the doom model is necessarily un-observable for anthropic reasons, alternative models (such as my own, or moravec's, or hanson's) have already made substantially better predictions, such that EY's doom model has low posterior probability.
EY has espoused this doom model for over a decade, and hasn't updated it much from what I can tell. Here is the classic doom model as I understand it, starting first with key background assumptions/claims:
Brain inefficiency: The human brain is inefficient in multiple dimensions/ways/metrics that translate into intelligence per dollar; inefficient as a hardware platform in key metrics such as thermodynamic efficiency.
Mind inefficiency or human incompetence: In terms of software he describes the brain as an inefficient complex "kludgy mess of spaghetti-code". He derived these insights from the influential evolved modularity hypothesis as popularized in ev pysch by Tooby and Cosmides. He boo-hooed neural networks, and in fact actively bet against them in actions by hiring researchers trained in abstract math/philosophy, ignoring neuroscience and early DL, etc.
More room at the bottom: Naturally dovetailing with points 1 and 2, EY confidently predicts there is enormous room for further software and hardware improvement, the latter especially through strong drexlerian nanotech.
That Alien mindspace: EY claims human mindspace is an incredibly narrow twisty complex target to hit, whereas the space of AI mindspace is vast, and AI designs will be something like random rolls from this vast alien landscape resulting in an incredibly low probability of hitting the narrow human target.
Doom naturally follows from these assumptions: Sometime in the near future some team discovers the hidden keys of intelligence and creates a human-level AGI which then rewrites its own source code, initiating a self improvement recursion cascade which ultimately increases the AGI's computational efficiency (intelligence/$, intelligence/J, etc) by many OOM to far surpass human brains, which then quickly results in the AGI developing strong nanotech and killing all humans within a matter of days or even hours.
If assumptions 1 and 2 don't hold (relative to 3) then there is little to no room for recursive self improvement. If assumption 4 is completely wrong then the default outcome is not doom regardless.
Every one of his key assumptions is mostly wrong, as I and others predicted well in advance. EY seems to have been systematically overconfident as an early futurist, and then perhaps updated later to avoid specific predictions, but without much updating his mental models (specifically his nanotech-woo model, as we will see).
Brain Hardware Efficiency
EY correctly recognizes that thermodynamic efficiency is a key metric for computation/intelligence, and he confidently, brazenly claims (as of late 2021), that the brain is not that efficient and about 6 OOM from thermodynamic limits:
EY is just completely out of his depth here: he doesn't seem to understand how the Landauer limit actually works, doesn't seem to understand that synapses are analog MACs which minimally require OOMs more energy than simple binary switches, doesn't seem to have a good model of the interconnect requirements, etc.
Some attempt to defend EY by invoking reversible computing, but EY explicitly states that ATP synthase may be close to 100% thermodynamically efficient, and explicitly links the end result of extreme inefficiency to the specific cause of pumping "thousands of ions in and out of each stretch of axon and dendrite" - which would be irrelevant when comparing to some exotic reversible superconducting or optical computer. Given that he doesn't mention reversible computing and the hint "biology is simply not that efficient" helps establish we are both discussing conventional irreversible computation: not exotic reversible or quantum computing (neither of which are practical in the near future or relevant for the nanotech he envisions, which is fundamentally robotic and thus constrained by the efficiency of applying energy to irreversibly transform matter). He seems to believe biology is inefficient even given the practical constraints it is working with, not inefficient compared to all possible future hypothetical exotic computing platforms without consideration for other tradeoffs. Finally if he actually believes (as I do) that brains are efficient within the constraints of conventional irreversible computation, this would in fact substantially weaken his larger argument - and EY is not the kind of writer who weakens his own arguments.
In actuality biology is incredibly thermodynamically efficient, and generally seems to be near pareto-optimal in that regard at the cellular nanobot level, but we'll get back to that.
In a 30 year human "training run" the brain uses somewhere between 1e23 to 1e25 flops. ANNs trained with this amount of compute already capture much - but not all - of human intelligence. One likely reason is that flops is not the only metric of relevance, and a human brain training run also uses 1e23 to 1e25 bytes of memops, which is still OOM more than the likely largest ANN training run to date (GPT4) - because GPUs have a 2 or 3 OOM gap between flops and memops.
My model instead predicts that AGI will require GPT4 -ish levels of training compute, and SI will require far more. To the extent that recursive self-improvement is actually a thing in the NN paradigm, it's something that NNs mostly just do automatically (and something the brain currently still does better than ANNs).
Mind Software Efficiency
EY derived much of his negative beliefs about the human mind from the cognitive biases and ev psych literature, and especially Tooby and Cosmide's influential evolved modularity hypothesis. The primary competitor to evolved modularity was/is the universal learning hypothesis and associated scaling hypothesis, and there was already sufficient evidence to rule out evolved modularity back in 2015 or earlier.
Let's quickly assess the predictions of evolved modularity vs universal learning/scaling. Evolved modularity posits that the brain is a kludgy mess of domain specific evolved mechanisms ("spaghetti code" in EY's words), and thus AGI will probably not come from brain reverse engineering. Human intelligence is exceptional because evolution figured out some "core to generality" that prior primate brains don't have, but humans have only the minimal early version of this, and there is likely huge room for further improvement.
The universal learning/scaling model instead posits that there is a single obvious algorithmic signature for intelligence (approx bayesian inference), it isn't that hard to figure out, evolution found it multiple times, human DL researchers also figured much of it out in the 90's - ie intelligence is easy - it just takes enormous amounts of compute for training. As long as you don't shoot yourself in the foot - as long as your architectural prior is flexible enough (ex transformers), as long as your approx to bayesian inference actually converges correctly (normalization etc) etc - then the amount of intelligence you get is proportional to the net compute spent on training. The human brain isn't exceptional - its just a scaled up primate brain, but scaling up the net training compute by 10x (3x from larger brain, 3x from extended neotany, and some from arch/hyperparm tweaking) was enough for linguistic intelligence and the concomitant turing transition to emerge[1]. EY hates the word emergence, but intelligence is an emergent phenomena.
The universal learning/scaling model was largely correct - as tested by openAI scaling up GPT to proto-AGI.
That does not mean we are on the final scaling curve. The brain is of course strong evidence of other scaling choices that look different than chinchilla scaling. A human brain's natural 'clock rate' of about 100hz supports a thoughtspeed of about 10 tokens per second, or only about 10 billion tokens per lifetime training run. GPT3 trained on about 50x longer lifetime experience/data, and GPT4 may have trained on 1000 human lifetimes of experience/data. You can spend roughly the same compute budget training a huge brain sized model for just one human lifetime, or spend it on a 100x smaller model trained for far longer. You don't end up in exactly the same space of course - GPT4 has far more crystallized knowledge than any one human, but seems to still lack much of a human domain expert's fluid intelligence capabilities.
Moore room at the Bottom
If the brain really is ~6 OOM from thermodynamic efficiency limits, then we should not expect moore's law to end with brains still having a non-trivial thermodynamic efficiency advantage over digital computers. Except that is exactly what is happening. TSMC is approaching the limits of circuit miniaturization, and it is increasing obvious that fully closing the (now not so large) gap with the brain will require more directly mimicking it through neuromorphic computing[2].
Biological cells operate directly at thermodynamic efficiency limits: they copy DNA using near minimal energy, and in general they perform robotics tasks of rearranging matter using near minimal energy. For nanotech replicators (and nanorobots in general) like biological cells thermodynamic efficiency is the dominant constraint, and biology is already pareto optimal there. No SI will ever create strong nanotech that significantly improves on the thermodynamic efficiency of biology - unless/until they can rewrite the laws of physics.
Of course an AGI could still kill much of humanity using advanced biotech weapons - ex a supervirus - but that is beyond the scope of EY's specific model, and for various reasons mostly stemming from the strong prior that biology is super effecient I expect humanity to be very difficult to kill in this way (and growing harder to kill every year as we advance prosaic AI tech). Also killing humanity would likely not be in the best interests of even unaligned AGI, because humans will probably continue to be key components of the economy (as highly efficient general purpose robots) long after AGI running in datacenters takes most higher paying intellectual jobs. So instead I expect unaligned power-seeking AGIs to adopt much more covert strategies for world domination.
That Alien Mindspace
In the "design space of minds in general" EY says:
Quintin Pope has already written out a well argued critique of this alien mindspace meme from the DL perspective, and I already criticized this meme once when it was fresh over a decade ago. So today I will instead take a somewhat different approach (an updated elaboration of my original critique).
Imagine we have some set of mysterious NNs, which we'd like to replicate, but we only have black box access. By that I mean we have many many examples of the likely partial inputs and outputs of these networks, and some ideas about the architecture, but we don't have any direct access to the weights.
In turns out there is a simple and surprisingly successful technique which one can use to create an arbitrary partial emulation of any ensemble of NNs: distillation. In essence distillation is simply the process of training one NN on the collected inputs/outputs of other NNs, such that it learns to emulate them.
This is exactly how we train modern large ANNs, and LLMs specifically: by training them on the internet, we are training them on human thoughts and thus (partially) distilling human minds.
Thus my model (or the systems/cybernetic model in general) correctly predicted - well in advance - that LLMs would have anthropomorphic cognition: mirroring much or our seemingly idiosyncratic cognitive biases, quirks, and limitations. Thus we have AGI that can write poems and code (like humans) but struggles with multiplying numbers (like humans), generally exhibits human like psychology, is susceptible to flattery, priming, the Jungian "shadow self" effect, etc. There is a large growing pile of specific evidence that LLMs are distilling/simulating human minds, some of which I and others have collected in prior posts, but the strength of this argument should already clearly establish a strong prior expectation that distillation should be the default outcome.
The width of mindspace is completely irrelevant. Moravec, myself and the other systems-thinkers were correct: AI is and will be our mind children; the technosphere extends the noosphere.
This alone does not strongly entail that AGI will be aligned by default, but it does defeat EY's argument that AGI will be unaligned by default (and he loses much bayes points that I gain).
The Risk Which Remains
To be clear, I am not arguing that AGI is not a threat. It is rather obviously the pivotal eschatonic event, the closing chapter in human history. Of course 'it' is dangerous, for we are dangerous. But that does not mean that 1.) extinction is the most likely outcome, or 2.) that alignment is intrinsically more difficult than AGI, or 3.) that EY's specific arguments are the especially relevant and correct way to arrive at any such conclusions.
You will likely die, but probably not because of a nanotech holocaust initiated by a god-like machine superintelligence. Instead you will probably die when you simply can no longer afford the tech required to continue living. If AI does end up causing humanity's extinction, it will probably be the result of a slow more prosaic process of gradually out-competing us economically. AGI is not inherently mortal and can afford patience, unlike us mere humans.
The turing transition: brains evolved linguistic symbolic communication which permits compressing and sharing thoughts across brains, forming a new layer of networked social computational organization and allowing minds to emerge as software entities. This is a one time transition, as there is nothing more universal/general than a turing machine. ↩︎
Neuromorphic computing is the main eventual long-term threat to current GPUs/accelerators and a continuation of the trend of embedding efficient matrix ops into the hardware, but it is unlikely to completely replace them for various reasons, and I don't expect it to be very viable until traditional moore's law has mostly ended. ↩︎