Review

Eliezer Yudkowsky predicts doom from AI: that humanity faces likely extinction in the near future (years or decades) from a rogue unaligned superintelligent AI system. Moreover he predicts that this is the default outcome, and AI alignment is so incredibly difficult that even he failed to solve it.

EY is an entertaining and skilled writer, but do not confuse rhetorical writing talent for depth and breadth of technical knowledge. I do not have EY's talents there, or Scott Alexander's poetic powers of prose. My skill points instead have gone near exclusively towards extensive study of neuroscience, deep learning, and graphics/GPU programming. More than most, I actually have the depth and breadth of technical knowledge necessary to evaluate these claims in detail.

I have evaluated this model in detail and found it substantially incorrect and in fact brazenly naively overconfident.

Intro

Even though the central prediction of the doom model is necessarily un-observable for anthropic reasons, alternative models (such as my own, or moravec's, or hanson's) have already made substantially better predictions, such that EY's doom model has low posterior probability.

EY has espoused this doom model for over a decade, and hasn't updated it much from what I can tell. Here is the classic doom model as I understand it, starting first with key background assumptions/claims:

  1. Brain inefficiency: The human brain is inefficient in multiple dimensions/ways/metrics that translate into intelligence per dollar; inefficient as a hardware platform in key metrics such as thermodynamic efficiency.

  2. Mind inefficiency or human incompetence: In terms of software he describes the brain as an inefficient complex "kludgy mess of spaghetti-code". He derived these insights from the influential evolved modularity hypothesis as popularized in ev pysch by Tooby and Cosmides. He boo-hooed neural networks, and in fact actively bet against them in actions by hiring researchers trained in abstract math/philosophy, ignoring neuroscience and early DL, etc.

  3. More room at the bottom: Naturally dovetailing with points 1 and 2, EY confidently predicts there is enormous room for further software and hardware improvement, the latter especially through strong drexlerian nanotech.

  4. That Alien mindspace: EY claims human mindspace is an incredibly narrow twisty complex target to hit, whereas the space of AI mindspace is vast, and AI designs will be something like random rolls from this vast alien landscape resulting in an incredibly low probability of hitting the narrow human target.

Doom naturally follows from these assumptions: Sometime in the near future some team discovers the hidden keys of intelligence and creates a human-level AGI which then rewrites its own source code, initiating a self improvement recursion cascade which ultimately increases the AGI's computational efficiency (intelligence/$, intelligence/J, etc) by many OOM to far surpass human brains, which then quickly results in the AGI developing strong nanotech and killing all humans within a matter of days or even hours.

If assumptions 1 and 2 don't hold (relative to 3) then there is little to no room for recursive self improvement. If assumption 4 is completely wrong then the default outcome is not doom regardless.

Every one of his key assumptions is mostly wrong, as I and others predicted well in advance. EY seems to have been systematically overconfident as an early futurist, and then perhaps updated later to avoid specific predictions, but without much updating his mental models (specifically his nanotech-woo model, as we will see).

Brain Hardware Efficiency

EY correctly recognizes that thermodynamic efficiency is a key metric for computation/intelligence, and he confidently, brazenly claims (as of late 2021), that the brain is not that efficient and about 6 OOM from thermodynamic limits:

Which brings me to the second line of very obvious-seeming reasoning that converges upon the same conclusion - that it is in principle possible to build an AGI much more computationally efficient than a human brain - namely that biology is simply not that efficient, and especially when it comes to huge complicated things that it has started doing relatively recently.

ATP synthase may be close to 100% thermodynamically efficient, but ATP synthase is literally over 1.5 billion years old and a core bottleneck on all biological metabolism. Brains have to pump thousands of ions in and out of each stretch of axon and dendrite, in order to restore their ability to fire another fast neural spike. The result is that the brain's computation is something like half a million times less efficient than the thermodynamic limit for its temperature - so around two millionths as efficient as ATP synthase. And neurons are a hell of a lot older than the biological software for general intelligence!

The software for a human brain is not going to be 100% efficient compared to the theoretical maximum, nor 10% efficient, nor 1% efficient, even before taking into account the whole thing with parallelism vs. serialism, precision vs. imprecision, or similarly clear low-level differences.

EY is just completely out of his depth here: he doesn't seem to understand how the Landauer limit actually works, doesn't seem to understand that synapses are analog MACs which minimally require OOMs more energy than simple binary switches, doesn't seem to have a good model of the interconnect requirements, etc.

Some attempt to defend EY by invoking reversible computing, but EY explicitly states that ATP synthase may be close to 100% thermodynamically efficient, and explicitly links the end result of extreme inefficiency to the specific cause of pumping "thousands of ions in and out of each stretch of axon and dendrite" - which would be irrelevant when comparing to some exotic reversible superconducting or optical computer. Given that he doesn't mention reversible computing and the hint "biology is simply not that efficient" helps establish we are both discussing conventional irreversible computation: not exotic reversible or quantum computing (neither of which are practical in the near future or relevant for the nanotech he envisions, which is fundamentally robotic and thus constrained by the efficiency of applying energy to irreversibly transform matter). He seems to believe biology is inefficient even given the practical constraints it is working with, not inefficient compared to all possible future hypothetical exotic computing platforms without consideration for other tradeoffs. Finally if he actually believes (as I do) that brains are efficient within the constraints of conventional irreversible computation, this would in fact substantially weaken his larger argument - and EY is not the kind of writer who weakens his own arguments.

In actuality biology is incredibly thermodynamically efficient, and generally seems to be near pareto-optimal in that regard at the cellular nanobot level, but we'll get back to that.

In a 30 year human "training run" the brain uses somewhere between 1e23 to 1e25 flops. ANNs trained with this amount of compute already capture much - but not all - of human intelligence. One likely reason is that flops is not the only metric of relevance, and a human brain training run also uses 1e23 to 1e25 bytes of memops, which is still OOM more than the likely largest ANN training run to date (GPT4) - because GPUs have a 2 or 3 OOM gap between flops and memops.

My model instead predicts that AGI will require GPT4 -ish levels of training compute, and SI will require far more. To the extent that recursive self-improvement is actually a thing in the NN paradigm, it's something that NNs mostly just do automatically (and something the brain currently still does better than ANNs).

Mind Software Efficiency

EY derived much of his negative beliefs about the human mind from the cognitive biases and ev psych literature, and especially Tooby and Cosmide's influential evolved modularity hypothesis. The primary competitor to evolved modularity was/is the universal learning hypothesis and associated scaling hypothesis, and there was already sufficient evidence to rule out evolved modularity back in 2015 or earlier.

Let's quickly assess the predictions of evolved modularity vs universal learning/scaling. Evolved modularity posits that the brain is a kludgy mess of domain specific evolved mechanisms ("spaghetti code" in EY's words), and thus AGI will probably not come from brain reverse engineering. Human intelligence is exceptional because evolution figured out some "core to generality" that prior primate brains don't have, but humans have only the minimal early version of this, and there is likely huge room for further improvement.

The universal learning/scaling model instead posits that there is a single obvious algorithmic signature for intelligence (approx bayesian inference), it isn't that hard to figure out, evolution found it multiple times, human DL researchers also figured much of it out in the 90's - ie intelligence is easy - it just takes enormous amounts of compute for training. As long as you don't shoot yourself in the foot - as long as your architectural prior is flexible enough (ex transformers), as long as your approx to bayesian inference actually converges correctly (normalization etc) etc - then the amount of intelligence you get is proportional to the net compute spent on training. The human brain isn't exceptional - its just a scaled up primate brain, but scaling up the net training compute by 10x (3x from larger brain, 3x from extended neotany, and some from arch/hyperparm tweaking) was enough for linguistic intelligence and the concomitant turing transition to emerge[1]. EY hates the word emergence, but intelligence is an emergent phenomena.

The universal learning/scaling model was largely correct - as tested by openAI scaling up GPT to proto-AGI.

That does not mean we are on the final scaling curve. The brain is of course strong evidence of other scaling choices that look different than chinchilla scaling. A human brain's natural 'clock rate' of about 100hz supports a thoughtspeed of about 10 tokens per second, or only about 10 billion tokens per lifetime training run. GPT3 trained on about 50x longer lifetime experience/data, and GPT4 may have trained on 1000 human lifetimes of experience/data. You can spend roughly the same compute budget training a huge brain sized model for just one human lifetime, or spend it on a 100x smaller model trained for far longer. You don't end up in exactly the same space of course - GPT4 has far more crystallized knowledge than any one human, but seems to still lack much of a human domain expert's fluid intelligence capabilities.

Moore room at the Bottom

If the brain really is ~6 OOM from thermodynamic efficiency limits, then we should not expect moore's law to end with brains still having a non-trivial thermodynamic efficiency advantage over digital computers. Except that is exactly what is happening. TSMC is approaching the limits of circuit miniaturization, and it is increasing obvious that fully closing the (now not so large) gap with the brain will require more directly mimicking it through neuromorphic computing[2].

Biological cells operate directly at thermodynamic efficiency limits: they copy DNA using near minimal energy, and in general they perform robotics tasks of rearranging matter using near minimal energy. For nanotech replicators (and nanorobots in general) like biological cells thermodynamic efficiency is the dominant constraint, and biology is already pareto optimal there. No SI will ever create strong nanotech that significantly improves on the thermodynamic efficiency of biology - unless/until they can rewrite the laws of physics.

Of course an AGI could still kill much of humanity using advanced biotech weapons - ex a supervirus - but that is beyond the scope of EY's specific model, and for various reasons mostly stemming from the strong prior that biology is super effecient I expect humanity to be very difficult to kill in this way (and growing harder to kill every year as we advance prosaic AI tech). Also killing humanity would likely not be in the best interests of even unaligned AGI, because humans will probably continue to be key components of the economy (as highly efficient general purpose robots) long after AGI running in datacenters takes most higher paying intellectual jobs. So instead I expect unaligned power-seeking AGIs to adopt much more covert strategies for world domination.

That Alien Mindspace

In the "design space of minds in general" EY says:

Any two AI designs might be less similar to each other than you are to a petunia.

Quintin Pope has already written out a well argued critique of this alien mindspace meme from the DL perspective, and I already criticized this meme once when it was fresh over a decade ago. So today I will instead take a somewhat different approach (an updated elaboration of my original critique).

Imagine we have some set of mysterious NNs, which we'd like to replicate, but we only have black box access. By that I mean we have many many examples of the likely partial inputs and outputs of these networks, and some ideas about the architecture, but we don't have any direct access to the weights.

In turns out there is a simple and surprisingly successful technique which one can use to create an arbitrary partial emulation of any ensemble of NNs: distillation. In essence distillation is simply the process of training one NN on the collected inputs/outputs of other NNs, such that it learns to emulate them.

This is exactly how we train modern large ANNs, and LLMs specifically: by training them on the internet, we are training them on human thoughts and thus (partially) distilling human minds.

Thus my model (or the systems/cybernetic model in general) correctly predicted - well in advance - that LLMs would have anthropomorphic cognition: mirroring much or our seemingly idiosyncratic cognitive biases, quirks, and limitations. Thus we have AGI that can write poems and code (like humans) but struggles with multiplying numbers (like humans), generally exhibits human like psychology, is susceptible to flattery, priming, the Jungian "shadow self" effect, etc. There is a large growing pile of specific evidence that LLMs are distilling/simulating human minds, some of which I and others have collected in prior posts, but the strength of this argument should already clearly establish a strong prior expectation that distillation should be the default outcome.

The width of mindspace is completely irrelevant. Moravec, myself and the other systems-thinkers were correct: AI is and will be our mind children; the technosphere extends the noosphere.

This alone does not strongly entail that AGI will be aligned by default, but it does defeat EY's argument that AGI will be unaligned by default (and he loses much bayes points that I gain).

The Risk Which Remains

To be clear, I am not arguing that AGI is not a threat. It is rather obviously the pivotal eschatonic event, the closing chapter in human history. Of course 'it' is dangerous, for we are dangerous. But that does not mean that 1.) extinction is the most likely outcome, or 2.) that alignment is intrinsically more difficult than AGI, or 3.) that EY's specific arguments are the especially relevant and correct way to arrive at any such conclusions.

You will likely die, but probably not because of a nanotech holocaust initiated by a god-like machine superintelligence. Instead you will probably die when you simply can no longer afford the tech required to continue living. If AI does end up causing humanity's extinction, it will probably be the result of a slow more prosaic process of gradually out-competing us economically. AGI is not inherently mortal and can afford patience, unlike us mere humans.


  1. The turing transition: brains evolved linguistic symbolic communication which permits compressing and sharing thoughts across brains, forming a new layer of networked social computational organization and allowing minds to emerge as software entities. This is a one time transition, as there is nothing more universal/general than a turing machine. ↩︎

  2. Neuromorphic computing is the main eventual long-term threat to current GPUs/accelerators and a continuation of the trend of embedding efficient matrix ops into the hardware, but it is unlikely to completely replace them for various reasons, and I don't expect it to be very viable until traditional moore's law has mostly ended. ↩︎

Contra Yudkowsky on AI Doom
New Comment
112 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings
[-]habryka10882

I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily. 

Maybe it will do that using GPUs, or maybe it will do that using some more neuromorphic design, but I really don't understand why energy density matters very much. The vast majority of energy that current humans produce is of course not spent on running human brains, and there are easily 10-30 OOMs of improvement lying around without going into density (just using the energy output of a single power plant under your model would produce something that would likely be easily capable of disempowering humanity). 

More broadly, you list these three "assumptions" of Eliezer's worldview: 

The brain inefficiency assumption: The human brain is inefficient in multiple dimensions/ways/metrics that translate into intelligence per dollar; inefficient as a hardware platform in key metrics such as thermodynamic efficiency.

The mind inefficiency or human incompetence assumption: In terms of software he describes the brain as an inefficient complex "kludgy mess of spaghetti-code". He deri

... (read more)

I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.

I'm going to expand on this.

Jacob's conclusion to the speed section of his post on brain efficiency is this:

The brain is a million times slower than digital computers, but its slow speed is probably efficient for its given energy budget, as it allows for a full utilization of an enormous memory capacity and memory bandwidth. As a consequence of being very slow, brains are enormously circuit cycle efficient. Thus even some hypothetical superintelligence, running on non-exotic hardware, will not be able to think much faster than an artificial brain running on equivalent hardware at the same clock rate.

Let's accept all Jacob's analysis about the tradeoffs of clock speed, memory capacity and bandwidth.

The force of his conclusion depends on the superintelligence "running on equivalent hardware." Obviously, core to Eliezer's superintelligence argument, and habryka's comment here, is the point that the hardware underpinning AI can be made large and expanded upon in a way that is not possible... (read more)

First, he needs to explain why any efficiency constraints can't be overcome by just throwing a lot of material and energy resources into building and powering inefficient or as-efficient-as-human-brains GPUs. If energy is not a taut constraint for AGI, and it's also expected to be an increasing fraction of costs over time, then that sounds like an argument that we can overcome any efficiency limits with increasing expenditures to achieve superhuman performance.

If Jake claims to disagree with the claim that ai can starkly surpass humans [now disproven - he has made more explicit that it can], I'd roll my eyes at him. He is doing a significant amount of work based on the premise that this ai can surpass humans. His claims about safety must therefore not rely on ai being limited in capability; if his claims had relied on ai being naturally capability bounded I'd have rolled to disbelieve [edit: his claims do not rely on it]. I don't think his claims rely on it, as I currently think his views on safety are damn close to simply being a lower resolution version of mine held overconfidently [this is intended to be a pointer to stalking both our profiles]; it's possible he actually disa... (read more)

8DirectedEvolution
Respectfully, it's hard for me to follow your comment because of the amount of times you say things like "If Jake claims to disagree with this," "based on the premise that this is false," "must therefore not rely on it or be false," and "I don't think they rely on it." The double negatives plus pointing to things with the word "this" and "it" makes me lose confidence in my ability to track your line of thinking. If you could speak in the positive and replace your "pointer terms" like "this" and "it" with the concrete claims you're referring to, that would help a lot!
5the gears to ascension
Understandable, I edited in clearer references - did that resolve all the issues? I'm not sure in return that I parsed all your issues parsing :) I appreciate the specific request!
4DirectedEvolution
It helps! There are still some double negatives ("His claims about safety must therefore not rely on ai not surpassing humans, or be false" could be reworded to "his claims about safety can only be true if they allow for AI surpassing humans," for example), and I, not being a superintelligence, would find that easier to parse :) The "pointers" bit is mostly fixed by you replacing the word "this" with the phrase "the claim that ai can starkly surpass humans." Thank you for the edits!
1jacob_cannell
I don't need to explain that as I don't believe it. Of course you can overcome efficiency constraints somewhat by brute force - and that is why I agree energy is not by itself an especially taut constraint for early AGI, but it is a taut constraint for SI. You can't overcome any limits just by increasing expenditures. See my reply here for an example. I don't really feel this need, because EY already agrees thermodynamic efficiency is important, and i'm arguing specifically against core claims of his model. Computation simply is energy organized towards some end, and intelligence is a form of computation. A superintelligence that can clearly overpower humanity is - almost by definition - something with greater intelligence than humanity, which thus translates into compute and energy requirements through efficiency factors.
5DirectedEvolution
It’s absolutely valid to make a local argument against specific parts of Eliezer’s model. However, you have a lot of other arguments “attached” that don’t straightforwardly flow from the parts of Eliezer’s model you’re mainly attacking. That’s a debate style choice that’s up to you, but as a reader who is hoping to learn from you, it becomes distracting because I have to put a lot of extra work into distinguishing “this is a key argument against point 3 from EY’s efficiency model” from “this is a side argument consisting of one assertion about bioweapons based on unstated biology background knowledge.” Would it be better if we switched from interpreting your post as “a tightly focused argument on demolishing EY’s core efficiency-based arguments,” to “laying out Jabob’s overall view on AI risk, with a lot of emphasis on efficiency arguments?” If that’s the best way to look at it then I retract the objection I’m making here, except to say it wasn’t as clear as it could have been.

The bioweapons is something of a tangent, but I felled compelled to mention it because every time I've pointed out that strong nanotech can't have any core thermodynamic efficiency over biology someone has to mention superviruses or something, even that isn't part of EY's model - he talks about diamond nanobots. But sure, that paragraph is something of a tangent.

EY's model requires slightly-smarter-than-us AGI running on normal hardware to start a FOOM cycle of recursive self improvement resulting in many OOM intelligence improvement in a short amount of time. That requires some combination of 1.) many OOM software improvement on current hardware, 2.) many OOM hardware improvement with current foundry tech, or 3.) completely new foundry tech with many OOM improvement over current - ie nanotech woo. The viability of all/any of this is all entirely dependent on near term engineering practicality.

3DirectedEvolution
I think I see what you're saying here. Correct me if I'm wrong. You're saying that there's an argument floating around that goes something like this: And it's this argument specifically that you are dispatching with your efficiency arguments. Because, for inescapable physics reasons, AI will hit an efficiency wall, and it can't become more intelligent than humans on hardware with equivalent size, energy, and so on. Loosely speaking, it's impossible to build a device something significantly smaller than a brain and using less power than a brain running AI that's more than 1-2 OOMs smarter than a brain, and we can certainly rule out a superintelligence 6 OOMs smarter than humans running on a device smaller and less energy-intensive than a brain.  You have other arguments about practical engineering constraints, the potential utility to an AI of keeping humans around, the difficulty of building grey goo, and so on, the "alien minds" argument, but those are all based on separate counterarguments. You're also not arguing about whether an AI just 2-100x as intelligent as humans might be dangerous based on efficiency considerations. You do have arguments in some or all of these areas, but the efficiency arguments are meant to just deal with this one specific scenario about a 6 OOM (not a 2 OOM)  improvement in intelligence during a training run without accessing more hardware than was made available during the training run. Is that correct?

I'm confused because you describe an "argument specifically that you are dispatching with your efficiency arguments", and the first paragraph sounds like an EY argument, but the 2nd more like my argument. (And 'dispatching' is ambiguous)

Also "being already superintelligent" presumes the conclusion at the onset.

So lets restart:

  1. Someone creates an AGI a bit smarter than humans.
  2. It creates even smarter AGI - by rewriting its own source code.
  3. After the Nth iteration and software OOM improvement is tapped it creates nanotech assemblers to continue growing OOM in power (or alternatively somehow gets OOM improvement with existing foundry tech, but that seems less likely as part of EY's model).
  4. At some point it has more intelligence/compute than all of humanity, and kills us with nanotech or something.

EY and I agree on 1 but diverge past that. Point 2 is partly a matter of software efficiency but not entirely. Recall that I correctly predicted in advance that AGI requires brain-like massive training compute, which largely defeats EY's view of 2 where it's just a modest "rewrite of its own source code". The efficiency considerations matter for both 2 and 3, as they determine how effectively it can quickly turn resources (energy/materials/money/etc) into bigger better training runs to upgrade its intelligence.

5DirectedEvolution
Ugh yes, I have no idea why I originally formatted it with the second paragraph quoted as I had it originally (which I fully intended as an articulation of your argument, a rebuttal to the first EY-style paragraph). Just a confusing formatting and structure error on my part. Sorry about that, thanks for your patience. So as a summary, you agree that AI could be trained a bit smarter than humans, but you disagree with the model where AI could suddenly iteratively extract like 6 OOMs better performance on the same hardware it's running on, all at once, figure out ways to interact with the physical world again within the hardware it's already training on, and then strike humanity all at once with undetectable nanotech before the training run is even complete. The inability of the AI to attain 6 OOMs better performance on its training hardware during its training run by recursively self-improving its own software is mainly based on physical efficiency limits, and this is why you put such heavy emphasis on them. And the idea that neural net-like structures that are very demanding in terms of compute, energy, space, etc appear to be the only tractable road to superintelligence means that there's no alternative, much more efficient scheme the neural net form of the AI could find to rewrite itself a fundamentally more efficienct architecture on this scale. Again, you have other arguments to deal with other concerns and to make other predictions about the outcome of training superintelligent AI, but dispatching this specific scenario is where your efficiency arguments are most important. Is that correct?
7jacob_cannell
Yes but I again expect AGI to use continuous learning, so the training run doesn't really end. But yes I largely agree with that summary. NN/DL in its various flavors are simply what efficient approx bayesian inference involves, and there are not viable non-equivalent dramatically better alternatives.

Thanks Jacob for talking me through your model. I agree with you that this is a model that EY and others associated with him have put forth. I've looked back through Eliezer's old posts, and he is consistently against the idea that LLMs are the path to superintelligence (not just that they're not the only path, but he outright denies that superintelligence could come from neural nets).

My update, based on your arguments here, is that any future claim about a mechanism for iterative self-improvement that happens suddenly, on the training hardware and involves > 2 OOMs of improvement, needs to first deal with the objections you are raising here to be a meaningful way of moving the conversation forward.

I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.

I am genuinely curious and confused as to what exactly you concretely imagine this supposed 'superintelligence' to be, such that is not already the size of a factory, such that you mention "size of a factory" as if that is something actually worth mentioning - at all. Please show at least your first pass fermi estimates for the compute requirements. By that I mean - what are the compute requirements for the initial SI - and then the later presumably more powerful 'factory'?

Maybe it will do that using GPUs, or maybe it will do that using some more neuromorphic design, but I really don't understand why energy density matters very much.

I would suggest reading more about advanced GPU/accelerator design, and then about datacenter design and the thermodynamic/cooling considerations therein.

The vast majority of energy that current humans produce is of course not spent on running human brains, and there are easily 10-30 OOMs of improvement lying around without going into density (j

... (read more)

I am genuinely curious and confused as to what exactly you concretely imagine this supposed 'superintelligence' to be, such that is not already the size of a factory, such that you mention "size of a factory" as if that is something actually worth mentioning - at all. Please show at least your first pass fermi estimates for the compute requirements.

 

Despite your claim to be "genuinely curious and confused," the overarching tone and content of this bit does not strike me as projecting curiosity or confusion, but instead confident and sharp-toned burden-of-proof-shifting to habryka. That's merely a stylistic note, not impacting the content of your claims.

It sounds here like you are agreeing with him that you can deal with any limits to ops/mm^3 limits by simply building a bigger computer. It's therefore hard for me to see why these arguments about efficiency limitations matter very much for AI's ability to be superintelligent and exhibit superhuman takeover capabilities.

I can see why maybe human brains, being efficient according to certain metrics, might be a useful tool for the AI to keep around, but I don't see why we ought to feel at all reassured by that. I don't really want... (read more)

Despite your claim to be "genuinely curious and confused," the overarching tone and content of this bit does not strike me as projecting curiosity or confusion, but instead confident and sharp-toned burden-of-proof-shifting to habryka

I see how that tone could come off as rude, but really I don't understand habryka's model when he says "a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily."

So it takes 1% of a single power plant output to train GPT4. If GPT-4 got put on chips and distributed, wouldn't it take only a very small amount of power comparatively to actually run it once trained? Why are we talking about training costs rather than primarily about the cost to operate models once they have been trained?

The transformer arch is fully parallelizable only during training, but is roughly just as or more inefficient than RNNs on GPUs/accelerators for inference. The inference costs of GPT4 are of course a openai/microsoft secret, but it is not a cheap model. Also human-level AGI, let alone superintelligence, will likely require continual learning/training.

I guess by "put on chips" you mean ... (read more)

Re Yudkowsky, I don’t think his entire argument rests on efficiency, and the pieces that don’t can’t be dispatched by arguing about efficiency.

Regarding “alien mindspace,” what I mean is that the physical form of AI, and whatever awareness the AI has of that, makes it alien. Like, if I knew I could potentially transmit my consciousness with perfect precision over the internet and create self-clones almost effortlessly, I would think very differently than I do now.

His argument entirely depends on efficiency. He claims that near future AGI somewhat smarter than us creates even smarter AGI and so on, recursively bottoming out in something that is many many OOM more intelligent than us without using unrealistic amounts of energy, and all of this happens very quickly.

So that's entirely an argument that boils down to practical computational engineering efficiency considerations. Additionally he needs the AGI to be unaligned by default, and that argument is also faulty.

9DirectedEvolution
In your other recent comment to me, you said: It seems like in one place, you're saying EY's model depends on near term engineering practicality, and in another, that it depends on physics-constrainted efficiency which you argue invalidates it. Being no expert on the physics-based efficiency arguments, I'm happy to concede the physics constraints. But I'm struggling to understand their relevance to non-physics-based efficiency arguments or their strong bearing on matters of engineering practicality. My understanding is that your argument goes something like this: 1. You can't build something many OOMs more intelligent than a brain on hardware with roughly the same size and energy consumption as the brain. 2. Therefore, building a superintelligent AI would require investing more energy and more material resources than a brain uses. 3. Therefore... and here's where the argument loses steam for me. Why can't we or the AI just invest lots of material and energy resources? How much smarter than us does an unaligned AI need to be to pose a threat, and why should we think resources are a major constraint to get it to recursively self-improve itself to get to that point? Why should we think it will need constant retraining to recursively self-improve? Why do we think it'll want to keep an economy going? As far as the "anthropomorphic" counterargument to the "vast space of alien minds" thing, I fully agree that it appears the easiest way to predict tokens from human text is to simulate a human mind. That doesn't mean the AI is a human mind, or that it is intrinsically constrained to human values. Being able to articulate those values and imitate behaviors that accord with those values is a capability, not a constraint. We have evidence from things like ChaosGPT or jailbreaks that you can easily have the AI behave in ways that appear unaligned, and that even the appearance of consistent alignment has to be consistently enforced in ways that look awfully fragile. Overa
6habryka
You just said in your comment to me that a single power plant is enough to run 100M brains. It seems like you need zero hardware progress in order to get something much smarter without unrealistic amounts of energy, so I just don't understand the relevance of this.
4jacob_cannell
I said longer term - using hypothetical brain-parity neuromorphic computing (uploads or neuromorphic AGI). We need enormous hardware progress to reach that. Current tech on GPUs requires large supercomputers to train 1e25+ flops models like GPT4 that are approaching, but not quite, human level AGI. If the rurmour of 1T params is true, then it takes a small cluster and ~10KW just to run some smallish number of instances of the model. Getting something much much smarter than us would require enormous amounts of computation and energy without large advances in software and hardware.
6habryka
Sure. We will probably get enormous hardware progress over the next few decades, so that's not really an obstacle.  It seems to me your argument is "smarter than human intelligence cannot make enormous hardware or software progress in a relatively short amount of time", but this has nothing to do with "efficiency arguments". The bottleneck is not energy, the bottleneck is algorithmic improvements and improvements to GPU production, neither of which is remotely bottlenecked on energy consumption. No, as you said, it would require like, a power plant worth of energy. Maybe even like 10 power plants or so if you are really stretching it, but as you said, the really central bottleneck here is GPU production, not energy in any relevant way.

Sure. We will probably get enormous hardware progress over the next few decades, so that's not really an obstacle.

As we get more hardware and slow mostly-aligned AGI/AI progress this further raises the bar for foom.

It seems to me your argument is "smarter than human intelligence cannot make enormous hardware or software progress in a relatively short amount of time", but this has nothing to do with "efficiency arguments".

That is actually an efficiency argument, and in my brain efficiency post I discuss multiple sub components of net efficiency that translate into intelligence/$.

The bottleneck is not energy, the bottleneck is algorithmic improvements and improvements to GPU production, neither of which is remotely bottlenecked on energy consumption.

Ahh I see - energy efficiency is tightly coupled to other circuit efficiency metrics as they are all primarily driven by shrinkage. As you increasingly bottom out hardware improvements energy then becomes an increasingly more direct constraint. This is already happening with GPUs where power consumption is roughly doubling with each generation, and could soon dominate operating costs.

See here where I line the roodman model up... (read more)

3lemonhope
How much room is there in algorithmic improvements?
4toastje
Maybe it would be a good idea to change the title of this essay to: as to not give people hope that there would be a counter argument somewhere in this article to his more general claim:
9habryka
This seems like it straightforwardly agrees that energy efficiency is not in any way a bottleneck, so I don't understand the focus of this post on efficiency. I also don't know what you mean by longer term. More room at the bottom was of course also talking longer term (you can't build new hardware in a few weeks, unless you have nanotech, but then you can also build new factories in a few weeks), so I don't understand why you are suddenly talking as if "longer term" was some kind of shift of the topic.  Eliezer's model is that we definitely won't have many decades with AIs smarter but not much smarter than humans, since there appear to be many ways to scale up intelligence, both via algorithmic progress and via hardware progress. Eliezer thinks that drexlerian nanotech is one of the main ways to do this, and if you buy that premise, then the efficiency arguments don't really matter, since clearly you can just scale things up horizontally and build a bunch of GPUs. But even if you don't, you can still just scale things up horizontally and increase GPU production (and in any case, energy efficiency is not the bottleneck here, it's GPU production, which this post doesn't talk about) I don't understand the relevance of this. You seem to be now talking about a completely different scenario than what I understood Eliezer to be talking about. Eliezer does not think that a slightly superhuman AI would be capable of improving the hardware efficiency of its hardware completely on its own.  Both scenarios (going both big, in that you just use whole power-plant levels of energy, or going down in that you improve efficiency of chips) require changing semiconductor manufacturing, which is unlikely to be one of the first things a nascent AI does, unless it does successfully develop and deploy drexlerian nanotech. Eliezer in his model here was talking about what are reasonable limits that we would be approaching here relatively soon after an AI passes human levels.  I don't u

To reiterate the model of EY that I am critiquing is one where an AGI quickly rapidly fooms through many OOM efficiency improvements. All key required improvements are efficiency improvements - it needs to improve it's world modelling/planning per unit compute, and or improve compute per dollar and or compute per joule, etc.

In EY's model there are some perhaps many OOM software improvements over the initial NN arch/aglorithms, perhaps then continued with more OOM hardware improvements. I don't believe "buying more GPUs" is a key part of his model - it is far far too slow to provide even one OOM upgrade. Renting/hacking your way to even one OOM more GPUs is also largely unrealistic (I run one of the larger GPU compute markets and talk to many suppliers, I have inside knowledge here).

Both scenarios (going both big, in that you just use whole power-plant levels of energy, or going down in that you improve efficiency of chips) require changing semiconductor manufacturing, which is unlikely to be one of the first things a nascent AI does, unless it does successfully develop and deploy drexlerian nanotech

Right, so I have arguments against drexlerian nanotech (Moore room at the bot... (read more)

8the gears to ascension
I don't think he's at all claiming safety is trivial or that humans can expect to remain in charge. control-capture foom is very much permitted by his model and he says so directly; much bigger minds are allowed. But his model suggests that reflective algorithmic improvement is not the panacea that yudkowsky expected, nor that beating biology head to head is easy even for a very superintelligent system. this does not change any claim I would make about safety; it should barely be an update for anyone who has already updated off of deep learning. but it should knock down yudkowsky's view of capability scaling in algorithms thoroughly. this is relevant to prediction of which kinds of system are a threat to other systems and how.
4Matthew Barnett
Presumably it takes a gigantic amount of compute to train a "brain the size of a factory"? If we assume that training a human-level AI will take 10^28 FLOP (which is quite optimistic), the Chinchilla scaling laws predict that training a model 10,000 times larger would take about 10^36 FLOP, which is far more than the total amount of compute available to humans cumulatively over our history. By the time the world is training factory-sized brains, I expect human labor to already have been made obsolete by previous generations of AIs that were smarter than us, but not vastly so. Presumably this is Jacob's model of the future too?

Biology is incredibly efficient, and generally seems to be near pareto-optimal.

This seems really implausible. I'd like to see a debate about this. E.g. why can't I improve on heat by having super-cooled fluid pumped throughout my artificial brain; doesn't having no skull-size limit help a lot; doesn't metal help; doesn't it help to not have to worry about immune system stuff; doesn't it help to be able to maintain full neuroplasticity; etc. 

Biology is incredibly efficient at certain things that happen at the cell level. To me, it seems like OP is extrapolating this observation rather too broadly. Human brains are quite inefficient at things they haven't faced selective pressure to be good at, like matrix multiplication.

Claiming that human brains are near Pareto-optimal efficiency for general intelligence seems like a huge stretch to me. Even assuming that's true, I'm much more worried about absolute levels of general intelligence rather than intelligence per Watt. Conventional nuclear bombs are dangerous even though they aren't anywhere near the efficiency of a theoretical antimatter bomb. AI "brains" need not be constrained by the size and energy constraints of a human brain.

7jacob_cannell
The human brain hardware is essentially a giant analog/digital hybrid vector matrix multiplication engine if you squint the right way, and later neuromorphic hardware for AGI will look similar. But GPT4 isn't good at explicit matrix multiplication either.
2Donald Hobson
>But GPT4 isn't good at explicit matrix multiplication either. So it is also very inefficient.  Probably a software problem. 

Your instinct is right. The Landauer limit says that it takes at least energy to erase 1 bit of information, which is necessary to run a function which outputs 1 bit (to erase the output bit). The important thing to note is that it scales with temperature (measured in an absolute scale). Human brains operate at 310 Kelvin. Ordinary chips can already operate down to around ~230 Kelvin, and there is even a recently developed chip which operates at ~0.02 Kelvin.

So human brains being near the thermodynamic limit in this case means very little about what sort of efficiencies are possible in practice.

Your point about skull-sizes [being bounded by childbirth death risk] seems very strong for evolutionary reasons, and to which I would also add the fact that bird brains seem to do similar amounts of cognition (to smallish mammals) in a much more compact volume without having substantially higher body temperatures (~315 Kelvin).

Cooling the computer doesn't let you get around the Landauer limit! The savings in energy you get by erasing bits at low temperature are offset by the energy you need to dissipate to keep your computer cold. (Erasing a bit at low temperature still generates some heat, and when you work out how much energy your refrigerator has to use to get rid of that heat, it turns out that you must dissipate the same amount as the Landauer limit says you'd have to if you just erased the bit at ambient temperatures.) To get real savings, you have to actually put your computer in an environment that is naturally colder. For example, if you could put a computer in deep space, that would work.

On the other hand, there might also be other good reasons to keep a computer cold, for example if you want to lower the voltage needed to represent a bit, then keeping your computer cold would plausibly help with that. It just won't reduce your Landauer-limit-imposed power bill.

None of this is to say that I agree with the rest of Jacob's analysis of thermodynamic efficiency, I believe he's made a couple of shaky assumptions and one actual mistake. Since this is getting a lot of attention, I might write a post on it.

1O O
Deep space is a poor medium as the only energy dissipation there is radiation, which is slower than convection in Earth. Vacuums are typically used to insulate things (thermos).
1[comment deleted]

Ordinary chips can already operate down to around ~230 Kelvin, and there is even a recently developed chip which operates at ~0.02 Kelvin.

In a room temp bath this always costs more energy - there is no free lunch in cooling. However in the depths of outer space this may become relevant.

4Adele Lopez
That is true, and I concede that that weakens my point. It still seems to be the case that you could get a ~35% efficiency increase by operating in e.g. Antarctica. I also have this intuition I'll need to think more about that there are trade-offs with the Landauer limit that could get substantial gains by separating things that are biologically constrained to be close... similar to how a human with an air conditioner can thrive in much hotter environments (using more energy overall, but not energy that has to be in thermal contact with the brain via e.g. the same circulatory system).
4jacob_cannell
Norway/sweden do happen to be currently popular datacenter building locations, but more for cheap power than cooling from what I understand. The problem with Antarctica would be terrible solar production for much of the year.
2Donald Hobson
You can play the same game in the other direction. Given a cold source, you can run your chips hot, and use a steam engine to recapture some of the heat.  The Landauer limit still applies. 

I don't think heat dissipation is actually a limiting factor for humans as things stand right now. Looking at the heat dissipation capabilities of a human brain from three perspectives (maximum possible heat dissipation by sweat glands across the whole body, maximum actual amount of sustained power output by a human in practice, maximum heat transfer from the brain to arterial blood with current-human levels of arterial bloodflow), none of them look to me to be close to the 20w the human brain consumes.

  • Based on sweat production of athletic people reaching 2L per hour, that gives an estimate of ~1kW of sustained cooling capacity for an entire human
  • 5 watts per kg seems to be pretty close to the maximum power output well-trained humans can actually output in practice for a full hour, so that suggests that a 70 kg human has at least 350 watts of sustained cooling capacity (and probably more, because the limiting factor does not seem to be overheating).
  • Bloodflow to the brain is about 45L / h, and brains tolerate temperature ranges of 3-4ºC, so working backwards from that we get that a 160W brain would reach temperatures of about 3ºC higher than arterial blood assuming that arterial
... (read more)
7hairyfigment
I note in passing that the elephant brain is not only much larger, but also has many more neurons than any human brain. Since I've no reason to believe the elephant brain is maximally efficient, making the same claim for our brains should require much more evidence than I'm seeing.
[-]gilch1812

That's if you're counting the cerebellum, which doesn't seem to contribute much to intelligence, but is important for controlling the complicated musculature of a trunk and large body.

By cortical neuron count, humans have about 18 billion, while elephants have less than 6, comparable to a chimpanzee. (source)

Elephants are undeniably intelligent as animals go, but not at human level.

Even blue whales barely approach human level by cortical neuron count, although some cetaceans (notably orcas) exceed it.

4TekhneMakre
jacob_cannell's post here https://www.lesswrong.com/posts/xwBuoE9p8GE7RAuhd/brain-efficiency-much-more-than-you-wanted-to-know#Space argues that: Does that seem about right to you?
7faul_sname
I conclude something more like "the brain consumes perhaps 1 to 2 OOM less energy than the biological limits of energy density for something of its size, but is constrained to its somewhat lower than maximal energy density due in part to energy availability considerations" but I suspect that this is more of a figure/ground type of disagreement about which things are salient to look at vs a factual disagreement. That said @jacob_cannell is likely to be much more informed in this space than I am -- if the thermodynamic cooling considerations actually bind much more tightly than I thought, I'd be interested to know that (although not necessarily immediately, I expect that he's dealing with rather a lot of demands on his time that are downstream of kicking the hornet's nest here).
6the gears to ascension
efficient for the temperature it runs at. Jake is correct about the fundamental comparison, but he's leaving off the part where he expects reversible computing to fundamentally change the efficiency tradeoffs for intelligence eventually, which is essentially "the best way to make use of near perfect cooling" as a research field; I don't have a link to where he's said this before, since I'm remembering conversations we had out loud.
2TekhneMakre
But how is "efficient for the temperature it runs at" relevant to whether there's much room to improve on how much compute biology provides?
2the gears to ascension
it's relevant in that there's a lot of room to improve, it's just not at the same energy budget and temperature. I'm not trying to imply a big hidden iceberg in addition to that claim; what it implies is up to your analysis.
3jacob_cannell
Near pareto-optimal in terms of thermodynamic efficiency as replicators and nanobots, see the discussions and links here and here.
4TekhneMakre
Then how is that relevant to the argument in your OP? I thought you were arguing: That's what I responded to in my top-level comment. Is that not what you're arguing? If it is what you're arguing, then I'm confused because it seems like here in this comment you're talking about something irrelevant and not responding to my comment (though I could be confused about that as well!).
4jacob_cannell
The specific line where I said "biology is incredibly efficient, and generally seems to be near pareto-optimal", occurs immediately after and is mainly referring to the EY claim that "biology is not that efficient", and his more specific claim about thermodynamic efficiency - which I already spent a whole long post refuting. None of your suggestions: Improve thermodynamic efficiency, nor do they matter much in terms of OOM. EY's argument is essentially that AGI will quickly find many OOM software improvement, and then many more OOM improvement via new nanotech hardware.

Except that is exactly what is happening. TSMC is approaching the limits of circuit miniaturization, and it is increasing obvious that fully closing the (now not so large) gap with the brain will require more directly mimicking it through neuromorphic computing[2].

You mention a few times that you seem confident about Moore's law ending very soon. I am confused where this confidence comes from (though you might have looked into this more than I have). 

In-general the transistor-density aspect of Moore's law always seemed pretty contingent to me. The economic pressures care about flops/$, not about transistor density, which has just historically been the best way to get flops/$. Also for forecasting AI dynamics, flops/$ seems like it matters a lot more, since in the near future AI seems unlikely to have to care much about transistor density, given that there are easily 10-20 OOMs of energy and materials to be used on earth's surface for some kind of semiconductor or neuromorphic compute production. 

And in the space of flops/$, Moore's law seems to be going strong.The last report from AI Impacts I remember reading suggests things were going strong until at least 2020: 

ht... (read more)

Jensen Huang/Nvidia is almost un-arguably one of TSMC's most important clients and probably has some insights/access to their roadmaps, and I don't particularly suspect he is lying when he claims Moore's Law is dead, it matches my own analysis of TSMC's public roadmap, as well my analysis of the industry research/chatter/gossip/analysis. Moore's Law was a long recursive miniaturization optimization process which just was always naturally destined to bottom out somewhat before new on-moore's law leading foundries cost sizable fractions of world GDP and features approach minimal sizes (well predicted in advance).

This obviously isn't the end of technological progress in computing! It's just the end of the easy era. Neuromorphic computing is much harder for comparatively small gains. Reversible computing seems almost impossibly difficult, such that many envision just jumping straight to quantum computing, which itself is no panacea and very far.

And this 2022 analysis suggests things were also going quite strong very recently,

As were chip clock frequencies under dennard scaling, until that suddenly ended. I have uncertainty over how far we are from minimal viable switch energie... (read more)

As were chip clock frequencies under dennard scaling, until that suddenly ended. I have uncertainty over how far we are from minimal viable switch energies but it is not multiple OOM. There are more architectural tricks in the pipes in the nature of lower precision tensorcores, but not many of those left either

Want to take a bet? $1000, even odds.

I predict flops/$ to continue going down at between a factor of 2x every 2 years and 2x every 3 years. Happy to have someone else be a referee on whether it holds up. 

[Edit: Actually, to avoid having to condition on a fast takeoff itself, let's say "going down faster than a factor of 2x every 3 years for the next 6 years"]

I may be up for that but we need to first define 'flops', acceptable GPUs/products, how to calculate prices (preferably some standard rental price with power cost), and finally the bet implementation.

1Mo Putera
Curious, did this bet happen? Since Jacob said he may be up for it depending on various specifics.
3jacob_cannell
Part of the issue is my post/comment was about moore's law (transistor density for mass produced nodes), which is a major input to but distinct from flops/$. As I mentioned somewhere, there is still some free optimization energy in extracting more flops/$ at the circuit level even if moore's law ends. Moore's law is very specifically about fab efficiency as measured in transistors/cm^2 for large chip runs - not the flops/$ habyrka wanted to bet on. Even when moore's law is over, I expect some continued progress in flops/$. All that being said, nvidia's new flagship GPU everyone is using - the H100 which is replacing the A100 and launched just a bit after habryka proposed the bet - actually offers near zero improvement in flops/$ (the price increased in direct proportion to flops increase). So I probably should have taken the bet if it was narrowly defined as (flops/$ for the flagship gpus most teams using currently for training foundation models).
2Mo Putera
Thanks Jacob. I've been reading the back-and-forth between you and other commenters (not just habryka) in both this post and your brain efficiency writeup, and it's confusing to me why some folks so confidently dismiss energy efficiency considerations with handwavy arguments not backed by BOTECs.  While I have your attention – do you have a view on how far we are from ops/J physical limits? Your analysis suggests we're only 1-2 OOMs away from the ~10^-15 J/op limit, and if I'm not misapplying Koomey's law (2x every 2.5y back in 2015, I'll assume slowdown to 3y doubling by now) this suggests we're only 10-20 years away, which sounds awfully near, albeit incidentally in the ballpark of most AGI timelines (yours, Metaculus etc). 
4jacob_cannell
TSMC 4N is a little over 1e10 transistors/cm^2 for GPUs and roughly 5e^-18 J switch energy assuming dense activity (little dark silicon). The practical transistor density limit with minimal few electron transistors is somewhere around ~5e11 trans/cm^2, but the minimal viable high speed switching energy is around ~2e^-18J. So there is another 1 to 2 OOM further density scaling, but less room for further switching energy reduction. Thus scaling past this point increasingly involves dark silicon or complex expensive cooling and thus diminishing returns either way. Achieving 1e-15 J/flop seems doable now for low precision flops (fp4, perhaps fp8 with some tricks/tradeoffs); most of the cost is data movement as pulling even a single bit from RAM just 1 cm away costs around 1e-12J.
1habryka
It did not

He boo-hooed neural networks, and in fact actively bet against them in actions by hiring researchers trained in abstract math/philosophy, ignoring neuroscience and early DL, etc.

This seems to assume that those researchers were meant to work out how to create AI. But the goal of that research was rather to formalize and study some of the challenges in AI alignment in crisp language to make them as clear as possible. The intent was not to study the question of "how do we build AI" but rather "what would we want from an AI and what would prevent us from getting that, assuming that we could build one". That approach doesn't make any assumptions of how the AI would be built, it could be neural nets or anything else. 

Eliezer makes that explicit in e.g. this SSC comment:

MIRI doesn’t assume all AIs will be logical and I really need to write a long long screed about this at some point if I can stop myself from banging the keyboard so hard that the keys break. We worked on problems involving logic, because when you are confused about a *really big* thing, one of the ways to proceed is to try to list out all the really deep obstacles. And then, instead of the usual practice of trying to

... (read more)

EY's belief distribution about NNs and early DL from over a decade ago and how that reflects on his predictive track record has already been extensively litigated in other recent threads like here. I mostly agree that EY 2008 and later is somewhat cautious/circumspect about making explicitly future-disprovable predictions, but he surely did seem to exude skepticism which complements my interpretation of his actions.

That being said I also largely agree that MIRI's research path was chosen specifically to try and be more generic than any viable route to AGI. But one could consider that also as something of a failure or missed opportunity vs investing more in studying neural networks, the neuroscience of human alignment, etc.

But I've always said (perhaps not in public, but nonetheless) that I thought MIRI had a very small chance of success, but it was still a reasonable bet for at least one team to make, just in case the connectivists were all wrong about this DL thing.

[-]Max H2513

Thus my model (or the systems/cybernetic model in general) correctly predicted - well in advance - that LLMs would have anthropomorphic cognition: mirroring much or our seemingly idiosyncratic cognitive biases, quirks, and limitations.

This seems false. LLMs are trained to predict, which often results in them mimicking certain kinds of human errors. Mimicking errors doesn't mean that the underlying cognition which produced those errors is similar.

This is exactly how we train modern large ANNs, and LLMs specifically: by training them on the internet, we are training them on human thoughts and thus (partially) distilling human minds.

In what sense is predicting internet text "training [LLMs] on human thoughts"? Human thoughts are causally upstream of some internet text, so learning to predict human thoughts is one way of being good at predicting, but it's certainly not the only one. More on this general point here.

One thing that I have observed, working with LLMs, is that when they're predicting the next token in a Python REPL they also make kinda similar mistakes to the ones that a human who wasn't paying that much attention would make. For example, consider the following

>>> a, b = 3, 5    # input
>>> a + b          # input
8                  # input
>>> a, b = b, a    # input
>>> a * b          # input
15                 # prediction (text-davinci-003, temperature=0, correct)
>>> a / b          # input
1.0                # prediction (text-davinci-003, temperature=0, incorrect but understandable mistake)
>>> a              # input
5                  # prediction (text-davinci-003, temperature=0, correct)
>>> a / b          # input
1.0                # prediction (text-davinci-003, temperature=0, incorrect but understandable mistake)
>>> a              # input
5                  # prediction (text-davinci-003, temperature=0, correct)
>>> a / b          # input
1.0                # prediction (text-davinci-003, temperature=0, incorrect but understandable mistake)
>>> b              # input
3           
... (read more)
7Max H
An interesting example! A couple remarks: * a more human mistake might be guessing 0.6 and not 1.0? * After the mistake, it's not clear what the "correct" answer is, from a text prediction perspective. If I were trying to predict the output of my python interpreter, and it output 1.0, I'd predict that future outputs on the same input would also be "wrong" - that either I was using some kind of bugged interpreter, or that I was looking at some kind of human-guessed transcript of a python session.
5faul_sname
Yeah, that one's "the best example of the behavior that I was able to demonstrate from scratch with the openai playground in 2 minutes" not "the best example of the behavior I've ever seen". Mostly the instances I've seen were chess-specific results on a model that I specifically fine-tuned on Python REPL transcripts that looked like >>> import chess >>> board = chess.Board() >>> board.push_san('Na3') Move.from_uci('b1a3') >>> print(board.piece_at(chess.parse_square('b1'))) and it would print N instead of None (except that in the actual examples it mostly was a much longer transcript, and it was more like it would forget where the pieces were if the transcript contained an unusual move or just too many moves). For context I was trying to see if a small language model could be fine-tuned to play chess, and was working under the hypothesis of "a Python REPL will make the model behave as if statefulness holds". And then, of course, the Othello paper came out, and bing chat came out and just flat out could play chess without having been explicitly trained on it, and the question of "can a language model play chess" became rather less compelling because the answer was just "yes". But that project is where a lot of my "the mistakes tend to look like things a careless human does, not weird alien mistakes" intuitions ultimately come from.
1Htarlov
An alternative explanation of mistakes is that making mistakes and then correcting them was awarded during additional post-training refinement stages. I work with GPT-4 daily and sometimes it feels like it makes mistakes on purpose just to be able to say that it is sorry for the confusion and then correct it. It feels like it also makes fewer mistakes when you ask politely, which is rather strange (use please, thank you, etc.). Nevertheless, distillation seems like a very possible thing that is also going on here. It does not distill the whole of a human mind though. There are areas that are intuitive for the average human, even a small child, that are not for the GPT-4. For example, it has problems with concepts of 3D geometry and visualizing things in 3D. It may have similar gaps in other areas, including more important ones (like moral intuitions).
6jacob_cannell
Internet text contains the inputs and outputs of human minds in the sense that every story, post, article, essay, book, etc written by humans first went through our brains, word by word, token by token, tracing through our minds. Training on internet text is literally training on human thoughts because text written by humans is literally an encoding of human thoughts. The fact that it is an incomplete and partial encoding is mostly irrelevant, as given enough data you can infer through any such gaps. Only a small fraction of the pixels in an image are sufficient to reconstruct it.

Even if every one of your object level objections is likely to be right, this wouldn't shift me much in terms of policies I think we should pursue because the downside risks from TAI are astronomically large even at small probabilities (unless you discount all future and non-human life to 0). I see Eliezer as making arguments about the worst ways things could go wrong and why it's not guaranteed that they won't go that way. We could get lucky, but we shouldn't count on luck, so even if Eliezer is wrong he's wrong in ways that, if we adopt policies that account for his arguments, better protect us from existential catastrophe at the cost of getting to TAI a few decades later, which is a small price to pay to offset very large risks that exist even at small probabilities.

I am reasonably sympathetic to this argument, and I agree that the difference between EY's p(doom) > 50% and my p(doom) of perhaps 5% to 10% doesn't obviously cash out into major policy differences.

I of course fully agree with EY/bostrom/others that AI is the dominant risk, we should be appropriately cautious, etc. This is more about why I find EY's specific classic doom argument to be uncompelling.

My own doom scenario is somewhat different and more subtle, but mostly beyond scope of this (fairly quick) summary essay.

7jimv
You mention here that "of course" you agree that AI is the dominant risk, and that you rate p(doom) somewhere in the 5-10% range. But that wasn't at all clear to me from reading the opening to the article. As written, that opener suggests to me that you think the overall model of doom being likely is substantially incorrect (not just the details I've elided of it being the default). I feel it would be very helpful to the reader to ground the article from the outset with the note you've made here somewhere near the start. I.e., that your argument is with the specific doom case from EY, that you retain a significant p(doom), but that it's based on different reasoning.
-7mesaoptimizer
9philh
Eliezer believes and argues that things go wrong by default, with no way he sees to avoid that. Not just "no guarantee they won't go wrong". It may be that his arguments are sufficient to convince you of "no guarantee they won't go wrong" but not to convince you of "they go wrong by default, no apparent way to avoid that". But that's not what he's arguing.

This is interesting but would benefit from more citations for claims and fewer personal attacks on Eliezer.

I had the same impression at first, but in the areas where I most wanted these, I realized that Jacob linked to additional posts where he has defended specific claims at length.

Here is one example:

EY is just completely out of his depth here: he doesn't seem to understand how the Landauer limit actually works, doesn't seem to understand that synapses are analog MACs which minimally require OOMs more energy than simple binary switches, doesn't seem to understand that interconnect dominates energy usage regardless, etc.

I usually find Tyler Cowenesque (and heck, Yudkowskian) phrases like this irritating, and usually they're pretty hard to interrogate, but Jacob helpfully links to an entire factpost he wrote on this specific point, elaborating on this claim in detail.
 

He does something similar here:

EY derived much of his negative beliefs about the human mind from the cognitive biases and ev psych literature, and especially Tooby and Cosmide's influential evolved modularity hypothesis. The primary competitor to evolved modularity was/is the universal learning hypothesis and associated scaling hypothesis, and there was already sufficient evidence to rule out evolved modularity back i

... (read more)

Humanity is generating and consuming enormous amount of power - why is the power budget even relevant? And even if it was, energy for running brains ultimately comes from Sun - if you include the agriculture energy chain, and "grade" the energy efficiency of brains by the amount of solar energy it ultimately takes to power a brain, AI definitely has a potential to be more efficient. And even if a single human brain is fairly efficient, the human civilization is clearly not. With AI, you can quickly scale up the amount of compute you use, but scaling beyond a single brain is very inefficient.

If the optimal AGI design running on GPUs takes about 10 GPUs and 10kw to rival one human-brain power, and superintelligence which kills humanity ala the foom model requires 10 billion human brain power and thus 100 billion GPUs and a 100 terrawatt power plant - that is just not something that is possible in any near term.

In EY's model there is supposedly 6 OOM improvement from nanotech, so you could get the 10 billion human brainpower with a much more feasible 100 MW power plant and 100 thousand GPUs ish (equivalent).

you're assuming sublinear scaling. why wouldn't it be superlinear post training? it certainly seems like it is now. it need not be sharply superlinear like yud expected to still be superlinear.

7Anon User
Exactly! I'd expect compute to scale way better than humans - not necessarily because the intelligence of compute scales so well, but because the intelligence of human groups scales so poorly...
5jacob_cannell
So I assumed a specific relationship between "one unit of human-brain power", and "super intelligence capable of killing humanity", where I use human-brain power as a unit but that doesn't actually have to be linear scaling - imagine this is a graph with two labeled data points, with a point at (human, X:1) and then another point at (SI, X:10B), you can draw many different curves that connect those two labeled points and the Y axis is sort of arbitrary. Now maybe 10B HBP to kill humanity seems too high, but I assume humanity as a civilization which includes a ton of other compute, AI, and AGI, and I don't really put much credence in strong nanotech.
5habryka
To be clear, I don't know anyone who would currently defend the claim that you need a single system with computation needs of all 10 billion human brains. That seems like at least 5 OOMs too much. Even simulating 10M humans is likely enough, but you can probably do many OOMs better by skipping the incredible inefficiency of humans coordinating with each other in a global economy.
4jacob_cannell
If you believe modern economies are incredibly inefficient coordination mechanisms thats a deeper disagreement beyond this post. But in general my estimate for the intellectual work required to create an entirely new path to a much better compute substrate is something at least vaguely on the order of the amount of intellectual work accumulated into our current foundry tech. That is not my estimate for the minimal amount of intelligence required to takeover the world in some sense - that would probably require less. But again this is focused on critiquing scenarios where a superintelligence (something greater than humanity in net intelligence) bootstraps from AGI rapidly.
7habryka
Yep, seems plausibly like a relevant crux. Modern economies sure seem incredibly inefficient, especially when viewed through the lens of "how much is this system doing long-term planning and trying to improve its own intelligence".
2Donald Hobson
In many important tasks in the modern economy, it isn't possible to replace on expert with any number of average humans. A large fraction of average humans aren't experts.  A large fraction of human brains are stacking shelves or driving cars or playing computer games or relaxing etc. Given a list of important tasks in the computer supply chain, most humans, most of the time, are simply not making any attempt at all to solve them.  And of course a few percent of the modern economy is actively trying to blow each other up. 
9Steven Byrnes
To put some numbers on that, USA brains directly consume 20W × 330M = 6.6 GW, whereas the USA food system consumes ≈500 GW [not counting sunlight falling on crops] (≈15% of the 3300 GW total USA energy consumption).
[-]dr_s1311

You will likely die, but probably not because of a nanotech holocaust initiated by a god-like machine superintelligence.

This I agree with and always assumed, but it is also largely irrelevant if the end conclusion is that AGI still destroys us all. To most people, I'd say, the specific method of death doesn't matter as much as the substance. It's a special kind of academic argument one where we can endlessly debate on how precisely will the end come to be through making this thing while we all mostly agree that this thing we are making, and that we could stop making, will likely end us all. Sane people (and civilizations) just... don't make the deadly thing.

I haven't gone through the numbers so I'll give it a try, but out of the box, feels to me like your arguments about biology's computational efficiency aren't the end of it. I actually mentioned the topic as one possible point of interest here: https://www.lesswrong.com/posts/76n4pMcoDBTdXHTLY/ideas-for-studies-on-agi-risk. My impression is that biology can come up with some spectacularly efficient trade-offs, but that's only within the rules of biology. For example, biology can produce very fast animals with very good legs, b... (read more)

There are some other assumptions that go into Eliezer's model that are required for doom. I can think of one very clearly which is:

5.  The transition to that god-AGI will be as quick that other entities won't have the time to reach also superhuman capabilities. There are no "intermediate" AGIs that can be used to work on Alignment related problems or even as a defence from unaligned AGIs

This is the first contra AI doom case I've read which felt like it was addressing some of the core questions, rather than nitpicking on some irrelevant point, or just completely failing to understand the AI doom argument.

So, whilst I still think some of your points need fleshing out/further arguments, thank you very much for this post!

The little pockets of cognitive science that I've geeked out about - usually in the predictive processing camp - have featured researchers who are usually quite surprised by or are going to great lengths to double underline the importance of language and culture in our embodied / extended / enacted cognition.

A simple version of the story I have in my head is this: We have physical brains thanks to evolution, and then by being an embodied predictive perception/action loop out in the world, we started transforming our world into affordances for new perceptio... (read more)

Actually I think the shoggoth mask framing is somewhat correct, but it also applies to humans. We don't have a single fixed personality, we are also mask-wearers.

5faul_sname
The argument that shifts me the most away from thinking of it with the shoggoth-mask analogy is the implication that a mask has a single coherent actor behind it. But if you can avoid that mental failure mode I think the shoggoth-mask analogy is basically correct.

Hm, neuron impulses travel at around 200 m/s, electric signals travel at around 2e8 m/s, so I think electronics have an advantage there. (I agree that you may have a point with "That Alien Mindspace".)

5jacob_cannell
The brain's slow speed seems mostly for energy efficiency but it is also closely tuned to brain size such that signal delay is not a significant problem.

I agree that the human brain is roughly at a local optimum. But think about what could be done just with adding a fiber optic connection between two brains (I think there are some ethical issues here so this is a thought experiment, not something I recommend). The two brains could be a kilometer apart, and the signal between them on the fiber optic link takes less time than a signal takes to get from one side to the other of a regular brain.  So these two brains could think together (probably with some (a lot?) neural rewiring) as fast as a regular brain thinks individually. Repeat with some more brains.

Or imagine if myelination was under conscious control. If you need to learn a new language, demyelinate the right parts of the brain, learn the language quickly, and then remyelinate it.

So I think even without changing things much neurons could be used in ways that provide faster thinking and faster learning.

As for energy efficiency, there is no reason that a superintelligence has to be limited to the approximately 20 watts that a human brain has access to. Gaming computers can have 1000 W power supplies, which is 50 times more power. I think 50 brains thinking together really ... (read more)

  1. "These GPUs cost $1M and use 10x the energy of a human for the same work" is still a pretty bad deal for any workers that have to compete with that. And I don't expect economic gains to go to displaced workers.

  2. Even if an AI is more expensive per computational capacity than humans, it being much faster and immortal would still be a threat. I could imagine a single immortal human genius becoming world-emperor eventually. Now imagine them operating 10^6 or even 10^3 faster than ordinary humans.

This post raised some interesting points, and stimulated a bunch of interesting discussion in the comments. I updated a little bit away from foom-like scenarios and towards slow-takeoff scenarios. Thanks. For that, I'd like to upvote this post.

On the other hand: I think direct/non-polite/uncompromising argumentation against other arguments, models, or beliefs is (usually) fine and good. And I think it's especially important to counter-argue possible inaccuracies in key models that lots of people have about AI/ML/alignment. However, in many places, the post... (read more)

A very naive question for Jacob. A few years ago the fact that bird brains are about 10x more computationally dense than human brains was mentioned on SlateStarCodex and by Diana Fleischman. This is something I would not expect to be true if there were not some significant "room at the bottom." 

Is this false? Does this not imply what I think it should? Am I just wrong in thinking this is of any relevance? 

https://slatestarcodex.com/2019/03/25/neurons-and-intelligence-a-birdbrained-perspective/

I don't understand the physics, so this is just me not... (read more)

7jacob_cannell
Bird brains have higher neuron density esp in the forebrain, but I'm unsure if this also translates into higher synaptic density or just less synapses per neuron. Regardless it does look like bird brains are optimized more heavily for compactness, but its not clear what tradeoffs may being made there. But heat transport cooling scales with the surface area whereas compute and thus heat production scales with volume, so brains tend to become less dense as they grow larger absent more heroic cooling efforts.

If we look at the game of Go, AI managed to be vastly better than humans. An AI that can outcompete humans at any task the way that AlphaGo can outcompete human at Go is a serious problems even if it's not capable of directly figuring out how to build nanobots. 

This is exactly how we train modern large ANNs, and LLMs specifically: by training them on the internet, we are training them on human thoughts and thus (partially) distilling human minds.

While that's true that currently most of the training data we put into LLMs seems human-created, I don't th... (read more)

Thus my model (or the systems/cybernetic model in general) correctly predicted - well in advance - that LLMs would have anthropomorphic cognition: mirroring much or our seemingly idiosyncratic cognitive biases, quirks, and limitations. Thus we have AGI that can write poems and code (like humans) but struggles with multiplying numbers (like humans), generally exhibits human like psychology, is susceptible to flattery, priming, the Jungian "shadow self" effect, etc.

 

Is this comment the best example of your model predicting anthropomorphic cognition? I r... (read more)

4jacob_cannell
It's more fleshed out in the 2015 ULM post.
[-][anonymous]30

The universal learning/scaling model was largely correct - as tested by openAI scaling up GPT to proto-AGI.

I don't understand how OpenAIs success at scaling GPT proves the universal learning model. Couldn't there be an as yet undiscovered algorithm for intelligence that is more efficient?

4jacob_cannell
If I have a model which predicts "this simple architecture scales up to human intelligence with enough compute", and that is tested and indeed shown to be correct, then the model is validated. And it helps further rule out an entire space of theories about intelligence: namely all the theories that intelligence is very complicated and requires many complex interacting innate algorithms (evolved modularity, which EY seemed to subscribe to ) Sure there could be other algorithms for intelligence that are more efficient, and I already said I don't think we are on quite on the final scaling curve with transfomers. But over time the probability mass remaining for these undiscovered algorithms continually diminishes as we explore ever more of the algorithmic search space. Furthemore, evolution extensively explored the search space for architectures/algorithms for intelligent agents, and essentially found common variants of universal learning on NNs in multiple unrelated lineages, substantially adding to the evidence that yes this really is as good as it gets (at least for any near term conventional computers).
1[anonymous]
I see, thanks for clarifying.

Humans suck at arithmetic. Really suck. From comparison of current GPU's to a human trying and failing to multiply 10 digit numbers in their head, we can conclude that something about humans, hardware or software, is Incredibly inefficient. 

Almost all humans have roughly the same sized brain. 

So even if Einsteins brain was operating at 100% efficiency, the brain of the average human is operating at a lot less.

ie intelligence is easy - it just takes enormous amounts of compute for training.

Making a technology work at all is generally easier than m... (read more)

mirroring much or our seemingly idiosyncratic cognitive biases, quirks, and limitations.

True. 

They also have a big pile of their own new idiosyncratic quirks. 

https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation

These are bizarre behaviour patterns that don't resemble any humans. 

This looks less like a human, and more like a very realistic painted statue. It looks like a human, complete with painted on warts, but scratch the paint, and the inhuman nature shows through. 

The width of mindspace is compl

... (read more)

Biological cells operate directly at thermodynamic efficiency limits:

Well muscles are less efficient than steam engines. Which is why hamster wheel electricity is a dumb idea, burning the hamster food in a steam engine is more efficient. 

then we should not expect moore's law to end with brains still having a non-trivial thermodynamic efficiency advantage over digital computers. Except that is exactly what is happening. TSMC is approaching the limits of circuit miniaturization, and it is increasing obvious that fully closing the (now not so large) gap with the brain will require more directly mimicking it through neuromorphic computing[2].

 

This is a clear error. 

There is no particular reason to expect TSMC to taper off at a point anywhere near the theoretical limits.

A closely anal... (read more)

I don't have much to contribute on AI risk, but I do want to say +1 for the gutsy title. It's not often you see the equivalent of "Contra The Founding Mission of an Entire Community".