Starting in 2008, Robin Hanson and Eliezer Yudkowsky debated the likelihood of FOOM: a rapid and localized increase in some AI's intelligence that occurs because an AI recursively improves itself.

As Yudkowsky summarizes his position:

I think that, at some point in the development of Artificial Intelligence, we are likely to see a fast, local increase in capability—“AI go FOOM.” Just to be clear on the claim, “fast” means on a timescale of weeks or hours rather than years or decades; and “FOOM” means way the hell smarter than anything else around, capable of delivering in short time periods technological advancements that would take humans decades, probably including full-scale molecular nanotechnology. (FOOM, 235)

Over the course of this debate, both Hanson and Yudkowsky made a number of incidental predictions about things which could occur before the advent of artificial superintelligence -- or for which we could at the very least receive strong evidence before artificial superintelligence.

On the object level, my conclusions is that when you examine these predictions, Hanson probably does a little better than Yudkowsky. Although depending on how you weigh different topics, I could see arguments from "they do about the same" to "Hanson does much better."

On one meta level, my conclusion is that Hanson's view --- that we should try to use abstractions that have proven prior predictive power -- looks like a pretty good policy.

On another meta level, my conclusion -- springing to a great degree from how painful seeking clear predictions in 700 pages of words has been -- is that if anyone says "I have a great track record" without pointing to specific predictions that they made, you should probably ignore them, or maybe point out their lack of epistemic virtue if you have the energy to spare for doing that kind of criticism productively.


There are number of difficulties involved in evaluating some public figure's track record. We want to avoid cherry-picking sets of particularly good or bad predictions. And we want to have some baseline to compare them to.

We can mitigate both of these difficulties -- although not, alas, eliminate them -- by choosing one document to evaluate: "The Hanson-Yudkowsky Foom Debate". (All future page numbers refer to this PDF.) Note that the PDF includes the (1) debate-via-blogposts which took place on OvercomingBias, (2) an actual in-person debate that took place at Jane Street in 2011 and (3) further summary materials from Hanson (further blogposts) and Yudkowsky ("Intelligence Explosion Microeconomic"). This spans a period from 2008 to 2013.

I do not intend this to be a complete review of everything in these arguments.

The discussion spans the time from the big bang until hypothetical far future galactic civilizations. My review is a little more constrained: I am only going to look at predictions for which I think we've received strong evidence in the 15 or so years since the debate started.

Note also that the context of this debate was quite different than it would be if it happened today.

At the time of the debate, both Hanson and Yudkowsky believed that machine intelligence would be extremely important, but that the time of its arrival was uncertain. They thought that it would probably arrive this century, but neither had the very, certain short timelines which are common today.

At this point Yudkowsky was interested in actually creating a recursively self-improving artificial intelligence, a "seed AI." For instance, in 2006 the Singularity Institute -- what MIRI was before it renamed -- had a website explicitly stating that they sought funding to create recursively self-improving AI. During the Jane Street debate Yudkowsky humorously describes the Singularity Institute as the "Institute for Carefully Programmed Intelligence Explosion."

So this context is quite different than today.

I think that if I make a major mistake in the below, it's probably that I missed some major statements from Hanson or Yudkowsky, rather than drastically mis-calling the items that I did include. Such a mistake could happen in part because I have tried to be conservative, and mostly included predictions which seem to have multiple affirmations in the text. But I definitely did skim-read parts of the debate that seemed irrelevant to predictions, such as the parts about the origin of life or about the morality of taking over the world with a superhuman AI. So I could very well have missed something.

Feel free to mention such missed predictions in the comments, although please quote and cite page numbers. Rereading this has confirmed my belief that the recollected mythology of positions advanced during this debate is... somewhat different than what people's actual positions were.

Predictions -- Relatively Easy To Call

In this section, I'm going to include predictions which appear to me relatively straightforward.

I don't think many people who read the FOOM debate for the first time now would dispute them. Although they are nevertheless disputable if you try really hard, like everything.

I'll phrase each prediction so that Yudkowsky takes the positive, and Hanson the negative.

"Cyc is not a Promising Approach to Machine Intelligence"

Cyc was (and is) an effort to build an artificial intelligence by building a vast database of logically-connected facts about the world by hand. So a belief like "Bob works as an engineer" is represented by relating the entity <Bob> to <engineer> with <works-as>, in this database. These facts would then be entered into an "inference engine," which can reason about them in long chains of valid proofs. Right now, CycCorp claims Cyc has a knowledge base with 25 million axioms, 40,000 predicates, and so on. Its creator, Douglas Lenat, moved on to Cyc from Eurisko because he decided AI needed in a large base of knowledge to work correctly.

Hanson thinks that this is a promising approach, stating:

The lesson Lenat took from Eurisko is that architecture is overrated; AIs learn slowly now mainly because they know so little. So we need to explicitly code knowledge by hand until we have enough to build systems effective at asking questions, reading, and learning for themselves. Prior AI researchers were too comfortable starting every project over from scratch; they needed to join to create larger integrated knowledge bases.... They had to start somewhere, and in my opinion they have now collected a knowledge base with a truly spectacular size, scope, and integration. Other architectures may well work better, but if knowing lots is anywhere near as important as Lenat thinks, I’d expect serious AI attempts to import Cyc’s knowledge, translating it into a new representation. (FOOM, 226)

On the other hand, Yudkowsky thinks Cyc has approximately zero chance of working well:

Knowledge isn’t being able to repeat back English statements. This is true even of humans. It’s a hundred times more true of AIs, even if you turn the words into tokens and put the tokens in tree structures... A basic exercise to perform with any supposed AI is to replace all the English names with random gensyms and see what the AI can still do, if anything. Deep Blue remains invariant under this exercise. Cyc, maybe, could count—it may have a genuine understanding of the word “four”—and could check certain uncomplicatedly structured axiom sets for logical consistency, although not, of course, anything on the order of say Peano arithmetic. The rest of Cyc is bogus. If it knows about anything, it only knows about certain relatively small and simple mathematical objects, certainly nothing about the real world. (FOOM, 228)

Yudkowsky seems obviously right from where we stand now; Cyc was not promising.

The most advanced modern AI systems have zero need to import knowledge from Cyc, and Cyc's abilities pale besides modern LLMs. By 2011, Hanson concedes at least somewhat to Yudkowsky's position and states that Cyc might not have enough information or be in the wrong format (FOOM, 496).

Aside: What counts as being right

I think that Yudkowsky is obviously right here.

But if you wished, you could say that Hanson's position that "Cyc is promising" has not been entirely falsified. CycCorp still appears to have customers. Their product functions. They advertise the conclusions of Cyc as auditable, in a way that the conclusions of DL are not, and this is true. Functionally, Cyc is surpassed by machine learning for basically everything -- but you could say that in the future the approach might possibly turn things around. It's a logically coherent thing to say.

Nevertheless -- I'm comfortable saying that Cyc is the wrong approach, and that Yudkowsky clearly had the better predictions about this. As Yudkowsky said even in 2011, Cyc being promising has "been incrementally more and more falsified" each year (FOOM 476), and each year since 2011 has been further incremental falsification.

My basic criteria for judgement is that, if you had believed Hanson's view, you'd have been waaaaaaaaay more surprised by the future than if you had believed Yudkowsky's view. This will be the approach I'm taking for all the other predictions as well.

"AI Comes Before Whole-Brain Emulations"

Intelligence-on-computers could come in at least two ways.

It could come through AI and machine-learning algorithms manually coded by humans, perhaps inspired by the human brain but ultimately only loosely connected to it. Or it could come from some kind of high-resolution scan of a human brain, which is then virtualized and run on a computer: a whole brain emulation (WBE or "em").

Hanson literally wrote the book on ems (albeit after this debate) and thinks that ems are marginally more likely to occur before hand-coded AI (FOOM, 26).

Yudkowsky also had -- as of the Hanson-Yudkowsky debate, not now -- very broad intervals for the arrival of machine intelligence, which he summarizes as "I don’t know which decade and you don’t know either" (FOOM, 682). Nevertheless, he think AI is likely to occur before ems.

AI seems well on its way, and ems as distant as they did in 2008, so I'm comfortable saying that Yudkowsky's position looks far more accurate right now.

Nevertheless, both Yudkowsky and Hanson explicitly call attention the very broad distribution of their own timelines, so it is a small update towards Yudkowsky over Hanson.

"AI Won't Be Able to Exchange Cognitive Content Easily"

A central part of the dispute between Yudkowsky and Hanson is how localized future growth rates will be.

They both think that an economy with machine intelligences in it -- either em or AI -- will grow very quickly compared to our current economy.

But Hanson sees a world where "these AIs, and their human owners, and the economy that surrounds them, undergo a collective FOOM of self-improvement. No local agent is capable of doing all this work, only the collective system" (FOOM, 276, Yudkowsky summarizing Hanson). Yudkowsky, on the other hand, sees a world where an individual AI undergoes a rapid spike in self-improvement relative to the world; where a brain in a box in a basement can grow quickly to come to out-think all of humanity.

One thing that could influence whether growth is more local or global is whether AIs can trade cognitive content. If such trading such cognitive content with your neighbors is more advantageous -- or trading in general is advantageous -- then growth will probably be more global; if trading is less advantageous, growth will probably be more local.

Yudkowsky thinks trading or simply exchanging cognitive content between AIs is quite unlikely. Part of this is because of the current state AI in 2008, where no one AI architecture has grown to dominate the others:

And I have to say that, looking over the diversity of architectures proposed at any AGI conference I’ve attended, it is very hard to imagine directly trading cognitive content between any two of them. It would be an immense amount of work just to set up a language in which they could communicate what they take to be facts about the world—never mind preprocessed cognitive content.

And a little earlier:

Trading cognitive content between diverse AIs is more difficult and less likely than it might sound. Consider the field of AI as it works today. Is there any standard database of cognitive content that you buy off the shelf and plug into your amazing new system, whether it be a chess player or a new data-mining algorithm? If it’s a chess-playing program, there are databases of stored games—but that’s not the same as having databases of preprocessed cognitive content. So far as I can tell, the diversity of cognitive architectures acts as a tremendous barrier to trading around cognitive content. (FOOM, 278)

But not all of this is simply projecting the present into the future. He further thinks that even if different AIs were to have the same architecture, trading cognitive content between them would be quite difficult:

If you have many AIs around that are all built on the same architecture by the same programmers, they might, with a fair amount of work, be able to pass around learned cognitive content. Even this is less trivial than it sounds. If two AIs both see an apple for the first time, and they both independently form concepts about that apple, and they both independently build some new cognitive content around those concepts, then their thoughts are effectively written in a different language. (FOOM, 278)

By default, he also expects more sophisticated, advanced AIs to have representations that are more opaque to each other. This effect he thinks will be so significant that pre-FOOM AIs might be incapable of doing it: "AI would have to get very sophisticated before it got over the “hump” of increased sophistication making sharing harder instead of easier. I’m not sure this is pre-takeoff sophistication we’re talking about, here" (FOOM, 280).

Again—in today’s world, sharing of cognitive content between diverse AIs doesn’t happen, even though there are lots of machine learning algorithms out there doing various jobs. You could say things would happen differently in the future, but it’d be up to you to make that case. (FOOM, 280)

Hanson, on the other hand, thinks that the current diverse state of AI architectures is simply an artifact of the early state of AI development. As AI research finds solutions that work, we should expect that architectures become more standardized. And as architectures become more standardized, this will make sharing between AIs more easy:

Amost every new technology comes at first in a dizzying variety of styles and then converges to what later seems the “obvious” configuration. It is actually quite an eye-opener to go back and see old might-have-beens, from steam-powered cars to pneumatic tube mail to memex to Engelbart’s computer tools. Techs that are only imagined, not implemented, take on the widest range of variations. When actual implementations appear, people slowly figure out what works better, while network and other scale effects lock in popular approaches... But of course “visionaries” take a wide range of incompatible approaches. Commercial software tries much harder to match standards and share sources. (FOOM, 339-340)

This makes him think that sharing between AIs is likely to occur relatively easily, because AI progress will make architectures more similar, which makes it easier to share cognitive content between AIs.

Hanson is the clear winner here. We don't have AIs that are exchanging cognitive content, because we don't have AIs that are sufficiently agent-like to do this. But humans now exchange cognitive AI content all the time.

Per Hanson's prediction, AI architectures have standardized around one thing -- neural networks, and even around a single neural network architecture (Transformers) to a very great degree. The diversity Yudkowsky observed in architectures has shrunk enormously, comparatively speaking.

Moreover, granting neural networks, trading cognitive content has turned out to be not particularly hard. It does not require superintelligence to share representations between different neural networks; a language model can be adapted to handle visual data without enormous difficulty. Encodings from BERT or an ImageNet model can be applied to a variety of downstream tasks, and this is by now a standard element in toolkits and workflows. When you share architectures and training data, as for two differently fine-tuned diffusion models, you can get semantically meaningful merges between networks simply by taking the actual averages of their weights. Thoughts are not remotely "written in a different language."

So generally, cognitive content looks to be relatively easy to swap between different systems. It remains easy to swap as systems get smarter, and workflows that involve such swapping are becoming increasingly common. Hanson's view looks more accurate.

"Improvements in One AI Project Generally Won't Improve Another Much"

This issue mirrors the one above.

As whether cognitive content could be easily shared between AIs is relevant for local vs. global takeoff, so is whether cognitive algorithms could be easily shared between AIs. That is, whether the improvements you make to one AI could be relatively easily transferred to another.

Yudkowsky states:

The same sort of barriers that apply to trading direct cognitive content would also apply to trading changes in cognitive source code.... It’s a whole lot easier to modify the source code in the interior of your own mind than to take that modification and sell it to a friend who happens to be written on different source code.... This is another localizing force. It means that the improvements you make to yourself, and the compound interest earned on those improvements, are likely to stay local. If the scenario with an AI takeoff is anything at all like the modern world in which all the attempted AGI projects have completely incommensurable architectures, then any self-improvements will definitely stay put, not spread.

Yudkowsky does relax his confidence about sharing cognitive algorithms by the time of the 2011 debate, noting that chess algorithms have benefitted from sharing techniques, but still maintains his overall position (FOOM, 663).

Similarly to the above, Hanson thinks as progress occurs, improvements will begin to be shared.

Yudkowsky is again pretty clearly wrong here.

An actual improvement to say, how Transformers work, would help with speech recognition, language modelling, image recognition, image segmentation, and so on and so forth. Improvements to AI-relevant hardware are a trillion-dollar business. Work compounds so easily on other work that many alignment-concerned people want to conduct all AI research in secret.

Hanson's position looks entirely correct.

"Algorithms are Much More Important Than Compute for AI Progress"

Different views about the nature of AI imply different things about how quickly AIs could FOOM.

If most of the space between the sub-human AIs of 2008 and potentially superhuman AIs of the future is algorithmic, then growth could be very fast and localized as AI discovers these algorithms. The "a brain in a box in a basement" frequently mentioned in the Jane Street debate could discover algorithms that let it move from merely human to godlike intelligence overnight.

On the other hand, if a lot of the space between AIs of 2008 and superhuman AIs of the future is in size of compute needed -- or if greater compute is at least a prerequisite for having superhuman AI -- then growth is likely to be slower because AIs need to obtain new hardware or even build new hardware. A computer in a basement somewhere would need to purchase time in the cloud, hack GPUs, or purchase hardware to massively increase its intelligence, which could take more time and is at least more visible.

Yudkowsky uniformly insists that qualitative algorithmic differences are more important than compute, and moreover that great quantities of compute are not a prerequisite.

For instance, he says that "quantity [of minds] < (size, speed) [of minds] < quality [of minds]" (FOOM, 601). He expects "returns on algorithms to dominate" during an intelligence explosion (627). He consistently extends this belief into the past, noting that although human brains are four times bigger than chimpanzee brains "this tells us very little because most of the differences between humans and chimps are almost certainly algorithmic" (FOOM, 613).

When he mentions that compute could contribute to AI progress, he always makes clear that algorithms will be more important :

Let us consider first the prospect of an advanced AI already running on so much computing power that it is hard to speed up. I find this scenario somewhat hard to analyze because I expect AI to be mostly about algorithms rather than lots of hardware, but I can’t rule out scenarios where the AI is developed by some large agency which was running its AI project on huge amounts of hardware from the beginning... Thus I cannot say that the overall scenario is implausible. (FOOM, 628, emphasis mine)

To take another view on how he believes that limited compute is in no way an obstacle to FOOM; he gives a "rough estimate" that you could probably run a mind about as smart as a human's mind on a 2008 desktop, or "or maybe even a desktop computer from 1996." (FOOM, 257)

1996 Desktop, Top of the Line

But a desktop from 1996 isn't even the lower limit. If a superintelligence were doing the design for a mind, he continues, "you could probably have [mind of] roughly human formidability on something substantially smaller" (FOOM, 257).

This view about the non-necessity of compute is thoroughly and deliberately integrated into Yudkowsky's view, without particular prodding from Hanson -- he has several asides in FOOM where he explains how Moravec or Kurzweil's reasoning about needing human-equivalent compute for AI is entirely wrong (FOOM, 19, 256).

Hanson does not cover topic of compute as much.

To the degree he does, he is extremely dubious that there any small handful of algorithmic insights in intelligence-space that will grant intelligence; he also emphasizes hardware much more.

For instance, he approvingly states that "the usual lore among older artificial intelligence researchers is that new proposed architectural concepts are almost always some sort of rearranging of older architectural concepts." He continues:

AI successes come when hardware costs fall enough to implement old methods more vigorously. Most recent big AI successes are due to better ability to integrate a diversity of small contributions. See how Watson won, or Peter Norvig on massive data beating elegant theories. New architecture deserves only small credit for recent success.... Future superintelligences will exist, but their vast and broad mental capacities will come mainly from vast mental content and computational resources. (FOOM, 497)

Yudkowsky seems quite wrong here, and Hanson right, about one of the central trends -- and maybe the central trend -- of the last dozen years of AI. Implementing old methods more vigorously is more or less exactly what got modern deep learning started; algorithms in absence of huge compute have achieved approximately nothing.

The Deep Learning revolution is generally dated from 2012's AlexNet. The most important thing about AlexNet isn't any particular algorithm; the most important thing is that the authors wrote their code with CUDA to run on GPUs, which let them make the neural network far bigger then it could otherwise have been while training in a mere week. Pretty much all subsequent progress in DL has hinged on the continuing explosion of compute resources since then. Someone who believed Yudkowsky would have been extremely surprised by 2012-2020, when compute spent on ML runs doubled every 6 months and when that doubling was nearly always key for the improved performance.

Algorithms do matter. I think finding the right algorithms and data, rather than getting enough compute, are probably the biggest current obstacles to extremely compute-rich organizations like OpenAI or Google right now. But it is nevertheless undisputable that algorithms have not had the primary importance Yudkowsky attributed to them, in the absence of vastly increased compute. Put it this way: there still exist comparatively compute-frugal AI startups like Keen Technologies -- but even these still need to buy things like a DGX station that would be the most powerful supercomputer in the world if it existed in 2008 by a wide margin. So a comparatively compute-frugal program now is still compute-rich beyond anything Yudkowsky points to over the course of the debate.

(If you're further interested in the topic you should of course read Gwern on the scaling hypothesis.)

Yudkowsky himself sometimes appears to have changed his mind at least somewhat -- if still he thought that algorithms were the key to AGI, he wouldn't have advocated for banning huge GPU clusters with international law, because that's the kind of thing which would predictably focus more attention on improved algorithms, no?

On the other hand -- he seems (?) to still think that if only AI researchers were smart enough, progress would not involve huge compute? From his discussion with Ngo:

[A] lot of the current interesting results have been from people spending huge compute (as wasn't the case to nearly the same degree in 2008) and if things happen on short timelines it seems reasonable to guess that the future will look that much like the present. This is very much due to cognitive limitations of the researchers rather than a basic fact about computer science, but cognitive limitations are also facts and often stable ones.

"The last decade of progress has depended on compute because everyone is too stupid to program human-level AI on a 2008 computer," could be the most Yudkowskan possible response to the evidence of the past ten years.

But -- regardless of Yudkowsky's current position -- it still remains that you'd have been extremely surprised by the last decade's use of compute if you had believed him, and much less surprised if you had believed Hanson.

Predictions -- Harder to Call

The above cases seem to me relatively clear.

The below I think seem pretty sensitive to what kind of predictions you take Hanson and Yudkowsky to be making, and how favorably or unfavorably you read them. The are greater interpretive degrees of freedom.

Nevertheless I include it this section, mostly because I've seen various claims that evidence supports one person or another.

"Human Content is Unimportant Compared to the Right Architecture"

A topic that comes up over and over again over the course of the debate -- particularly later, though -- is how important the prior "content" of all prior human civilization might be.

That is, consider of all the explicit knowledge encoded in all the books humans have written. Consider also all the implicit knowledge encoded in human praxis and tradition: how to swing an axe to cut down a tree, how to run a large team of AI science researchers, how to navigate different desired levels of kitchen cleanliness among roommates, how to use an arc-welder, how to calm a crying baby, and so on forever. Consider also all the content encoded not even in anyone's brains, but in the economic and social relationships without which society does not function.

How important is this kind of "content"?

It could be that this content, built up over the course of human civilization, is actually something AI would likely need. After all, humans take the first two decades or so of their life trying to absorb a big chunk of it. So it might be difficult for an AI to rederive all human scientific knowledge without this content.

Alternately, the vast edifice of prior human civilization and knowledge might fall before a more elegant AI architecture. The AI might find that it could easily recreate most of this knowledge without much difficulty, then quickly vault past it.

Hanson generally thinks that this content is extremely important.

...since a million years ago when humans probably had language, we are now a vastly more powerful species, because we used this ability to collect cultural content and built up a vast society that contains so much more. I think that if you took humans and made some better architectural innovations to them and put a pile of them off in the forest somewhere, we’re still going to outcompete them if they’re isolated from us because we just have this vaster base that we have built up since then. (FOOM, 449, emphasis mine)

And again, Hanson:

I see our main heritage from the past as all the innovations embodied in the design of biological cells/bodies, of human minds, and of the processes/habits of our hunting, farming, and industrial economies. These innovations are mostly steadily accumulating modular “content” within our architectures, produced via competitive processes and implicitly containing both beliefs and values.

Yudkowsky on the other hand, thinks that with the right architecture you can just skip over a lot of human content:

It seems to me at least that if we look at the present cognitive landscape, we’re getting really strong information that... humans can develop all sorts of content that lets them totally outcompete other animal species who have been doing things for millions of years longer than we have by virtue of architecture, and anyone who doesn’t have the architecture isn’t really in the running for it. (FOOM, 448)

Notably, Yudkowsky has also claimed, some years after the debate, that the evidence supports him in this domain.

In 2017 AlphaGoZero was released, which was able to learn Go at a superhuman level without learning from any human games at all. Yudkowsky then explained how this was evidence for his position:

I emphasize how all the mighty human edifice of Go knowledge, the joseki and tactics developed over centuries of play, the experts teaching children from an early age, was entirely discarded by AlphaGo Zero with a subsequent performance improvement. These mighty edifices of human knowledge, as I understand the Hansonian thesis, are supposed to be the bulwark against rapid gains in AI capability across multiple domains at once. I was like "Human intelligence is crap and our accumulated skills are crap" and this appears to have been bourne out.

Yes, Go is a closed system allowing for self-play. It still took humans centuries to learn how to play it. Perhaps the new Hansonian bulwark against rapid capability gain can be that the environment has lots of empirical bits that are supposed to be very hard to learn, even in the limit of AI thoughts fast enough to blow past centuries of human-style learning in 3 days; and that humans have learned these vital bits over centuries of cultural accumulation of knowledge, even though we know that humans take centuries to do 3 days of AI learning when humans have all the empirical bits they need; and that AIs cannot absorb this knowledge very quickly using "architecture", even though humans learn it from each other using architecture. If so, then let's write down this new world-wrecking assumption (that is, the world ends if the assumption is false) and be on the lookout for further evidence that this assumption might perhaps be wrong.

Tl;dr: As others are already remarking, the situation with AlphaGo Zero looks nothing like the Hansonian hypothesis and a heck of a lot more like the Yudkowskian one.

So Yudkowsky says.

If we round off Hanson's position to "content from humans is likely to matter a lot" and Yudkowsky's to "human content is crap," then I think that AlphaGoZero is some level of evidence in support of Yudkowsky's view. (Although Hanson responded by saying it was a very small piece of evidence, because his view always permitted narrow tools to make quick progress without content, and AGZ is certainly a narrow tool.)

On the other hand, is it the only piece of evidence reality gives us on this matter? Is it the most important?

One additional piece of data is that some subsequent developments of more complex game-playing AI have not been able to discard human data. Neither DeepMind's StarCraft II, nor OpenAI's Dota2 playing agents -- both post Go-playing AIs -- were able to train without being jumpstarted by human data. Starcraft II and Dota2 are far more like the world than Go -- they involve partial information, randomness, and much more complex ontologies. So this might be an iota of evidence for something like a Hansonian view.

But far more importantly, and even further in the same direction -- non-narrow tools like GPT-4 are generally trained by dumping a significant fraction of all written human content into them. Training them well currently relies in part on mildly druidical knowledge about the right percent of the different parts of human content to dump into them -- should you have 5% code or 15% code? Multilingual or not? More ArXiV or more Stack overflow? There is reasonable speculation that we will run out of sufficient high-quality human content to feed these systems. The recent PaLM-2 paper has 18 authors for the data section -- more than it has for the architecture section! (Although both have fewer than the infrastructure section gets, of course -- how to employ compute still remains big.) So content is hugely important for LLMs.

Given that GPT-4 and similar programs look to be by far the most generally intelligent AI entities in the real world rather than a game world yet made, it's hard for me to see this as anything other than some evidence that content in Hanson's sense might matter a lot. If LLMs matter more for future general intelligence than AlphaGoZero -- which is a genuinely uncertain "if" for me -- then Hanson probably gets some fractional number of Bayes points over Yudkowsky. If not, maybe the reverse?

I don't think the predictions are remotely clear enough for either person to claim reality as on their side.

"Simple AI architectures will generalize very well" (Claim probably not made)

Different AI architectures can be more simple or more complex.

AlphaGo, which combines Monte-Carlo Tree Search, a policy network and a value network, is probably more architecturally complex than GPT-3, which is mostly a single giant transformer. Something like DreamerV3 is probably more complex than either, although you very quickly get into discussion of "what counts as complexity?" But there is in any event a spectrum of architectural complexity out there -- a system of one giant neural network trained end-to-end is relatively less complex, and a system of multiple neural networks trained with different objective functions is relatively more complex.

Yudkowsky has claimed (since the FOOM debate) that he predicted (in the FOOM debate) something akin to "simple architectures will generalize very well over broad domains." Thus, during his discussion with Ngo last year:

But you can also see powerful practical hints that these things [intelligence and agency] are much more correlated than, eg, Robin Hanson was imagining during the FOOM debate, because Robin did not think something like GPT-3 should exist; Robin thought you should need to train lots of specific domains that didn't generalize. I argued then with Robin that it was something of a hint that humans had visual cortex and cerebellar cortex but not Car Design Cortex, in order to design cars. Then in real life, it proved that reality was far to the Eliezer side of Eliezer on the Eliezer-Robin axis, and things like GPT-3 were built with less architectural complexity and generalized more than I was arguing to Robin that complex architectures should generalize over domains.

In general, I think right now it does look like you can get a pretty architecturally simple network doing a lot of cool cross-domain things. So if Yudkowsky had predicted it and Hanson had denied it, it would be some level of evidence for Yudkowsky's view over Hanson's.

The problem is that Yudkowsky mostly.... just doesn't seem to predict this unambiguously? I have ctrl-f'd for "car," "automobile," "cortex" through the PDF, and just not found that particular claim.

He does make some similar claims. For instance, Yudkowsky does claim that human level AI will be universally cross-domain.

In other words, trying to get humanlike performance in just one domain is divorcing a final product of that economy from all the work that stands behind it. It’s like having a global economy that can only manufacture toasters, but not dishwashers or light bulbs. You can have something like Deep Blue that beats humans at chess in an inhuman, specialized way; but I don’t think it would be easy to get humanish performance at, say, biology R&D, without a whole mind and architecture standing behind it that would also be able to accomplish other things. Tasks that draw on our cross-domain-ness, or our long-range real-world strategizing, or our ability to formulate new hypotheses, or our ability to use very high-level abstractions—I don’t think that you would be able to replace a human in just that one job, without also having something that would be able to learn many different jobs.

Unfortunately, this is a claim that an architecture will have breadth, but not a claim about the simplicity of the architecture. It is also -- granting that we don't have AIs that can do long-range planning -- one for which we haven't received good information.

Here's a claim Yudkowsky and Hanson disagree about that could be interpreted as "simple architectures will generalize far" -- Yudkowsky says that only a few insights separate AI from being human-level.

On one hand, you'd think that saying a "few insights" separate AI from human-level-ness sort-of implies that the AI would have a simple architecture. But on the other hand, you could truthfully say only a few insights let you steer rockets around, fundamentally... but rockets nevertheless have pretty complex architectures. I'm not sure that the notion of "few insights" really corresponds to "simple architecture." In the dialog, it more seems to correspond to.... FOOM-ability, to the idea that you can find an insight while thinking in a basement that lets your thinking improve 2x, which is indifferent to the simplicity of the architecture the insight lets you find.

Let me return to what Yudkowsky and Hanson actually say, to show why.

Yudkowsky claims that a small handful of insights will likely propel a ML model from infrahumanity to superhumanity. He characterises the number as "about ten" (FOOM, 445) but also says it might be just one or two important ones (FOOM, 450). He affirms that "intelligence is about architecture" and that "architecture is mostly about deep insights" (FOOM, 406, emphasis his) and thus that the people who make an AI FOOM will have done so because of new deep insights (FOOM, 436).

Hanson, by contrast, thinks "powerful architectural insights are quite rare" (FOOM, 496). He believes that "most tools require lots more than a few key insights to be effective—they also require thousands of small insights that usually accumulate from a large community of tool builders and users" (FOOM, 10). He does think that there are some large insights in general, but insights "are probably distributed something like a power law, with many small-scope insights and a few large-scope" (FOOM, 144).

We shouldn’t underrate the power of insight, but we shouldn’t overrate it either; some systems can just be a mass of details, and to master such systems you must master those details. And if you pin your hopes for AI progress on powerful future insights, you have to ask how often such insights occur, and how many we would need. The track record so far doesn’t look especially encouraging. (FOOM, 351)

So Hanson in general thinks AI will look like most technology -- see the progress of planes, cars, guns, and so on -- in that progress comes from hundreds of tiny refinements and improvements. There's no moment in the history of planes where they suddenly become useful -- there are 100s of small and big improvements all gradually moving planes from "mostly useless, with rare exceptions" to "incredibly useful."

Yudkowsky, on the other hand, thinks that AI will look more like a handful of "eureka!" moments, followed up by some coding and subsequent world-transformation. As is witnessed, of course, by MIRI's / the then-Singularity institute plan to build a seed AGI entirely on their own.

If we take this as the disagreement -- will AI progress come from a handful of big insights, or many small ones -- I think the world right looks a great deal more like Hanson's view than Yudkowsky's. In his interview with Lex Fridman, Sam Altman characterizes GPT-4 as improving on GPT-3 in a hundred little things rather than a few big things, and that's... by far... my impression of current ML progress. So when I interpret their disagreement in terms of the kind of work you need to do before attaining AGI, I tend to agree that Hanson is right.

On the other hand, we could return to saying that "few insights" implies "simple architecture." I don't think this is... exactly... implied by the text? I'll admit that the vibes are for sure more on Yudkowsky's side. So if we interpret the text that way, then I'd tend to agree that Yudkowsky is right.

Either way, though, I don't think Yudkowsky and Hanson were really clear about what was going on and about what kind of anticipations they were making.


I was going to have a whole section of things that didn't quite make the cut vis-a-vis predictions, but were super suggestive, but that could be seen as trying to influence the results on my part. So I'm just going to bail instead, mostly.


Who was more right?

When I look at the above claims, Hanson's record looks a little better than Yudkowsky's, albeit with a small sample size. If you weight the Cyc prediction a ton, maybe you could get them to parity. I think it would be weird not to see the compute prediction as a little more important than the Cyc prediction, though.

Note that Hanson currently thinks the chances of AI doom are < 1%, while Yudkowsky thinks that they are > 99%. (Hanson thinks the chances of doom are... maybe somewhat lower than Yudkowsky, but they seem to have different ontologies of what qualifies as "doom" as the comments point out.)

What Actual Lessons Can We Learn, Other Than Some Stuff About Deferral to Authority That Everyone Will Ignore Because We Like to Pretend We Do Not Defer to Authority, Even Though We All Fucking Do?

I was mildly surprised by how well some economic abstractions hold up.

The big part of the meta-debate in FOOM -- which they return to over and over again -- is whether you should try to use mostly only mental tools whose results have proven useful in the past.

Hansons' view is that if you use rules which you think retrodict data well but which haven't been vetted by actual predictions, you are almost certain to make mistakes because humans psychologically cannot distinguish actual retrodictions from post-hoc fitting. To avoid this post-hoc fitting, you should only use tools which have proven useful for actual predictions. Thus, he prefers to use economic abstractions which have been thus vetted over novel abstractions invented for the purpose.

I think this holds up pretty well. Yudkowsky makes predictions about future use of compute in AI, based on his attempted retrodictions about human evolution, human skull size, and so on. These predictions mostly failed. On the other hand, Hanson makes some predictions about AI converging to more similar systems, about advances in these systems mutually improving competing systems, and so on, based only on economic theory. These predictions succeeded.

Overall, I think "don't lean heavily on abstractions you haven't yet gotten actual good predictions from" comes out pretty well from the debate, and I continue to heavily endorse research evaluation proposals related to it.


New Comment
71 comments, sorted by Click to highlight new comments since: Today at 8:56 PM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moreover, granting neural networks, trading cognitive content has turned out to be not particularly hard. It does not require superintelligence to share representations between different neural networks; a language model can be adapted to handle visual data without enormous difficulty. Encodings from BERT or an ImageNet model can be applied to a variety of downstream tasks, and this is by now a standard element in toolkits and workflows. When you share architectures and training data, as for two differently fine-tuned diffusion models, you can get semantically meaningful merges between networks simply by taking the actual averages of their weights. Thoughts are not remotely "written in a different language."

Huh, I am very surprised by this section. When I read the description I thought you would obviously call this prediction the other way around. 

The part where you can average weights is unique to diffusion models, as far as I can tell, which makes sense because the 2-d structure of the images is very local, and so this establishes a strong preferred basis for the representations of different networks. 

Exchanging knowledge between two language models currently seems appr... (read more)

I would also call this one for Eliezer. I think we mostly just retrain AI systems without reusing anything. I think that's what you'd guess on Eliezer's model, and very surprising on Robin's model. The extent to which we throw things away is surprising even to a very simple common-sense observer.

I would have called "Human content is unimportant" for Robin---it seems like the existing ML systems that are driving current excitement (and are closest to being useful) lean extremely heavily on imitation of human experts and mostly don't make new knowledge themselves. So far game-playing AI has been an exception rather than the rule (and this special case was already mostly established by the time of the debate).

That said, I think it would be reasonable to postpone judgment on most of these questions since we're not yet in the end of days (Robin thinks it's still fairly far, and Eliezer thinks it's close but things will change a lot by the intelligence explosion). The main ones I'd be prepared to call unambiguously already are:

  • Short AI timelines and very general AI architectures: obvious advantage to Eliezer.
  • Importance of compute, massive capital investment, and large projects selling their output to the world: obvious advantage to Robin.

These aren't literally settled, but market odds have moved really far since the debate, and they both seem like defining features of the current world. In each case I'd say that one of the two participants was clearly super wrong and the other was basically right.

4Max H4mo
If someone succeeds in getting, say a ~13B parameter model to be equal in performance (at high-level tasks) to a previous-gen model 10x that size, using a 10x smaller FLOPs budget during training, isn't that a pretty big win for Eliezer? That seems to be kind of what is happening: this list mostly has larger models at the top, but not uniformly so. I'd say, it was more like, there was a large minimum amount of compute needed to make things work at all, but most of the innovation in LLMs comes from algorithmic improvements needed to make them work at all. Hobbyists and startups can train their own models from scratch without massive capital investment, though not the absolute largest ones, and not completely for free. This capability does require massive capital expenditures by hardware manufacturers to improve the underlying compute technology sufficiently, but massive capital investments in silicon manufacturing technology are nothing new, even if they have been accelerated and directed a bit by AI in the last 15 years. And I don't think it would have been surprising to Eliezer (or anyone else in 2008) that if you dump more compute at some problems, you get gradually increasing performance. For example, in 2008, you could have made massive capital investments to build the largest supercomputer in the world, and gotten the best chess engine by enabling the SoTA algorithms to search 1 or 2 levels deeper in the Chess game tree. Or you could have used that money to pay for researchers to continue looking for algorithmic improvements and optimizations.

The part where you can average weights is unique to diffusion models, as far as I can tell, which makes sense because the 2-d structure of the images is very local, and so this establishes a strong preferred basis for the representations of different networks.

Exchanging knowledge between two language models currently seems approximately impossible? Like, you can train on the outputs, but I don't think there is really any way for two language models to learn from each other by exchanging any kind of cognitive content, or to improve the internal representations of a language model by giving it access to the internal representations of another language model.

There's a pretty rich literature on this stuff, transferring representational/functional content between neural networks.

Averaging weights to transfer knowledge is not unique to diffusion models. It works on image models trained with non-diffusion setups (, as well as on non-image tasks such as language modeling (, Exchanging knowledge between language models via weight averaging is possible pr... (read more)

I think requiring a "common initialization + early training trajectory" is a pretty huge obstacle to knowledge sharing, and would de-facto make knowledge sharing among the vast majority of large language models infeasible.  I do think stuff like stitching via cross-attention is kind of interesting, but it feels like a non-scalable way of knowledge sharing, unless I am misunderstanding how it works. I don't know much about Knowledge Distillation, so maybe that is actually something that would fit the "knowledge sharing is easy" description (my models here aren't very confident, and I don't have super strong predictions on whether knowledge sharing among LLMs is possible or impossible, my sense was just that so far we haven't succeeded at doing it without very large costs, which is why, as far as I can tell, new large language models are basically always trained from scratch after we made some architectural changes).

I think requiring a "common initialization + early training trajectory" is a pretty huge obstacle to knowledge sharing, and would de-facto make knowledge sharing among the vast majority of large language models infeasible.

Agreed. That part of my comment was aimed only at the claim about weight averaging only working for diffusion/image models, not about knowledge sharing more generally.

I do think stuff like stitching via cross-attention is kind of interesting, but it feels like a non-scalable way of knowledge sharing, unless I am misunderstanding how it works.

Not sure I see any particular argument against the scalability of knowledge exchange between LLMs in general or via cross-attention, though. Especially if we're comparing the cost of transfer to the cost of re-running the original training. That's why people are exploring this, especially smaller/independent researchers. There's a bunch of concurrent recent efforts to take frozen unimodal models and stitch them into multimodal ones (example from a few days ago Heck, the dominant approach in the community of LLM hobbyists seems to be transferring behaviors and knowledge from GPT-4 into LLaMa variants via targeted synthetic data generation. What kind of scalability are you thinking of?

In addition to what cfoster0 said, I'm kinda excited about the next ~2-3 years of cross LLM knowledge transfer, so this seems a differing prediction about the future, which is fun.

My model for why it hasn't happened already is in part just that most models know the same stuff, because they're trained on extremely similar enormous swathes of text, so there's no gain to be had by sticking them together. That would be why more effort goes into LLM / images / video glue than LLM / LLM glue.

But abstractly, a world where LLMs can meaningfully be connected to vision models but not on to other LLMs would be surprising to me. I expect something like training a model on code, and another model on non-code text, and then sticking them together to be possible.

If we take this as the disagreement -- will AI progress come from a handful of big insights, or many small ones -- I think the world right looks a great deal more like Hanson's view than Yudkowsky's. In his interview with Lex Fridman, Sam Altman characterizes GPT-4 as improving on GPT-3 in a hundred little things rather than a few big things, and that's... by far... my impression of current ML progress. So when I interpret their disagreement in terms of the kind of work you need to do before attaining AGI, I tend to agree that Hanson is right.

This also feels confused to me. Of course the key insight of the Transformer architecture was super simple, and as far as I can tell the primary difference between GPT-4 and GPT-3 is throwing a lot more compute at it, combined with a lot of engineering work to get it to work at larger scales and more GPUs (in a way that doesn't substantially improve performance). 

We don't know how GPT-4 works, but I would currently bet that within 2-3 years we will see a system that gets GPT-4 performance and compute-efficiency whose source-code is extremely simple and does not require a lot of clever hacks, but whose difference from GPT-3 will be best characterized by "0 to 2 concrete insights that improved things", since that is exactly what we've seen with GPT-2 and GPT-3. The first system to reach a capability threshold often has a bunch of hacks, usually stemming from a lack of polish or understanding or just bugs, which then iteratively get pared down as progress continues.

I agree I'm confused here. But it's hard to come down to clear interpretations. I kinda think Hanson and Yudkowsky are also confused.

Like, here are some possible interpretations on this issue, and how I'd position Hanson and Yudkowsky on them based on my recollection and on vibes.

  1. Improvements in our ways of making AI will be incremental. (Hanson pro, Yudkowsky maaaybe con, and we need some way to operationalize "incremental", so probably just ambiguous)
  2. Improvements in our ways of making AI will be made by lots of different people distributed over space and time. (Hanson pro, Yudkowsky maybe con, seems pretty Hanson favored)
  3. AI in its final form will have elegant architecture (Hanson more con, Yudkowsky more pro, seems Yudkowsky favored, but I'm unhappy with what "elegant" means)

Or even 4. People know when they're making a significant improvement to AI -- the difference between "clever hack" and "deep insight" is something you see from beforehand just as much as afterwards. (Hanson vibes con, Yudkowsky vibes pro, gotta read 1000 pages of philosophy of progress before you call it, maybe depends on the technology, I tend to think people often don't know)

Which is why this overall section is in the "hard to call" area.

8Eli Tyre4mo
Here's a market for your claim. GPT-4 performance and compute efficiency from a simple architecture before 2026  
2Eli Tyre4mo
How do I embed the market directly into the comment, instead of having a link to which people click through?
You just copy the link to the market, and if you paste it into an empty new paragraph it should automatically be replaced with an embed.

Note that Hanson currently thinks the chances of AI doom are < 1%, while Yudkowsky thinks that they are > 99%.


It is good to note that the optimistic version of Hanson would be considered doom by many (including Yudkowsky). Doom/utopia definition Yudkowsky is not equal to doom/utopia definition of Hanson. 

This is important in many discussions. Many non-doomers have definitions of utopia that many consider to be dystopian. E.g. AI will replace humans to create a very interesting future where the AI's will conquer the stars, some think this is positive others think this is doom because there are no humans. 

5Lukas Finnveden4mo
This was also my impression. Curious if OP or anyone else has a source for the <1% claim? (Partially interested in order to tell exactly what kind of "doom" this is anti-predicting.)
Here a summary of the Hanson position (by himself). He is very clear about humanity being replaced by AI.

I skimmed this, but I get the sense that you're interpreting Hanson's predictions in ways that he would not have agreed with. My cached thoughts suggest that Hanson's model predicts deep learning couldn't possibly work, because creating "intelligence" will require lots of custom engineering for different skills instead of "GPU go brr". Hence his admiration of Cyc: it is focusing on implemeting a whole host of skills with lots of integrated knowledge.

See his post "I heart CYC". Here's a quote form it, which I think highlights Hanson's own interpretation of "architecture is overrated":

The lesson Lenat took from EURISKO is that architecture is overrated; AIs learn slowly now mainly because they know so little. So we need to explicitly code knowledge by hand until we have enough to build systems effective at asking questions, reading, and learning for themselves. Prior AI researchers were too comfortable starting every project over from scratch; they needed to join to create larger integrated knowledge bases. This still seems to me a reasonable view, and anyone who thinks Lenat created the best AI system ever should consider seriously the lesson he thinks he learned.

That sure doesn't l... (read more)

I do in fact include the same quote you include in the section titled "Cyc is not a Promising Approach to Machine Intelligence." That's part of the reason why that section resolves in favor of Yudkowsky. I agree that Hanson thinks skills in general will be harder to acquire than Yudkowsky thinks. I think that could easily be another point for Yudkowsky in the "human content vs right architecture." Like many points there in that section, I don't think it's operationalized particularly well, which is why I don't call it either way.

But -- regardless of Yudkowsky's current position -- it still remains that you'd have been extremely surprised by the last decade's use of compute if you had believed him, and much less surprised if you had believed Hanson.

I think you are pointing towards something real here, but also, algorithmic progress is currently outpacing compute growth by quite a bit, at least according to the Epoch AI estimates I remember. I also expect algorithmic progress to increase in importance. 

I do think that some of the deep learning revolution turned out to be kind of compute bottlenecked, but I don't believe this is currently that true anymore, though I think it's kind of messy (since it's unclear what fraction of compute-optimizations themselves were bottlenecked on making it cheaper to experiment by having cheaper compute). 

I do think that some of the deep learning revolution turned out to be kind of compute bottlenecked, but I don't believe this is currently that true anymore

I had kind of the exact opposite impression of compute bottlenecks (that deep learning was not meaingfully compute bottlenecked until very recently). OpenAI apparently has a bunch of products and probably also experiments that are literally just waiting for H100s to arrive. Probably this is mainly due to the massive demand for inference, but still, this seems like a kind actual hardware bottleneck that is pretty new for the field of DL. It kind of has a parallel to Bitcoin mining technology, where the ability to get the latest-gen ASICs first was (still is?) a big factor in miner profitability.

Huh, maybe. My current guess is that things aren't really "compute bottlenecked". It's just the case that we now have profitable enough AI that we really want to have better compute. But if we didn't get cheaper compute, we would still see performance increase a lot as we find ways to improve compute-efficiency the same way we've been improving it a lot over the past 5-10 years, and that for any given period of time, the algorithmic progress is a bigger deal for increasing performance than the degree to which compute got cheaper in the same period.
I'd say usually bottlenecks aren't absolute, but instead quantifiable and flexible based on costs, time, etc.? One could say that we've reached the threshold where we're bottlenecked on inference-compute, whereas previously talk of compute bottlenecks was about training-compute. This seems to matter for some FOOM scenarios since e.g. it limits the FOOM that can be achieved by self-duplicating. But the fact that AI companies are trying their hardest to scale up compute, and are also actively researching more compute-efficient algorithms, means IMO that the inference-compute bottleneck will be short-lived.
In what sense are they "not trying their hardest"?
I think you inserted an extra "not".
Oh gosh, how did I hallucinate that?
Maybe you're an LLM.
This is true, but as a picture of a past, this is underselling compute by focusing on cost of compute rather than compute itself. I.e., in the period between 2012 and 2020: -- Algo efficiency improved 44x, if we use the OpenAI efficiency baseline for AlexNet -- Cost of compute improved by... less than 44x, let's say, if we use a reasonable guess based off Moore's law. So algo efficiency was more important than that cost per FLOP going down. -- But, using EpochAI's estimates for a 6 month doubling time, total compute per training run increased > 10,000x. So just looking at cost of compute is somewhat misleading. Cost per FLOP went down, but the amount spent went up from just dollars on a training run to tens of thousands of dollars on a training run.
It is ridiculous to interpret this as some general algo efficiency improvement - it's a specific improvement in a specific measure (flops) which doesn't even directly translate into equivalent wall-clock time performance, and is/was already encapsulated in sparsity techniques. There has been extremely little improvement in general algorithm efficiency, compared to hardware improvement.
Not disagreeing. Am still interested in a longer-form view of why the 44x estimate overestimates, if you're interested in writing it (think you mentioned looking into it one time).
It's like starting with an uncompressed image, and then compressing it farther each year using different compressors (which aren't even the best known, as there were better compressors available known earlier or in the beginning), and then measuring the data size reduction over time and claiming it as a form of "general software efficiency improvement". It's nothing remotely comparable to moore's law progress (which more generally actually improves a wide variety of software).
This is not right, at least in computer vision. They seem to be the same order of magnitude. Physical compute has growth at 0.6 OOM/year and physical compute requirements have decreased at 0.1 to 1.0 OOM/year, see a summary here or a in depth investigation here Another relevant quote
Cool, makes sense. Sounds like I remembered the upper bound for the algorithmic efficiency estimate. Thanks for correcting!
Algorithmic improvement has more FOOM potential. Hardware always has a lag. 
That is to very basic approximation correct. Davidson's takeoff model illustrates this point, where a "software singularity" happens for some parameter settings due to software not being restrained to the same degree by capital inputs. I would point out however that our current understanding of how software progress happens is somewhat poor. Experimentation is definitely a big component of software progress, and it is often understated in LW.  More research on this soon!

Note that Hanson currently thinks the chances of AI doom are < 1%

I think this is a common misconception of Hanson's views. If you define "doom" as human extinction, he's put it at about 30% within one year after human-level AI (I don't have a more recent link on hand but I've seen him talk about it on Twitter a few times, and I don't think he's changed his views substantially).

Hanson's chance on extinction is close to a 100%. He just thinks it's slower. He is optimistic about something that most would call a dystopia (a very interesting technological race that will conquer the stars before the grabby aliens do). A discussion between Yudkowsky and Hanson is about are we dying fast or slow. It is not really a doomer vs non-doomer debate from my perspective (still a very interesting debate btw, both have good arguments). I do appreciate the Hanson perspective. It is well thought out and coherent. I just would not call it optimistic (because of the extinction). I have no ready example of a non-extinction perspective coherent view on the future. Does anybody have a good example of a coherent non-extinction view? 
Yeah, one example is the view that AGI won't happen, either because it's just too hard and humanity won't devote sufficient resources to it, or because we recognize it will kill us all.

I think this is a pretty good and fair roundup, but I want to add as very lazy bit of personal context short of actually explaining my takes:

Both when I read the FOOM debate, and skimming over it again now, in my personal opinion Yudkowsky largely comes off better. Yudkowsky makes a few major mistakes that are clearly visible now, like being dismissive of dumb, scaled, connectionist architectures, but the arguments seem otherwise repairable. Contra, I do not know how to well defend Hanson's position.

I don't state this to claim a winner, and for sure there are people who read the arguments the other way, but only to suggest to the reader, if you have the time, consider taking a look and forming your own opinion.

I don't think that's a mistake at all. Sure, they've given us impressive commercial products, but no progress towards AGI, so the dismissiveness is completely justified.
This doesn't feel like a constructive way to engage with the zeitgeist here. Obviously Yudkowsky plus most people here disagree with you on this. As such, if you want to engage productively on this point, you should find a place better set up to discuss whether NNs uninformatively dead-end. Two such places are the open thread or a new post where you lay out your basic argument.

Yudkowsky seems quite wrong here, and Hanson right, about one of the central trends -- and maybe the central trend -- of the last dozen years of AI. Implementing old methods more vigorously is more or less exactly what got modern deep learning started; algorithms in absence of huge compute have achieved approximately nothing.


Really? If you sent a bunch of H100 GPUs (and infrastructure needed to run them) back in time to 2008, people might have been able to invent transformers, GPTs, and all the little quirks that actually make them work a little faster, and a little more cheaply.

OTOH, if you sent back Attention is all you need (and some other papers or documentation on ML from the last decade), without the accompanying hardware, people likely would have gotten pretty far, pretty quickly, just using 2008-level hardware (or buying / building more and faster hardware, once they knew the right algorithms to run on them). People didn't necessarily have a use for all the extra compute, until they invented the algorithms which could actually make use of it.

Even today, just scaling up GPTs even further is one obvious thing to try that is currently somewhat bottlenecked on super... (read more)

So I think that is just straightforwardly true. Everyone times the start of the deep learning... thing, to 2012's AlexNet. AlexNet has convolutions, and reLU, and backprop, but didn't invent any of them. Here's what Wikipedia says is important about AlexNet So.... I think that what I'm saying about how DL started is the boring consensus. Of course, new algorithms did come along, and I agree that they are important. But still -- if there's something important that has worked without big compute, what is it? (I do agree that in a counterfactual world I'd probably prefer to get Attention is All You Need.) And yeah, I accidentally posted this a month ago for 30 min when it was in draft, so you might have seen it it before.
1Max H4mo
I would say that the "vigor" was almost entirely bottlenecked on researcher effort and serial thinking time, rather than compute resources. A bunch of concrete examples to demonstrate what I mean: * There are some product rollouts (e.g. multimodal and 32k GPT-4) and probably some frontier capabilities experiments which are currently bottlenecked on H100 capacity. But this is a very recent phenomenon, and has more to do with the sudden massive demand for inference rather than anything to do with training. In the meantime, there are plenty of other ways that OpenAI and others are pushing the capabilities frontier by things other than just piling on more layers on more GPUs. * If you sent back the recipe for training the smallest GPT model that works at all (GPT-J  6B, maybe?), people in 2008 could probably cobble together existing GPUs into a supercomputer, or, failing that, have the foundries of 2008 fabricate ASICs, and have it working in <1 year. * OTOH, if researchers in 2008 had access to computing resources of today, I suspect it would take them many years to get to GPT-3. Maybe not the full 14 years, since now many of their smaller-scale training runs go much quicker, and they can try larger things out much faster. But the time required to: (a) think up which experiments to try (b) implement the code to try them, and (c) analyze the results, dominates the time spent waiting around for a training run to finish. * More generally, it's not obvious how to "just scale things up" with more compute, and figuring out the exact way of scaling things is itself an algorithmic innovation. You can't just literally throw GPUs at the DL methods of 10-15 years ago and have them work. * "Researcher time" isn't exactly the same thing as algorithmic innovation, but I think that has been the actual bottleneck on the rate of capabilities advancement during most of the last 15 years (again, with very recent except
What is so great about that 2007 paper? Can you please explain the bizarre use of the word "compute" here? Is this a typo? "compute" is a verb. The noun form would be "computing" or "computing power."
4Max H3mo
The paper is from 2017, not 2007. It's one of the foundational papers that kicked off the current wave of transformer-based AI. The use of compute as a noun is pretty standard, see e.g. this post Algorithmic Improvement Is Probably Faster Than Scaling Now. (Haven't checked, but chatGPT or Bing could probably have answered both of these questions for you.)
Sorry, 2007 was a typo. I'm not sure how to interpret the ironic comment about asking an LLM, though.
5Max H3mo
It was not meant as irony or a joke. Both questions you asked are simple factual questions that you could have answered quickly on your own using an LLM or a traditional search engine.
LLMs' answers on factual questions are not trustworthy; they are often hallucinatory. Also, I was obviously asking you for your views, since you wrote the comment.

An actual improvement to say, how Transformers work, would help with speech recognition, language modelling, image recognition, image segmentation, and so on and so forth. Improvements to AI-relevant hardware are a trillion-dollar business. Work compounds so easily on other work that many alignment-concerned people want to conduct all AI research in secret.

This section feels like it misunderstands what Yudkowsky is trying to say here, though I am not confident. I expected this point to not be about "what happens if you find an improvement to transformers i... (read more)

I think an important point missing from the discussion on compute is training vs inference: you can totally get a state-of-the-art language model performing inference on a laptop.

This is a slight point in favor of Yudkowsky: thinking is cheap, finding the right algorithm (including weights) is expensive. Right now we're brute-forcing the discovery of this algorithm using a LOT of data, and maybe it's impossible to do any better than brute-forcing. (Well, the human brain can do it, but I'll ignore that.)

Could you run a LLM on a desktop from 2008? No. But, o... (read more)

So, like, I remain pretty strongly pro Hanson on this point: 1. I think LLaMA 7b is very cool, but it's really stretching it to call it a state-of-the-art language model. It's much worse than LLaMA 65b, which much worse than GPT-4, which most people think is > 100b as far as I know. I'm using a 12b model right now while working on an interpretability project... and it is just much, much dumber than these big ones. 2. Not being able to train isn't a small deal, I think. Learning in a long-term way is a big part of intelligence. 3. Overall, and not to be too glib, I don't see why fitting a static and subhuman mind into consumer hardware from 2023 means that Yudkowsky doesn't lose points for saying you can fit a learning (implied) and human-level mind into consumer hardware from 2008.
Because one has nothing to do with the other. LLMs are getting bigger and bigger, but that says nothing about whether a mind designed algorithmically could fit on consumer hardware.

which is indifferent to the simplicify of the architecture the insight lets you find.

The bolded should be "simplicity". 

It does not require superintelligence to share representations between different neural networks

I don’t think you can train one transformer on a dataset that doesn’t contain any mentions of the fact X but mentions fact Y, then train the second transformer on a dataset that contains Y but not X, and then easily share the knowledge of X and Y between them

Let's say we have a language model that only knows how to speak English and a second one that only knows how to speak Japanese. Is your expectation that there would be no way to glue these two LLMs together to build an English-to-Japanese translator such that training the "glue" takes <1% of the compute used to train the independent models? I weakly expect the opposite, largely based on stuff like this, and based on playing around with using algebraic value editing to get an LLM to output French in response to English (but also note that the LLM I did that with knew English and the general shape of what French looks like, so there's no guarantee that result scales or would transfer the way I'm imagining).
Correct. They're two entirely different models. There's no way they could interoperate without massive computing and building a new model. (Aside: was that a typo, or did you intend to say "compute" instead of "computing power"?)
It historically has been shown that one can interpolate between a vision model and a language model[1]. And, more recently, it has been shown that yes, you can use a fancy transformer to map between intermediate representations in your image and text models, but you don't have to do that and in fact it works fine[2] to just use your frozen image encoder, then a linear mapping (!), then your text decoder. I personally expect a similar phenomenon if you use the first half of an English-only pretrained language model and the second half of a Japanese-only pretrained language model -- you might not literally be able to use a linear mapping as above, but I expect you could use a quite cheap mapping. That said, I am not aware of anyone who has actually attempted the thing so I could be wrong that the result from [2] will generalize that far. Yeah, I did mean "computing power" there. I think it's just a weird way that people in my industry use words.[3] 1. ^ Example: DeepMind's Flamingo, which demonstrated that it was possible at all to take pretrained language model and a pretrained vision model, and glue them together into a multimodal model, and that doing so produced SOTA results on a number of benchmarks. See also this paper, also out of DeepMind. 2. ^ Per  Linearly Mapping from Image to Text Space 3. ^ For example, see this HN discussion about it. See also the "compute" section of this post, which talks about things that are "compute-bound" rather than "bounded on the amount of available computing power". Why waste time use lot word when few word do trick?
1Mikhail Samin4mo
(“There’s no way” is a claim too strong. My expectation is that there’s a way to train something from scratch using <1% of compute that was used to train either LLMs, that works better.) But I was talking about sharing the internal representations between the two already trained transformers.

I really disagree with this article. It's basically just saying that you drank the LLM Kool-Aid. LLMs are massively overhyped. GPT-x is not the way to AGI.

This article could have been written a dozen years ago. A dozen years ago, people were saying the same thing: "we've given up on the Good Old-Fashioned AI / Douglas Hofstadter approach of writing algorithms and trying to find insights! it doesn't give us commerical products, whereas the statistical / neural network stuff does!"

And our response was the same as it is today. GOFAI is hard. No one expected t... (read more)

For what it's worth, I'm at least somewhat an LLM-plateau-ist -- on balance at least somewhat dubious we get AGI from models in which 99% of compute is spent on next-word prediction in big LLMs. I really think Nostalgebrists take has merit and the last few months have made me think it has more merit. Yann LeCunn's "LLMs are an off-ramp to AGI" might come back to show his forsight. Etc etc. But it isn't just LLM progress which has hinged on big quantities of compute. Everything in deep learning -- ResNets, vision Transformers, speech-to-text, text-to-speech, AlphaGo, EfficientZero, Dota5, VPT, and so on -- has used more and more compute. I think at least some of this deep learning stuff is an important step to human-like intelligence, which is why I think this is good evidence against Yudkowsky If you think none of the DL stuff is a step, then you can indeed maintain the compute doesn't matter, of course, and that I am horribly wrong. But if you think the DL stuff is an important step, it becomes more difficult to maintain.

By 2011, Hanson concedes at least somewhat to Yudkowsky's position and states that Cyc might not have enough information or be in the wrong format (FOOM, 496).

I looked for it on that page, but instead it's on 497 (second-to-last numbered paragraph), where he says:

4. The AI system Eliezer most respects for its promising architecture is eurisko. Its author, Doug Lenat, concluded from it that our main obstacle is not architecture but mental content—the more one knows, the faster one can learn. Lenat’s new Cyc system has much content, though it still doesn’t learn fast. Cyc might not have enough content yet, or perhaps Lenat sought the wrong content or format.

Thank you, this has many interesting points. The takeoff question is the heart of predicting x-risk. With soft takeoff catastrophy seems unlikely, and likely with hard takeoff.

One point though. "Foom" was intended to be a synonym for "intelligence explosion" and "hard takeoff". But not for "recursive self-improvement", although EY perceived the latter to be the main argument for the former, though not the only one. He wrote:

[Recursive self-improvement] is the biggest, most interesting, hardest-to-analyze, sharpest break-with-the-past contributing to the

... (read more)
"The upper bound of what can be learned from a dataset is not the most capable trajectory, but the conditional structure of the universe implicated by their sum".
Being able to perfectly imitate a Chimpanzee would probably also require superhuman intelligence. But such a system would still only be able to imitate chimpanzees. Effectively, it would be much less intelligent than a human. Same for imitating human text. It's very hard, but the result wouldn't yield large capabilities.
Do please read the post. Being able to predict human text requires vastly superhuman capabilities, because predicting human text requires predicting the processes that generated said text. And large tracts of text are just reporting on empirical features of the world. Alternatively, just read the post I linked.
I did read your post. The fact that something like predicting text requires superhuman capabilities of some sort does not mean that the task itself will result in superhuman capabilities. That's the crucial point. It is much harder to imitate human text than to write while being a human, but that doesn't mean the imitated human itself is any more capable than the original. An analogy. The fact that building fusion power plants is much harder than building fission power plants doesn't at all mean that the former are better. They could even be worse. There is a fundamental disconnect between the difficulty of a task and the usefulness of that task.
2Matt Goldenberg4mo
It depends on your ability to extract the information from the model. RLHF and instruction tuning are one such algorithm that allow certain capabaliities besides next-token prediction to be extracted from the model. I suspect many other search and extraction techniques will be found, which can leverage latent capabalities and understandings in the model that aren't modelled in its' text outputs.
This approach doesn't seem to work with in-context learning. Then it is unclear whether fine-tuning could be more successful.
2Matt Goldenberg4mo
I think there are probably many approaches that don't work.
I aware of just three methods to modify GPTs: In-context learning (prompting), supervised fine-tuning, reinforcement fine-tuning. The achievable effects seem rather similar.
2Matt Goldenberg4mo
There's many other ways to search the network in the literature, such as Activation Vectors.  And I suspect we're just getting started on these sorts of search methods.

New to LessWrong?