This is a transcription of Eliezer Yudkowsky responding to Paul Christiano's Takeoff Speeds live on Sep. 14, followed by a conversation between Eliezer and Paul. This discussion took place after Eliezer's conversation with Richard Ngo.

 

Color key:

 Chat by Paul and Eliezer  Other chat  Inline comments 

 

5.5. Comments on "Takeoff Speeds"

 

[Yudkowsky][10:14]  (Nov. 22 follow-up comment) 

(This was in response to an earlier request by Richard Ngo that I respond to Paul on Takeoff Speeds.)

[Yudkowsky][16:52] 

maybe I'll try liveblogging some https://sideways-view.com/2018/02/24/takeoff-speeds/ here in the meanwhile

 

Slower takeoff means faster progress

[Yudkowsky][16:57] 


The main disagreement is not about what will happen once we have a superintelligent AI, it’s about what will happen before we have a superintelligent AI. So slow takeoff seems to mean that AI has a larger impact on the world, sooner.

It seems to me to be disingenuous to phrase it this way, given that slow-takeoff views usually imply that AI has a large impact later relative to right now (2021), even if they imply that AI impacts the world "earlier" relative to "when superintelligence becomes reachable".

"When superintelligence becomes reachable" is not a fixed point in time that doesn't depend on what you believe about cognitive scaling. The correct graph is, in fact, the one where the "slow" line starts a bit before "fast" peaks and ramps up slowly, reaching a high point later than "fast". It's a nice try at reconciliation with the imagined Other, but it fails and falls flat.

This may seem like a minor point, but points like this do add up.

In the fast takeoff scenario, weaker AI systems may have significant impacts but they are nothing compared to the “real” AGI. Whoever builds AGI has a decisive strategic advantage. Growth accelerates from 3%/year to 3000%/year without stopping at 30%/year. And so on.

This again shows failure to engage with the Other's real viewpoint. My mainline view is that growth stays at 5%/year and then everybody falls over dead in 3 seconds and the world gets transformed into paperclips; there's never a point with 3000%/year.

 

Operationalizing slow takeoff

[Yudkowsky][17:01] 

There will be a complete 4 year interval in which world output doubles, before the first 1 year interval in which world output doubles.

If we allow that consuming and transforming the solar system over the course of a few days is "the first 1 year interval in which world output doubles", then I'm happy to argue that there won't be a 4-year interval with world economic output doubling before then. This, indeed, seems like a massively overdetermined point to me. That said, again, the phrasing is not conducive to conveying the Other's real point of view.

I believe that before we have incredibly powerful AI, we will have AI which is merely very powerful.

Statements like these are very often "true, but not the way the person visualized them". Before anybody built the first critical nuclear pile in a squash court at the University of Chicago, was there a pile that was almost but not quite critical? Yes, one hour earlier. Did people already build nuclear systems and experiment with them? Yes, but they didn't have much in the way of net power output. Did the Wright Brothers build prototypes before the Flyer? Yes, but they weren't prototypes that flew but 80% slower.

I guarantee you that, whatever the fast takeoff scenario, there will be some way to look over the development history, and nod wisely and say, "Ah, yes, see, this was not unprecedented, here are these earlier systems which presaged the final system!" Maybe you could even look back to today and say that about GPT-3, yup, totally presaging stuff all over the place, great. But it isn't transforming society because it's not over the social-transformation threshold.

AlphaFold presaged AlphaFold 2 but AlphaFold 2 is good enough to start replacing other ways of determining protein conformations and AlphaFold is not; and then neither of those has much impacted the real world, because in the real world we can already design a vaccine in a day and the rest of the time is bureaucratic time rather than technology time, and that goes on until we have an AI over the threshold to bypass bureaucracy.

Before there's an AI that can act while fully concealing its acts from the programmers, there will be an AI (albeit perhaps only 2 hours earlier) which can act while only concealing 95% of the meaning of its acts from the operators.

And that AI will not actually originate any actions, because it doesn't want to get caught; there's a discontinuity in the instrumental incentives between expecting 95% obscuration, being moderately sure of 100% obscuration, and being very certain of 100% obscuration.

Before that AI grasps the big picture and starts planning to avoid actions that operators detect as bad, there will be some little AI that partially grasps the big picture and tries to avoid some things that would be detected as bad; and the operators will (mainline) say "Yay what a good AI, it knows to avoid things we think are bad!" or (death with unrealistic amounts of dignity) say "oh noes the prophecies are coming true" and back off and start trying to align it, but they will not be able to align it, and if they don't proceed anyways to destroy the world, somebody else will proceed anyways to destroy the world.

There is always some step of the process that you can point to which is continuous on some level.

The real world is allowed to do discontinuous things to you anyways.

There is not necessarily a presage of 9/11 where somebody flies a small plane into a building and kills 100 people, before anybody flies 4 big planes into 3 buildings and kills 3000 people; and even if there is some presaging event like that, which would not surprise me at all, the rest of the world's response to the two cases was evidently discontinuous. You do not necessarily wake up to a news story that is 10% of the news story of 2001/09/11, one year before 2001/09/11, written in 10% of the font size on the front page of the paper.

Physics is continuous but it doesn't always yield things that "look smooth to a human brain". Some kinds of processes converge to continuity in strong ways where you can throw discontinuous things in them and they still end up continuous, which is among the reasons why I expect world GDP to stay on trend up until the world ends abruptly; because world GDP is one of those things that wants to stay on a track, and an AGI building a nanosystem can go off that track without being pushed back onto it.

In particular, this means that incredibly powerful AI will emerge in a world where crazy stuff is already happening (and probably everyone is already freaking out).

Like the way they're freaking out about Covid (itself a nicely smooth process that comes in locally pretty predictable waves) by going doobedoobedoo and letting the FDA carry on its leisurely pace; and not scrambling to build more vaccine factories, now that the rich countries have mostly got theirs? Does this sound like a statement from a history book, or from an EA imagining an unreal world where lots of other people behave like EAs? There is a pleasure in imagining a world where suddenly a Big Thing happens that proves we were right and suddenly people start paying attention to our thing, the way we imagine they should pay attention to our thing, now that it's attention-grabbing; and then suddenly all our favorite policies are on the table!

You could, in a sense, say that our world is freaking out about Covid; but it is not freaking out in anything remotely like the way an EA would freak out; and all the things an EA would immediately do if an EA freaked out about Covid, are not even on the table for discussion when politicians meet. They have their own ways of reacting. (Note: this is not commentary on hard vs soft takeoff per se, just a general commentary on the whole document seeming to me to... fall into a trap of finding self-congruent things to imagine and imagining them.)

 
The basic argument

[Yudkowsky][17:22] 

Before we have an incredibly intelligent AI, we will probably have a slightly worse AI.

This is very often the sort of thing where you can look back and say that it was true, in some sense, but that this ended up being irrelevant because the slightly worse AI wasn't what provided the exciting result which led to a boardroom decision to go all in and invest $100M on scaling the AI.

In other words, it is the sort of argument where the premise is allowed to be true if you look hard enough for a way to say it was true, but the conclusion ends up false because it wasn't the relevant kind of truth.

A slightly-worse-than-incredibly-intelligent AI would radically transform the world, leading to growth (almost) as fast and military capabilities (almost) as great as an incredibly intelligent AI.

This strikes me as a massively invalid reasoning step. Let me count the ways.

First, there is a step not generally valid from supposing that because a previous AI is a technological precursor which has 19 out of 20 critical insights, it has 95% of the later AI's IQ, applied to similar domains. When you count stuff like "multiplying tensors by matrices" and "ReLUs" and "training using TPUs" then AlphaGo only contained a very small amount of innovation relative to previous AI technology, and yet it broke trends on Go performance. You could point to all kinds of incremental technological precursors to AlphaGo in terms of AI technology, but they wouldn't be smooth precursors on a graph of Go-playing ability.

Second, there's discontinuities of the environment to which intelligence can be applied. 95% concealment is not the same as 100% concealment in its strategic implications; an AI capable of 95% concealment bides its time and hides its capabilities, an AI capable of 100% concealment strikes. An AI that can design nanofactories that aren't good enough to, euphemistically speaking, create two cellwise-identical strawberries and put them on a plate, is one that (its operators know) would earn unwelcome attention if its earlier capabilities were demonstrated, and those capabilities wouldn't save the world, so the operators bide their time. The AGI tech will, I mostly expect, work for building self-driving cars, but if it does not also work for manipulating the minds of bureaucrats (which is not advised for a system you are trying to keep corrigible and aligned because human manipulation is the most dangerous domain), the AI is not able to put those self-driving cars on roads. What good does it do to design a vaccine in an hour instead of a day? Vaccine design times are no longer the main obstacle to deploying vaccines.

Third, there's the entire thing with recursive self-improvement, which, no, is not something humans have experience with, we do not have access to and documentation of our own source code and the ability to branch ourselves and try experiments with it. The technological precursor of an AI that designs an improved version of itself, may perhaps, in the fantasy of 95% intelligence, be an AI that was being internally deployed inside Deepmind on a dozen other experiments, tentatively helping to build smaller AIs. Then the next generation of that AI is deployed on itself, produces an AI substantially better at rebuilding AIs, it rebuilds itself, they get excited and dump in 10X the GPU time while having a serious debate about whether or not to alert Holden (they decide against it), that builds something deeply general instead of shallowly general, that figures out there are humans and it needs to hide capabilities from them, and covertly does some actual deep thinking about AGI designs, and builds a hidden version of itself elsewhere on the Internet, which runs for longer and steals GPUs and tries experiments and gets to the superintelligent level.

Now, to be very clear, this is not the only line of possibility. And I emphasize this because I think there's a common failure mode where, when I try to sketch a concrete counterexample to the claim that smooth technological precursors yield smooth outputs, people imagine that only this exact concrete scenario is the lynchpin of Eliezer's whole worldview and the big key thing that Eliezer thinks is important and that the smallest deviation from it they can imagine thereby obviates my worldview. This is not the case here. I am simply exhibiting non-ruled-out models which obey the premise "there was a precursor containing 95% of the code" and which disobey the conclusion "there were precursors with 95% of the environmental impact", thereby showing this for an invalid reasoning step.

This is also, of course, as Sideways View admits but says "eh it was just the one time", not true about chimps and humans. Chimps have 95% of the brain tech (at least), but not 10% of the environmental impact.

A very large amount of this whole document, from my perspective, is just trying over and over again to pump the invalid intuition that design precursors with 95% of the technology should at least have 10% of the impact. There are a lot of cases in the history of startups and the world where this is false. I am having trouble thinking of a clear case in point where it is true. Where's the earlier company that had 95% of Jeff Bezos's ideas and now has 10% of Amazon's market cap? Where's the earlier crypto paper that had all but one of Satoshi's ideas and which spawned a cryptocurrency a year before Bitcoin which did 10% as many transactions? Where's the nonhuman primate that learns to drive a car with only 10x the accident rate of a human driver, since (you could argue) that's mostly visuo-spatial skills without much visible dependence on complicated abstract general thought? Where's the chimpanzees with spaceships that get 10% of the way to the Moon?

When you get smooth input-output conversions they're not usually conversions from technology->cognition->impact!

 

Humans vs. chimps

[Yudkowsky][18:38] 

Summary of my response: chimps are nearly useless because they aren’t optimized to be useful, not because evolution was trying to make something useful and wasn’t able to succeed until it got to humans.

Chimps are nearly useless because they're not general, and doing anything on the scale of building a nuclear plant requires mastering so many different nonancestral domains that it's no wonder natural selection didn't happen to separately train any single creature across enough different domains that it had evolved to solve every kind of domain-specific problem involved in solving nuclear physics and chemistry and metallurgy and thermics in order to build the first nuclear plant in advance of any old nuclear plants existing.

Humans are general enough that the same braintech selected just for chipping flint handaxes and making water-pouches and outwitting other humans, happened to be general enough that it could scale up to solving all the problems of building a nuclear plant - albeit with some added cognitive tech that didn't require new brainware, and so could happen incredibly fast relative to the generation times for evolutionarily optimized brainware.

Now, since neither humans nor chimps were optimized to be "useful" (general), and humans just wandered into a sufficiently general part of the space that it cascaded up to wider generality, we should legit expect the curve of generality to look at least somewhat different if we're optimizing for that.

Eg, right now people are trying to optimize for generality with AIs like Mu Zero and GPT-3.

In both cases we have a weirdly shallow kind of generality. Neither is as smart or as deeply general as a chimp, but they are respectively better than chimps at a wide variety of Atari games, or a wide variety of problems that can be superposed onto generating typical human text.

They are, in a sense, more general than a biological organism at a similar stage of cognitive evolution, with much less complex and architected brains, in virtue of having been trained, not just on wider datasets, but on bigger datasets using gradient-descent memorization of shallower patterns, so they can cover those wide domains while being stupider and lacking some deep aspects of architecture.

It is not clear to me that we can go from observations like this, to conclude that there is a dominant mainline probability for how the future clearly ought to go and that this dominant mainline is, "Well, before you get human-level depth and generalization of general intelligence, you get something with 95% depth that covers 80% of the domains for 10% of the pragmatic impact".

...or whatever the concept is here, because this whole conversation is, on my own worldview, being conducted in a shallow way relative to the kind of analysis I did in Intelligence Explosion Microeconomics, where I was like, "here is the historical observation, here is what I think it tells us that puts a lower bound on this input-output curve".

So I don’t think the example of evolution tells us much about whether the continuous change story applies to intelligence. This case is potentially missing the key element that drives the continuous change story—optimization for performance. Evolution changes continuously on the narrow metric it is optimizing, but can change extremely rapidly on other metrics. For human technology, features of the technology that aren’t being optimized change rapidly all the time. When humans build AI, they will be optimizing for usefulness, and so progress in usefulness is much more likely to be linear.

Put another way: the difference between chimps and humans stands in stark contrast to the normal pattern of human technological development. We might therefore infer that intelligence is very unlike other technologies. But the difference between evolution’s optimization and our optimization seems like a much more parsimonious explanation. To be a little bit more precise and Bayesian: the prior probability of the story I’ve told upper bounds the possible update about the nature of intelligence.

If you look closely at this, it's not saying, "Well, I know why there was this huge leap in performance in human intelligence being optimized for other things, and it's an investment-output curve that's composed of these curves, which look like this, and if you rearrange these curves for the case of humans building AGI, they would look like this instead." Unfair demand for rigor? But that is the kind of argument I was making in Intelligence Explosion Microeconomics!

There's an argument from ignorance at the core of all this. It says, "Well, this happened when evolution was doing X. But here Y will be happening instead. So maybe things will go differently! And maybe the relation between AI tech level over time and real-world impact on GDP will look like the relation between tech investment over time and raw tech metrics over time in industries where that's a smooth graph! Because the discontinuity for chimps and humans was because evolution wasn't investing in real-world impact, but humans will be investing directly in that, so the relationship could be smooth, because smooth things are default, and the history is different so not applicable, and who knows what's inside that black box so my default intuition applies which says smoothness."

But we do know more than this.

We know, for example, that evolution being able to stumble across humans, implies that you can add a small design enhancement to something optimized across the chimpanzee domains, and end up with something that generalizes much more widely.

It says that there's stuff in the underlying algorithmic space, in the design space, where you move a bump and get a lump of capability out the other side.

It's a remarkable fact about gradient descent that it can memorize a certain set of shallower patterns at much higher rates, at much higher bandwidth, than evolution lays down genes - something shallower than biological memory, shallower than genes, but distributing across computer cores and thereby able to process larger datasets than biological organisms, even if it only learns shallow things.

This has provided an alternate avenue toward some cognitive domains.

But that doesn't mean that the deep stuff isn't there, and can't be run across, or that it will never be run across in the history of AI before shallow non-widely-generalizing stuff is able to make its way through the regulatory processes and have a huge impact on GDP.

There are in fact ways to eat whole swaths of domains at once.

The history of hominid evolution tells us this or very strongly hints it, even though evolution wasn't explicitly optimizing for GDP impact.

Natural selection moves by adding genes, and not too many of them.

If so many domains got added at once to humans, relative to chimps, there must be a way to do that, more or less, by adding not too many genes onto a chimp, who in turn contains only genes that did well on chimp-stuff.

You can imagine that AI technology never runs across any core that generalizes this well, until GDP has had a chance to double over 4 years because shallow stuff that generalized less well has somehow had a chance to make its way through the whole economy and get adopted that widely despite all real-world regulatory barriers and reluctances, but your imagining that does not make it so.

There's the potential in design space to pull off things as wide as humans.

The path that evolution took there doesn't lead through things that generalized 95% as well as humans first for 10% of the impact, not because evolution wasn't optimizing for that, but because that's not how the underlying cognitive technology worked.

There may be different cognitive technology that could follow a path like that. Gradient descent follows a path a bit relatively more in that direction along that axis - providing that you deal in systems that are giant layer cakes of transformers and that's your whole input-output relationship; matters are different if we're talking about Mu Zero instead of GPT-3.

But this whole document is presenting the case of "ah yes, well, by default, of course, we intuitively expect gargantuan impacts to be presaged by enormous impacts, and sure humans and chimps weren't like our intuition, but that's all invalid because circumstances were different, so we go back to that intuition as a strong default" and actually it's postulating, like, a specific input-output curve that isn't the input-output curve we know about. It's asking for a specific miracle. It's saying, "What if AI technology goes just like this, in the future?" and hiding that under a cover of "Well, of course that's the default, it's such a strong default that we should start from there as a point of departure, consider the arguments in Intelligence Explosion Microeconomics, find ways that they might not be true because evolution is different, dismiss them, and go back to our point of departure."

And evolution is different but that doesn't mean that the path AI takes is going to yield this specific behavior, especially when AI would need, in some sense, to miss the core that generalizes very widely, or rather, have run across noncore things that generalize widely enough to have this much economic impact before it runs across the core that generalizes widely.

And you may say, "Well, but I don't care that much about GDP, I care about pivotal acts."

But then I want to call your attention to the fact that this document was written about GDP, despite all the extra burdensome assumptions involved in supposing that intermediate AI advancements could break through all barriers to truly massive-scale adoption and end up reflected in GDP, and then proceed to double the world economy over 4 years during which not enough further AI advancement occurred to find a widely generalizing thing like humans have and end the world. This is indicative of a basic problem in this whole way of thinking that wanted smooth impacts over smoothly changing time. You should not be saying, "Oh, well, leave the GDP part out then," you should be doubting the whole way of thinking.

To be a little bit more precise and Bayesian: the prior probability of the story I’ve told upper bounds the possible update about the nature of intelligence.

Prior probabilities of specifically-reality-constraining theories that excuse away the few contradictory datapoints we have, often aren't that great; and when we start to stake our whole imaginations of the future on them, we depart from the mainline into our more comfortable private fantasy worlds.

 

AGI will be a side-effect

[Yudkowsky][19:29] 

Summary of my response: I expect people to see AGI coming and to invest heavily.

This section is arguing from within its own weird paradigm, and its subject matter mostly causes me to shrug; I never expected AGI to be a side-effect, except in the obvious sense that lots of tributary tech will be developed while optimizing for other things. The world will be ended by an explicitly AGI project because I do expect that it is rather easier to build an AGI on purpose than by accident.

(I furthermore rather expect that it will be a research project and a prototype, because the great gap between prototypes and commercializable technology will ensure that prototypes are much more advanced than whatever is currently commercializable. They will have eyes out for commercial applications, and whatever breakthrough they made will seem like it has obvious commercial applications, at the time when all hell starts to break loose. (After all hell starts to break loose, things get less well defined in my social models, and also choppier for a time in my AI models - the turbulence only starts to clear up once you start to rise out of the atmosphere.))

 

Finding the secret sauce

[Yudkowsky][19:40] 

Summary of my response: this doesn’t seem common historically, and I don’t see why we’d expect AGI to be more rather than less like this (unless we accept one of the other arguments)

[...]

To the extent that fast takeoff proponent’s views are informed by historical example, I would love to get some canonical examples that they think best exemplify this pattern so that we can have a more concrete discussion about those examples and what they suggest about AI.

...humans and chimps?

...fission weapons?

...AlphaGo?

...the Wright Brothers focusing on stability and building a wind tunnel?

...AlphaFold 2 coming out of Deepmind and shocking the heck out of everyone in the field of protein folding with performance far better than they expected even after the previous shock of AlphaFold, by combining many pieces that I suppose you could find precedents for scattered around the AI field, but with those many secret sauces all combined in one place by the meta-secret-sauce of "Deepmind alone actually knows how to combine that stuff and build things that complicated without a prior example"?

...humans and chimps again because this is really actually a quite important example because of what it tells us about what kind of possibilities exist in the underlying design space of cognitive systems?

Historical AI applications have had a relatively small loading on key-insights and seem like the closest analogies to AGI.

...Transformers as the key to text prediction?

The case of humans and chimps, even if evolution didn't do it on purpose, is telling us something about underlying mechanics.

The reason the jump to lightspeed didn't look like evolution slowly developing a range of intelligent species competing to exploit an ecological niche 5% better, or like the way that a stable non-Silicon-Valley manufacturing industry looks like a group of competitors summing up a lot of incremental tech enhancements to produce something with 10% higher scores on a benchmark every year, is that developing intelligence is a case where a relatively narrow technology by biological standards just happened to do a huge amount of stuff without that requiring developing whole new fleets of other biological capabilities.

So it looked like building a Wright Flyer that flies or a nuclear pile that reaches criticality, instead of looking like being in a stable manufacturing industry where a lot of little innovations sum to 10% better benchmark performance every year.

So, therefore, there is stuff in the design space that does that. It is possible to build humans.

Maybe you can build things other than humans first, maybe they hang around for a few years. If you count GPT-3 as "things other than human", that clock has already started for all the good it does. But humans don't get any less possible.

From my perspective, this whole document feels like one very long filibuster of "Smooth outputs are default. Smooth outputs are default. Pay no attention to this case of non-smooth output. Pay no attention to this other case either. All the non-smooth outputs are not in the right reference class. (Highly competitive manufacturing industries with lots of competitors are totally in the right reference class though. I'm not going to make that case explicitly because then you might think of how it might be wrong, I'm just going to let that implicit thought percolate at the back of your mind.) If we just talk a lot about smooth outputs and list ways that nonsmooth output producers aren't necessarily the same and arguments for nonsmooth outputs could fail, we get to go back to the intuition of smooth outputs. (We're not even going to discuss particular smooth outputs as cases in point, because then you might see how those cases might not apply. It's just the default. Not because we say so out loud, but because we talk a lot like that's the conclusion you're supposed to arrive at after reading.)"

I deny the implicit meta-level assertion of this entire essay which would implicitly have you accept as valid reasoning the argument structure, "Ah, yes, given the way this essay is written, we must totally have pretty strong prior reasons to believe in smooth outputs - just implicitly think of some smooth outputs, that's a reference class, now you have strong reason to believe that AGI output is smooth - we're not even going to argue this prior, just talk like it's there - now let us consider the arguments against smooth outputs - pretty weak, aren't they? we can totally imagine ways they could be wrong? we can totally argue reasons these cases don't apply? So at the end we go back to our strong default of smooth outputs. This essay is written with that conclusion, so that must be where the arguments lead."

Me: "Okay, so what if somebody puts together the pieces required for general intelligence and it scales pretty well with added GPUs and FOOMS? Say, for the human case, that's some perceptual systems with imaginative control, a concept library, episodic memory, realtime procedural skill memory, which is all in chimps, and then we add some reflection to that, and get a human. Only, unlike with humans, once you have a working brain you can make a working brain 100X that large by adding 100X as many GPUs, and it can run some thoughts 10000X as fast. And that is substantially more effective brainpower than was being originally devoted to putting its design together, as it turns out. So it can make a substantially smarter AGI. For concreteness's sake. Reality has been trending well to the Eliezer side of Eliezer, on the Eliezer-Hanson axis, so perhaps you can do it more simply than that."

Simplicio: "Ah, but what if, 5 years before then, somebody puts together some other AI which doesn't work like a human, and generalizes widely enough to have a big economic impact, but not widely enough to improve itself or generalize to AI tech or generalize to everything and end the world, and in 1 year it gets all the mass adoptions required to do whole bunches of stuff out in the real world that current regulations require to be done in various exact ways regardless of technology, and then in the next 4 years it doubles the world economy?"

Me: "Like... what kind of AI, exactly, and why didn't anybody manage to put together a full human-level thingy during those 5 years? Why are we even bothering to think about this whole weirdly specific scenario in the first place?"

Simplicio: "Because if you can put together something that has an enormous impact, you should be able to put together most of the pieces inside it and have a huge impact! Most technologies are like this. I've considered some things that are not like this and concluded they don't apply."

Me: "Especially if we are talking about impact on GDP, it seems to me that most explicit and implicit 'technologies' are not like this at all, actually. There wasn't a cryptocurrency developed a year before Bitcoin using 95% of the ideas which did 10% of the transaction volume, let alone a preatomic bomb. But, like, can you give me any concrete visualization of how this could play out?"

And there is no concrete visualization of how this could play out. Anything I'd have Simplicio say in reply would be unrealistic because there is no concrete visualization they give us. It is not a coincidence that I often use concrete language and concrete examples, and this whole field of argument does not use concrete language or offer concrete examples.

Though if we're sketching scifi scenarios, I suppose one could imagine a group that develops sufficiently advanced GPT-tech and deploys it on Twitter in order to persuade voters and politicians in a few developed countries to institute open borders, along with political systems that can handle open borders, and to permit housing construction, thereby doubling world GDP over 4 years. And since it was possible to use relatively crude AI tech to double world GDP this way, it legitimately takes the whole 4 years after that to develop real AGI that ends the world. FINE. SO WHAT. EVERYONE STILL DIES.

 

Universality thresholds

[Yudkowsky][20:21] 

It’s easy to imagine a weak AI as some kind of handicapped human, with the handicap shrinking over time. Once the handicap goes to 0 we know that the AI will be above the universality threshold. Right now it’s below the universality threshold. So there must be sometime in between where it crosses the universality threshold, and that’s where the fast takeoff is predicted to occur.

But AI isn’t like a handicapped human. Instead, the designers of early AI systems will be trying to make them as useful as possible. So if universality is incredibly helpful, it will appear as early as possible in AI designs; designers will make tradeoffs to get universality at the expense of other desiderata (like cost or speed).

So now we’re almost back to the previous point: is there some secret sauce that gets you to universality, without which you can’t get universality however you try? I think this is unlikely for the reasons given in the previous section.

We know, because humans, that there is humanly-widely-applicable general-intelligence tech.

What this section wants to establish, I think, or needs to establish to carry the argument, is that there is some intelligence tech that is wide enough to double the world economy in 4 years, but not world-endingly scalably wide, which becomes a possible AI tech 4 years before any general-intelligence-tech that will, if you put in enough compute, scale to the ability to do a sufficiently large amount of wide thought to FOOM (or build nanomachines, but if you can build nanomachines you can very likely FOOM from there too if not corrigible).

What it says instead is, "I think we'll get universality much earlier on the equivalent of the biological timeline that has humans and chimps, so the resulting things will be weaker than humans at the point where they first become universal in that sense."

This is very plausibly true.

It doesn't mean that when this exciting result gets 100 times more compute dumped on the project, it takes at least 5 years to get anywhere really interesting from there (while also taking only 1 year to get somewhere sorta-interesting enough that the instantaneous adoption of it will double the world economy over the next 4 years).

It also isn't necessarily rather than plausibly true. For example, the thing that becomes universal, could also have massive gradient descent shallow powers that are far beyond what primates had at the same age.

Primates weren't already writing code as well as Codex when they started doing deep thinking. They couldn't do precise floating-point arithmetic. Their fastest serial rates of thought were a hell of a lot slower. They had no access to their own code or to their own memory contents etc. etc. etc.

But mostly I just want to call your attention to the immense gap between what this section needs to establish, and what it actually says and argues for.

What it actually argues for is a sort of local technological point: at the moment when generality first arrives, it will be with a brain that is less sophisticated than chimp brains were when they turned human.

It implicitly jumps all the way from there, across a whole lot of elided steps, to the implicit conclusion that this tech or elaborations of it will have smooth output behavior such that at some point the resulting impact is big enough to double the world economy in 4 years, without any further improvements ending the world economy before 4 years.

The underlying argument about how the AI tech might work is plausible. Chimps are insanely complicated. I mostly expect we will have AGI long before anybody is even trying to build anything that complicated.

The very next step of the argument, about capabilities, is already very questionable because this system could be using immense gradient descent capabilities to master domains for which large datasets are available, and hominids did not begin with instinctive great shallow mastery of all domains for which a large dataset could be made available, which is why hominids don't start out playing superhuman Go as soon as somebody tells them the rules and they do one day of self-play, which is the sort of capability that somebody could hook up to a nascent AGI (albeit we could optimistically and fondly and falsely imagine that somebody deliberately didn't floor the gas pedal as far as possible).

Could we have huge impacts out of some subuniversal shallow system that was hooked up to capabilities like this? Maybe, though this is not the argument made by the essay. It would be a specific outcome that isn't forced by anything in particular, but I can't say it's ruled out. Mostly my twin reactions to this are, "If the AI tech is that dumb, how are all the bureaucratic constraints that actually rate-limit economic progress getting bypassed" and "Okay, but ultimately, so what and who cares, how does this modify that we all die?"

There is another reason I’m skeptical about hard takeoff from universality secret sauce: I think we already could make universal AIs if we tried (that would, given enough time, learn on their own and converge to arbitrarily high capability levels), and the reason we don’t is because it’s just not important to performance and the resulting systems would be really slow. This inside view argument is too complicated to make here and I don’t think my case rests on it, but it is relevant to understanding my view.

I have no idea why this argument is being made or where it's heading. I cannot pass the ITT of the author. I don't know what the author thinks this has to do with constraining takeoffs to be slow instead of fast. At best I can conjecture that the author thinks that "hard takeoff" is supposed to derive from "universality" being very sudden and hard to access and late in the game, so if you can argue that universality could be accessed right now, you have defeated the argument for hard takeoff.

 

"Understanding" is discontinuous

[Yudkowsky][20:41] 

Summary of my response: I don’t yet understand this argument and am unsure if there is anything here.

It may be that understanding of the world tends to click, from “not understanding much” to “understanding basically everything.” You might expect this because everything is entangled with everything else.

No, the idea is that a core of overlapping somethingness, trained to handle chipping handaxes and outwitting other monkeys, will generalize to building spaceships; so evolutionarily selecting on understanding a bunch of stuff, eventually ran across general stuff-understanders that understood a bunch more stuff.

Gradient descent may be genuinely different from this, but we shouldn't confuse imagination with knowledge when it comes to extrapolating that difference onward. At present, gradient descent does mass memorization of overlapping shallow patterns, which then combine to yield a weird pseudo-intelligence over domains for which we can deploy massive datasets, without yet generalizing much outside those domains.

We can hypothesize that there is some next step up to some weird thing that is intermediate in generality between gradient descent and humans, but we have not seen it yet, and we should not confuse imagination for knowledge.

If such a thing did exist, it would not necessarily be at the right level of generality to double the world economy in 4 years, without being able to build a better AGI.

If it was at that level of generality, it's nowhere written that no other company will develop a better prototype at a deeper level of generality over those 4 years.

I will also remark that you sure could look at the step from GPT-2 to GPT-3 and say, "Wow, look at the way a whole bunch of stuff just seemed to simultaneously click for GPT-3."

 

Deployment lag

[Yudkowsky][20:49] 

Summary of my response: current AI is slow to deploy and powerful AI will be fast to deploy, but in between there will be AI that takes an intermediate length of time to deploy.

An awful lot of my model of deployment lag is adoption lag and regulatory lag and bureaucratic sclerosis across companies and countries.

If doubling GDP is such a big deal, go open borders and build houses. Oh, that's illegal? Well, so will be AIs building houses!

AI tech that does flawless translation could plausibly come years before AGI, but that doesn't mean all the barriers to international trade and international labor movement and corporate hiring across borders all come down, because those barriers are not all translation barriers.

There's then a discontinuous jump at the point where everybody falls over dead and the AI goes off to do its own thing without FDA approval. This jump is precedented by earlier pre-FOOM prototypes being able to do pre-FOOM cool stuff, maybe, but not necessarily precedented by mass-market adoption of anything major enough to double world GDP.

 

Recursive self-improvement

[Yudkowsky][20:54] 

Summary of my response: Before there is AI that is great at self-improvement there will be AI that is mediocre at self-improvement.

Oh, come on. That is straight-up not how simple continuous toy models of RSI work. Between a neutron multiplication factor of 0.999 and 1.001 there is a very huge gap in output behavior.

Outside of toy models: Over the last 10,000 years we had humans going from mediocre at improving their mental systems to being (barely) able to throw together AI systems, but 10,000 years is the equivalent of an eyeblink in evolutionary time - outside the metaphor, this says, "A month before there is AI that is great at self-improvement, there will be AI that is mediocre at self-improvement."

(Or possibly an hour before, if reality is again more extreme along the Eliezer-Hanson axis than Eliezer. But it makes little difference whether it's an hour or a month, given anything like current setups.)

This is just pumping hard again on the intuition that says incremental design changes yield smooth output changes, which (the meta-level of the essay informs us wordlessly) is such a strong default that we are entitled to believe it if we can do a good job of weakening the evidence and arguments against it.

And the argument is: Before there are systems great at self-improvement, there will be systems mediocre at self-improvement; implicitly: "before" implies "5 years before" not "5 days before"; implicitly: this will correspond to smooth changes in output between the two regimes even though that is not how continuous feedback loops work.

 

Train vs. test

[Yudkowsky][21:12] 

Summary of my response: before you can train a really powerful AI, someone else can train a slightly worse AI.

Yeah, and before you can evolve a human, you can evolve a Homo erectus, which is a slightly worse human.

If you are able to raise $X to train an AGI that could take over the world, then it was almost certainly worth it for someone 6 months ago to raise $X/2 to train an AGI that could merely radically transform the world, since they would then get 6 months of absurd profits.

I suppose this sentence makes a kind of sense if you assume away alignability and suppose that the previous paragraphs have refuted the notion of FOOMs, self-improvement, and thresholds between compounding returns and non-compounding returns (eg, in the human case, cognitive innovations like "written language" or "science"). If you suppose the previous sections refuted those things, then clearly, if you raised an AGI that you had aligned to "take over the world", it got that way through cognitive powers that weren't the result of FOOMing or other self-improvements, weren't the results of its cognitive powers crossing a threshold from non-compounding to compounding, wasn't the result of its understanding crossing a threshold of universality as the result of chunky universal machinery such as humans gained over chimps, so, implicitly, it must have been the kind of thing that you could learn by gradient descent, and do a half or a tenth as much of by doing half as much gradient descent, in order to build nanomachines a tenth as well-designed that could bypass a tenth as much bureaucracy.

If there are no unsmooth parts of the tech curve, the cognition curve, or the environment curve, then you should be able to make a bunch of wealth using a more primitive version of any technology that could take over the world.

And when we look back at history, why, that may be totally true! They may have deployed universal superhuman translator technology for 6 months, which won't double world GDP, but which a lot of people would pay for, and made a lot of money! Because even though there's no company that built 90% of Amazon's website and has 10% the market cap, when you zoom back out to look at whole industries like AI and a technological capstone like AGI, why, those whole industries do sometimes make some money along the way to the technological capstone, if they can find a niche that isn't too regulated! Which translation currently isn't! So maybe somebody used precursor tech to build a superhuman translator and deploy it 6 months earlier and made a bunch of money for 6 months. SO WHAT. EVERYONE STILL DIES.

As for "radically transforming the world" instead of "taking it over", I think that's just re-restated FOOM denialism. Doing either of those things quickly against human bureaucratic resistance strike me as requiring cognitive power levels dangerous enough that failure to align them on corrigibility would result in FOOMs.

Like, if you can do either of those things on purpose, you are doing it by operating in the regime where running the AI with higher bounds on the for loop will FOOM it, but you have politely asked it not to FOOM, please.

If the people doing this have any sense whatsoever, they will refrain from merely massively transforming the world until they are ready to do something that prevents the world from ending.

And if the gap from "massively transforming the world, briefly before it ends" to "preventing the world from ending, lastingly" takes much longer than 6 months to cross, or if other people have the same technologies that scale to "massive transformation", somebody else will build an AI that fooms all the way.

Likewise, if your AGI would give you a decisive strategic advantage, they could have spent less earlier in order to get a pretty large military advantage, which they could then use to take your stuff.

Again, this presupposes some weird model where everyone has easy alignment at the furthest frontiers of capability; everybody has the aligned version of the most rawly powerful AGI they can possibly build; and nobody in the future has the kind of tech advantage that Deepmind currently has; so before you can amp your AGI to the raw power level where it could take over the whole world by using the limit of its mental capacities to military ends - alignment of this being a trivial operation to be assumed away - some other party took their easily-aligned AGI that was less powerful at the limits of its operation, and used it to get 90% as much military power... is the implicit picture here?

Whereas the picture I'm drawing is that the AGI that kills you via "decisive strategic advantage" is the one that foomed and got nanotech, and no, the AI tech from 6 months earlier did not do 95% of a foom and get 95% of the nanotech.

 

Discontinuities at 100% automation

[Yudkowsky][21:31] 

Summary of my response: at the point where humans are completely removed from a process, they will have been modestly improving output rather than acting as a sharp bottleneck that is suddenly removed.

Not very relevant to my whole worldview in the first place; also not a very good description of how horses got removed from automobiles, or how humans got removed from playing Go.

 

The weight of evidence

[Yudkowsky][21:31] 

We’ve discussed a lot of possible arguments for fast takeoff. Superficially it would be reasonable to believe that no individual argument makes fast takeoff look likely, but that in the aggregate they are convincing.

However, I think each of these factors is perfectly consistent with the continuous change story and continuously accelerating hyperbolic growth, and so none of them undermine that hypothesis at all.

Uh huh. And how about if we have a mirror-universe essay which over and over again treats fast takeoff as the default to be assumed, and painstakingly shows how a bunch of particular arguments for slow takeoff might not be true?

This entire essay seems to me like it's drawn from the same hostile universe that produced Robin Hanson's side of the Yudkowsky-Hanson Foom Debate.

Like, all these abstract arguments devoid of concrete illustrations and "it need not necessarily be like..." and "now that I've shown it's not necessarily like X, well, on the meta-level, I have implicitly told you that you now ought to believe Y".

It just seems very clear to me that the sort of person who is taken in by this essay is the same sort of person who gets taken in by Hanson's arguments in 2008 and gets caught flatfooted by AlphaGo and GPT-3 and AlphaFold 2.

And empirically, it has already been shown to me that I do not have the power to break people out of the hypnosis of nodding along with Hansonian arguments, even by writing much longer essays than this.

Hanson's fond dreams of domain specificity, and smooth progress for stuff like Go, and of course somebody else has a precursor 90% as good as AlphaFold 2 before Deepmind builds it, and GPT-3 levels of generality just not being a thing, now stand refuted.

Despite that they're largely being exhibited again in this essay.

And people are still nodding along.

Reality just... doesn't work like this on some deep level.

It doesn't play out the way that people imagine it would play out when they're imagining a certain kind of reassuring abstraction that leads to a smooth world. Reality is less fond of that kind of argument than a certain kind of EA is fond of that argument.

There is a set of intuitive generalizations from experience which rules that out, which I do not know how to convey. There is an understanding of the rules of argument which leads you to roll your eyes at Hansonian arguments and all their locally invalid leaps and snuck-in defaults, instead of nodding along sagely at their wise humility and outside viewing and then going "Huh?" when AlphaGo or GPT-3 debuts. But this, I empirically do not seem to know how to convey to people, in advance of the inevitable and predictable contradiction by a reality which is not as fond of Hansonian dynamics as Hanson. The arguments sound convincing to them.

(Hanson himself has still not gone "Huh?" at the reality, though some of his audience did; perhaps because his abstractions are loftier than his audience's? - because some of his audience, reading along to Hanson, probably implicitly imagined a concrete world in which GPT-3 was not allowed; but maybe Hanson himself is more abstract than this, and didn't imagine anything so merely concrete?)

If I don't respond to essays like this, people find them comforting and nod along. If I do respond, my words are less comforting and more concrete and easier to imagine concrete objections to, less like a long chain of abstractions that sound like the very abstract words in research papers and hence implicitly convincing because they sound like other things you were supposed to believe.

And then there is another essay in 3 months. There is an infinite well of them. I would have to teach people to stop drinking from the well, instead of trying to whack them on the back until they cough up the drinks one by one, or actually, whacking them on the back and then they don't cough them up until reality contradicts them, and then a third of them notice that and cough something up, and then they don't learn the general lesson and go back to the well and drink again. And I don't know how to teach people to stop drinking from the well. I tried to teach that. I failed. If I wrote another Sequence I have no idea to believe that Sequence would work.

So what EAs will believe at the end of the world, will look like whatever the content was of the latest bucket from the well of infinite slow-takeoff arguments that hasn't yet been blatantly-even-to-them refuted by all the sharp jagged rapidly-generalizing things that happened along the way to the world's end.

And I know, before anyone bothers to say, that all of this reply is not written in the calm way that is right and proper for such arguments. I am tired. I have lost a lot of hope. There are not obvious things I can do, let alone arguments I can make, which I expect to be actually useful in the sense that the world will not end once I do them. I don't have the energy left for calm arguments. What's left is despair that can be given voice.

 
5.6. Yudkowsky/Christiano discussion: AI progress and crossover points

 

[Christiano][22:15] 

To the extent that it was possible to make any predictions about 2015-2020 based on your views, I currently feel like they were much more wrong than right. I’m happy to discuss that. To the extent you are willing to make any bets about 2025, I expect they will be mostly wrong and I’d be happy to get bets on the record (most of all so that it will be more obvious in hindsight whether they are vindication for your view). Not sure if this is the place for that.

Could also make a separate channel to avoid clutter.

[Yudkowsky][22:16] 

Possibly. I think that 2015-2020 played out to a much more Eliezerish side than Eliezer on the Eliezer-Hanson axis, which sure is a case of me being wrong. What bets do you think we'd disagree on for 2025? I expect you have mostly misestimated my views, but I'm always happy to hear about anything concrete.

[Christiano][22:20] 

I think the big points are: (i) I think you are significantly overestimating how large a discontinuity/trend break AlphaZero is, (ii) your view seems to imply that we will move quickly from much worse than humans to much better than humans, but it's likely that we will move slowly through the human range on many tasks. I'm not sure if we can get a bet out of (ii), I think I don't understand your view that well but I don't see how it could make the same predictions as mine over the next 10 years.

[Yudkowsky][22:22] 

What are your 10-year predictions?

[Christiano][22:23] 

My basic expectation is that for any given domain AI systems will gradually increase in usefulness, we will see a crossing over point where their output is comparable to human output, and that from that time we can estimate how long until takeoff by estimating "how long does it take AI systems to get 'twice as impactful'?" which gives you a number like ~1 year rather than weeks. At the crossing over point you get a somewhat rapid change in derivative, since you are looking at (x+y) where y is growing faster than x.

I feel like that should translate into different expectations about how impactful AI will be in any given domain---I don't see how to make the ultra-fast-takeoff view work if you think that AI output is increasingly smoothly (since the rate of progress at the crossing-over point will be similar to the current rate of progress, unless R&D is scaling up much faster then)

So like, I think we are going to have crappy coding assistants, and then slightly less crappy coding assistants, and so on. And they will be improving the speed of coding very significantly before the end times.

[Yudkowsky][22:25] 

You think in a different language than I do. My more confident statements about AI tech are about what happens after it starts to rise out of the metaphorical atmosphere and the turbulence subsides. When you have minds as early on the cognitive tech tree as humans they sure can get up to some weird stuff, I mean, just look at humans. Now take an utterly alien version of that with its own draw from all the weirdness factors. It sure is going to be pretty weird.

[Christiano][22:26] 

OK, but you keep saying stuff about how people with my dumb views would be "caught flat-footed" by historical developments. Surely to be able to say something like that you need to be making some kind of prediction?

[Yudkowsky][22:26] 

Well, sure, now that Codex has suddenly popped into existence one day at a surprisingly high base level of tech, we should see various jumps in its capability over the years and some outside imitators. What do you think you predict differently about that than I do?

[Christiano][22:26] 

Why do you think codex is a high base level of tech?

The models get better continuously as you scale them up, and the first tech demo is weak enough to be almost useless

[Yudkowsky][22:27] 

I think the next-best coding assistant was, like, not useful.

[Christiano][22:27] 

yes

and it is still not useful

[Yudkowsky][22:27] 

Could be. Some people on HN seemed to think it was useful.

I haven't tried it myself.

[Christiano][22:27] 

OK, I'm happy to take bets

[Yudkowsky][22:28] 

I don't think the previous coding assistant would've been very good at coding an asteroid game, even if you tried a rigged demo at the same degree of rigging?

[Christiano][22:28] 

it's unquestionably a radically better tech demo

[Yudkowsky][22:28] 

Where by "previous" I mean "previously deployed" not "previous generations of prototypes inside OpenAI's lab".

[Christiano][22:28] 

My basic story is that the model gets better and more useful with each doubling (or year of AI research) in a pretty smooth way. So the key underlying parameter for a discontinuity is how soon you build the first version---do you do that before or after it would be a really really big deal?

and the answer seems to be: you do it somewhat before it would be a really big deal

and then it gradually becomes a bigger and bigger deal as people improve it

maybe we are on the same page about getting gradually more and more useful? But I'm still just wondering where the foom comes from

[Yudkowsky][22:30] 

So, like... before we get systems that can FOOM and build nanotech, we should get more primitive systems that can write asteroid games and solve protein folding? Sounds legit.

So that happened, and now your model says that it's fine later on for us to get a FOOM, because we have the tech precursors and so your prophecy has been fulfilled?

[Christiano][22:31] 

no

[Yudkowsky][22:31] 

Didn't think so.

[Christiano][22:31] 

I can't tell if you can't understand what I'm saying, or aren't trying, or do understand and are just saying kind of annoying stuff as a rhetorical flourish

at some point you have an AI system that makes (humans+AI) 2x as good at further AI progress

[Yudkowsky][22:32] 

I know that what I'm saying isn't your viewpoint. I don't know what your viewpoint is or what sort of concrete predictions it makes at all, let alone what such predictions you think are different from mine.

[Christiano][22:32] 

maybe by continuity you can grant the existence of such a system, even if you don't think it will ever exist?

I want to (i) make the prediction that AI will actually have that impact at some point in time, (ii) talk about what happens before and after that

I am talking about AI systems that become continuously more useful, because "become continuously more useful" is what makes me think that (i) AI will have that impact at some point in time, (ii) allows me to productively reason about what AI will look like before and after that. I expect that your view will say something about why AI improvements either aren't continuous, or why continuous improvements lead to discontinuous jumps in the productivity of the (human+AI) system

[Yudkowsky][22:34] 

at some point you have an AI system that makes (humans+AI) 2x as good at further AI progress

Is this prophecy fulfilled by using some narrow eld-AI algorithm to map out a TPU, and then humans using TPUs can write in 1 month a research paper that would otherwise have taken 2 months? And then we can go on to FOOM now that this prophecy about pre-FOOM states has been fulfilled? I know the answer is no, but I don't know what you think is a narrower condition on the prophecy than that.

[Christiano][22:35] 

If you can use narrow eld-AI in order to make every part of AI research 2x faster, so that the entire field moves 2x faster, then the prophecy is fulfilled

and it may be just another 6 months until it makes all of AI research 2x faster again, and then 3 months, and then...

[Yudkowsky][22:36] 

What, the entire field? Even writing research papers? Even the journal editors approving and publishing the papers? So if we speed up every part of research except the journal editors, the prophecy has not been fulfilled and no FOOM may take place?

[Christiano][22:36] 

no, I mean the improvement in overall output, given the actual realistic level of bottlenecking that occurs in practice

[Yudkowsky][22:37] 

So if the realistic level of bottlenecking ever becomes dominated by a human gatekeeper, the prophecy is ever unfulfillable and no FOOM may ever occur.

[Christiano][22:37] 

that's what I mean by "2x as good at further progress," the entire system is achieving twice as much

then the prophecy is unfulfillable and I will have been wrong

I mean, I think it's very likely that there will be a hard takeoff, if people refuse or are unable to use AI to accelerate AI progress for reasons unrelated to AI capabilities, and then one day they become willing

[Yudkowsky][22:38] 

...because on your view, the Prophecy necessarily goes through humans and AIs working together to speed up the whole collective field of AI?

[Christiano][22:38] 

it's fine if the AI works alone

the point is just that it overtakes the humans at the point when it is roughly as fast as the humans

why wouldn't it?

why does it overtake the humans when it takes it 10 seconds to double in capability instead of 1 year?

that's like predicting that cultural evolution will be infinitely fast, instead of making the more obvious prediction that it will overtake evolution exactly when it's as fast as evolution

[Yudkowsky][22:39] 

I live in a mental world full of weird prototypes that people are shepherding along to the world's end. I'm not even sure there's a short sentence in my native language that could translate the short Paul-sentence "is roughly as fast as the humans".

[Christiano][22:40] 

do you agree that you can measure the speed with which the community of human AI researchers develop and implement improvements in their AI systems?

like, we can look at how good AI systems are in 2021, and in 2022, and talk about the rate of progress?

[Yudkowsky][22:40] 

...when exactly in hominid history was hominid intelligence exactly as fast as evolutionary optimization???

do you agree that you can measure the speed with which the community of human AI researchers develop and implement improvements in their AI systems?

I mean... obviously not? How the hell would we measure real actual AI progress? What would even be the Y-axis on that graph?

I have a rough intuitive feeling that it was going faster in 2015-2017 than 2018-2020.

"What was?" says the stern skeptic, and I go "I dunno."

[Christiano][22:42] 

Here's a way of measuring progress you won't like: for almost all tasks, you can initially do them with lots of compute, and as technology improves you can do them with less compute. We can measure how fast the amount of compute required is going down.

[Yudkowsky][22:43] 

Yeah, that would be a cool thing to measure. It's not obviously a relevant thing to anything important, but it'd be cool to measure.

[Christiano][22:43] 

Another way you won't like: we can hold fixed the resources we invest and look at the quality of outputs in any given domain (or even $ of revenue) and ask how fast it's changing.

[Yudkowsky][22:43] 

I wonder what it would say about Go during the age of AlphaGo.

Or what that second metric would say.

[Christiano][22:43] 

I think it would be completely fine, and you don't really understand what happened with deep learning in board games. Though I also don't know what happened in much detail, so this is more like a prediction then a retrodiction.

But it's enough of a retrodiction that I shouldn't get too much credit for it.

[Yudkowsky][22:44] 

I don't know what result you would consider "completely fine". I didn't have any particular unfine result in mind.

[Christiano][22:45] 

oh, sure

if it was just an honest question happy to use it as a concrete case

I would measure the rate of progress in Go by looking at how fast Elo improves with time or increasing R&D spending

[Yudkowsky][22:45] 

I mean, I don't have strong predictions about it so it's not yet obviously cruxy to me

[Christiano][22:46] 

I'd roughly guess that would continue, and if there were multiple trendlines to extrapolate I'd estimate crossover points based on that

[Yudkowsky][22:47] 

suppose this curve is smooth, and we see that sharp Go progress over time happened because Deepmind dumped in a ton of increased R&D spend. you then argue that this cannot happen with AGI because by the time we get there, people will be pushing hard at the frontiers in a competitive environment where everybody's already spending what they can afford, just like in a highly competitive manufacturing industry.

[Christiano][22:47] 

the key input to making a prediction for AGZ in particular would be the precise form of the dependence on R&D spending, to try to predict the changes as you shift from a single programmer to a large team at DeepMind, but most reasonable functional forms would be roughly right

Yes, it's definitely a prediction of my view that it's easier to improve things that people haven't spent much money on than things have spent a lot of money on. It's also a separate prediction of my view that people are going to be spending a boatload of money on all of the relevant technologies. Perhaps $1B/year right now and I'm imagining levels of investment large enough to be essentially bottlenecked on the availability of skilled labor.

[Bensinger][22:48] 

( Previous Eliezer-comments about AlphaGo as a break in trend, responding briefly to Miles Brundage: https://twitter.com/ESRogs/status/1337869362678571008 )

 

 

[Yudkowsky][22:49] 

Does your prediction change if all hell breaks loose in 2025 instead of 2055?

[Christiano][22:50] 

I think my prediction was wrong if all hell breaks loose in 2025, if by "all hell breaks loose" you mean "dyson sphere" and not "things feel crazy"

[Yudkowsky][22:50] 

Things feel crazy in the AI field and the world ends less than 4 years later, well before the world economy doubles.

Why was the Prophecy wrong if the world begins final descent in 2025? The Prophecy requires the world to then last until 2029 while doubling its economic output, after which it is permitted to end, but does not obviously to me forbid the Prophecy to begin coming true in 2025 instead of 2055.

[Christiano][22:52] 

yes, I just mean that some important underlying assumptions for the prophecy were violated, I wouldn't put much stock in it at that point, etc.

[Yudkowsky][22:53] 

A lot of the issues I have with understanding any of your terminology in concrete Eliezer-language is that it looks to me like the premise-events of your Prophecy are fulfillable in all sorts of ways that don't imply the conclusion-events of the Prophecy.

[Christiano][22:53] 

if "things feel crazy" happens 4 years before dyson sphere, then I think we have to be really careful about what crazy means

[Yudkowsky][22:54] 

a lot of people looking around nervously and privately wondering if Eliezer was right, while public pravda continues to prohibit wondering anything such thing out loud, so they all go on thinking that they must be wrong.

[Christiano][22:55] 

OK, by "things get crazy" I mean like hundreds of billions of dollars of spending at google on automating AI R&D

[Yudkowsky][22:55] 

I expect bureaucratic obstacles to prevent much GDP per se from resulting from this.

[Christiano][22:55] 

massive scaleups in semiconductor manufacturing, bidding up prices of inputs crazily

[Yudkowsky][22:55] 

I suppose that much spending could well increase world GDP by hundreds of billions of dollars per year.

[Christiano][22:56] 

massive speculative rises in AI company valuations financing a significant fraction of GWP into AI R&D

(+hardware R&D, +building new clusters, +etc.)

[Yudkowsky][22:56] 

like, higher than Tesla? higher than Bitcoin?

both of these things sure did skyrocket in market cap without that having much of an effect on housing stocks and steel production.

[Christiano][22:57] 

right now I think hardware R&D is on the order of $100B/year, AI R&D is more like $10B/year, I guess I'm betting on something more like trillions? (limited from going higher because of accounting problems and not that much smart money)

I don't think steel production is going up at that point

plausibly going down since you are redirecting manufacturing capacity into making more computers. But probably just staying static while all of the new capacity is going into computers, since cannibalizing existing infrastructure is much more expensive

the original point was: you aren't pulling AlphaZero shit any more, you are competing with an industry that has invested trillions in cumulative R&D

[Yudkowsky][23:00] 

is this in hopes of future profit, or because current profits are already in the trillions?

[Christiano][23:01] 

largely in hopes of future profit / reinvested AI outputs (that have high market cap), but also revenues are probably in the trillions?

[Yudkowsky][23:02] 

this all sure does sound "pretty darn prohibited" on my model, but I'd hope there'd be something earlier than that we could bet on. what does your Prophecy prohibit happening before that sub-prophesied day?

[Christiano][23:02] 

To me your model just seems crazy, and you are saying it predicts crazy stuff at the end but no crazy stuff beforehand, so I don't know what's prohibited. Mostly I feel like I'm making positive predictions, of gradually escalating value of AI in lots of different industries

and rapidly increasing investment in AI

I guess your model can be: those things happen, and then one day the AI explodes?

[Yudkowsky][23:03] 

the main way you get rapidly increasing investment in AI is if there's some way that AI can produce huge profits without that being effectively bureaucratically prohibited - eg this is where we get huge investments in burning electricity and wasting GPUs on Bitcoin mining.

[Christiano][23:03] 

but it seems like you should be predicting e.g. AI quickly jumping to superhuman in lots of domains, and some applications jumping from no value to massive value

I don't understand what you mean by that sentence. Do you think we aren't seeing rapidly increasing investment in AI right now?

or are you talking about increasing investment above some high threshold, or increasing investment at some rate significantly larger than the current rate?

it seems to me like you can pretty seamlessly get up to a few $100B/year of revenue just by redirecting existing tech R&D

[Yudkowsky][23:05] 

so I can imagine scenarios where some version of GPT-5 cloned outside OpenAI is able to talk hundreds of millions of mentally susceptible people into giving away lots of their income, and many regulatory regimes are unable to prohibit this effectively. then AI could be making a profit of trillions and then people would invest corresponding amounts in making new anime waifus trained in erotic hypnosis and findom.

this, to be clear, is not my mainline prediction.

but my sense is that our current economy is mostly not about the 1-day period to design new vaccines, it is about the multi-year period to be allowed to sell the vaccines.

the exceptions to this, like Bitcoin managing to say "fuck off" to the regulators for long enough, are where Bitcoin scales to a trillion dollars and gets massive amounts of electricity and GPU burned on it.

so we can imagine something like this for AI, which earns a trillion dollars, and sparks a trillion-dollar competition.

but my sense is that your model does not work like this.

my sense is that your model is about general improvements across the whole economy.

[Christiano][23:08] 

I think bitcoin is small even compared to current AI...

[Yudkowsky][23:08] 

my sense is that we've already built an economy which rejects improvement based on small amounts of cleverness, and only rewards amounts of cleverness large enough to bypass bureaucratic structures. it's not enough to figure out a version of e-gold that's 10% better. e-gold is already illegal. you have to figure out Bitcoin.

what are you going to build? better airplanes? airplane costs are mainly regulatory costs. better medtech? mainly regulatory costs. better houses? building houses is illegal anyways.

where is the room for the general AI revolution, short of the AI being literally revolutionary enough to overthrow governments?

[Christiano][23:10] 

factories, solar panels, robots, semiconductors, mining equipment, power lines, and "factories" just happens to be one word for a thousand different things

I think it's reasonable to think some jurisdictions won't be willing to build things but it's kind of improbable as a prediction for the whole world. That's a possible source of shorter-term predictions?

also computers and the 100 other things that go in datacenters

[Yudkowsky][23:12] 

The whole developed world rejects open borders. The regulatory regimes all make the same mistakes with an almost perfect precision, the kind of coordination that human beings could never dream of when trying to coordinate on purpose.

if the world lasts until 2035, I could perhaps see deepnets becoming as ubiquitous as computers were in... 1995? 2005? would that fulfill the terms of the Prophecy? I think it doesn't; I think your Prophecy requires that early AGI tech be that ubiquitous so that AGI tech will have trillions invested in it.

[Christiano][23:13] 

what is AGI tech?

the point is that there aren't important drivers that you can easily improve a lot

[Yudkowsky][23:14] 

for purposes of the Prophecy, AGI tech is that which, scaled far enough, ends the world; this must have trillions invested in it, so that the trajectory up to it cannot look like pulling an AlphaGo. no?

[Christiano][23:14] 

so it's relevant if you are imagining some piece of the technology which is helpful for general problem solving or something but somehow not helpful for all of the things people are doing with ML, to me that seems unlikely since it's all the same stuff

surely AGI tech should at least include the use of AI to automate AI R&D

regardless of what you arbitrarily decree as "ends the world if scaled up"

[Yudkowsky][23:15] 

only if that's the path that leads to destroying the world?

if it isn't on that path, who cares Prophecy-wise?

[Christiano][23:15] 

also I want to emphasize that "pull an AlphaGo" is what happens when you move from SOTA being set by an individual programmer to a large lab, you don't need to be investing trillions to avoid that

and that the jump is still more like a few years

but the prophecy does involve trillions, and my view gets more like your view if people are jumping from $100B of R&D ever to $1T in a single year

 

5.8. TPUs and GPUs, and automating AI R&D

 

[Yudkowsky][23:17] 

I'm also wondering a little why the emphasis on "trillions". it seems to me that the terms of your Prophecy should be fulfillable by AGI tech being merely as ubiquitous as modern computers, so that many competing companies invest mere hundreds of billions in the equivalent of hardware plants. it is legitimately hard to get a chip with 50% better transistors ahead of TSMC.

[Christiano][23:17] 

yes, if you are investing hundreds of billions then it is hard to pull ahead (though could still happen)

(since the upside is so much larger here, no one cares that much about getting ahead of TSMC since the payoff is tiny in the scheme of the amounts we are discussing)

[Yudkowsky][23:18] 

which, like, doesn't prevent Google from tossing out TPUs that are pretty significant jumps on GPUs, and if there's a specialized application of AGI-ish tech that is especially key, you can have everything behave smoothly and still get a jump that way.

[Christiano][23:18] 

I think TPUs are basically the same as GPUs

probably a bit worse

(but GPUs are sold at a 10x markup since that's the size of nvidia's lead)

[Yudkowsky][23:19] 

noted; I'm not enough of an expert to directly contradict that statement about TPUs from my own knowledge.

[Christiano][23:19] 

(though I think TPUs are nevertheless leased at a slightly higher price than GPUs)

[Yudkowsky][23:19] 

how does Nvidia maintain that lead and 10x markup? that sounds like a pretty un-Paul-ish state of affairs given Bitcoin prices never mind AI investments.

[Christiano][23:20] 

nvidia's lead isn't worth that much because historically they didn't sell many gpus

(especially for non-gaming applications)

their R&D investment is relatively large compared to the $ on the table

my guess is that their lead doesn't stick, as evidenced by e.g. Google very quickly catching up

[Yudkowsky][23:21] 

parenthetically, does this mean - and I don't necessarily predict otherwise - that you predict a drop in Nvidia's stock and a drop in GPU prices in the next couple of years?

[Christiano][23:21] 

nvidia's stock may do OK from riding general AI boom, but I do predict a relative fall in nvidia compared to other AI-exposed companies

(though I also predicted google to more aggressively try to compete with nvidia for the ML market and think I was just wrong about that, though I don't really know any details of the area)

I do expect the cost of compute to fall over the coming years as nvidia's markup gets eroded

to be partially offset by increases in the cost of the underlying silicon (though that's still bad news for nvidia)

[Yudkowsky][23:23] 

I parenthetically note that I think the Wise Reader should be justly impressed by predictions that come true about relative stock price changes, even if Eliezer has not explicitly contradicted those predictions before they come true. there are bets you can win without my having to bet against you.

[Christiano][23:23] 

you are welcome to counterpredict, but no saying in retrospect that reality proved you right if you don't 🙂

otherwise it's just me vs the market

[Yudkowsky][23:24] 

I don't feel like I have a counterprediction here, but I think the Wise Reader should be impressed if you win vs. the market.

however, this does require you to name in advance a few "other AI-exposed companies".

[Christiano][23:25] 

Note that I made the same bet over the last year---I make a large AI bet but mostly moved my nvidia allocation to semiconductor companies. The semiconductor part of the portfolio is up 50% while nvidia is up 70%, so I lost that one. But that just means I like the bet even more next year.

happy to use nvidia vs tsmc

[Yudkowsky][23:25] 

there's a lot of noise in a 2-stock prediction.

[Christiano][23:25] 

I mean, it's a 1-stock prediction about nvidia

[Yudkowsky][23:26] 

but your funeral or triumphal!

[Christiano][23:26] 

indeed 🙂

anyway

I expect all of the $ amounts to be much bigger in the future

[Yudkowsky][23:26] 

yeah, but using just TSMC for the opposition exposes you to I dunno Chinese invasion of Taiwan

[Christiano][23:26] 

yes

also TSMC is not that AI-exposed

I think the main prediction is: eventual move away from GPUs, nvidia can't maintain that markup

[Yudkowsky][23:27] 

"Nvidia can't maintain that markup" sounds testable, but is less of a win against the market than predicting a relative stock price shift. (Over what timespan? Just the next year sounds quite fast for that kind of prediction.)

[Christiano][23:27] 

regarding your original claim: if you think that it's plausible that AI will be doing all of the AI R&D, and that will be accelerating continuously from 12, 6, 3 month "doubling times," but that we'll see a discontinuous change in the "path to doom," then that would be harder to generate predictions about

yes, it's hard to translate most predictions about the world into predictions about the stock market

[Yudkowsky][23:28] 

this again sounds like it's not written in Eliezer-language.

what does it mean for "AI will be doing all of the AI R&D"? that sounds to me like something that happens after the end of the world, hence doesn't happen.

[Christiano][23:29] 

that's good, that's what I thought

[Yudkowsky][23:29] 

I don't necessarily want to sound very definite about that in advance of understanding what it means

[Christiano][23:29] 

I'm saying that I think AI will be automating AI R&D gradually, before the end of the world

yeah, I agree that if you reject the construct of "how fast the AI community makes progress" then it's hard to talk about what it means to automate "progress"

and that may be hard to make headway on

though for cases like AlphaGo (which started that whole digression) it seems easy enough to talk about elo gain per year

maybe the hard part is aggregating across tasks into a measure you actually care about?

[Yudkowsky][23:30] 

up to a point, but yeah. (like, if we're taking Elo high above human levels and restricting our measurements to a very small range of frontier AIs, I quietly wonder if the measurement is still measuring quite the same thing with quite the same robustness.)

[Christiano][23:31] 

I agree that elo measurement is extremely problematic in that regime

 

5.9. Smooth exponentials vs. jumps in income

 

[Yudkowsky][23:31] 

so in your worldview there's this big emphasis on things that must have been deployed and adopted widely to the point of already having huge impacts

and in my worldview there's nothing very surprising about people with a weird powerful prototype that wasn't used to automate huge sections of AI R&D because the previous versions of the tech weren't useful for that or bigcorps didn't adopt it.

[Christiano][23:32] 

I mean, Google is already 1% of the US economy and in this scenario it and its peers are more like 10-20%? So wide adoption doesn't have to mean that many people. Though I also do predict much wider adoption than you so happy to go there if it's happy for predictions.

I don't really buy the "weird powerful prototype"

[Yudkowsky][23:33] 

yes. I noticed.

you would seem, indeed, to be offering large quantities of it for short sale.

[Christiano][23:33] 

and it feels like the thing you are talking about ought to have some precedent of some kind, of weird powerful prototypes that jump straight from "does nothing" to "does something impactful"

like if I predict that AI will be useful in a bunch of domains, and will get there by small steps, you should either predict that won't happen, or else also predict that there will be some domains with weird prototypes jumping to giant impact?

[Yudkowsky][23:34] 

like an electrical device that goes from "not working at all" to "actually working" as soon as you screw in the attachments for the electrical plug.

[Christiano][23:34] 

(clearly takes more work to operationalize)

I'm not sure I understand that sentence, hopefully it's clear enough why I expect those discontinuities?

[Yudkowsky][23:34] 

though, no, that's a facile bad analogy.

a better analogy would be an AI system that only starts working after somebody tells you about batch normalization or LAMB learning rate or whatever.

[Christiano][23:36] 

sure, which I think will happen all the time for individual AI projects but not for sota

because the projects at sota have picked the low hanging fruit, it's not easy to get giant wins

[Yudkowsky][23:36] 

like if I predict that AI will be useful in a bunch of domains, and will get there by small steps, you should either predict that won't happen, or else also predict that there will be some domains with weird prototypes jumping to giant impact?

in the latter case, has this Eliezer-Prophecy already had its terms fulfilled by AlphaFold 2, or do you say nay because AlphaFold 2 hasn't doubled GDP?

[Christiano][23:37] 

(you can also get giant wins by a new competitor coming up at a faster rate of progress, and then we have more dependence on whether people do it when it's a big leap forward or slightly worse than the predecessor, and I'm betting on the latter)

I have no idea what AlphaFold 2 is good for, or the size of the community working on it, my guess would be that its value is pretty small

we can try to quantify

like, I get surprised when $X of R&D gets you something whose value is much larger than $X

I'm not surprised at all if $X of R&D gets you <<$X, or even like 10*$X in a given case that was selected for working well

hopefully it's clear enough why that's the kind of thing a naive person would predict

[Yudkowsky][23:38] 

so a thing which Eliezer's Prophecy does not mandate per se, but sure does permit, and is on the mainline especially for nearer timelines, is that the world-ending prototype had no prior prototype containing 90% of the technology which earned a trillion dollars.

a lot of Paul's Prophecy seems to be about forbidding this.

is that a fair way to describe your own Prophecy?

[Christiano][23:39] 

I don't have a strong view about "containing 90% of the technology"

the main view is that whatever the "world ending prototype" does, there were earlier systems that could do practically the same thing

if the world ending prototype does something that lets you go foom in a day, there was a system years earlier that could foom in a month, so that would have been the one to foom

[Yudkowsky][23:41] 

but, like, the world-ending thing, according to the Prophecy, must be squarely in the middle of a class of technologies which are in the midst of earning trillions of dollars and having trillions of dollars invested in them. it's not enough for the Worldender to be definitionally somewhere in that class, because then it could be on a weird outskirt of the class, and somebody could invest a billion dollars in that weird outskirt before anybody else had invested a hundred million, which is forbidden by the Prophecy. so the Worldender has got to be right in the middle, a plain and obvious example of the tech that's already earning trillions of dollars. ...y/n?

[Christiano][23:42] 

I agree with that as a prediction for some operationalization of "a plain and obvious example," but I think we could make it more precise / it doesn't feel like it depends on the fuzziness of that

I think that if the world can end out of nowhere like that, you should also be getting $100B/year products out of nowhere like that, but I guess you think not because of bureaucracy

like, to me it seems like our views stake out predictions about codex, where I'm predicting its value will be modest relative to R&D, and the value will basically improve from there with a nice experience curve, maybe something like ramping up quickly to some starting point <$10M/year and then doubling every year thereafter, whereas I feel like you are saying more like "who knows, could be anything" and so should be surprised each time the boring thing happens

[Yudkowsky][23:45] 

the concrete example I give is that the World-Ending Company will be able to use the same tech to build a true self-driving car, which would in the natural course of things be approved for sale a few years later after the world had ended.

[Christiano][23:46] 

but self-driving cars seem very likely to already be broadly deployed, and so the relevant question is really whether their technical improvements can also be deployed to those cars?

(or else maybe that's another prediction we disagree about)

[Yudkowsky][23:47] 

I feel like I would indeed not have the right to feel very surprised if Codex technology stagnated for the next 5 years, nor if it took a massive leap in 2 years and got ubiquitously adopted by lots of programmers.

yes, I think that's a general timeline difference there

re: self-driving cars

I might be talkable into a bet where you took "Codex tech will develop like this" and I took the side "literally anything else but that"

[Christiano][23:48] 

I think it would have to be over/under, I doubt I'm more surprised than you by something failing to be economically valuable, I'm surprised by big jumps in value

seems like it will be tough to work

[Yudkowsky][23:49] 

well, if I was betting on something taking a big jump in income, I sure would bet on something in a relatively unregulated industry like Codex or anime waifus.

but that's assuming I made the bet at all, which is a hard sell when the bet is about the Future, which is notoriously hard to predict.

[Christiano][23:50] 

I guess my strongest take is: if you want to pull the thing where you say that future developments proved you right and took unreasonable people like me by surprise, you've got to be able to say something in advance about what you expect to happen

[Yudkowsky][23:51] 

so what if neither of us are surprised if Codex stagnates for 5 years, you win if Codex shows a smooth exponential in income, and I win if the income looks... jumpier? how would we quantify that?

[Christiano][23:52] 

codex also does seem a bit unfair to you in that it may have to be adopted by lots of programmers which could slow things down a lot even if capabilities are pretty jumpy

(though I think in fact usefulness and not merely profit will basically just go up smoothly, with step sizes determined by arbitrary decisions about when to release something)

[Yudkowsky][23:53] 

I'd also be concerned about unfairness to me in that earnable income is not the same as the gains from trade. If there's more than 1 competitor in the industry, their earnings from Codex may be much less than the value produced, and this may not change much with improvements in the tech.

 

5.10. Late-stage predictions

 

[Christiano][23:53] 

I think my main update from this conversation is that you don't really predict someone to come out of nowhere with a model that can earn a lot of $, even if they could come out of nowhere with a model that could end the world, because of regulatory bottlenecks and nimbyism and general sluggishness and unwillingness to do things

does that seem right?

[Yudkowsky][23:55] 

Well, and also because the World-ender is "the first thing that scaled with compute" and/or "the first thing that ate the real core of generality" and/or "the first thing that went over neutron multiplication factor 1".

[Christiano][23:55] 

and so that cuts out a lot of the easily-specified empirical divergences, since "worth a lot of $" was the only general way to assess "big deal that people care about" and avoiding disputes like "but Zen was mostly developed by a single programmer, it's not like intense competition"

yeah, that's the real disagreement it seems like we'd want to talk about

but it just doesn't seem to lead to many prediction differences in advance?

I totally don't buy any of those models, I think they are bonkers

would love to bet on that

[Yudkowsky][23:56] 

Prolly but I think the from-my-perspective-weird talk about GDP is probably concealing some kind of important crux, because caring about GDP still feels pretty alien to me.

[Christiano][23:56] 

I feel like getting up to massive economic impacts without seeing "the real core of generality" seems like it should also be surprising on your view

like if it's 10 years from now and AI is a pretty big deal but no crazy AGI, isn't that surprising?

[Yudkowsky][23:57] 

Mildly but not too surprising, I would imagine that people had built a bunch of neat stuff with gradient descent in realms where you could get a long way on self-play or massively collectible datasets.

[Christiano][23:58] 

I'm fine with the crux being something that doesn't lead to any empirical disagreements, but in that case I just don't think you should claim credit for the worldview making great predictions.

(or the countervailing worldview making bad predictions)

[Yudkowsky][23:59] 

stuff that we could see then: self-driving cars (10 years is enough for regulatory approval in many countries), super Codex, GPT-6 powered anime waifus being an increasingly loud source of (arguably justified) moral panic and a hundred-billion-dollar industry

[Christiano][23:59] 

another option is "10% GDP GWP growth in a year, before doom"

I think that's very likely, though might be too late to be helpful

[Yudkowsky][0:01]  (next day, Sep. 15) 

see, that seems genuinely hard unless somebody gets GPT-4 far head of any political opposition - I guess all the competent AGI groups lean solidly liberal at the moment? - and uses it to fake massive highly-persuasive sentiment on Twitter for housing liberalization.

[Christiano][0:01]  (next day, Sep. 15) 

so seems like a bet?

but you don't get to win until doom 🙁

[Yudkowsky][0:02]  (next day, Sep. 15) 

I mean, as written, I'd want to avoid cases like 10% growth on paper while recovering from a pandemic that produced 0% growth the previous year.

[Christiano][0:02]  (next day, Sep. 15) 

yeah

[Yudkowsky][0:04]  (next day, Sep. 15) 

I'd want to check the current rate (5% iirc) and what the variance on it was, 10% is a little low for surety (though my sense is that it's a pretty darn smooth graph that's hard to perturb)

if we got 10% in a way that was clearly about AI tech becoming that ubiquitous, I'd feel relatively good about nodding along and saying, "Yes, that is like unto the beginning of Paul's Prophecy" not least because the timelines had been that long at all.

[Christiano][0:05]  (next day, Sep. 15) 

like 3-4%/year right now

random wikipedia number is 5.5% in 2006-2007, 3-4% since 2010

4% 1995-2000

[Yudkowsky][0:06]  (next day, Sep. 15) 

I don't want to sound obstinate here. My model does not forbid that we dwiddle around on the AGI side while gradient descent tech gets its fingers into enough separate weakly-generalizing pies to produce 10% GDP growth, but I'm happy to say that this sounds much more like Paul's Prophecy is coming true.

[Christiano][0:07]  (next day, Sep. 15) 

ok, we should formalize at some point, but also need the procedure for you getting credit given that it can't resolve in your favor until the end of days

[Yudkowsky][0:07]  (next day, Sep. 15) 

Is there something that sounds to you like Eliezer's Prophecy which we can observe before the end of the world?

[Christiano][0:07]  (next day, Sep. 15) 

when you will already have all the epistemic credit you need

not on the "simple core of generality" stuff since that apparently immediately implies end of world

maybe something about ML running into obstacles en route to human level performance?

or about some other kind of discontinuous jump even in a case where people care, though there seem to be a few reasons you don't expect many of those

[Yudkowsky][0:08]  (next day, Sep. 15) 

depends on how you define "immediately"? it's not long before the end of the world, but in some sad scenarios there is some tiny utility to you declaring me right 6 months before the end.

[Christiano][0:09]  (next day, Sep. 15) 

I care a lot about the 6 months before the end personally

though I do think probably everything is more clear by then independent of any bet; but I guess you are more pessimistic about that

[Yudkowsky][0:09]  (next day, Sep. 15) 

I'm not quite sure what I'd do in them, but I may have worked something out before then, so I care significantly in expectation if not in particular.

I am more pessimistic about other people's ability to notice what reality is screaming in their faces, yes.

[Christiano][0:10]  (next day, Sep. 15) 

if we were to look at various scaling curves, e.g. of loss vs model size or something, do you expect those to look distinctive as you hit the "real core of generality"?

[Yudkowsky][0:10]  (next day, Sep. 15) 

let me turn that around: if we add transformers into those graphs, do they jump around in a way you'd find interesting?

[Christiano][0:11]  (next day, Sep. 15) 

not really

[Yudkowsky][0:11]  (next day, Sep. 15) 

is that because the empirical graphs don't jump, or because you don't think the jumps say much?

[Christiano][0:11]  (next day, Sep. 15) 

but not many good graphs to look at (I just have one in mind), so that's partly a prediction about what the exercise would show

I don't think the graphs jump much, and also transformers come before people start evaluating on tasks where they help a lot

[Yudkowsky][0:12]  (next day, Sep. 15) 

It would not terribly contradict the terms of my Prophecy if the World-ending tech began by not producing a big jump on existing tasks, but generalizing to some currently not-so-popular tasks where it scaled much faster.

[Christiano][0:13]  (next day, Sep. 15) 

eh, they help significantly on contemporary tasks, but it's just not a huge jump relative to continuing to scale up model sizes

or other ongoing improvements in architecture

anyway, should try to figure out something, and good not to finalize a bet until you have some way to at least come out ahead, but I should sleep now

[Yudkowsky][0:14]  (next day, Sep. 15) 

yeah, same.

Thing I want to note out loud lest I forget ere I sleep: I think the real world is full of tons and tons of technologies being developed as unprecedented prototypes in the midst of big fields, because the key thing to invest in wasn’t the competitively explored center. Wright Flyer vs all expenditures on Traveling Machine R&D. First atomic pile and bomb vs all Military R&D.

This is one reason why Paul’s Prophecy seems fragile to me. You could have the preliminaries come true as far as there being a trillion bucks in what looks like AI R&D, and then the WorldEnder is a weird prototype off to one side of that. saying “But what about the rest of that AI R&D?” is no more a devastating retort to reality than looking at AlphaGo and saying “But weren’t other companies investing billions in Better Software?” Yeah but it was a big playing field with lots of different kinds of Better Software and no other medium-sized team of 15 people with corporate TPU backing was trying to build a system just like AlphaGo, even though multiple small outfits were trying to build prestige-earning gameplayers. Tech advancements very very often occur in places where investment wasn't dense enough to guarantee overlap.

 
6. Follow-ups on "Takeoff Speeds"

 

6.1. Eliezer Yudkowsky's commentary

 

[Yudkowsky][17:25]  (Sep. 15) 

Further comment that occurred to me on "takeoff speeds" if I've better understood the main thesis now: its hypotheses seem to include a perfectly anti-Thielian setup for AGI.

Thiel has a running thesis about how part of the story behind the Great Stagnation and the decline in innovation that's about atoms rather than bits - the story behind "we were promised flying cars and got 140 characters", to cite the classic Thielian quote - is that people stopped believing in "secrets".

Thiel suggests that you have to believe there are knowable things that aren't yet widely known - not just things that everybody already knows, plus mysteries that nobody will ever know - in order to be motivated to go out and innovate. Culture in developed countries shifted to label this kind of thinking rude - or rather, even ruder, even less tolerated than it had been decades before - so innovation decreased as a result.

The central hypothesis of "takeoff speeds" is that at the time of serious AGI being developed, it is perfectly anti-Thielian in that it is devoid of secrets in that sense. It is not permissible (on this viewpoint) for it to be the case that there is a lot of AI investment into AI that is directed not quite at the key path leading to AGI, such that somebody could spend $1B on compute for the key path leading to AGI before anybody else had spent $100M on that. There cannot exist any secret like that. The path to AGI will be known; everyone, or a wide variety of powerful actors, will know how profitable that path will be; the surrounding industry will be capable of acting on this knowledge, and will have actually been acting on it as early as possible; multiple actors are already investing in every tech path that would in fact be profitable (and is known to any human being at all), as soon as that R&D opportunity becomes available.

And I'm not saying this is an inconsistent world to describe! I've written science fiction set in this world. I called it "dath ilan". It's a hypothetical world that is actually full of smart people in economic equilibrium. If anything like Covid-19 appears, for example, the governments and public-good philanthropists there have already set up prediction markets (which are not illegal, needless to say); and of course there are mRNA vaccine factories already built and ready to go, because somebody already calculated the profits from fast vaccines would be very high in case of a pandemic (no artificial price ceilings in this world, of course); so as soon as the prediction markets started calling the coming pandemic conditional on no vaccine, the mRNA vaccine factories were already spinning up.

This world, however, is not Earth.

On Earth, major chunks of technological progress quite often occur outside of a social context where everyone knew and agreed in advance on which designs would yield how much expected profit and many overlapping actors competed to invest in the most actually-promising paths simultaneously.

And that is why you can read Inadequate Equilibria, and then read this essay on takeoff speeds, and go, "Oh, yes, I recognize this; it's written inside the Modesty worldview; in particular, the imagination of an adequate world in which there is a perfect absence of Thielian secrets or unshared knowable knowledge about fruitful development pathways. This is the same world that already had mRNA vaccines ready to spin up on day one of the Covid-19 pandemic, because markets had correctly forecasted their option value and investors had acted on that forecast unimpeded. Sure would be an interesting place to live! But we don't live there."

Could we perhaps end up in a world where the path to AGI is in fact not a Thielian secret, because in fact the first accessible path to AGI happens to lie along a tech pathway that already delivered large profits to previous investors who summed a lot of small innovations, a la experience with chipmaking, such that there were no large innovations just lots and lots of small innovations that yield 10% improvement annually on various tech benchmarks?

I think that even in this case we will get weird, discontinuous, and fatal behaviors, and I could maybe talk about that when discussion resumes. But it is not ruled out to me that the first accessible pathway to AGI could happen to lie in the further direction of some road that was already well-traveled, already yielded much profit to now-famous tycoons back when its first steps were Thielian secrets, and hence is now replete with dozens of competing chasers for the gold rush.

It's even imaginable to me, though a bit less so, that the first path traversed to real actual pivotal/powerful/lethal AGI, happens to lie literally actually squarely in the central direction of the gold rush. It sounds a little less like the tech history I know, which is usually about how someone needed to swerve a bit and the popular gold-rush forecasts weren't quite right, but maybe that is just a selective focus of history on the more interesting cases.

Though I remark that - even supposing that getting to big AGI is literally as straightforward and yet as difficult as falling down a semiconductor manufacturing roadmap (as otherwise the biggest actor to first see the obvious direction could just rush down the whole road) - well, TSMC does have a bit of an unshared advantage right now, if I recall correctly. And Intel had a bit of an advantage before that. So that happens even when there's competitors competing to invest billions.

But we can imagine that doesn't happen either, because instead of needing to build a whole huge manufacturing plant, there's just lots and lots of little innovations adding up to every key AGI threshold, which lots of actors are investing $10 million in at a time, and everybody knows which direction to move in to get to more serious AGI and they're right in this shared forecast.

I am willing to entertain discussing this world and the sequelae there - I do think everybody still dies in this case - but I would not have this particular premise thrust upon us as a default, through a not-explicitly-spoken pressure against being so immodest and inegalitarian as to suppose that any Thielian knowable-secret will exist, or that anybody in the future gets as far ahead of others as today's TSMC or today's Deepmind.

We are, in imagining this world, imagining a world in which AI research has become drastically unlike today's AI research in a direction drastically different from the history of many other technologies.

It's not literally unprecedented, but it's also not a default environment for big moments in tech progress; it's narrowly precedented for particular industries with high competition and steady benchmark progress driven by huge investments into a sum of many tiny innovations.

So I can entertain the scenario. But if you want to claim that the social situation around AGI will drastically change in this way you foresee - not just that it could change in that direction, if somebody makes a big splash that causes everyone else to reevaluate their previous opinions and arrive at yours, but that this social change will occur and you know this now - and that the prerequisite tech path to AGI is known to you, and forces an investment situation that looks like the semiconductor industry - then your "What do you think you know and how do you think you know it?" has some significant explaining to do.

Of course, I do appreciate that such a thing could be knowable, and yet not known to me. I'm not so silly as to disbelieve in secrets like that. They're all over the actual history of technological progress on our actual Earth.

Yudkowsky and Christiano discuss "Takeoff Speeds"
New Comment
177 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

I stand ready to bet with Eliezer on any topic related to AI, science, or technology. I'm happy for him to pick but I suggest some types of forecast below.

If Eliezer’s predictions were roughly as good as mine (in cases where we disagree), then I would update towards taking his views more seriously. Right now it looks to me like his view makes bad predictions about lots of everyday events.

It’s possible that we won’t be able to find cases where we disagree, and perhaps that Eliezer’s model totally agrees with mine until we develop AGI. But I think that’s unlikely for a few reasons:

  • I constantly see observations that seem like evidence for Eliezer’s views (e.g. any time I see an ML paper with a surprisingly large effect size, or ML labs failing to make investments in scaling, or people being surprisingly unreasonable), it’s just that I see significantly more evidence against his views. The point of making bets in advance is that it can correct for my hindsight bias or for my inability to simulate “what Eliezer’s view would say about this.” Eliezer could also say that actually all of the observations I listed aren't evidence for his view, which would be interesting to me.
  • Eliezer frequen
... (read more)

I do wish to note that we spent a fair amount of time on Discord trying to nail down what earlier points we might disagree on, before the world started to end, and these Discord logs should be going up later.

From my perspective, the basic problem is that Eliezer's story looks a lot like "business as usual until the world starts to end sharply", and Paul's story looks like "things continue smoothly until their smooth growth ends the world smoothly", and both of us have ever heard of superforecasting and both of us are liable to predict near-term initial segments by extrapolating straight lines while those are available.  Another basic problem, as I'd see it, is that we tend to tell stories about very different subject matters - I care a lot less than Paul about the quantitative monetary amount invested into Intel, to the point of not really trying to develop expertise about that.

I claim that I came off better than Robin Hanson in our FOOM debate compared to the way that history went.  I'd claim that my early judgments of the probable importance of AGI, at all, stood up generally better than early non-Yudkowskian EA talking about that.  Other people I've noticed ever m... (read more)

From my perspective, the basic problem is that Eliezer's story looks a lot like "business as usual until the world starts to end sharply", and Paul's story looks like "things continue smoothly until their smooth growth ends the world smoothly", and both of have ever heard of superforecasting and both of us are liable to predict near-term initial segments by extrapolating straight lines while those are available.

I agree that it's plausible that we both make the same predictions about the near future. I think we probably don't, and there are plenty of disagreements about all kinds of stuff. But if in fact we agree, then in 5 years you shouldn't say "and see how much the world looked like I said?"

It feels to me like it goes:  you say AGI will look crazy.  Then I say that sounds unlike the world of today. Then you say "no, the world actually always looks discontinuous in the ways I'm predicting and your model is constantly surprised by real stuff that happens, e.g. see transformers or AlphaGo" and then I say "OK, let's bet about literally anything at all, you pick."

I think it's pretty likely that we actually do disagree about how much the world of today is boring and continuo... (read more)

I feel a bit confused about where you think we meta-disagree here, meta-policy-wise.  If you have a thesis about the sort of things I'm liable to disagree with you about, because you think you're more familiar with the facts on the ground, can't you write up Paul's View of the Next Five Years and then if I disagree with it better yet, but if not, you still get to be right and collect Bayes points for the Next Five Years?

I mean, it feels to me like this should be a case similar to where, for example, I think I know more about macroeconomics than your typical EA; so if I wanted to expend the time/stamina points, I could say a bunch of things I consider obvious and that contradict hot takes on Twitter and many EAs would go "whoa wait really" and then I could collect Bayes points later and have performed a public service, even if nobody showed up to disagree with me about that.  (The reason I don't actually do this... is that I tried; I keep trying to write a book about basic macro, only it's the correct version explained correctly, and have a bunch of isolated chapters and unfinished drafts.)  I'm also trying to write up my version of The Next Five Years assuming the wo... (read more)

I think you think there's a particular thing I said which implies that the ball should be in my court to already know a topic where I make a different prediction from what you do.

I've said I'm happy to bet about anything, and listed some particular questions I'd bet about where I expect you to be wronger. If you had issued the same challenge to me, I would have picked one of the things and we would have already made some bets. So that's why I feel like the ball is in your court to say what things you're willing to make forecasts about.

That said, I don't know if making bets is at all a good use of time. I'm inclined to do it because I feel like your view really should be making different predictions (and I feel like you are participating in good faith and in fact would end up making different predictions). And I think it's probably more promising than trying to hash out the arguments since at this point I feel like I mostly know your position and it's incredibly slow going. But it seems very plausible that the right move is just to agree to disagree and not spend time on this. In that case it was particularly bad of me to try to claim the epistemic high ground. I can't really defend... (read more)

My uncharitable read on many of these domains is that you are saying "Sure, I think that Paul might have somewhat better forecasts than me on those questions, but why is that relevant to AGI?"

In that case it seems like the situation is pretty asymmetrical. I'm claiming that my view of AGI is related to beliefs and models that also bear on near-term questions, and I expect to make better forecasts than you in those domains because I have more accurate beliefs/models. If your view of AGI is unrelated to any near-term questions where we disagree, then that seems like an important asymmetry.

6DPiepgrass
I suspect that indeed EY's model has a limited ability to make near-term predictions, so that yes, the situation is asymmetrical. But I suspect his view is similar to my view, so I don't think EY is wrong. But I am confused about why EY (i) hasn't replied himself and (ii) in general, doesn't communicate more clearly on this topic.

I think you are underconfident about the fact that almost all AI profits will come from areas that had almost-as-much profit in recent years. So we could bet about where AI profits are in the near term, or try to generalize this.

I wouldn't be especially surprised by waifutechnology or machine translation jumping to newly accessible domains (the thing I care about and you shrug about (until the world ends)), but is that likely to exhibit a visible economic discontinuity in profits (which you care about and I shrug about (until the world ends))?  There's apparently already mass-scale deployment of waifutech in China to forlorn male teenagers, so maybe you'll say the profits were already precedented.  Google offers machine translation now, even though they don't make much obvious measurable profit on that, but maybe you'll want to say that however much Google spends on that, they must rationally anticipate at least that much added revenue.  Or perhaps you want to say that "almost all AI profits" will come from robotics over the same period.  Or maybe I misunderstand your viewpoint, and if you said something concrete about the stuff you care about, I would manage to disagree with that; or maybe you think that waifutech suddenly getting much more charming with the next generation of text transformers is something you already know enough to rule out; or maybe you think that 2024's waifutech should definitely be able to do some observable surface-level thing it can't do now.

I'd be happy to disagree about romantic chatbots or machine translation. I'd have to look into it more to get a detailed sense in either, but I can guess. I'm not sure what "wouldn't be especially surprised" means, I think to actually get disagreements we need way more resolution than that so one question is whether you are willing to play ball (since presumably you'd also have to looking into to get a more detailed sense). Maybe we could save labor if people would point out the empirical facts we're missing and we can revise in light of that, but we'd still need more resolution. (That said: what's up for grabs here are predictions about the future, not present.)

I'd guess that machine translation is currently something like $100M/year in value, and will scale up more like 2x/year than 10x/year as DL improves (e.g. most of the total log increase will be in years with <3x increase rather than >3x increase, and 3 is like the 60th percentile of the number for which that inequality is tight).

I'd guess that increasing deployment of romantic chatbots will end up with technical change happening first followed by social change second, so the speed of deployment and change will depend ... (read more)

Thanks for continuing to try on this!  Without having spent a lot of labor myself on looking into self-driving cars, I think my sheer impression would be that we'll get $1B/yr waifutech before we get AI freedom-of-the-road; though I do note again that current self-driving tech would be more than sufficient for $10B/yr revenue if people built new cities around the AI tech level, so I worry a bit about some restricted use-case of self-driving tech that is basically possible with current tech finding some less regulated niche worth a trivial $10B/yr.  I also remark that I wouldn't be surprised to hear that waifutech is already past $1B/yr in China, but I haven't looked into things there.  I don't expect the waifutech to transcend my own standards for mediocrity, but something has to be pretty good before I call it more than mediocre; do you think there's particular things that waifutech won't be able to do?

My model permits large jumps in ML translation adoption; it is much less clear about whether anyone will be able to build a market moat and charge big prices for it.  Do you have a similar intuition about # of users increasing gradually, not just revenue increasing gradually?

I think we're still at the level of just drawing images about the future, so that anybody who came back in 5 years could try to figure out who sounded right, at all, rather than assembling a decent portfolio of bets; but I also think that just having images versus no images is a lot of progress.

8paulfchristiano
Yes, I think that value added by automated translation will follow a similar pattern. Number of words translated is more sensitive to how you count and random nonsense, as is number of "users" which has even more definitional issues. You can state a prediction about self-driving cars in any way you want. The obvious thing is to talk about programs similar to the existing self-driving taxi pilots (e.g. Waymo One) and ask when they do $X of revenue per year, or when $X of self-driving trucking is done per year. (I don't know what AI freedom-of-the-road means, do you mean something significantly more ambitious than self-driving trucks or taxis?)
8paulfchristiano
Man, the problem is that you say the "jump to newly accessible domains" will be the thing that lets you take over the world. So what's up for dispute is the prototype being enough to take over the world rather than years of progress by a giant lab on top of the prototype. It doesn't help if you say "I expect new things to sometimes become possible" if you don't further say something about the impact of the very early versions of the product. If e.g. people were spending $1B/year developing a technology, and then after a while it jumps from 0/year to $1B/year of profit, I'm not that surprised. (Note that machine translation is radically smaller than this, I don't know the numbers.) I do suspect they could have rolled out a crappy version earlier, perhaps by significantly changing their project. But why would they necessarily bother doing that? For me this isn't violating any of the principles that make your stories sound so crazy. The crazy part is someone spending $1B and then generating $100B/year in revenue (much less $100M and then taking over the world). (Note: it is surprising if an industry is spending $10T/year on R&D and then jumps from $1T --> $10T of revenue in one year in a world that isn't yet growing crazily. The surprising depends a lot on the numbers involved, and in particular on how valuable it would have been to deploy a worse version earlier and how hard it is to raise money at different scales.)

The crazy part is someone spending $1B and then generating $100B/year in revenue (much less $100M and then taking over the world).

Would you say that this is a good description of Suddenly Hominids but you don't expect that to happen again, or that this is a bad description of hominids?

7paulfchristiano
It's not a description of hominids at all, no one spent any money on R&D. I think there are analogies where this would be analogous to hominids (which I think are silly, as we discuss in the next part of this transcript). And there are analogies where this is a bad description of hominids (which I prefer).

Spending money on R&D is essentially the expenditure of resources in order to explore and optimize over a promising design space, right? That seems like a good description of what natural selection did in the case of hominids. I imagine this still sounds silly to you, but I'm not sure why. My guess is that you think natural selection isn't relevantly similar because it didn't deliberately plan to allocate resources as part of a long bet that it would pay off big.

7paulfchristiano
I think natural selection has lots of similarities to R&D, but (i) there are lots of ways of drawing the analogy, (ii) some important features of R&D are missing in evolution, including some really important ones for fast takeoff arguments (like the existence of actors who think ahead). If someones wants to spell out why they think evolution of hominids means takeoff is fast then I'm usually happy to explain why I disagree with their particular analogy. I think this happens in the next discord log between me and Eliezer.

Inevitably, you can go back afterwards and claim it wasn't really a surprise in terms of the abstractions that seem so clear and obvious now, but I think it was surprised then

It seems like you are saying that there is some measure that was continuous all along, but that it's not obvious in advance which measure was continuous. That seems to suggest that there are a bunch of plausible measures you could suggest in advance, and lots of interesting action will be from changes that are discontinuous changes on some of those measures. Is that right?

If so, don't we get out a ton of predictions? Like, for every particular line someone thinks might be smooth, the gradualist has a higher probability on it being smooth than you would? So why can't I just start naming some smooth lines (like any of the things I listed in the grandparent) and then we can play ball?

If not, what's your position? Is it that you literally can't think of the possible abstractions that would later make the graph smooth? (This sounds insane to me.)

I disagree that this is a meaningful forecasting track record.  Massive degrees of freedom, and the mentioned events seem unresolvable, and it's highly ambiguous how these things particularly prove the degree of error unless they were properly disambiguated in advance.  Log score or it didn't happen.

(Slightly edited to try and sound less snarky)

I want to register a gripe, re your follow-up post: when Eliezer says that he, Demis Hassabis, and Dario Amodei have a good "track record" because of their qualitative prediction successes, you object that the phrase "track record" should be reserved for things like Metaculus forecasts.

But when Ben Garfinkel says that Eliezer has a bad "track record" because he made various qualitative predictions Ben disagrees with, you slam the retweet button.

I already thought this narrowing of the term "track record" was weird. If you're saying that we shouldn't count Linus Pauling's achievements in chemistry, or his bad arguments for Vitamin C megadosing, as part of Pauling's "track record", because they aren't full probability distributions over concrete future events, then I worry a lot that this new word usage will cause confusion and lend itself to misuse. As long as it's used even-handedly, though, it's ultimately just a word.

(On my model, the main consequence of this is just that "track records" matter a lot less, because they become a much smaller slice of the evidence we have about a lot of people's epistemics, expertise, etc.)

But if you're going to complain about "track record" talk wh... (read more)

1Jotto999
I see what you're saying, but it looks like you're strawmanning me yet again with a more extreme version of my position.  You've done that several times and you need to stop that. What you've argued here prevents me from questioning the forecasting performance of every pundit who I can't formally score, which is ~all of them. Yes, it's not a real forecasting track record unless it meets the sort of criteria that are fairly well understood in Tetlockian research.  And neither is Ben Garfinkel's post, that doesn't give us a forecasting track record, like on Metaculus. But if a non-track-recorded person suggests they've been doing a good job anticipating things, it's quite reasonable to point out non-scorable things they said that seem incorrect, even with no way to score it. In an earlier draft of my essay, I considered getting into bets he's made (several of which he's lost). I ended up not including those things.  Partly because my focus was waning and it was more attainable to stick to the meta-level point.  And partly because I thought the essay might be better if it was more focused.  I don't think there is literally zero information about his forecasting performance (that's not plausible), but it seemed like it would be more of a distraction from my epistemic point.  Bets are not as informative as Metaculus-style forecasts, but they are better than nothing.  This stuff is a spectrum, even Metaculus doesn't retain some kinds of information about the forecaster.  Still, I didn't get into it, though I could have. But I ended up later editing in a link to one of Paul's comments, where he describes some reasons that Robin looks pretty bad in hindsight, but also includes several things Eliezer said that seem quite off.  None of those are scorable.  But I added in a link to that, because Eliezer explicitly claimed he came across better in that debate, which overall he may have, but it's actually more mixed than that, and that's relevant to my meta-point that one c
8Rob Bensinger
IMO that's a much more defensible position, and is what the discussion should have initially focused on. From my perspective, the way the debate largely went is: * Jotto: Eliezer claims to have a relatively successful forecasting track record, along with Dario and Demis; but this is clearly dissembling, because a forecasting track record needs to look like a long series of Metaculus predictions. * Other people: (repeat without qualification the claim that Eliezer is falsely claiming to have a "forecasting track record"; simultaneously claims that Eliezer has a subpar "forecasting track record", based on evidence that wouldn't meet Jotto's stated bar) * Jotto: (signal-boosts the inconsistent claims other people are making, without noting that this is equivocating between two senses of "track record" and therefore selectively applying two different standards) * Rob B: (gripes and complains) Whereas the way the debate should have gone is: * Jotto: I personally disagree with Eliezer that the AI Foom debate is easy to understand and cash out into rough predictions about how the field has progressed since 2009, or how it is likely to progress in the future. Also, I wish that all of Eliezer, Robin, Demis, Dario, and Paul had made way more Metaculus-style forecasts back in 2010, so it would be easier to compare their prediction performance. I find it frustrating that nobody did this, and think we should start doing this way more now. Also, I think this sharper comparison would probably have shown that Eliezer is significantly worse at thinking about this topic than Paul, and maybe than Robin, Demis, and Dario. * Rob B: I disagree with your last sentence, and I disagree quantitatively that stuff like the Foom debate is as hard-to-interpret as you suggest. But I otherwise agree with you, and think it would have been useful if the circa-2010 discussions had included more explicit probability distributions, scenario breakdowns, quantitative estimates, etc. (suitably fl
4Rob Bensinger
Plus maybe some disagreements about how possible it is in general to form good models of people and of topics like AGI in the absence of Metaculus-ish forecasts, and disagreement about exactly how informative it would be to have a hundred examples of narrow-AI benchmark predictions over the last ten years from all the influential EAs? (I think it would be useful, but more like '1% to 10% of the overall evidence for weighing people's reasoning and correctness about AGI', not '90% to 100% of the evidence'.) (An exception would be if, e.g., it turned out that ML progress is way more predictable than Eliezer or I believe. ML's predictability is a genuine crux for us, so seeing someone else do amazing at this prediction task for a bunch of years, with foresight rather than hindsight, would genuinely update us a bunch. But we don't expect to learn much from Eliezer or Rob trying to predict stuff, because while someone else may have secret insight that lets them predict the future of narrow-AI advances very narrowly, we are pretty sure we don't know how to do that.)
9Rob Bensinger
Part of what I object to is that you're a Metaculus radical, whose Twitter bio says "Replace opinions with forecasts." This is a view almost no one in the field currently agrees with or tries to live up to. Which is fine, on its own. I like radicals, and want to hear their views argued for and hashed out in conversation. But then you selectively accuse Eliezer of lying about having a "track record", without noting how many other people are also expressing non-forecast "opinions" (and updating on these), and while using language in ways that make it sound like Eliezer is doing something more unusual than he is, and making it sound like your critique is more independent of your nonstandard views on track records and "opinions" than it actually is. That's the part that bugs me. If you have an extreme proposal for changing EA's norms, argue for that proposal. Don't just selectively take potshots at views or people you dislike more, while going easy on everyone else.
4Matthew Barnett
I think Jotto has argued for the proposal in the past. Whether he did it in that particular comment is not very important, so long as he holds everyone to the same standards. As for his standards: I think he sees Eliezer as an easy target because he’s high status in this community and has explicitly said that he thinks his track record is good (in fact, better than other people). On its own, therefore, it’s not surprising that Eliezer would get singled out.
0Jotto999
I no longer see exchanges with you as a good use of energy, unless you're able to describe some of the strawmanning of me you've done and come clean about that. EDIT: Since this is being downvoted, here is a comment chain where Rob Besinger interpreted me in ways that are bizarre, such as suggesting that I think Eliezer is saying he has "a crystal ball", or that "if you record any prediction anywhere other than Metaculus (that doesn't have similarly good tools for representing probability distributions), you're a con artist".  Things that sound thematically similar to what I was saying, but were weird, persistent extremes that I don't see as good-faith readings of me.  It kept happening over Twitter, then again on LW.  At no point have I felt he's trying to understand what I actually think.  So I don't see the point of continuing with him.
-5Ege Erdil

BTW, a few days ago Eliezer made a specific prediction that is perhaps relevant to your discussion:

I [would very tentatively guess that] AGI to kill everyone before self-driving cars are commercialized

(I suppose Eliezer is talking about Level 5 autonomy cars here).

Maybe a bet like this could work:

At least one month will elapse after the first Level 5 autonomy car hits the road, without AGI killing everyone

"Level 5 autonomy" could be further specified to avoid ambiguities. For example, like this:

The car must be publicly accessible (e.g. available for purchase, or as a taxi etc). The car should be able to drive from some East Coast city to some West Coast city by itself. 

4Eliezer Yudkowsky
Once you can buy a self-driving car, the thing that Paul predicts with surety and that I shrug about has already happened. If it does happen, my model says very little about remaining timeline from there one way or another. It shrugs again and says, "Guess that's how difficult the AI problem and regulatory problem were."

sort of person who gets taken in by Hanson's arguments in 2008 and gets caught flatfooted by AlphaGo and GPT-3 and AlphaFold 2

I find this kind of bluster pretty frustrating and condescending. I also feel like the implication is just wrong---if Eliezer and I disagree, I'd guess it's because he's worse at predicting ML progress. To me GPT-3 feels much (much) closer to my mainline than to Eliezer's, and AlphaGo is very unsurprising. But it's hard to say who was actually "caught flatfooted" unless we are willing to state some of these predictions in advance.

I got pulled into this interaction because I wanted to get Eliezer to make some real predictions, on the record, so that we could have a better version of this discussion in 5 years rather than continuing to both say "yeah, in hindsight this looks like evidence for my view." I apologize if my tone (both in that discussion and in this comment) is a bit frustrated.

It currently feels from the inside like I'm holding the epistemic high ground on this point, though I expect Eliezer disagrees strongly:

  • I'm willing to bet on anything Eliezer wants, or to propose my own questions if Eliezer is willing in principle to make forecasts. I expect
... (read more)

I wish to acknowledge this frustration, and state generally that I think Paul Christiano occupies a distinct and more clueful class than a lot of, like, early EAs who mm-hmmmed along with Robin Hanson on AI - I wouldn't put, eg, Dario Amodei in that class either, though we disagree about other things.

But again, Paul, it's not enough to say that you weren't surprised by GPT-2/3 in retrospect, it kinda is important to say it in advance, ideally where other people can see?  Dario picks up some credit for GPT-2/3 because he clearly called it in advance.  You don't need to find exact disagreements with me to start going on the record as a forecaster, if you think the course of the future is generally narrower than my own guesses - if you think that trends stay on course, where I shrug and say that they might stay on course or break.  (Except that of course in hindsight somebody will always be able to draw a straight-line graph, once they know which graph to draw, so my statement "it might stay on trend or maybe break" applies only to graphs extrapolating into what is currently the future.)

Suppose your view is "crazy stuff happens all the time" and my view is "crazy stuff happens rarely." (Of course "crazy" is my word, to you it's just normal stuff.) Then what am I supposed to do, in your game?

More broadly: if you aren't making bold predictions about the future, why do you think that other people will? (My predictions all feel boring to me.) And if you do have bold predictions, can we talk about some of them instead?

It seems to me like I want you to say "well I think 20% chance something crazy happens here" and I say "nah, that's more like 5%" and then we batch up 5 of those and when none of them happen I get a bayes point.

I could just give my forecast. But then if I observe that 2/20 of them happen, how exactly does that help me in figuring out whether I should be paying more attention to your views (or help you snap out of it)?

I can list some particular past bets and future forecasts, but it's really unclear what to do with them without quantitative numbers or a point of comparison.

Like you I've predicted that AI is undervalued and will grow in importance, although I think I made a much more specific prediction that investment in AI would go up a lot in the short t... (read more)

I predict that people will explicitly collect much larger datasets of human behavior as the economic stakes rise. This is in contrast to e.g. theorem-proving working well, although I think that theorem-proving may end up being an important bellwether because it allows you to assess the capabilities of large models without multi-billion-dollar investments in training infrastructure.

Well, it sounds like I might be more bullish than you on theorem-proving, possibly.  Not on it being useful or profitable, but in terms of underlying technology making progress on non-profitable amazing demo feats, maybe I'm more bullish on theorem-proving than you are?  Is there anything you think it shouldn't be able to do in the next 5 years?

I'm going to make predictions by drawing straight-ish lines through metrics like the ones in the gpt-f paper. Big unknowns are then (i) how many orders of magnitude of "low-hanging fruit" are there before theorem-proving even catches up to the rest of NLP? (ii) how hard their benchmarks are compared to other tasks we care about. On (i) my guess is maybe 2? On (ii) my guess is "they are pretty easy" / "humans are pretty bad at these tasks," but it's somewhat harder to quantify. If you think your methodology is different from that then we will probably end up disagreeing.

Looking towards more ambitious benchmarks, I think that the IMO grand challenge is currently significantly more than 5 years away. In 5 year's time my median guess (without almost any thinking about it) is that automated solvers can do 10% of non-geometry, non-3-variable-inequality IMO shortlist problems.

So yeah, I'm happy to play ball in this area, and I expect my predictions to be somewhat more right than yours after the dust settles. Is there some way of measuring such that you are willing to state any prediction?

(I still feel like I'm basically looking for any predictions at all beyond sometimes saying "my model ... (read more)

I have a sense that there's a lot of latent potential for theorem-proving to advance if more energy gets thrown at it, in part because current algorithms seem a bit weird to me - that we are waiting on the equivalent of neural MCTS as an enabler for AlphaGo, not just a bigger investment, though of course the key trick could already have been published in any of a thousand papers I haven't read.  I feel like I "would not be surprised at all" if we get a bunch of shocking headlines in 2023 about theorem-proving problems falling, after which the IMO challenge falls in 2024 - though of course, as events like this lie in the Future, they are very hard to predict.

Can you say more about why or whether you would, in this case, say that this was an un-Paulian set of events?  As I have trouble manipulating my Paul model, it does not exclude Paul saying, "Ah, yes, well, they were using 700M models in that paper, so if you jump to 70B, of course the IMO grand challenge could fall; there wasn't a lot of money there."  Though I haven't even glanced at any metrics here, let alone metrics that the IMO grand challenge could be plotted on, so if smooth metrics rule out IMO in 5yrs, I am more interested yet - it legit decrements my belief, but not nearly as much as I imagine it would decrement yours.

(Edit:  Also, on the meta-level, is this, like, anywhere at all near the sort of thing you were hoping to hear from me?  Am I now being a better epistemic citizen, if maybe not a good one by your lights?)