Wiki Contributions


AGI Ruin: A List of Lethalities

To start off, I don't see much point in formally betting $20 on an event conditioned on something I assign <<50% probability of happening within the next 30 years (powerful AI is launched and failed catastrophically and we're both still alive to settle the bet and there was an unambiguous attribution of the failure to the AI). I mean sure, I can accept the bet, but largely because I don't believe it matters one way or another, so I don't think it counts from the epistemological virtue standpoint.

But I can state what I'd disagree with in your terms if I were to take it seriously, just to clarify my argument:

  1. Sounds good.
  2. Mostly sounds good, but I'd push back that "not actually running anything close to the dangerous limit" sounds like a win to me, even if theoretical research continues. One pretty straightforward Schelling point for a ban/moratorium on AGI research is "never train or run anything > X parameters", with X << dangerous level at then-current paradigm. It may be easier explain to the public and politicians than many other potential limits, and this is important.  It's much easier to control too - checking that nobody collects and uses a gigashitton of GPUs [without supervision] is easier than to check every researcher's laptop. Additionally, we'll have nuclear weapons tests as a precedent.
  3. That's the core of my argument, really. If the consortium of 200 world experts says "this happened because your AI wasn't aligned, let's stop all AI research", then Facebook AI or China can tell the consortium to go fuck themselves, and I agree with your skepticism that it'd make all labs pause for even a month (see: gain of function research, covid). But if it becomes public knowledge that a catastrophe of 1mln casualties happened because of AI, then it can trigger a panic which will make both the world leaders and the public to really honestly want to restrict this AI stuff, and it will both justify and enable the draconian measures required to make every lab to actually stop the research. Similar to how panics about nuclear energy, terrorism and covid worked. I propose defining "public agreement" as "leaders of the relevant countries (defined as the countries housing the labs from p.1, so US, China, maybe UK and a couple of others) each issue a clear public statement saying that the catastrophe happened because of an unaligned AI". This is not an unreasonable ask, they were this unanimous about quite a few things, including vaccines.
AGI Ruin: A List of Lethalities

What Steven Byrnes said, but also my reading is that 1) in the current paradigm it's near-damn-impossible to built such an AI without creating an unaligned AI in the process (how else do you gradient-descend your way into a book on aligned AIs?) and 2) if you do make an unaligned AI powerful enough to write such a textbook, it'll probably proceed to converting the entire mass of the universe into textbooks, or do something similarly incompatible with human life.

AGI Ruin: A List of Lethalities

It might, given some luck and that all the pro-safety actors play their cards right. Assuming by "all labs" you mean "all labs developing AIs at or near to then-current limit of computational power", or something along those lines, and by "research" you mean "practical research", i.e. training and running models. The model I have in mind not that everyone involved will intellectually agree that such research should be stopped, but that enough percentage of public and governments will get scared and exert pressure on the labs. Consider how most of the world was able to (imperfectly) coordinate to slow Covid spread, or how nobody have prototyped a supersonic passenger jet in decades, or, again, the nuclear energy - we as a species can do such things in principle, even though often for the wrong reasons.

I'm not informed enough to give meaningful probabilities on this, but to honor the tradition, I'd say that given a catastrophe with immediate, graphic death toll >=1mln happening in or near the developed world, I'd estimate >75% probability that ~all seriously dangerous activity will be stopped for at least a month, and >50% that it'll be stopped for at least a year. With the caveat that the catastrophe was unambiguously attributed to the AI, think "Fukushima was a nuclear explosion", not "Covid maybe sorta kinda plausibly escaped from the lab but well who knows".

AGI Ruin: A List of Lethalities

The important difference is that the nuclear weapons are destructive because they worked exactly as intended, and the AI in this scenario is destructive because it failed horrendously. Plus, the concept of rogue AI has been firmly ingrained into public consciousness by now, afaik not the case with the extremely destructive weapons in 1940s [1]. So hopefully this will produce more public outrage (and scare among the elites themselves) => stricter external and internal limitations on all agents developing AIs. But in the end I agree, it'll only buy time, maybe few decades if we are lucky, to solve the problem properly or to build more sane political institutions.

  1. ^

    Yes I'm sure there was a scifi novel or two before 1945 describing bombs of immense power. But I don't think it was anywhere nearly as widely known as Matrix or Terminator.

AGI Ruin: A List of Lethalities

How possible is it that a misaligned, narrowly-superhuman AI is launched, fails catastrophically with casualties in the 10^4 - 10^9 range, and the [remainder of] humanity is "scared straight" and from that moment onward treats the AI technology the way we treat nuclear technology now - i.e. effectively strangles it into stagnation with regulations - or even more conservatively? From my naive perspective it is somewhat plausible politically, based on the only example of ~world-destroying technology that we have today. And this list of arguments doesn't seem to rule out this possibility. Is there an independent argument by EY as to why this is not plausible technologically? I.e., why AIs narrow/weak enough to not be inevitably world-destroying but powerful enough to fail catastrophically are unlikely to be developed [soon enough]?

(To be clear, the above scenario is nothing like a path to victory and I'm not claiming it's very likely. More like a tiny remaining possibility for our world to survive.)

Why rationalists are not much concerned about mortality?

Yes and no. 1-6 are obviously necessary but not sufficient - there's much more to diet and exercise than "not too much" and "some" respectively. 7 and 8 are kinda minor and of dubious utility except for in some narrow circumstances so whatever. And 9 and 10 are hotly debated and that's exactly what you'd need rationality for, as well as figuring out the right pattern of diet and exercise. And I mean right for each individual person, not in general, and the same with supplements - a 60-year old should have much higher tolerance for potential risks of a longevity treatment than a 25yo, since the latter has more less to gain and more to loose.

Why rationalists are not much concerned about mortality?

I would be very surprised if inflammation or loss of proteostasis did not have any effect on fascia, if only because they have negative effect on ~everything. But more importantly, I don't think there's any significant number of people dying from fascia stiffness? That's one of the main ideas behind the hallmarks of aging, that you don't have to solve the entire problem in its every minuscule aspect at once. If you could just forestall all these hallmarks or even just some of them, you could probably increase lifespan and healthspan significantly, thus buying more time to fix other problems (or develop completely knew approaches like mind uploading or regenerative medicine or whatever else).

Why rationalists are not much concerned about mortality?

You're fighting a strawman (nobody's going to deny death to anyone, and except for seriously ill most people who truly want to die now have an option to do so; myself I'm actually pro-euthanasia). And, once again, you want to inflict on literally everyone a fate you say you don't want for yourself. Also, I don't accept the premise there's any innate power balance in the universe that we ought to uphold even at the cost of our lives, we do not inhabit a Marvel movie. And you're assuming the knowledge which you can't possibly have, about exactly how human consciousness functions and what alterations to it we'll be able to make in the next centuries or millennia.

Why rationalists are not much concerned about mortality?

That's, like, 99.95% probability, one in two thousand chances. You'd have two orders of magnitude higher chances of survival if you were to literally shoot yourself with a literal gun. I'm not sure you can forecast anything at all (about humans or technologies) with this degree of certainty decades into the future, definitely not that every single one of dozens attempts in a technology you're not an expert in fail and every single one of hundreds attempts in another technology you're not an expert in fail (building aligned AGI).

I don't believe there are any tradeoffs I can make which would give me a 50% chance to live to 300 years.

I don't believe it either, it's a thought experiment, I assumed it'd be obvious since it's a very common technique to estimate how much one should value low probabilities.

Why rationalists are not much concerned about mortality?

Equating high risk/high reward strategies with Pascal Wager is a way too common failure mode, and it's helped by putting numbers on your estimates. How much is VERY TINY, how much do you think the best available options really cost, and how much would you be willing to pay (assuming you have that kind of money) for a 50% chance of living to 300 years?

To be clear, I'm not so much trying to convince you personally, as to get a generally better sense of the inferential distances involved.

Load More