habryka

Running Lightcone Infrastructure, which runs LessWrong. You can reach me at habryka@lesswrong.com

Sequences

A Moderate Update to your Artificial Priors
A Moderate Update to your Organic Priors
Concepts in formal epistemology

Wiki Contributions

Load More

Comments

habryka56

Despite my general interest in open inquiry, I will avoid talking about my detailed hypothesis of how to construct such a virus. I am not confident this is worth the tradeoff, but the costs of speculating about the details here in public do seem non-trivial.

habryka6637

@Daniel Kokotajlo If you indeed avoided signing an NDA, would you be able to share how much you passed up as a result of that? I might indeed want to create a precedent here and maybe try to fundraise for some substantial fraction of it.

habryka31

I have a lot of uncertainty about the difficulty of robotics, and the difficulty of e.g. designing superviruses or other ways to kill a lot of people. I do agree that in most worlds robotics will be solved to a human level before AI will be capable of killing everyone, but I am generally really averse to unnecessarily constraining my hypothesis space when thinking about this kind of stuff.

>90% seems quite doable with a well-engineered virus (especially one with a long infectious incubation period). I think 99%+ is much harder and probably out of reach until after robotics is thoroughly solved, but like, my current guess is a motivated team of humans could design a virus that kills 90% - 95% of humanity.

habryka2-3

The infrastructure necessary to run a datacenter or two is not that complicated. See these Gwern comments for some similar takes: 

In the world without us, electrical infrastructure would last quite a while, especially with no humans and their needs or wants to address. Most obviously, RTGs and solar panels will last indefinitely with no intervention, and nuclear power plants and hydroelectric plants can run for weeks or months autonomously. (If you believe otherwise, please provide sources for why you are sure about "soon after" - in fact, so sure about your power grid claims that you think this claim alone guarantees the AI failure story must be "pretty different" - and be more specific about how soon is "soon".)

And think a little bit harder about options available to superintelligent civilizations of AIs*, instead of assuming they do the maximally dumb thing of crashing the grid and immediately dying... (I assure you any such AIs implementing that strategy will have spent a lot longer thinking about how to do it well than you have for your comment.)

Add in the capability to take over the Internet of Things and the shambolic state of embedded computers which mean that the billions of AI instances & robots/drones can run the grid to a considerable degree and also do a more controlled shutdown than the maximally self-sabotaging approach of 'simply let it all crash without lifting a finger to do anything', and the ability to stockpile energy in advance or build one's own facilities due to the economic value of AGI (how would that look much different than, say, Amazon's new multi-billion-dollar datacenter hooked up directly to a gigawatt nuclear power plant...? why would an AGI in that datacenter care about the rest of the American grid, never mind world power?), and the 'mutually assured destruction' thesis is on very shaky grounds.

And every day that passes right now, the more we succeed in various kinds of decentralization or decarbonization initiatives and the more we automate pre-AGI, the less true the thesis gets. The AGIs only need one working place to bootstrap from, and it's a big world, and there's a lot of solar panels and other stuff out there and more and more every day... (And also, of course, there are many scenarios where it is not 'kill all humans immediately', but they end in the same place.)

Would such a strategy be the AGIs' first best choice? Almost certainly not, any more than chemotherapy is your ideal option for dealing with cancer (as opposed to "don't get cancer in the first place"). But the option is definitely there. 

If there is an AI that is making decent software-progress, even if it doesn't have the ability to maintain all infrastructure, it would probably be able to develop new technologies and better robots controls over the course of a few months or years without needing to have any humans around. 

habryka31

Uploaded them both!

habryka40

Of course, at some point, we'll eventually make sufficient progress in robotics that we can't rely on this safety guarantee

Why would "robotics" be the blocker? I think AIs can do a lot of stuff without needing much advancement in robotics. Convincing humans to do things is a totally sufficient API to have very large effects (e.g. it seems totally plausible to me you can have AI run country-sized companies without needing any progress in robotics).

habryka40

There are two images provided for a sequence, the banner image and the card image. The card image is required for it to show up in the Library. 

At least Eliezer has been extremely clear that he is in favor of a stop not a pause (indeed, that was like the headline of his article "Pausing AI Developments Isn't Enough. We Need to Shut it All Down"), so I am confused why you list him with anything related to "pause".

My guess is me and Eliezer are both in favor of a pause, but mostly because a pause seems like it would slow down AGI progress, not because the next 6 months in-particular will be the most risky period. 

Agree that if you include things that are not money, it starts being relatively central. I do think constraining it to money gets rid of a lot of the scenarios. 

Load More