LESSWRONG
LW

[ Question ]

If someone with sufficient capabilities decides to deliberately create an Unaligned AGI, is there anything anyone can actually DO, to stop them?

by [anonymous]

1 min read12th Mar 20221 answer 5 comments

-4

The scenario is simple, some unassuming programmer creates a DAO on a Blockchain that is the seed AI, with the single purpose gaining political, economic, and military power to create a new world order with the DAO at the top of the proverbial food chain. The question becomes, what do the bloggers/posters of LessWrong.org actually DO to stop the AI DAO?

New to LessWrong?

Getting Started

If someone with sufficient capabilities decides to deliberately create an Unaligned AGI, is there anything anyone can actually DO, to stop them?

New Answer

New Comment

1 Answers sorted by
top scoring

Mar 12, 2022

30

An AI DAO is an interesting thing to specify.

The etherium blockchain as a whole contains a virtual machine running at 350,000 instructions per second. In other words, even if someone very rich threw enough etherium at their AI to be able to outbid everyone else for gas, the AI would be running on a computer 10,000x less powerful than a raspberry pi.

A blockchain replaces one computer doing an Add instruction, with many computers all doing cryptographic protocols to check to make sure none of the other computers are cheating. It comes with one heck of a performance penalty. I would expect making an AI run on that level of compute is at the very least much harder than making an AGI that takes a more reasonable amount of compute.

So lets say the AI is actually running on a desktop in the programmers house. Its given unrestricted internet access. They might tell someone what they are planning to do, or what they have done. If the AI is smart and unaligned, it won't make its existence as an unaligned AI obvious. ~~Although there is a chance the AI will give its existence away when its still fairly dumb.~~ (Probably not, most things a dumb AGI can do online, a trolling human can do. Even if it went on lesswrong and asked "Hi, I'm a young dumb AI, how can I take over the world", we still wouldn't realize it was an actual AI.)

So in this scenario, we probably don't get strong evidence that the AI exists until it is too late to do anything. Although its possible that someone from here calls the developer and says "I'm concerned about the safety of your AI design, could you turn it off". That might happen if the design was posted somewhere prominent. But in that case, someone else will run the same code next week.

What people like Eliezer are aiming for is a scenario where (they/ someone who listened to them ) make an AGI aligned to the best interests of humanity. Somehow or other, that AI stops anyone else making an AI. (And probably does a bunch of other things.) Nanomachines that melt all GPUs has been suggested.

[-][anonymous]2y10

I specified a "Blockchain", and not Ethereum specifically. Assume we are using a 3rd generation or higher Blockchain, and a the Oracle problem has been solved. The heavy computation could be outsourced off the Blockchain and minimal core circuits would run on the DAO. If the particular universe we inhabit is structured so that AI strength is proportional to computational power (and Large Language Scaling laws seem to suggest this is the case) then in the war between friendly and unfriendly AI becomes a game where first move wins. Once an unfriendly AI ... (read more)

2Donald Hobson2y

If anyone who wants can do a bit of the heavy computation (and get paid in crypto), this opens a vulnerability, you can offer to do some of the work, and return nonsense results. Most AI's aren't put on the blockchain, because debugging becomes needlessly hard when cryptographic protocols make it slow and expensive to edit your code. And blockchain is basically the wrong tech anyway. If the first AGI is unfriendly, then unless a friendly AI happens to be built like a few days later, yes it is too late. (If several AGI projects are very close, it may come down to some mix of which has more compute, a more efficient algorithm and being a day ahead) The unfriendly AI does whatever it wants. I don't think it would be bribing courts and politicians because courts and politicians are kind of slow. Its plan is likely to be more. 1. Hack several big supercomputers, giving me plenty of compute and ensuring I won't be shut off 2. Trick a bioresearch lab into making a particular DNA string and mixing it with various other ingredients. 3. Bootstrap self replicating nanotech. Turn all the earth into more nanotech, spread across universe. If it can get self replicating nanotech in a week, with no one having any idea the AI exists until it grey goos earth, what advantage does the AI hope to gain in courts. Whether or not this AI is in someway related to the blockchain is pretty irrelevant.

0[anonymous]2y

How is the AGI going to evade detection after it hacks into the super computer and starts eating up all it's cycles? What's going to stop the operators from simply air gapping the supercomputer? And furthermore Grey Goos are science fiction. There seems to be this notion on Less Wrong that AGIs are magic and can make anything you can imagine real. I'm trying to present a somewhat realistic computer security concern where an autonomous program designed to cause havoc is run on a computer that can't be shut down, namely a Blockchain based machine.

3Donald Hobson2y

Lets say the supercomputer is doing something like simulating aerodynamics of new designs of wind turbines. The AI finds a maths trick that lets it calculate the same thing with half the compute, and runs its own code on the other half, while hacking or spoofing any performance monitoring tools. Or maybe the AGI just pretends to be a human hacker doing cryptomining. It puts up a message saying "let us mine crypto for 1 week, and then this code will self delete and all will be well. Try to remove this code and I brick your supercomputer". Current supercomputers aren't in 100% airgapped farady cages because people want to get data on and off them. The researcher studying wind turbines likes to be able to test their code without having to physically go in to the server room. Sure, if someone set up a supercomputer before the AI was made, and the only way to program it was to go into the server room and type on a keyboard directly connected to the supercomputer, no internet, and no memory sticks. Then this would be fairly secure against the AGI (in the early stages). It would also be practically unusable. You couldn't download your large datasets onto the supercomputer, and you couldn't load the latest version of numpy on there either. Oh, that's why so many science authors write dystopian fiction. They make that particular form of dystopia science fiction, thus magically stopping it happening in reality. Why if only enough authors had written scifi containing climate change early enough, that would have made climate change science fiction, forcing CO2 to not have any effect on global climate. (sarcasm) Compared to monkeys or whatever, modern human tech seems pretty magic. I think that the limits of intelligence are really really high. There are lots and lots of clever things that can be done if you only work out how. https://www.lesswrong.com/posts/Jko7pt7MwwTBrfG3A/undiscriminating-skepticism https://www.lesswrong.com/posts/XKcawbsB6Tj5e2QRK/is-molecular-na

-2[anonymous]2y

"The AI finds a maths trick that lets it calculate the same thing with half the compute," You are not taking into account Computational Complexity Theory. There are fundamental limitations on what computers can do. Mathematical operations have lower bounds. After a certain point, there are no more clever tricks to discover.

2Donald Hobson2y

I agree that it is in principle possible for software to be as efficient as possible, for their to be no further maths tricks that speed it up. 1. There are a fair few maths tricks, including some that are pretty subtle. Often humans have been running one algorithm for years and researchers find a faster one. We have not run out of new tricks to discover yet, and have no particular reason to think we will before ASI. 2. There are many supercomputers running many tasks. The AI doesn't need to find a maths trick for fluid dynamics, it needs to find a maths trick for fluid dynamics or bitcoin mining or machine translation or ... or any of the other tasks big computer are doing. 3. No one said the simulations needed to be perfect. The AI replaces the simulation with a faster but slightly worse one. It looks about the same to the humans watching their little animations. It would take years before the real wind turbine is built and found to be less efficient than predicted. And even then the humans will just blame lumpy bearings. (If the world hasn't been destroyed by this point)