Crossposted from my Medium blog
Today’s computer viruses can copy themselves between computers, but they can’t mutate. That might change very soon, leading to a serious crisis.
Research on program synthesis is proceeding quickly, using neural reinforcement learning. With a similar approach, researchers in 2017 showed that a computer can teach itself to master chess and go, entirely from scratch.
The programs currently being generated are very small: dividing two numbers, reversing a string of letters, etc. It could be that there’s another chasm to cross before programs for more interesting tasks can be generated. But what if there’s not?
Consider the problem of automatically generating a program that can install itself on another computer on the network. To make this work, the statistical process generating the program needs some way to know which partial programs are most promising. Metaphorically, the “AI” needs some way to “know” when it’s “on the right track”.
Imagine you were the one generating this program, and you knew nothing except that it should be a sequence of characters. Even knowing nothing, if you could get feedback on each character, you’d quickly be able to get the result through sheer trial and error. You would just have to try each character one-by-one, keeping the character that performed best. If you only got feedback on each character pair, your life would be harder, but you’d still get there eventually. If feedback only came after each complete line, you probably wouldn’t get to the answer in a human lifetime.
The simple point here is that the task of generating malware automatically will be difficult if the “reward” is all-or-nothing — if we only know we’re on the right track once the program is fully successful. If partial success can be rewarded, it will be much easier to find the correct solution.
It might be that there are lots of obvious, intermediate rewards that make generating malware very easy. Exit without error, connection to the target IP, long and varied communication all seem like good signs that the program is on the right track.
If it turns out that writing a virus is much easier than mastering chess, we could be in a lot of trouble. Every time the virus infects another computer, the new host can be put to work improving and spreading the malware. The improvements apply to the process of generating the malware, rather than any single exploit. Once the attack gains momentum, it could be finding and exploiting new vulnerabilities faster than they can be patched. Vulnerabilities could be uncovered and exploited at any level, including low-level firmware.
People love getting into philosophical debates about the hard problem of consciousness, and whether "real" artificial intelligence is possible or likely. Debating these things is as fine a hobby as any, but we shouldn’t lose sight of the fact that mutating self-replicating systems don’t need anything like “general intelligence” to pose a huge threat.
For the first big AI safety crisis, think robot flu, not robot uprising.
It does seem like it might be a big jump from knowing how to write a program to a simple spec to independently discovering the latest processor bug, seeing its potential for malware, and writing that malware.
It doesn't seem like an AI-complete jump, though, just a big one.
Isn't it a big assumption that all computers are equally vulnerable?
I don't think I meant to imply that -- could you point out where I seem to be making that assumption?
Obviously there are more exploits for a computer running Windows 95 than a carefully firewalled Linux server.
First post on LW2, so apologies if I've not kept norms properly -- let me know if I should edit.
I considered doing this as a cross-post but it felt weird without rewriting, as the knowledge assumptions were all wrong --- so I decided to just link.
I'll note that I personally always prefer crossposts (and if the context/background info is non-obvious, maybe just include a disclaimer at the top about what assumptions the article makes). Also, having read it, it seemed pretty reasonable for the expected background knowledge of a LW reader.
In other news, welcome to the site. :)
Okay, I'll paste the content in. I think you're right -- a link post is pretty much strictly worse.