Eliezer, I understand the logic of what you are saying. If AI is an existential threat, then only FriendlyAI can save us. Since any self-improving AI can become quickly unstoppable, FriendlyAI must be developed first and deployed as soon as it is developed. The team that developed it would in fact have a moral imperative to deploy it without risking consultation from anyone else.
I assume you also understand where I'm coming from. Out here in the "normal" world, you sound like a zealot who would destroy the human race in order to save it. Anyone who has implemented a large software project would laugh at the idea of coming up with a proven correct meta-goal, stable under all possible evolutions of an AI, also implemented provably correctly.
The idea of a goal (or even a meta-goal) that we can all agree on strikes me as absurd. The idea hitting the start button on something that could destroy the human race, based on nothing more than pages of logic, would be considered ridiculous by practically every member of the human race.
I understand if you think you are right about all of this, and don't need to listen to or even react to criticism. In that case, why do you blog? Why do you waste your time answering comments? Why aren't you out working on FriendlyAI for as many hours as you can manage?
And if this is an existential threat, are the Luddites right? Wouldn't the best tactic for extending the life of the human race be to kill all AI and nanotech researchers?
Tim, there are neural simulation projects underway already. I think there are a large number of nerds who would consider becoming uploads. I don't see why you think this makes no sense. And when you say "once we have AI", what do you mean? AI covers a lot of territory. Do you just mean some little problem solving box, or what?
Moshe Gurvich, thanks for the encouragement. I can never decide if my problem is Depression as a disease, or just reaction to my particular life circumstances.
There are people who recommend purely cognitive approaches to depression, including a lot of self-monitoring. Finding a project that engages you, so that you don't dwell on your depression, is a different approach, although also purely cognitive.
My point on the original post though was that you might naively assume that people would be scared of self-modification. But then you see people using Prozac without a second thought. More commonly, alcohol and other mood-altering substances are used. So perhaps we aren't frightened of self-modification after all.
As Eliezer implies, that would make us even more unstable as an upload with direct self-modification abilities.
I'm depressed about the coming end of the human race. Got a solution for that? :-)
Eliezer, I'm aware of nanotech. And I know you think the human race is obsolete when AI comes along. And I also think that you might be right, and that people like you might have the power to make it so.
But I also believe that if the rest of the human race really thought that was a possibility, you'd be burned at the stake.
Do you have any regard for the opinions of humanity at all? If you were in the position of having an AI in front of you, that you had convinced yourself was friendly, would you let it out of the box without bothering to consult anyone else?
So do you think it's possible to deal with depression by thinking "oh, just ignore that mood. It's just a defective portion of my brain speaking."
Or is the act of getting an antidepressant med basically acting on the desire to change your own brain?
What does it say about our regard for self and willingness to change our mental structure that so many people take antidepressants? If we were uploaded, would we freely modify our minds, or fear losing ourselves in the process?
I forgot I posted over here the other day, and so I didn't check back. For anyone still reading this thread, here's a bit of an email exchange I had on this subject. I'd really like a "FriendlyAI scenarios" thread.
From the few sentences I read on CEV, you are basically saying “I don’t know what I want or what the human race wants, but here I have a superintelligent AI. Let’s ask it!” This is clever, even if it means the solution is completely unknown at this point. Still, there are problems. I envision this as a two-step process. First, you ask the AI “what feasible future do I want?” and then you implement it. In practice, this means what you are really asking is “tell me a story so convincing, I will give you the power to implement it.” I’m not sure that’s wise, unless you really trust the AI!
Still, suppose this is done in good faith. You still have to convince the world that this is the right solution, and that the AI can be trusted to implement it. Or, the AI development group could just become convinced and force the solution on the human race without agreement. This is one of the “see if the AI can talk itself out of the box” setups.
Even if you did have a solution so persuasive that the world agrees to implement it (and thereby give up control of its own future), I can see some options here as to how the AI proceeds.
Option A) The AI reads human literature, movies, TV, documentaries, examines human brains, watches humans interact, etc. and comes up with a theory of human motivation, and uses that to produce a solution - the optimum feasible world for human beings.
Option B) The AI uploads a sample of the human race, then runs them (reinitializing each time) through various scenario worlds. It would evolve a scenario world that the majority of the uploads could live with. This is the solution.
Option C) The AI uploads a sample and then upgrades them to have a power equivalent to its own. It then asks these human-derived AI’s to solve the problem. This seems the most problematic of the solution techniques, since there would be many possible versions of an upgraded human mind. To decide which one is a value judgment that strongly effects the outcome. For example, it could give one upload copy of you artistic talent and another mathematical talent. The two versions of you might then think very differently about the next upgrade step, with the artist asking for verbal skills, and the mathematician asking for musical talents. After many iterations, you would end up with two completely different minds with different values, based on the upgrade path taken.
All of these require a superintelligent AI, which as we know, is a dangerous thing to create. It seems to me you are saying “let’s take a horrible risk, then ask this question in order to prevent something horrible from happening.” Or in other words, to create a Friendly AI, you are requiring us to create a possibly Unfriendly AI first.
I also don’t find any of this convincing without at least one plausible answer to the “what does the human race want” question. If we don’t have any idea of that answer, I find it unlikely that the AI would come up with something we’d find satisfactory. It might come up with a true answer, but not one that we would agree with, if we don’t have any starting point. More on that below.
What’s more, an AI of this power could just create an upload. I personally think that an upload is the best version of Friendly AI we are going to come up with. As has been noted, the space of all possible intelligence is probably very large, with all possible human intelligence a small blob in this space. Human intelligence varies a lot, from artists and scholars and saints to serial killers and dictators and religious fanatics. By definition, the space of all intelligence varies even more. Scary versions of AI are easy to come up with, but think of bizarre ones as well. For example, an “artistic” AI that just creates and destroys “interesting” versions of the human race, as a way of expressing itself.
You could consider the software we write already as a point in this intelligence space. We know what that sort of rule-based intelligence is like. It’s brittle, unstable and unpredictable in changed circumstances. We don’t want an AI with any of those characteristics. I think they come from the way we do engineering though, so I would expect any human-designed AI to share them.
An upload has advantages over a designed AI. We know a lot about human minds, including how they fail. We are used to dealing with humans and detecting lies or insanity. We can compare the upload with the original to see if the simulation is working properly. We know how to communicate with the upload, and know that it solves problems and sees the world the same way we do. The “tile the world with smiley faces” problem is reduced.
If we had uploads, we have a more natural path to Friendly AI. We could upload selected individuals, run them through scenarios at accelerated pace, and see what happens. We could do the same to uploaded communities. We know they don’t have superintelligent capabilities like we fear a self-improving AI might. It would be easier to build confidence that the AI really was friendly, especially since there would be versions of the same people in both the outside world and inside the simulations. As we gradually turned up the clock, these AIs would become more and more capable of handling research questions. At some point, they would gradually come to dominate research and government, since they simply think faster. It’s not necessarily a rapid launch scenario. In other words, just “weakly godlike uploads” to produce your Friendly AI. This is not that different from your CEV approach.
It’s been argued that since uploads are so complex, there will inevitably be designed AI before uploads. It might even require a very competent AI to do the upload. Still, computer technology is advancing so rapidly, it might only be a few years between the point where hardware could support a powerful designed AI, and the time when uploads are possible. There might not actually be enough time between those two points to design and test a powerful AI. In that case, simulating brain tissue might be the quickest path to AI, if it takes less time than designing AI from scratch.
When I mentioned that the human race could survive as uploads, I was thinking of a comment in one of the Singularity pieces. It said something like “the AI doesn’t have to be unfriendly. It could just have a better use for the atoms that make up your body.” The idea is that the AI would convert the mass of the earth into processors, destroying humanity unintentionally. But, an AI that capable could also simulate all of humanity in upload form with a tiny fraction of its capabilities. It’s odd to think of it that way, but simulating all human minds really would be a trivial byproduct of the Singularity. Perhaps by insisting that the biological human race have a future (and hence, that Earth be preserved), we are just thinking too small.
Finally, I want to make some comments about possible human futures. You mentioned the “sysop scenario”, which sounds like “just don’t allow people to hurt one another and things will be fine.” But this obviously isn’t enough. Will people be able to starve one another? If not, do people just keep living without food? Will people be able to imprison one another? If not, does the sysop just make jails break open? What does this mean for organizing society, if you can’t really punish anyone? If there are no consequences for obnoxious behavior? (maybe it all ends up looking like blog comments... :-)
I also think this doesn’t solve the main problem. As long as humanity is basically unchanged, it will continue to invent things, including dangerous things like AI. If you want a biological humanity living on a real Earth, and you want it not to go extinct, either by self destruction, or by transhumanism, then you have to change humanity. Technological humanity just isn’t stable in the long run.
I think that means removing the tiny percentage of humans who do serious technology. It’s easy to imagine a world of humans, unchanged in any important respect, that just don’t have advanced mathematical ability. They can do all the trial and error engineering they want—live in a world as complex as anything the 18th or 19th century produced, but they can’t have Newtons or Einsteins, no calculus or quantum mechanics. A creature capable of those things would eventually create AI and destroy/change itself. I think that any goal which includes “preserve the human race” must also include “don’t let them change themselves or invent AI.” And that means “no nerds.”
Naturally, whenever I mention this to nerds, they are horrified. What, they ask, is the point of a world like that, where technical progress is impossible? But, I would argue that our human minds will eventually hit some limit anyway, even if we don’t create a Singularity. And I would also argue that for the vast majority of human history, people have lived without 20th-century style technical progress. There’s also no reason why the world can’t improve itself considerably just experimenting with political and economic systems. Technology might help reduce world poverty, but it could also worsen it (think robotics causing unemployment.) And there are other things that could reduce world poverty as well, like better governments.
There's that old quote: "never let your sense of morality keep you from doing what you know is right."
I'd still like an answer to the most basic Friendly AI question: what do you want it to do? Forget the implementation problems for a second, and just give me a scenario where the AI is doing what you want it to do. What does that world look like? Because I don't even know what I want from that future.
Here's a doubt for you: I'm a nerd, I like nerds, I've worked on technology, and I've loved techie projects since I was a kid. Grew up on SF, all of that.
My problem lately is that I can't take Friendly AI arguments seriously. I do think AI is possible, that we will invent it. I do think that at some point in the next hundreds of years, it will be game over for the human race. We will be replaced and/or transformed.
I kind of like the human race! And I'm forced to conclude that a human race without that tiny fraction of nerds could last a good long time yet (tens of thousands of years) and would change only slowly, through biological evolution. They would not do much technology, since it takes nerds (in the broadest sense) to do this. But, they would still have fulfilling, human, lives.
On the other hand, I don't think a human race with nerds can forever avoid inventing a self-destructive technology like AI. So much as I have been brought up to think of politicians and generals as destroyers, and scientists and other nerds as creators, I have to admit that it's the other way around, ultimately.
The non-nerds can't destroy the human race. Only we nerds can do that.
That's my particular crisis of faith. Care to take a side?
Tim, do we have any idea what is required for uploads? Do we have any idea what is required for AGI? How can you make those comparisons?
If we thin-section and scan a frozen brain, it's an immense amount of data, but at least potentially, captures everything you need to know about a brain. This is a solvable technological problem. If we understand neurons well enough, we can simulate that mapped brain. Again, that's just a matter of compute power. I'm sure there's a huge distance from a simulated scan to a functional virtual human, but it doesn't strike me as impossible. Are we really farther from doing that than from building a FriendlyAI from first principles?
Nick, what I'd like to see in order to take this FriendlyAI concept seriously, is some scenario, even with a lot of hand-waving, of how it would work, and what kind of results it would produce. All I've seen in a year of lurking on this board is very abstract and high level.
I don't take FriendlyAI seriously because I think it's the wrong idea, from start to finish. There is no common goal that we would agree on. Any high-level moral goal is going to be impossible to state with mathematical precision. Any implementation of an AI that tries to satisfy that goal will be too complex to prove correct. It's a mirage.
Eliezer writes "[FAI] computes a metamoral question, looking for reflective equilibria of your current inconsistent and unknowledgeable self; something along the lines of "What would you ask me to do if you knew what I know and thought as fast as I do?" ". This strikes me as a clever dodge of the question. As I put it in my post, “I don’t know what I want or what the human race wants, but here I have a superintelligent AI. Let’s ask it!” It just adds another layer of opacity to the entire question.
If this is your metagoal, you are prepared to activate a possibly unfriendly AI with absolutely no notion of what it would actually do. What kind of "proof" could you possibly construct that would show this AI will act the way you want it to, when you don't even know how you want it to act?
I fall back to the view that Eliezer has actually stated, that the space of all possible intelligences is much larger than the space of human intelligences. That most "points" in that space would be incomprehensible or insane by human standards. And so I think the only solution is some kind of upload society, one that can be examined more effectively by ordinary humans. One that can work with us and gain trust. Ordinary human minds in simulation, not self-modifying, and not accelerated. Once we've gotten used to that, we can gradually introduce faster human minds or modified human minds.
This all or nothing approach to FriendlyAI strikes me as a dead end.
This idea of writing off the human race, and assuming that some select team will just hit the button and change the world, like it or not, strikes me as morally bankrupt.