Artificial intelligence is getting smarter by leaps and bounds — within this century, research suggests, a computer AI could be as "smart" as a human being. And then, says Nick Bostrom, it will overtake us: "Machine intelligence is the last invention that humanity will ever need to make." A philosopher and technologist, Bostrom asks us to think hard about the world we're building right now, driven by thinking machines. Will our smart machines help to preserve humanity and our values — or will they have values of their own?

I realize this might go into a post in a media thread, rather than its own topic, but it seems big enough, and likely-to-prompt-discussion enough, to have its own thread.

I liked the talk, although it was less polished than TED talks often are. What was missing I think was any indication of how to solve the problem. He could be seen as just an ivory tower philosopher speculating on something that might be a problem one day, because apart from mentioning in the beginning that he works with mathematicians and IT guys, he really does not give an impression that this problem is already being actively worked on.

New Comment
10 comments, sorted by Click to highlight new comments since:

This is my first comment on LessWrong.

I just wrote a post replying to part of Bostrom's talk, but apparently I need 20 Karma points to post it, so... let it be a long comment instead:

Bostrom should modify his standard reply to the common "We'd just shut off / contain the AI" claim

In Superintelligence author Prof. Nick Bostrom's most recent TED Talk, What happens when our computers get smarter than we are?, he spends over two minutes replying to the common claim that we could just shut off an AI or preemptively contain it in a box in order to prevent it from doing bad things that we don't like, so there's no need to be too concerned about the possible future development of AI that has misconceived or poorly specified goals:

Now you might say, if a computer starts sticking electrodes into people's faces, we'd just shut it off. A, this is not necessarily so easy to do if we've grown dependent on the system -- like, where is the off switch to the Internet? B, why haven't the chimpanzees flicked the off switch to humanity, or the Neanderthals? They certainly had reasons. We have an off switch, for example, right here. (Choking) The reason is that we are an intelligent adversary; we can anticipate threats and plan around them. But so could a superintelligent agent, and it would be much better at that than we are. The point is, we should not be confident that we have this under control here.

And we could try to make our job a little bit easier by, say, putting the A.I. in a box, like a secure software environment, a virtual reality simulation from which it cannot escape. But how confident can we be that the A.I. couldn't find a bug. Given that merely human hackers find bugs all the time, I'd say, probably not very confident. So we disconnect the ethernet cable to create an air gap, but again, like merely human hackers routinely transgress air gaps using social engineering. Right now, as I speak, I'm sure there is some employee out there somewhere who has been talked into handing out her account details by somebody claiming to be from the I.T. department.

More creative scenarios are also possible, like if you're the A.I., you can imagine wiggling electrodes around in your internal circuitry to create radio waves that you can use to communicate. Or maybe you could pretend to malfunction, and then when the programmers open you up to see what went wrong with you, they look at the source code -- Bam! -- the manipulation can take place. Or it could output the blueprint to a really nifty technology, and when we implement it, it has some surreptitious side effect that the A.I. had planned. The point here is that we should not be confident in our ability to keep a superintelligent genie locked up in its bottle forever. Sooner or later, it will out.

If I recall correctly, Bostrom has replied to this claim in this manner in several of the talks he has given. While what he says is correct, I think that there is a more important point he should also be making when replying to this claim.

The point is that even if containing an AI in a box so that it could not escape and cause damage was somehow feasible, it would still be incredibly important for us to determine how to create AI that shares our interests and values (friendly AI). And we would still have great reason to be concerned about the creation of unfriendly AI. This is because other people, such as terrorists, could still create an unfriendly AI and intentionally release it into the world to wreak havoc and potentially cause an existential catastrophe.

The idea that we should not be too worried about figuring out how to make AI friendly because we could always contain the AI in a box until we knew it was safe to release is confused not primarily because we couldn't actually successfully contain it in the box, but rather because the primary reason we have for wanting to quickly figure out how to make a friendly AI is so that we can make a friendly AI before anyone else makes an unfriendly AI.

In his TED Talk, Bostrom continues:

I believe that the answer here is to figure out how to create superintelligent A.I. such that even if -- when -- it escapes, it is still safe because it is fundamentally on our side because it shares our values. I see no way around this difficult problem.

Bostrom could have strengthened his argument for the position that there is no way around this difficult problem by stating my point above.

That is, he could have pointed out that even if we somehow developed a reliable way to keep a superintelligent genie locked up in its bottle forever, this still would not allow us to avoid having to solve the difficult problem of creating friendly AI with human values, since there would still be a high risk that other people in the world with not-so-good intentions would eventually develop an unfriendly AI and intentionally release it upon the world, or simply not exercise the caution necessary to keep it contained.

Once the technology to make superintelligent AI is developed, good people will be pressured to create friendly AI and let it take control of the future of the world ASAP. The longer they wait, the greater the risk that not-so-good people will develop AI that isn't specifically designed to have human values. This is why solving the value alignment problem soon is so important.

I'm not sure your argument proves your claim. I think what you've shown is that there exist reasons other than the inability to create perfect boxes to care about the value alignment problem.

We can flip your argument around and apply it to your claim: imagine a world where there was only one team with the ability to make superintelligent AI. I would argue that it'll still be extremely unsafe to build an AI and try to box it. I don't think that this lets me conclude that a lack of boxing ability is the true reason that the value alignment problem is so important.

I agree that there are several reasons why solving the value alignment problem is important.

Note that when I said that Bostrom should "modify" his reply I didn't mean that he should make a different point instead of the point he made, but rather meant that he should make another point in addition to the point he already made. As I said:

While what [Bostrom] says is correct, I think that there is a more important point he should also be making when replying to this claim.

Ah, I see. Fair enough!

I thought it was excellent, and not at all too ivory tower, although he moved through more inferential steps than in the average TED talk.

I thoroughly enjoyed it and think it was really well done. I can't perfectly judge how accessible it would be to those unfamiliar with x-risk mitigation and AI, but I think it was pretty good in that respect and did a good job of justifying the value alignment problem without seeming threatening.

I like how he made sure to position the people working on the value alignment problem as separate from those actually developing the potentially-awesome-but-potentially-world-ending AI so that the audience won't have any reason to not support what he's doing. I just hope the implicit framing of superintelligent AI as an inevitability, not a possibility, isn't so much of an inferential leap that it takes people out of reality-mode and into fantasy-mode.

I wouldn't have been able to guess the date this speech was given. The major outline seems 10 years old.

Is that a problem? Reiterating the basics is always a useful thing, and he didn't have much more time after doing so.


Excellent layout of the real problem: the control mechanism, rather than the creation of AI itself.

I started this Reddit on Ethereum where AI may be recreated first: