(misleading title removed)

by The_Jaded_One2 min read28th Jan 20158 comments


Personal Blog

An article by AAAI president Tom Dietterich and Director of Microsoft Research Eric Horvitz has recently got some media attention (BBC, etc) downplaying AI existential risks. You can go read it yourself, but the key paragraph is this:  

A third set of risks echo the tale of the Sorcerer’s Apprentice. Suppose we tell a self-driving car to “get us to the airport as quickly as possible!” Would the autonomous driving system put the pedal to the metal and drive at 300 mph while running over pedestrians? Troubling scenarios of this form have appeared recently in the press. Other fears center on the prospect of out-of-control superintelligences that threaten the survival of humanity. All of these examples refer to cases where humans have failed to correctly instruct the AI algorithm in how it should behave.

This is not a new problem. An important aspect of any AI system that interacts with people is that it must reason about what people intend rather than carrying out commands in a literal manner. An AI system should not only act on a set of rules that it is instructed to obey — it must also analyze and understand whether the behavior that a human is requesting is likely to be judged as “normal” or “reasonable” by most people. It should also be continuously monitoring itself to detect abnormal internal behaviors, which might signal bugs, cyberattacks, or failures in its understanding of its actions. In addition to relying on internal mechanisms to ensure proper behavior, AI systems need to have the capability — and responsibility — of working with people to obtain feedback and guidance. They must know when to stop and “ask for directions” — and always be open for feedback.

Some of the most exciting opportunities ahead for AI bring together the complementary talents of people and computing systems. AI-enabled devices are ... (examples follow) ...

In reality, creating real-time control systems where control needs to shift rapidly and fluidly between people and AI algorithms is difficult. Some airline accidents occurred when pilots took over from the autopilots. The problem is that unless the human operator has been paying very close attention, he or she will lack a detailed understanding of the current situation.

AI doomsday scenarios belong more in the realm of science fiction than science fact.

They continue:

However, we still have a great deal of work to do to address the concerns and risks afoot with our growing reliance on AI systems. Each of the three important risks outlined above (programming errors, cyberattacks, “Sorcerer’s Apprentice”) is being addressed by current research, but greater efforts are needed

We urge our colleagues in industry and academia to join us in identifying and studying these risks and in finding solutions to addressing them, and we call on government funding agencies and philanthropic initiatives to support this research. We urge the technology industry to devote even more attention to software quality and cybersecurity as we increasingly rely on AI in safety-critical functions. And we must not put AI algorithms in control of potentially-dangerous systems until we can provide a high degree of assurance that they will behave safely and properly.

I feel that Horvitz and Dietterich somewhat contradict themselves here. They start their rebuttal by confidently asserting that "This is not a new problem" - but later go on to say that an AI system should "be continuously monitoring itself to detect ... failures in its understanding of its actions". Of course anyone who knows anything about the history of AI will know that AI systems are notoriously bad at knowing when they've completely "lost the plot" and that the solutions outlined - of an AI system understanding what counts as "reasonable" (the commonsense knowledge problem) and of an AI usefully self-monitoring in situations with real-life complexity are both hopelessly beyond the current state of the art. Yes, the problem is not new, but that doesn't mean it isn't a problem. 

More importantly, Horvitz and Dietterich don't really engage with the idea that superintelligence makes the control problem qualitatively harder.

Reading between the lines, I suspect that they don't really think superintelligence is a genuine possibility, their mental model of the world seems to be that from now until eternity we will have a series of incrementally better SIRIs and Cortanas which helpfully suggest which present to buy for grandma or what to wear on a trip to Boston. I.e. they think that the current state of the art will never be qualitatively superceded, there will never be an AI that is better at AI science than they are, that will self-improve, etc. 

This would make the rest of their position make a lot of sense. 

One question that keeps kicking around in my mind is that if someone's true but unstated objection to the problem of AI risk is that superintelligence will never happen, how do you change their mind?

Personal Blog