In Artificial Intelligence as a Negative and Positive Factor in Global Risk, Yudkowsky uses the following parable to illustrate the danger of using case-based learning to produce the goal systems of advanced AIs:

Once upon a time, the US Army wanted to use neural networks to automatically detect camouflaged enemy tanks.  The researchers trained a neural net on 50 photos of camouflaged tanks in trees, and 50 photos of trees without tanks.  Using standard techniques for supervised learning, the researchers trained the neural network to a weighting that correctly loaded the training set - output "yes" for the 50 photos of camouflaged tanks, and output "no" for the 50 photos of forest.  This did not ensure, or even imply, that new examples would be classified correctly.  The neural network might have "learned" 100 special cases that would not generalize to any new problem.  Wisely, the researchers had originally taken 200 photos, 100 photos of tanks and 100 photos of trees.  They had used only 50 of each for the training set.  The researchers ran the neural network on the remaining 100 photos, and without further training the neural network classified all remaining photos correctly.  Success confirmed!  The researchers handed the finished work to the Pentagon, which soon handed it back, complaining that in their own tests the neural network did no better than chance at discriminating photos.

It turned out that in the researchers' data set, photos of camouflaged tanks had been taken on cloudy days, while photos of plain forest had been taken on sunny days.  The neural network had learned to distinguish cloudy days from sunny days, instead of distinguishing camouflaged tanks from empty forest.

I once stumbled across the source of this parable online, but now I can't find it.

Anyway, I'm curious: Are there any well-known examples of this kind of problem actually causing serious damage — say, when a narrow AI trained via machine learning was placed into a somewhat novel environment?

New to LessWrong?

New Comment
24 comments, sorted by Click to highlight new comments since: Today at 5:43 PM

I sat on the floor with my dog and taught him to roll over in a few sessions. Each session was five minutes, and he had it by the end of each one.

The third session was a bit different from the first two because I sat in a chair. I began by making a whirling motion with my hand and saying "roll over". He quickly shuffled a few feet to the right, crashed hard into the wall, and fell over. He looked confusedly at the wall, and then at me - the wall for attacking him, and me for withholding his reward of food.

He had, after all, performed the trick that I had taught him - pointing his head at my crotch and shuffling to the right, not stopping until he flipped 360 degrees.

That I intended for him to be learning to roll over didn't matter - this is reinforcement learning.

Ed Fredkin has since sent me a personal email:

By the way, the story about the two pictures of a field, with and without army tanks in the picture, comes from me. I attended a meeting in Los Angeles, about half a century ago where someone gave a paper showing how a random net could be trained to detect the tanks in the picture. I was in the audience. At the end of the talk I stood up and made the comment that it was obvious that the picture with the tanks was made on a sunny day while the other picture (of the same field without the tanks) was made on a cloudy day. I suggested that the "neural net" had merely trained itself to recognize the difference between a bright picture and a dim picture.

My favorite example of this is the "racist" face tracking camera.

That's pretty funny, though it appears that the cause may not have been that, say, the engineers trained the face tracking software using mostly white people.

[-][anonymous]13y00

Amusing and perhaps inconvenient for a few users before it came to the attention on the manufacturer but I don't think that's really "serious damage".

It's almost certainly not the actual source of the "parable", or if it is the story was greatly exaggerated in its retelling (admittedly not unlikely), but this may well be the original study (and is probably the most commonly-reused data set in the field) and this is a useful overview of the topic.

Does that help?

Except "November Fort Carson RSTA Data Collection Final Report" was released in 1994 covering data collection from 1993, but the parable was described in 1992 in the "What Artificial Experts Can and Cannot Do" paper.

The earliest reference to the parable that I can find is in this paper from 1992. (Paywalled, so here's the relevant page.) I also found another paper which attributes the story to this book, but the limited Google preview does not show me a specific discussion of it in the book.

Here's the full version of "What Artificial Experts Can and Cannot Do" (1992): http://www.jefftk.com/dreyfus92.pdf It has:

... consider the legend of one of connectionism's first applications. In the early days of the perceptron ...

Expanded my comments into a post: http://www.jefftk.com/p/detecting-tanks

There's also https://neil.fraser.name/writing/tank/ from 1998 which says the "story might be apocryphal", so by that point it sounds like it had been passed around a lot.

In the "Building Neural Networks" book, the bottom of page 199 seems to be about "classifying military tanks in SAR imagery". It goes on to say it is only interested in "tank" / "non-tank" categories.

But it also doesn't look like it's a version of this story. That section of the book is just a straight ahead "how to distinguish tanks" bit.

Great, thanks!

[-][anonymous]13y80

Every time you've missed an important email because of a spam filter false positive.

[-][anonymous]13y30

This was also discussed in Magical Categories. I don't know the source of the parable, though.

Dataset bias is a huge topic in computer vision at the moment. An excellent (and humorous) overview is given by Efros and Torralba: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5995347&tag=1.

Pakistan claims that U.S. predator drones routinely bomb outdoor weddings, mistaking them for Al-Qaeda rallies. I couldn't immediately find an authoritative source, but if there's any truth to it this sort of phenomenon is probably to blame.

The drones aren't giving much in the way of self-direction. I'd guess this is more due to human error than anything else.

US predator drones don't fire automatically, yet.

I'm reminded of one of your early naively breathless articles here on the value of mid-80s and prior expert systems.

Why don't you write a post on how it is naive? Do you actually know something about practical application of these methods?

Yes, if experts say that they use quantifiable data X, Y, and Z to predict outcomes, that simple algorithms beat them on only that data might not be important if the experts really use other data. But there is lots of evidence saying that experts are terrible at non-quantifiable data, such as thinking interviews are useful in hiring. Tetlock finds that ecologically valid use of these trivial models beats experts in politics.

this one:

http://lesswrong.com/lw/3gv/statistical_prediction_rules_outperform_expert/

When based on the same evidence, the predictions of SPRs are at least as reliable as, and are typically more reliable than, the predictions of human experts for problems of social prediction.

Hmm yes, 'same evidence'.