Holden Karnofsky's Singularity Institute critique: other objections

by Paul Crowley1 min read11th May 20126 comments

3

Machine Intelligence Research Institute (MIRI)
Personal Blog

The sheer length of GiveWell co-founder and co-executive director Holden Karnofsky's excellent critique of the Singularity Institute means that it's hard to keep track of the resulting discussion.  I propose to break out each of his objections into a separate Discussion post so that each receives the attention it deserves.

Other objections to SI's views

There are other debates about the likelihood of SI's work being relevant/helpful; for example,

  • It isn't clear whether the development of AGI is imminent enough to be relevant, or whether other risks to humanity are closer.
  • It isn't clear whether AGI would be as powerful as SI's views imply. (I discussed this briefly in Karnofsky/Tallinn 2011.)
  • It isn't clear whether even an extremely powerful UFAI would choose to attack humans as opposed to negotiating with them. (I find it somewhat helpful to analogize UFAI-human interactions to human-mosquito interactions. Humans are enormously more intelligent than mosquitoes; humans are good at predicting, manipulating, and destroying mosquitoes; humans do not value mosquitoes' welfare; humans have other goals that mosquitoes interfere with; humans would like to see mosquitoes eradicated at least from certain parts of the planet. Yet humans haven't accomplished such eradication, and it is easy to imagine scenarios in which humans would prefer honest negotiation and trade with mosquitoes to any other arrangement, if such negotiation and trade were possible.)

Unlike the three objections I focus on, these other issues have been discussed a fair amount, and if these other issues were the only objections to SI's arguments I would find SI's case to be strong (i.e., I would find its scenario likely enough to warrant investment in).

6 comments, sorted by Highlighting new comments since Today at 4:45 AM
New Comment

In connection with this discussion, I am pleased to announce a new initiative, the Unfriendly AI Pseudocode Contest!

Objective of the contest: To produce convincing examples of how a harmless-looking computer program, that has not been specifically designed to be "friendly", could end up destroying the world. To explore the nature of AI danger without actually doing dangerous things.

Examples: A familiar example of unplanned unfriendliness, is the program designed to calculate pi, which reasons that it could calculate pi with much more accuracy if it turned the Earth into one giant computer. Here a harmless-looking goal (calculate pi) combines with a harmless-looking enhancement (vastly increased "intelligence") to produce a harmful outcome (Earth turned into one giant computer which does nothing but calculate pi).

An entry in the Unfriendly AI Pseudocode Contest which was intended to illustrate this scenario, would need to be specified in much more detail than this. For example, it might contain a pseudocode specification of the pi-calculating program in a harmless "unenhanced" state, then a description of a harmless-looking enhancement, and then an analysis demonstrating that the program has now become an existential risk.

Prizes: The accolades of your peers. The uneasy admiration of a terrified humanity, for whom your little demo has become the standard example of why "friendliness" matters. The gratitude of nihilist supervillains, for whom your pseudocode provides a convenient blueprint for action...

A variant of this contest with less catastrophic unfriendliness actually ran for a few years. The (now defunct) Underhanded C Contest (description below from the contest web page):

The Underhanded C Contest is an annual contest to write innocent-looking C code implementing malicious behavior. In this contest you must write C code that is as readable, clear, innocent and straightforward as possible, and yet it must fail to perform at its apparent function. To be more specific, it should do something subtly evil.

Every year, we will propose a challenge to coders to solve a simple data processing problem, but with covert malicious behavior. Examples include miscounting votes, shaving money from financial transactions, or leaking information to an eavesdropper. The main goal, however, is to write source code that easily passes visual inspection by other programmers.

This contest sounds seriously cool and possibly useful, but it looks like a valid entry would require the pseudocode for a general intelligence, which as far as I know is beyond the capability of anyone reading this post.

I expect at this stage, you'd be allowed an occasional "and then a miracle occurs" until we work out what step two looks like.

Lesswrong is not an enjoyable place to post pseudocode. I learned this today.

[This comment is no longer endorsed by its author]Reply

Intelligence is the greatest tool humans have. Computers show a path to implementing intelligence outside a human brain. We should prepare for AGI as best we can.