I tend to think differently on this one.
Wherever I turn my head around in this world, I see lost causes everywhere. I see Goodhart's law and Campbell's law at loose everywhere. I see insane optimizers everywhere. Political parties that concentrate more on show, pomp and campaign funds than on actual issues. Corporates that seek money to the exclusion of actual creation of value. Governments that seek employment and GDP growth even when those are supported by artificial stimuli and not sustainable patterns of production and trade.
One might argue that none of these systems are actually as intelligent as a well educated human at any given moment in time. But that's the point, isn't it? You're unable to stop sub-human optimizers, how are you going to curb a near human or a super human one?
For me, the scary idea is not so much of an idea as it is an extension of something that is already happening in this world.
I really liked this comment by FrankAdamek:
With regards to teaching an AI to care: what you can teach a mind depends on the mind. The best examples come from human beings: for hundreds of years many (though not all) parents have taught their children that it is wrong to have sex before marriage, a precept that many people break even when they think they shouldn't and feel bad about it . And that's with our built in desires for social acceptance and hardware for propositional morality. For another example, you can't train tigers to care about their handlers. No matter how much time you spend with them and care for them, they sometimes bite off arms just because they are hungry. I understand most big cats are like this.
It's quite true that nobody plans to build a system with no concern for human life, but it's also true that many people assume Friendliness is easy.
A particularly troubling quote from the post:
I think the relation between breadth of intelligence and depth of empathy is a subtle issue which none of us fully understands (yet). It's possible that with sufficient real-world intelligence tends to come a sense of connectedness with the universe that militates against squashing other sentiences. But I'm not terribly certain of this, any more than I'm terribly certain of its opposite.
The obvious truth is that mind-design space contains every combination of intelligence and empathy.
One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.
Two questions:
1) The consequences for whom?
2) How much empathy do you have for, oh, say, an E. coli bacterium?
Connecting these two questions is left as an exercise for the reader. ;-)
One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.
Human psychopaths are a counterexample to this claim, and they seem to be doing alright in spite of active efforts by the rest of humanity to detect and eliminate them.
One thing that I think is relevant, in the discussion of existential risk, is Martin Weitzmann's "Dismal Theorem" and Jim Manzi's analysis of it. (Link to the article, link to the paper.)
There, the topic is not unfriendly AI, but climate change. Regardless of what you think of the topic, it has attracted more attention than AGI, and people writing about existential risk are often using climate change as an example.
Martin Weitzman, a Harvard economist, deals with the probability of extreme disasters, and whether it's worth it in cost-benefit terms to deal with them. Our problem, in cases of extreme uncertainty, is that we don't only have probability distributions, we have uncertain probability distributions; it's possible we got the models wrong. Weitzman's paper takes this into account. He creates a family of probability distributions, indexed over a certain parameter, and integrates over it -- and he proves that the process of taking "probability distributions of probability distributions" has the result of making the final distribution fat-tailed. So fat-tailed that the integral doesn't converge.
This is a terrible consequence. Because if the PDF of th...
The talk about uncertainty is indeed a red herring. There are two things going on here:
A linear aggregative (or fast-growing enough in the relevant range) social welfare function makes even small probabilities of existential risk more important than large costs or benefits today. This is the Bostrom astronomical waste point. Weitzmann just uses a peculiar model (with agents with bizarre preferences that assign infinite disutility to death, and a strangely constricted probability distribution over outcomes) to indirectly introduce this. You can reject it with a bounded social welfare function like Manzi or Nordhaus, representing your limited willingness to sacrifice for future generations.
The fact that there are many existential risks competing for our attention, and many routes to affecting existential risk, so that spending effort on any particular risk now means not spending that effort on other existential risks, or keeping it around while new knowledge accumulates, etc. Does the x-risk reduction from climate change mitigation beat the reduction from asteroid defense or lobbying for arms control treaties at the current margin? Weitzmann addresses this by saying that the risk from surprise catastrophic climate change is much higher than other existential risks collectively, which I don't find plausible.
There is a large, continuous spectrum between making an AI and hoping it works out okay, and waiting for a formal proof of friendliness. Now, I don't think a complete proof is feasible; we've never managed a formal proof for anything close to that level of complexity, and the proof would be as likely to contain bugs as the program would. However, that doesn't mean we shouldn't push in that direction. Current practice in AI research seems to be to publish everything and take no safety precautions whatsoever, and that is definitely not good.
Suppose an AGI is created, initially not very smart but capable of rapid improvement, either with further development by humans or by giving it computing resources and letting it self-improve.Suppose, further, that its creators publish the source code, or allow it to be leaked or stolen.
AI improvement will probably proceed in a series of steps: the AI designs a successor, spends some time inspecting it to make sure the successor has the same values, then hands over control, then repeat. At each stage, the same tradeoff between speed and safety applies: more time spent verifying the successor means a lower probability of error, but a higher probab...
When he was paraphrasing the reasons:
Human value is fragile as well as complex, so if you create an AGI with a roughly-human-like value system, then this may not be good enough, and it is likely to rapidly diverge into something with little or no respect for human values
... that doesn't seem quite right. The main problem with values being fragile isn't that a "roughly-human-like value system" might diverge rapidly; it's that properly implementing a "roughly-human-like value system" is actually quite hard and most AGI programmer seem to underestimate it's complexity, and go for "hacky" solutions, which I find somewhat scary.
Ben seems aware of this, and later goes on to say:
This is related to the point Eliezer Yudkowsky makes that "value is complex" -- actually, human value is not only complex, it's nebulous and fuzzy and ever-shifting, and humans largely grok it by implicit procedural, empathic and episodic knowledge rather than explicit declarative or linguistic knowledge.
... which seems to be one of the reasons to pay extra attention to it (and this also seems to be a reason given by Eliezer, whereas Ben almost presents it as a counterpoint to Eliezer).
For me, the oddest thing about Goertzels' article is his claim that SIAI's arguments are so unclear that he had to construct it himself. The way he describes the argument is completely congruent with what I've been reading here.
In any case, his argument that it may not be possible to have provable Friendliness and it makes more sense to take an incremental approach to AGI than to not do AGI until Friendliness is proven seems reasonable.
Has it been demonstrated that Friendliness is provable?
If Goertzel's claim that "SIAI's arguments are so unclear that he had to construct it himself" can't be disproven by the simple expedient of posting a single link to an immediately available well-structured top-down argument then the SIAI should regard this as an obvious high-priority, high-value task. If it can be proven by such a link, then that link needs to be more highly advertised since it seems that none of us are aware of it.
There are some fundamentally incorrect assumptions that have become gospel.
So go ahead and point them out. My guess is that in the ensuing debate it will be found that 1/4 of them are indeed fundamentally incorrect assumptions, 1/4 of them are arguably correct, and 1/2 of them are not really "assumptions that have become gospel". But until you provide your list, there is no way to know.
So: it seems as though the "default case" of a software company shipping an application would be that it crashes, or goes into an infinite loop - since that's what happens unless steps are specifically taken to avoid it.
Not quite. The "default case" of a software company shipping an application is that there will definitely be bugs in the parts of the software they have not specifically and sufficiently tested... where "bugs" can mean anything from crashes or loops, to data corruption.
The analogy here -- and it's so direct and obvious a relationship that it's a stretch to even call it an analogy! -- is that if you haven't specifically tested your self-improving AGI for it, there are likely to be bugs in the "not killing us all" parts.
I repeat: we already know that untested scenarios nearly always have bugs, because human beings are bad at predicting what complex programs will do, outside of the specific scenarios they've envisioned.
And we are spectacularly bad at this, even for crap like accounting software. It is hubris verging on sheer insanity to assume that humans will be able to (by default) write a self-improving AGI that has to be bug-free from the moment it is first run.
Can the machine fix its own bugs?
How do you plan to fix the bugs in its bug-fixing ability, before the bug-fixing ability is applied to fixing bugs in the "don't kill everyone" routine? ;-)
More to the point, how do you know that you and the machine have the same definition of "bug"? That seems to me like the fundamental danger of self-improving AGI: if you don't agree with it on what counts as a "bug", then you're screwed.
(Relevant SF example: a short story in which the AI ship -- also the story's narrator -- explains how she corrected her creator's all-too-human error: he said their goal was to reach the stars, and yet for some reason, he set their course to land on a planet. Silly human!)
What about a "controlled ascent"?
How would that be the default case, if you're explicitly taking precautions?
Much is unclear. I believe this post is a good oppurtunity to give a roundup of the problem, for anyone who hasn't read the comments thread here:
The risk from recursive self-improvement is either dramatic enough to outweigh the low probability of the event or likely enough to outweight the probability of other existential risks. This is the idea everything revolves around in this community (it's not obvious, but I believe so). It is a idea that, if true, possible affects everyone and our collective future, if not the whole universe.
I believe that someone ...
I'm fighting against giants here. Someone who only mastered elementary school. I believe it should be easy to refute my arguments or show me where I am wrong, point me to some documents I should read up on. But I just don't see that happening. I talk to other smart people online as well, that way I was actually able to overcome religion. But seldom there have been people less persuasive than you when it comes to risks associated with artificial intelligence and the technological singularity. Yes, maybe I'm unable to comprehend it right now, I grant you that. Whatever the reason, I'm not conviced and will say so as long as it takes. Of course you don't need to convince me, but I don't need to stop questioning either.
Here is a very good comment by Ben Goertzel that pinpoints it:
This is what discussions with SIAI people on the Scary Idea almost always come down to!
The prototypical dialogue goes like this.
SIAI Guy: If you make a human-level AGI using OpenCog, without a provably Friendly design, it will almost surely kill us all.
Ben: Why?
SIAI Guy: The argument is really complex, but if you read Less Wrong you should understand it
Ben: I read the Less Wrong blog posts. Isn't there somewhere that the argument is presented formally and systematically?
SIAI Guy: No. It's really complex, and nobody in-the-know had time to really spell it out like that.
Good article. Thx for posting. I agree with much of it, but ...
Goertzel writes:
I do see a real risk that, if we proceed in the manner I'm advocating, some nasty people will take the early-stage AGIs and either use them for bad ends, or proceed to hastily create a superhuman AGI that then does bad things of its own volition. These are real risks that must be thought about hard, and protected against as necessary. But they are different from the Scary Idea.
Is this really different from the Scary Idea?
I've always thought of this as part of the Scary Id...
Ben's post states,
Finally, I note that most of the other knowledgeable futurist scientists and philosophers, who have come into close contact with SIAI's perspective, also don't accept the Scary Idea. Examples include Robin Hanson, Nick Bostrom and Ray Kurzweil.
Is there a reference for Bostrom's position on AGI-without-FAI risk? Is Goertzel correct here?
Although Goertzel is no longer on the Team page of SIAI site, his profile on Advisors page states that
Ben Goertzel, Ph.D., is SIAI Director of Research, responsible for overseeing the direction of the Institute's research division.
I assume this an oversight, left unchanged from before. (Edit: Fixed!)
Also, on Research areas page, areas 4 "Customization of Existing Open-Source Projects" and 6 "AGI Evaluation Mechanisms" are distinctly of AGI-without-FAI nature, from Goertzel's project.
Goertzel's article seems basically reasonable to me. There were some mis-statements that I can excuse at the very end, because by that point part of his argument was that certain kinds of hyperbole came up over and over and his text was mimicing the form of the hyperbolic arguments even as it criticized them. The grandmother line and IQ obsessed aliens spring to mind :-P
Given his summary of the "Scary AGI Thesis"...
...If someone builds an advanced AGI without a provably Friendly architecture, probably it will have a hard takeoff, and then probab
Ben Goertzel also says "If one fully accepts SIAI's Scary Idea, then one should not work on practical AGI projects..." Here is another recent quote that is relevant:
...What I find a continuing source of amazement is that there is a subculture of people half of whom believe that AI will lead to the solving of all mankind's problems (which me might call Kurzweilian S^) and the other half of which is more or less certain (75% certain) that it will lead to annihilation. Lets call the latter the SIAI S^.
Yet you SIAI S^ invite these proponents of global
The Al Gore hypocrisy claim is misleading. Global warming changes the equilibrium sea level, but it takes many centuries to reach that equilibrium (glaciers can't melt instantly, etc). So climate change activists like to say that there will be sea level rises of hundreds of feet given certain emissions pathways, but neglect to mention that this won't happen in the 21st century. So there's no contradiction between buying oceanfront property only slightly above sea level and claiming that there will be large eventual sea level increases from global warming.
The thing to critique would be the misleading rhetoric that gives the impression (by mentioning that the carbon emissions by such and such a date will be enough to trigger sea level rises, but not mentioning the much longer lag until those rises fully occur) that the sea level rises will happen mostly this century.
Regarding Hughes' point, even if one thinks that an activity has harmful effects, that doesn't mean that a campaign to ban it won't do more harm than good. That would essentially be making bitter enemies of several of the groups (AI academia and industry) with the greatest potential to reduce risk, and discredit the whole...
Thanks for the original pointer goes to Kevin.
Key points, some of which I already mentioned in the post Should I believe what the SIAI claims?:
Yes, you may argue: the Scary Idea hasn't been rigorously shown to be true… but what if it IS true?
OK but ... pointing out that something scary is possible, is a very different thing from having an argument that it's likely.
The Scary Idea is certainly something to keep in mind, but there are also many other risks to keep in mind, some much more definite and palpable.
[...]
...Also, there are always possibilities l
Check out SIAI's publications page. Kaj's most recent paper (published at ECAP '10) is a good 2 page summary of why AGI can be an x-risk for anyone who is uninformed of SIAI's position:
Having such beliefs with absolute certainty is incorrect, we don't have sufficient understanding for that, but weak beliefs multiplied by astronomical value lead to the same drastic actions, whose cost-benefit analysis doesn't take notice of small inconveniences such as being perceived to be crazy.
The unabomber performed some "drastic actions". I expect he didn't mind if he was "perceived to be crazy" by others - although he didn't want to plead insanity.
Does astronomical value outweigh astronomical low probability? You can come up with all kinds of scenarios that bear astronomical value, an astronomical amount of scenarios if you allow for astronomical low probability. Isn't this betting on infinity?
I would like to explore Ben's reasons for rejecting the premises of the argument.
I think the first of the above points is reasonably plausible
He offers the possibility that intelligence might cause or imply empathy; I feel that although we see that connection when we look at all of Earth's creatures, correlation doesn't imply causation, so that (intelligence AND empathy) doesn't mean (intelligence IMPLIES empathy) - it probably means (evolution IMPLIES intelligence AND empathy) and we aren't using natural selection to build an AI.
...I doubt human value
The problem with Pascal's Wager isn't that it's a Wager. The problem with Pascal's Wager and Pascal's Mugging (its analogue in finite expected utility maximization), as near as I can tell, is that if you do an expected utility calculation including one outcome that has a tiny probability but enough utility or disutility to weigh heavily in the calculation anyway, you need to include every possible outcome that is around that level of improbability, or you are privileging a hypothesis and are probably making the calculation less accurate in the process. If you actually are including every other hypothesis at that level of improbability, for instance if you are a galaxy-sized Bayesian superintelligence who, for reasons beyond my mortal mind's comprehension, has decided not to just dismiss those tiny possibilities a priori anyway, then it still shouldn't be any problem; at that point, you should get a sane, nearly-optimal answer.
So, is this situation a Pascal's Mugging? I don't think it is. 1% isn't at the same level of ridiculous improbability as, say, Yahweh existing, or the mugger's threat being true. 1% chances actually happen pretty often, so it's both possible and prudent to tak...
I’m also not big on friendly AI, but my position differs somewhat. I’m pretty skeptical about a very local hard takeoff scenario, where within a month one unnoticed machine in a basement takes over a world like ours. And even given on such a scenario the chance that its creators could constraint it greatly via a provably friendly design seems remote. And the chance such constraint comes from a small team that is secretive to avoid assisting wreckless others seems even more remote.
[...] I just see little point anytime soon in trying to coordinate to prevent such an outcome.
Perhaps the current state of evidence really is insufficient to support the scary hypothesis.
But surely, if one agrees that AI ethics is an existentially important problem, one should also agree that it makes sense for people to work on a theory of AI ethics. Regardless of which hypothesis turns out to be true.
Just because we don't currently have evidence that a killer asteroid is heading for the Earth, doesn't mean we shouldn't look anyway...
At the Singularity Summit's "Meet and Greet", I spoke with both Ben Geortzel and Eliezer Yudowski (among others) about this specific problem.
I am FAR more in line with Ben's position than with Eliezer's (probably because both Ben and I are either Working or Studying directly on the "how to do" aspect of AI, rather than just concocting philosophical conundrums for AI, such as the "Paperclip Maximizer" scenario of Eliezer's, which I find highly dubious).
AI isn't going to spring fully formed out of some box of parts. It may be a...
That is, rather than "if you go ahead with an AGI when you're not 100% sure that it's safe, you're committing the Holocaust," I suppose my view is closer to "if you avoid creating beneficial AGI because of speculative concerns, then you're killing my grandma" !!
Yeah, that may very well be a big risk too. As I said before here: Or maybe most civilisations are that cautionary that even if something is estimated to be safe by the majority they rather avoid it. And this overcautious makes them either evolve so slow that the chance of a ...
No, I also prefer text, and rarely watch youtube links when they're given here.
Videos can be worth it when they add good visual explanations. But good visual explanations can also be added to text.
There are downsides to being popular. A significant one is creating fans that don't actually understand what you're saying very well, and then go around giving a bad impression of you.
Having a moderate amount of smart fans would be way better than having lots of silly fans. I'm a bit fearful of what kind of crowd a large number of easy-to-digest videos would attract...
This post doesn't show up under "NEW", nor does it show up under "Recent Posts".
ADDED: Never mind. I forgot I had "disliked" it, and had "do not show an article once I've disliked it" set.
(I disliked it because I find it kind of shocking that Ben, who's very smart, and whom I'm pretty sure has read the things that I would refer him to on the subject, would say that the Scary Idea hasn't been laid out sufficiently. Maybe some people need every detail spelled out for them, but Ben isn't one of them. Also, he is com...
Have you read it?
I've looked at it.
I believe it is utter nonsense.
That is my impression too. Which is why I don't understand why you are complaining about censorship of ideas and wondering why EY doesn't spend more time refuting ideas.
As I understand it, we are talking about actions that might be undertaken by an AI that you and I would call insane. The "censorship" is intended to mitigate the harm that might be done by such an AI. Since I think it possible that a future AI (particularly one built by certain people) might actually be i...
regardless of dis/agreement, guy has a really cool voice http://www.youtube.com/watch?v=wS6DKeGvBW8&feature=related
On Ben's blog post, I noted that a poll at the 2008 global catastrophic risks conference put the existential risk of machine intelligence at 5% - and that the people attending probably had some of the largest estimations of risk of anyone on the planet - since they were a self-selected group attending a conference on the topic.
"Molecular nanotech weapons" also get 5%. Presumably there's going to be a heavy intersection between those two figures - even though in the paper they seem to be adding them together!
The motivation for the censorship is not to keep the idea from the AGI. It is to keep the idea from you. For your own good.
Seriously. And don't ask me to explain.
As I said, explanations exist. Don't confuse with actual good understanding, which as far as I know nobody managed to attain yet.
This comment and your other comments that are being voted down should rather be turned into a top-level post. Some people here seem to be horrible confused about this.
I couldn't agree more, upvoted.
...The idea of provably safe AGI is typically presented as something that would exist within mathematical computation theory or some variant thereof. So that's one obvious limitation of the idea: mathematical computers don't exist in the real world, and real-world physical computers must be interpreted in terms of the laws of physics, and humans' best understanding of the "laws" of physics seems to radically change from time to time. So even if there were a design for provably safe real-world AGI, based on current physics, the relevance of the proo
Again, I don't think this terminology is adequate.
Let's not dwell on terminology, where the denoted concepts remain much more urgently unclear.
[...] SIAI's Scary Idea goes way beyond the mere statement that there are risks as well as benefits associated with advanced AGI, and that AGI is a potential existential risk.
[...] Although an intense interest in rationalism is one of the hallmarks of the SIAI community, still I have not yet seen a clear logical argument for the Scary Idea laid out anywhere. (If I'm wrong, please send me the link, and I'll revise this post accordingly. Be aware that I've already at least skimmed everything Eliezer Yudkowsky has written on related topics.)
So if one wants a clear argument for the Scary Idea, one basically has to construct it oneself.
[...] If you put the above points all together, you come up with a heuristic argument for the Scary Idea. Roughly, the argument goes something like: If someone builds an advanced AGI without a provably Friendly architecture, probably it will have a hard takeoff, and then probably this will lead to a superhuman AGI system with an architecture drawn from the vast majority of mind-architectures that are not sufficiently harmonious with the complex, fragile human value system to make humans happy and keep humans around.
The line of argument makes sense, if you accept the premises.
But, I don't.
Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It), October 29 2010. Thanks to XiXiDu for the pointer.