[...] SIAI's Scary Idea goes way beyond the mere statement that there are risks as well as benefits associated with advanced AGI, and that AGI is a potential existential risk.

[...] Although an intense interest in rationalism is one of the hallmarks of the SIAI community, still I have not yet seen a clear logical argument for the Scary Idea laid out anywhere. (If I'm wrong, please send me the link, and I'll revise this post accordingly. Be aware that I've already at least skimmed everything Eliezer Yudkowsky has written on related topics.)

So if one wants a clear argument for the Scary Idea, one basically has to construct it oneself.

[...] If you put the above points all together, you come up with a heuristic argument for the Scary Idea. Roughly, the argument goes something like: If someone builds an advanced AGI without a provably Friendly architecture, probably it will have a hard takeoff, and then probably this will lead to a superhuman AGI system with an architecture drawn from the vast majority of mind-architectures that are not sufficiently harmonious with the complex, fragile human value system to make humans happy and keep humans around.

The line of argument makes sense, if you accept the premises.

But, I don't.

Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It), October 29 2010. Thanks to XiXiDu for the pointer.

New Comment
414 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

I tend to think differently on this one.

Wherever I turn my head around in this world, I see lost causes everywhere. I see Goodhart's law and Campbell's law at loose everywhere. I see insane optimizers everywhere. Political parties that concentrate more on show, pomp and campaign funds than on actual issues. Corporates that seek money to the exclusion of actual creation of value. Governments that seek employment and GDP growth even when those are supported by artificial stimuli and not sustainable patterns of production and trade.

One might argue that none of these systems are actually as intelligent as a well educated human at any given moment in time. But that's the point, isn't it? You're unable to stop sub-human optimizers, how are you going to curb a near human or a super human one?

For me, the scary idea is not so much of an idea as it is an extension of something that is already happening in this world.

Would you mind writing that up as an essay, for use as a LW post, or ideally, as a piece of SIAI literature?
Dear Michael, Without wanting to weasel out of your request, I honestly believe that Eliezer's Lost purposes post says the point I want to make very well, much better than I can hope to phrase it without putting in some hard work. The only new point I probably made is that these forces are already at loose and it is difficult to curb them. However, I will make an effort this weekend and see what I can come up with.
Thanks. I appreciate the effort.
Upvoted. Although I believe that one could also see our cultural and political systems as superhuman collective entities undergoing an evolutionary arms race featuring a anthropocentrically weighted utility maximizing selection pressure. There is some evidence for this too, to put it bluntly, we are better off than we have been 100 years ago?
I bet you meant lost purposes.

I really liked this comment by FrankAdamek:

With regards to teaching an AI to care: what you can teach a mind depends on the mind. The best examples come from human beings: for hundreds of years many (though not all) parents have taught their children that it is wrong to have sex before marriage, a precept that many people break even when they think they shouldn't and feel bad about it . And that's with our built in desires for social acceptance and hardware for propositional morality. For another example, you can't train tigers to care about their handlers. No matter how much time you spend with them and care for them, they sometimes bite off arms just because they are hungry. I understand most big cats are like this.

It's quite true that nobody plans to build a system with no concern for human life, but it's also true that many people assume Friendliness is easy.


A particularly troubling quote from the post:

I think the relation between breadth of intelligence and depth of empathy is a subtle issue which none of us fully understands (yet). It's possible that with sufficient real-world intelligence tends to come a sense of connectedness with the universe that militates against squashing other sentiences. But I'm not terribly certain of this, any more than I'm terribly certain of its opposite.

The obvious truth is that mind-design space contains every combination of intelligence and empathy.

I don't find that "truth" either obvious or true. Would you say that "The obvious truth is that mind -design space contains every combination of intelligence and rationality"? How about "The obvious truth is that mind -design space contains every combination of intelligence and effectiveness"? One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.

One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.

Two questions:

1) The consequences for whom?

2) How much empathy do you have for, oh, say, an E. coli bacterium?

Connecting these two questions is left as an exercise for the reader. ;-)


One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.

Human psychopaths are a counterexample to this claim, and they seem to be doing alright in spite of active efforts by the rest of humanity to detect and eliminate them.

'Detect and eliminate' or 'detect and affiliate with the most effective ones'. One or the other. ;)
There are no efforts by the rest of humanity to detect and eliminate the sort of psychopaths who understand it's in their own interests to cooperate with society. The sort of psychopaths who fail to understand that, and act accordingly, typically end up doing very badly.
Why all the focus on psychopaths? It could be said that certain forms of autism are equally empathy-blinded, and yet people along that portion of the spectrum are often hugely helpful to the human race, and get along just fine with the more neurotypical.
No. There are two bad assumptions in your counterexample. They are: 1. Human psychopaths are above the certain point of intelligence that I was talking about. 2. Human psychopaths are sufficiently long-lived for the consequences to be severe enough. Hmmmm. #2 says that I probably didn't make clear enough the importance of the length of interaction. You also appear to have the assumption that my argument is that the AGI fears detection of its unfriendly behavior and any consequences that humanity can apply. Humanity CANNOT apply sufficient negative consequences to a sufficiently powerful AGI. The severe consequences are all missed opportunity costs which means that the AGI is thereby sub-optimal and thereby less intelligent than is possible.
What sort of opportunity costs? The AI can simulate humans if it needs them, for a lower energy cost than keeping the human race alive. So, why should it keep the human race alive?
The underlying disorders of what is commonly referred to as psychopathy are indeed detectable. I also find it comforting that they are in fact disorders and that being evil in this fashion is not an attribute of an otherwise high-functioning mind. Psychopaths can be high-functioning in some areas, but a short interaction with them almost always makes it clear that there is something is.wrong.
Homosexuality was also a disorder once. Defining something as a sickness or disorder is a matter of politics as much as anything else.
Cat burning was also a form of entertainment once. Defining something as fun or entertainment is a matter of politics as much as anything else. The same goes for friendliness. I fear that once we pinpoint it, it'll be outdated.
What do you mean by psychopathy? At least one sort of no-empathy person is unusually good at manipulating most people.
Everybody who is known to be a psychopath is a bad psychopath, by definition; a skilled psychopath is one who will not let people figure out that he's a psychopath. Of course, this means that the existence of sufficiently skilled psychopath is, in everyday practice, unprovable and unfalsifiable (at least to the degree that we cannot tell the difference between a good actor and someone genuinely feeling empathy; I suppose you might figure out something by measuring people's brain activity while they watch a torture scene).
Even then it is far from definitive. Experienced doctors, for example, lose a lot the ability to feel certain kinds of physical empathy - their brains will look closer to a good actor's brain than that of a naive individual exposed to the same stimulus. That's just practical adaptation and good for patient and practitioner alike.
Considering the number of horror stories I've heard about doctors who just don't pay attention, I'm not sure you're right that doctors acting their empathy is good for patients. Cite? I'm curious about where and when that study was done.
Don't know. Never saw it first hand - I heard it from a doctor.
Thanks for your reply, but I think I'm going to push for some community norms for sourcing information from studies, ranging from read the whole thing carefully to heard about it from someone.
Only on lesswrong - we look down our noses at people who take the word of medical specialists.
That doctor almost certainly wasn't speaking out of his specialist knowledge.
You don't have enough information to arrive at that level of certainty. He was not, for example, a general practitioner and I was not a client of his. I was actually working with him in medical education at the time. Come to think of it, bizarrely enough and by pure happenstance that does put the subject into the realm of his specialist knowledge. I don't present that as a reason to be persuaded - I actually think not taking official status, particularly medicine related official status, seriously is a good thing. It is just a reply to your presumption. While I don't expect you to take my (or his) word for anything I also wouldn't expect you to need to. This is exactly the finding I would expect based off general knowledge of human behavior. When people are constantly exposed to stimulus that is emotionally laden they will tend to become desensitized to it. There are whole schools of cognitive therapy based on this fact. If someone has taken on the role of a torturer then their emotional response to witnessing torture will be drastically altered. Either it will undergo extinction or the individual will be crippled with PTSD. This can be expected to apply even more when they fully identify with their role due to, for example, the hazing processes involved in joining military and paramilitary organisations.
Part of what seemed iffy was the claim that it was good for both the patients and the practitioner, when it was correlated (from what you said) with experience, with no mention of quality of care. When someone says their source is "a doctor", what are the odds that it's a researcher specializing in that particular area? Especially when the information is something which could as easily be a fluffy popular report as something clearly related to a specialty? Also, I had a prior from Bernard Siegal which is also intuitively plausible-- that doctors who are emotionally numb around their patients are more likely to burn out. This was likely to have been based on anecdote, but not a crazy hypothesis.
I believe you have a sign error in your last paragraph. Doctors who do not emotionally numb themselves are the ones considered at risk to burn out. I have a friend from one of my T groups who is a physician at M. D. Anderson Cancer Center and she is now working in intensive care where the people are really messed up and people die all the time. She believes genuine loving care for her patients is her duty and makes her a better physician; she was trained be emotionally numb and she felt like it was an epiphany for herself to rebel against this after a couple of years in her current assignment. I have not asked her if her attitude is obvious to her supervisors. My guess is that it probably is not; I do not think she is secretive about it (although she probably does not go around evangelizing to the other doctors much) but I would think that the other doctors are too preoccupied to observe it. In the book Consciousness and Healing Larry Dossey M.D. also explicitly discusses behavioral norms of professional physicians being to minimize emotional involvement and he has arguments that this is a bad practice. (That book is not an example of good rational thinking from cover to cover.)
Desensitization to powerful negative emotional reactions is not the same thing as not caring and not building a personal relationship. Most of our default emotional reactions when we are in close contact with others who have physical or emotional injuries aren't exactly optimal for the purpose of providing assistance. Particularly when what the doctor needs to do will cause more pain.
I'll add that at particularly high levels of competence it makes very little difference whether you are a psychopath who has mastered the deception of others or a hypocrite (normal person) who has mastered deception of yourself.
That is probably because you don't share a definition of intelligence with most of those here. Perhaps look through http://www.vetta.org/definitions-of-intelligence/ - and see if you can find your position.
Now if you had suggested that intelligence cannot evolve beyond a certain point unless accompanied by empathy ... that would be another matter. I could easily be convinced that a social animal requires empathy almost as much as it requires eyesight, and that non-social animals cannot become very intelligent because they would never develop language. But I see no reason to think that an evolved intelligence would have empathy for entities with whom it had no social interactions during its evolutionary history. And no a priori reason to expect any kind of empathy at all in an engineered intelligence. Which brings up an interesting thought. Perhaps human-level AI already exists. But we don't realize it because we have no empathy for AIs.
MIT's Leonardo? Engineered super-cuteness!
The most likely location for an "unobserved" machine intelligence is probably the NSA's basement. However, it seems challenging to believe that a machine intelligence would need to stay hidden for very long.
Well, it does contain all those points, but some weird points are weighted much less heavily.

One thing that I think is relevant, in the discussion of existential risk, is Martin Weitzmann's "Dismal Theorem" and Jim Manzi's analysis of it. (Link to the article, link to the paper.)

There, the topic is not unfriendly AI, but climate change. Regardless of what you think of the topic, it has attracted more attention than AGI, and people writing about existential risk are often using climate change as an example.

Martin Weitzman, a Harvard economist, deals with the probability of extreme disasters, and whether it's worth it in cost-benefit terms to deal with them. Our problem, in cases of extreme uncertainty, is that we don't only have probability distributions, we have uncertain probability distributions; it's possible we got the models wrong. Weitzman's paper takes this into account. He creates a family of probability distributions, indexed over a certain parameter, and integrates over it -- and he proves that the process of taking "probability distributions of probability distributions" has the result of making the final distribution fat-tailed. So fat-tailed that the integral doesn't converge.

This is a terrible consequence. Because if the PDF of th... (read more)

The talk about uncertainty is indeed a red herring. There are two things going on here:

  1. A linear aggregative (or fast-growing enough in the relevant range) social welfare function makes even small probabilities of existential risk more important than large costs or benefits today. This is the Bostrom astronomical waste point. Weitzmann just uses a peculiar model (with agents with bizarre preferences that assign infinite disutility to death, and a strangely constricted probability distribution over outcomes) to indirectly introduce this. You can reject it with a bounded social welfare function like Manzi or Nordhaus, representing your limited willingness to sacrifice for future generations.

  2. The fact that there are many existential risks competing for our attention, and many routes to affecting existential risk, so that spending effort on any particular risk now means not spending that effort on other existential risks, or keeping it around while new knowledge accumulates, etc. Does the x-risk reduction from climate change mitigation beat the reduction from asteroid defense or lobbying for arms control treaties at the current margin? Weitzmann addresses this by saying that the risk from surprise catastrophic climate change is much higher than other existential risks collectively, which I don't find plausible.

Is anyone in SIAI making the argument that we should spend more because our models are too uncertain to provide expected costs, or more generally that our very uncertainty of model is a significant source of concern? My impression was more that it's "we have good reasons to doubt people's estimation that Friendliness is easy" and "we have good reason to believe it's actually quite hard."
fair enough -- this is my caution against the logic "I can think of a risk, therefore we need to worry about it!" It seems that SIAI is making the stronger claim that unfriendliness is very likely. My personal view is that AI is very hard itself, and that working on, say, a computer that can do what a mouse can do is likely to take a long time, and is harmless but very interesting research. I don't think we're anywhere near a point when we need to shut down anybody's current research.
Consider marginal utility. Many people are working on AI, machine learning, computational psychology, and related fields. Nobody is working on preference theory, formal understanding of our goals under reflection. If you want to do interesting research and if you have the background to advance either of those fields, do you think the world will be better off with you on the one side or on the other?
Maybe that's true, but that's a separate point. "Let's work on preference theory so that it'll be ready when the AI catches up" is one thing -- tentatively, I'd say it's a good idea. "Let's campaign against anybody doing AI research" seems less useful (and less likely to be effective.)
But if provable friendliness is hard, wouldn't it be much easier to accomplish with the help of AI? Presumably if the FAI problem can be solved by a few dozen smart human researchers within a few decades, then it can be solved in a year or so by a few dozen not-guaranteed-friendly AGIs-in-a-box with limited IQs in the 180-220 range. The AGIs design an FAI architecture and provide the proof, some smart humans check the proof, and then we build the thing and fasten our seatbelts for the exciting ride as the FAI goes FOOM.
How do you propose to limit their IQs? I'm not asking facetiously; your plan seems reasonable to me, but that's the part that seems the trickiest, and the part that if gotten wrong could lead to accidental early FOOMage.
I have no idea how to limit the IQ of AIs that other people produce without my knowledge. For AI's that I produce myself, I would simply do without closed-loop recursive self-improvement (aka, keep the AI in a box) until I have a proven FAI architecture in hand. I'm reasonably confident that a closed-loop FOOM is impossible until AI "IQ" goes well past the max human level. I am also reasonably confident that closing the recursive self-improvement loop doesn't speed things up much until you reach that level, either. So, if a "Sane AI" project like this one, operating under the slogan of "Open loop until we have a proof" can maintain a technological lead of a year or so over a "Risky AI" project with the slogan "Close the loop - Full speed ahead", then I'm pretty sure it is actually safer than a "Secure FAI" project operating under the slogan "No AGI until we have a proof". Because it has a better chance of establishing and maintaining that technological lead.
Eliezer figures out how to download his own brain. The emulation requires only a small amount of processing speed and memory. With the financial backing of the SIAI, LessWrong readers and wealthy tech businesspeople we create millions of Ems and have each run at 1,000 times the speed that Eliezer runs at. All of the Eliezer ems immediately work on improving the Ems' code and make huge use of trial and error in which they make some changes to the code of a subset of the Ems and give them intelligence tests, throwout the less intelligent Ems and make many copies of the superior ones. This could give us a singularity in a week.
Your scenario strikes me as laughably overoptimistic. A brain emulation requires only a small amount of processing speed and memory? A story that begins with finding financial backing takes only a week to reach completion? But in any case, this is a closed-loop recursive self-improvement FOOM. I don't doubt that such things are possible. My point was that if you already have a bunch of super-Eliezers, why not have them design a provably-correct FAI, rather than sending them off to FOOM into an uFAI? If they discover the secret of FAI within a year or so, great! If it turns out that provably correct FAI is just a pipe-dream, then maybe we ought to reconsider our plans to close the loop and FOOM.
" A brain emulation requires only a small amount of processing speed and memory?" If software is the bottleneck and computer speed and memory are increasing exponentially than you would expect that by the time the software was available it would use a relatively small amount of computing power. " A story that begins with finding financial backing takes only a week to reach completion?" My story begins with the Eliezer Em. 150,000 people die everyday, and money probably becomes useless after a singularity. If enough people understood what was happening we could raise, say, a billion dollars in a few days. Hedge funds, I strongly suspect, do sometimes make billion dollar bets based on information they acquired in the last day. "why not have them design a provably-correct FAI, rather than sending them off to FOOM into an uFAI?" The 150,000 lives a day cost of delay plus the Eliezer ems might be competing with other ems that have list benign intentions.
Hm, so then the issue just becomes how to keep the AI from closing its own loop (i.e. modifying itself in-memory through some security hole it finds). I agree that it seems unlikely to figure out how to do so at a relatively low level of intelligence. On the other hand, it seems like it would be pretty hard to do research on self-improvement without a closed loop; isn't the expectation usually that the self-improvement process won't start doing anything particularly interesting until many iterations have passed? Maybe I'm just misunderstanding your use of the terms. I take it by "open loop" you mean that the AI would seek to generate an improved version of itself, but would simply provide that code back to the researcher rather than running it itself?
Roughly, yes. But I see recursive self-improvement as having a hardware component as well, so "closed loop" also includes giving the AI control over electronics factories and electronic assembly robots. Odd. My expectation for the software-only and architecture-change portion of the self-improvement is that the curve would be the exact opposite - some big gains early by picking off low-hanging fruit, but slower improvement thereafter. It is only in the exponential growth of incorporated hardware that you would get a curve like that which you seem to expect.
Or letting them seize control of ... Not necessarily that hard given the existence of stuxnet.

There is a large, continuous spectrum between making an AI and hoping it works out okay, and waiting for a formal proof of friendliness. Now, I don't think a complete proof is feasible; we've never managed a formal proof for anything close to that level of complexity, and the proof would be as likely to contain bugs as the program would. However, that doesn't mean we shouldn't push in that direction. Current practice in AI research seems to be to publish everything and take no safety precautions whatsoever, and that is definitely not good.

Suppose an AGI is created, initially not very smart but capable of rapid improvement, either with further development by humans or by giving it computing resources and letting it self-improve.Suppose, further, that its creators publish the source code, or allow it to be leaked or stolen.

AI improvement will probably proceed in a series of steps: the AI designs a successor, spends some time inspecting it to make sure the successor has the same values, then hands over control, then repeat. At each stage, the same tradeoff between speed and safety applies: more time spent verifying the successor means a lower probability of error, but a higher probab... (read more)

Most of the compaines involved (e.g. Google, James Harris Simons) publish little or nothing relating so their code in this area publicly - and few know what safeguards they employ. The government security agencies potentially involved (e.g. the NSA) are even more secretive.
Simons is an AI researcher? News to me. Clearly his fund uses machine learning, but there is an ocean between that and AGI (besides plenty of funds use ML also, DE Shaw and many others).
Exactly this! I think there is a U-shaped response curve to risk versus rigor. Too little rigor ensures disaster, but too much rigor ensures a low rigor alternative is completed first. When discussing the correct course of action, I think it is critical to consider not just probability of success but also time to success. So far as I've seen arguments in favor of SIAI's course of action have completely ignored this essential aspect of the decision problem.

When he was paraphrasing the reasons:

Human value is fragile as well as complex, so if you create an AGI with a roughly-human-like value system, then this may not be good enough, and it is likely to rapidly diverge into something with little or no respect for human values

... that doesn't seem quite right. The main problem with values being fragile isn't that a "roughly-human-like value system" might diverge rapidly; it's that properly implementing a "roughly-human-like value system" is actually quite hard and most AGI programmer seem to underestimate it's complexity, and go for "hacky" solutions, which I find somewhat scary.

Ben seems aware of this, and later goes on to say:

This is related to the point Eliezer Yudkowsky makes that "value is complex" -- actually, human value is not only complex, it's nebulous and fuzzy and ever-shifting, and humans largely grok it by implicit procedural, empathic and episodic knowledge rather than explicit declarative or linguistic knowledge.

... which seems to be one of the reasons to pay extra attention to it (and this also seems to be a reason given by Eliezer, whereas Ben almost presents it as a counterpoint to Eliezer).

Human evaluation of human values under specific instances is everything that Ben says it is (complex, nebulous, fuzzy, ever-shifting, and grokked by implicit rather than explicit knowledge). On the other-hand, evaluation of a points in the Mandelbroit set by a deterministically moving entity that is susceptible to color-illusions is even more complex, nebulous, fuzzy, and ever-shifting to the extent that it probably can't be grokked at all. Yet, it is generated from two very simple formulae (the second being the deterministic movement of the entity). Eliezer has provided absolutely NO rational arguments (much less proof) that the core of Friendly is complex at all. Further, paying attention to the fact that ethical mandates within the obviously complex real world (particularly when viewed through the biased eyes and fallible beings) are comprehensible at all would seem an indication that maybe there are just a small number of simple laws underlying them (or maybe only one -- see my comment on Ben's post cross-posted at http://becominggaia.wordpress.com/2010/10/30/ben-goertzel-the-singularity-institutes-scary-idea/ for easy access).
My take on the optimisation target of all self-organising systems: http://originoflife.net/gods_utility_function/ Eliezer Yudkowsky explains why he doesn't like such things: http://lesswrong.com/lw/lq/fake_utility_functions/

For me, the oddest thing about Goertzels' article is his claim that SIAI's arguments are so unclear that he had to construct it himself. The way he describes the argument is completely congruent with what I've been reading here.

In any case, his argument that it may not be possible to have provable Friendliness and it makes more sense to take an incremental approach to AGI than to not do AGI until Friendliness is proven seems reasonable.

Has it been demonstrated that Friendliness is provable?


If Goertzel's claim that "SIAI's arguments are so unclear that he had to construct it himself" can't be disproven by the simple expedient of posting a single link to an immediately available well-structured top-down argument then the SIAI should regard this as an obvious high-priority, high-value task. If it can be proven by such a link, then that link needs to be more highly advertised since it seems that none of us are aware of it.

8Paul Crowley
The nearest thing to such a link is Artificial Intelligence as a Positive and Negative Factor in Global Risk [PDF]. But of course the argument is a little large to entirely set out in one paper; the next nearest thing is What I Think, If Not Why and the title shows in what way that's not what Goertzel was looking for.
44 pages. I don't see anything much like the argument being asked for. The lack of an index doesn't help. The nearest thing I could find was this: He also claims that intelligence could increase rapidly with a "dominant" probabilty. This all seems pretty vague to me.
Is this an official position in the first place? It seems to me that they want to give the impression that - without their efforts - the END IS NIGH - without committing to any particular probability estimate - which would then become the target of critics. Halloween update: It's been a while now, and I think the response has been poor. I think this means there is no such document (which explains Ben's attempted reconstruction). It isn't clear to me that producing such a document is a "high-priority task" - since it isn't clear that the thesis is actually correct - or that the SIAI folks actually believe it. Most of the participants here seem to be falling back on: even if it is unlikely, it could happen, and it would be devastating, so therefore we should care a lot - which seems to be a less unreasonable and more defensible position.
You lost me at that sharp swerve in the middle. With probabilities attached to the scary idea, it is an absolutely meaningless concept. What if its probability were 1 / 3^^^3, should we still care then? I could think of a trillion scary things that could happen. But without realistic estimates of how likely it is to happen, what does it matter?
Here are some links.
Heh. I've read virtually all those links. I still have the three following problems. 1. Those links are about as internally self-consistent as the Bible. 2. There are some fundamentally incorrect assumptions that have become gospel. 3. Most people WON'T read all those links and will therefore be declared unfit to judge anything. What I asked for was "an immediately available well-structured top-down argument". It would be particularly useful and effective if SIAI recruited someone with the opposite point of view to co-develop a counter-argument thread and let the two revolve around each other and solve some of these issues (or, at least, highlight the base important differences in opinion that prevent them from solution). I'm more than willing to spend a ridiculous amount of time on such a task and I'm sure that Ben would be more than willing to devote any time that he can tear away from his busy schedule.

There are some fundamentally incorrect assumptions that have become gospel.

So go ahead and point them out. My guess is that in the ensuing debate it will be found that 1/4 of them are indeed fundamentally incorrect assumptions, 1/4 of them are arguably correct, and 1/2 of them are not really "assumptions that have become gospel". But until you provide your list, there is no way to know.

3Paul Crowley
Multiple links are not an answer - to be what Goertzel was looking for it has to be a single link that sets out this position.
Yudkowsky calls it "The default case" - e.g. here: ...however, it is not terribly clear what being "the default case" is actually supposed to mean.
Seems plausible to interpret "default case" as meaning "the case that will most probably occur unless steps are specifically taken to avoid it". For example, the default case of knocking down a beehive is that you'll get stung; you avoid that default case by specifically anticipating it and taking countermeasures (i.e. wearing a bee-keeping suit).
So: it seems as though the "default case" of a software company shipping an application would be that it crashes, or goes into an infinite loop - since that's what happens unless steps are specifically taken to avoid it. The term "the default case" seems to be a way of making the point without being specific enough to attract the attention of critics

So: it seems as though the "default case" of a software company shipping an application would be that it crashes, or goes into an infinite loop - since that's what happens unless steps are specifically taken to avoid it.

Not quite. The "default case" of a software company shipping an application is that there will definitely be bugs in the parts of the software they have not specifically and sufficiently tested... where "bugs" can mean anything from crashes or loops, to data corruption.

The analogy here -- and it's so direct and obvious a relationship that it's a stretch to even call it an analogy! -- is that if you haven't specifically tested your self-improving AGI for it, there are likely to be bugs in the "not killing us all" parts.

I repeat: we already know that untested scenarios nearly always have bugs, because human beings are bad at predicting what complex programs will do, outside of the specific scenarios they've envisioned.

And we are spectacularly bad at this, even for crap like accounting software. It is hubris verging on sheer insanity to assume that humans will be able to (by default) write a self-improving AGI that has to be bug-free from the moment it is first run.

The idea that a self-improving AGI has to be bug-free from the moment it is first run seems like part of the "syndrome" to me. Can the machine fix its own bugs? What about a "controlled ascent"? etc.

Can the machine fix its own bugs?

How do you plan to fix the bugs in its bug-fixing ability, before the bug-fixing ability is applied to fixing bugs in the "don't kill everyone" routine? ;-)

More to the point, how do you know that you and the machine have the same definition of "bug"? That seems to me like the fundamental danger of self-improving AGI: if you don't agree with it on what counts as a "bug", then you're screwed.

(Relevant SF example: a short story in which the AI ship -- also the story's narrator -- explains how she corrected her creator's all-too-human error: he said their goal was to reach the stars, and yet for some reason, he set their course to land on a planet. Silly human!)

What about a "controlled ascent"?

How would that be the default case, if you're explicitly taking precautions?

Controlled ascent isn't the default case, but it certainly should be what provably friendly AI is weighed against.
It seems as though you don't have any references for the supposed "hubris verging on sheer insanity". Maybe people didn't think that in the first place. Computers regularly detect and fix bugs today - e.g. check out Eclipse. I never claimed "controlled ascent" as being "the default case". In fact I am here criticising "the default case" as weasel wording.
If it has a bug in its utility function, it won't want to fix it. If it has a bug in its bug-detection-and-fixing techniques, you can guess what happens. So, no, you can't rely on the AGI to fix itself, unless you're certain that the bugs are localised in regions that will be fixed.
So: bug-free is not needed - and a controlled ascent is possible. The unreferenced "hubris verging on sheer insanity" asumption seems like a straw man - nobody assumed that in the first place.
I think your analogy is apt. It's a similar argument for FAI; just as a software company should not ship a product without first running it through some basic tests to make sure it doesn't crash, so an AI developer should not turn on their (edit: potentially-FOOMing) AI unless they're first sure it is Friendly.
Well, I hope you see what I mean. If the "default case" is that your next operating system upgrade will crash your computer or loop forever, then maybe you have something to worry about - and you should probably do an extensive backup, with this special backup software I am selling.
It would certainly be the default case for untested operating system upgrades. Whenever I write a program, even a small program, it usually doesn't work the first time I run it; there's some mistake I made and have to go back and fix. I would never ship software that I hadn't at least ran on my own to make sure it does what it's supposed to. The problem with that when it comes to AI research, according to singulitarians, is that there's no safe way to do a test run of potentially-FOOMing software; mistakes that could lead to unFriendliness have to be found in some way that doesn't involve running the code, even in a test environment.
That just sounds crazy to me :-( Are these people actual programmers? How did they miss out on having the importance of unit tests drilled into them?
The problem is that running the AI might cause it to FOOM, and that could happen even in a test environment.
How do you get from that observation to the idea that running a complete untested program in the wild is going to be safer than not testing it at all?
No, the proposed solution is to first formally validate the program against some FAI theory before doing any test runs.
This idea is proposed by people with little idea of the value of testing - and little knowledge of the limitations of provable correctness - I presume. In fact, who has supposedly proposed this idea? What did they actually say? Also, you are now talking about performing "test runs". Is that doing testing, now?
The usefulness of testing is beside the point. The argument is that testing would be dangerous. By "testing" I meant "running the code to see if it works", which includes unit testing individual components, integration or functional testing on the program as a whole, or the simple measure of running the program and seeing if it does what it's supposed to. By "doing test runs" I meant doing either of the latter two. I would never ship, or trust in production, a program that had only been subjected to unit tests. This poses a problem for AI researchers, because while unit testing a potentially-FOOMing AI might well be safe (and would certainly be helpful in development), testing the whole thing at once would not be. I think EY's the original person behind a lot of this, but now the main visible proponents seem to be SIAI. Here's a link to the big ol' document they wrote about FAI. On the specific issue of having to formally prove friendliness before launching an AI, I can't find anything specific in there at the moment. Perhaps that notion came from elsewhere? I'm not sure; but, it seems straightforward to me from the premises of the argument (AGI might FOOM, we want to make sure it FOOMs into something Friendly, we cannot risk running the AGI unless we know it will) that you'd have to have some way of showing that an AGI codebase is Friendly without running it, and the only other way I can think of would be to apply a rigorous proof.
Life is dangerous: the issue is surely whether testing is more dangerous than not testing. It seems to me that a likely outcome of pursuing a strategy involving searching for a proof is that - while you are searching for it - some other team makes a machine intelligence that works - and suddenly whether your machine is "friendly" - or not - becomes totally irrelevant. I think bashing testing makes no sense. People are interested in proving what they can about machines - in the hope of making them more reliable - but that is not the same as not doing testing. The idea that we can make an intelligent machine - but are incapable of constructing a test harness capable of restraining it - seems like a fallacy to me. Poke into these beliefs, and people will soon refer you to the AI-box experiment - which purports to explain that restrained intelligent machines can trick human gate keepers. ...but so what? You don't imprison a super-intelligent agent - and then give the key to a single human and let them chat with the machine!
The "default case" occurs when not specifically avoided. The company making the OS upgrade is going to do their best to avoid the computers it's installed on crashing. In fact, they'll probably hire quality control experts to make certain of it. Why should AGI not have quality control?
It definitely should have quality control. The whole point of the 'Scary idea' is that there should be an effective quality control for GAI, otherwise the risks are too big. At the moment humanity has no idea on how to make an effective quality control - which would be some way to check if an arbitrary AI-in-a-box is Friendly. Ergo, if a GAI is launched before Friendly AI problem has some solutions, it means that GAI was launched without a quality control performed. Scary. At least to me.
The default case for a lot of shipped application isn't to do what it was designed to do, i.e. satisfy the target customer's needs. Even when you ignore the bugs, often the target customer doesn't understand how it works, or it's missing a few key features, or it's interface is clunky, or no-one actually needs it, or it's made confusing with too many features nobody cares about, etc. - a lot of applications (and websites) suck, or at least, the first released version does. We don't always see that extent because the set of software we use is heavily biased towards the "actually usable" subset, for obvious reasons. For example, see the debate tools that have been discussed here and are never used by anybody for real debate.
That it's impossible to find a course of action that is knowably good, is not an argument for the goodness of pursuing a course of action that isn't known to be good.
Certainly, but it is an argument for the goodness of pursuing a course of action that is known to have a chance of being good. There are roughly two types of options: 1) A plan that, if successful, will yield something good with 100% certainty, but has essentially 0% chance of succeeding to begin with. 2) A plan that, if successful, may or may not be good, with a non-zero chance of success. Clearly type 2 is a much, much larger class, and includes plans not worth pursuing. But it may include plans worth pursuing as well. If Friendly AI is as hard as everyone makes it out to be, I'm baffled that type 2 plans aren't given more exposure. Indeed, it should be the default, with reliance on a type 1 plan a fall back given more weight only with extraordinary evidence that all type 2 plans are as assuredly dangerous as FAI is impossible.
The argument isn't that we should throw away good plans because there's some small chance of it being bad even if successful. The argument is that the target is small enough that anything but a proof still leaves you with a ~0% chance of getting a good outcome.
You point out a correct statement (2) for which the incorrect argument (1) apparently argues. This doesn't argue for correctness of the argument (1). (A course of action that is known to have a chance of being good is already known to be good, in proportion to that chance (unless it's also known to have a sufficient chance of being sufficiently bad). For AI to be Friendly doesn't require absolute certainty in its goodness, but beware the fallacy of gray.)

Much is unclear. I believe this post is a good oppurtunity to give a roundup of the problem, for anyone who hasn't read the comments thread here:

The risk from recursive self-improvement is either dramatic enough to outweigh the low probability of the event or likely enough to outweight the probability of other existential risks. This is the idea everything revolves around in this community (it's not obvious, but I believe so). It is a idea that, if true, possible affects everyone and our collective future, if not the whole universe.

I believe that someone ... (read more)

Umm, this is not the SIAI blog. It is "Less Wrong: a community blog devoted to refining the art of human rationality". The idea everything revolves around in this community is what comes after the ':' in the preceding sentence.
* Google site:lesswrong.com "artificial intelligence" 4,860 results * Google site:lesswrong.com rationality 4,180 results Besides its history and the logo with a link to the SIAI that you can see in the top right corner, I believe that you underestimate the importance of artificial intelligence and associated risks within this community. As I said, it is not obvious, but when Yudkowsky came up with LessWrong.com it was against the background of the SIAI.
* Google site:lesswrong.com "me" 5,360 results * Google site:lesswrong.com "I" 7,520 results * Google site:lesswrong.com "it" 7,640 results * Google site:lesswrong.com "a" 7,710 results Perhaps you overestimate the extent to which google search results on a term reflect the importance of the concept to which the word refers. I note that: * The best posts on 'rationality' are among those that do not use the word 'rationality'*. * Similar to 'Omega' and 'Clippy', AI is a useful agent to include when discussing questions of instrumental rationality. It allows us to consider highly rational agents in the abstract without all the bullshit and normative dead weight that gets thrown into conversations whenever the agents in question are humans.
Eliezer explicitly forbade discussion of FAI/Singularity topics on lesswrong.com for the first few months because he didn't want discussion of such topics to be the primary focus of the community. Again, "refining the art of human rationality" is the central idea that everything here revolves around. That doesn't mean that FAI and related topics aren't important, but lesswrong.com would continue to thrive (albeit less so) if all discussion of singularity ceased.
You appear to be suggesting that Eliezer should censor presentation of his thoughts on the subject so as to prevent people from having nightmares. Spot the irony! ;) Eliezer asks people for money. That hardly makes him unique. Neither he nor anyone else is obliged to get your permission before they ask for donations in support of their cause. It seems to me that you expect more from the SIAI than you do from other well meaning organisations simply because there is actually a chance that the cause may make a significant long term difference. As opposed to virtually all the rest - those we know are pointless! I rather suspect that if all those demands were meant you would go ahead and find new rhetorical demands to make. That quote is out of context. While I do happen to hold Eliezer's behavior in that context in contempt, the way the quote is presented here is misleading. It is not relevant to your replies and only relevant to the topic here by virtue of Eliezer's character. Speak for yourself. I don't have the difficulty comprehending the premises either the ones you have questions here or the others required to make an adequate evaluation for the purpose of decision making. Neither I nor Eliezer and the SIAI need to force understanding of the Scary Idea upon you for it to be rational for us to place credence on it. The same applies to other readers here. That is not to say that more work producing the documentation of the kind that you describe would not be desirable.
This comment will be downvoted but I hope you people will actually explain yourself and not just click 'Vote down', every bot can do that. Now that I've slept I read your comment again and I don't see any justification for why it got upvoted even once. I never claimed that EY can't ask for money, you are creating a straw man there. You also do not know what I do expect from other organisations. Further, it is not fallacious to suspect that Yudkowsky has some responsibility if people get nighmares from ideas that he would be able to resolve. If he really believes those things, it is of course his right to proclaim them. But the gist of my comment was meant to inquire about the foundations of those beliefs and stating that it does not appear to me that they are based on evidence which makes it legally right but ethically irresponsible to tell people to worry to such an extent or even not to tell them not to worry. I just don't know how to parse this. I mean what I asked for and I do not ask for certainty here. I'm not doubting evolution and climate change. The problem is that even a randomly picked research paper likely bears more analysis, evidence and references than all of LW and the SIAI' documents together regarding risks posed by recursive self-improvement from artificial general intelligence. The quotes have been relevant as they showed that Yudkowsky clearly believes in his intellectual and epistemic superiority, yet any corroborative evidence seems to be missing. Yes, there is this huge amount of writings on rationality and some miscellaneous musing on artificial intelligence. But given how the idea of risks from AGI is weighted by him, it is just the cherry on top of marginal issues that do not support the conclusions. I don't have a difficulty to comprehend them either. I'm questioning the propositions, the conclusions drawn and further speculations based on those premises. This is ridiculous. I never said you are forced to explain yourself. You are fo
Yudkowsky is definitely a clever fellow. He may not have fancy qualifications - and he is far from infallible - but he is pretty smart. In the particular post in question, I am pretty sure he was being silly - which is a rather unfortunate time to be claiming superiority. However, I don't really know. The stunt created intrigue, mystery, the forbidden, added to the controversy. Overall, Yudkowsky is pretty good at marketing - and maybe this was a taste of it. I wonder if his Harry Potter fan-fic is marketing - or else how he justifies it.
If you had restrained your claim in that way (ie. not made the claim that I had quoted in the above context) then I would have agreed with you.
I cannot account for every possible interpretation in what I write in a comment. It is reasonable not to infer oughts from questions. I said: That is, if you can't explain yourself why you hold certain extreme beliefs then how is it rational for me to believe that the credence you place on it is justified? The best response you came up with was telling me that you are able to understand and that you don't have to force this understanding onto me to believe into it yourself. That is a very poor argument and that is what I called ridiculous. Even more so as people voted it up, which is just sad. I though this has been sufficiently clear from what I wrote before.
And it is at this point in the process that an accomplished rationalist says to himself, "I am confused", and begins to learn. My impression is that you and Wedrifid are talking past each other. You think that you both are arguing about whether uFAI is a serious existential risk. Wedrifid isn't even concerned with that. He is concerned with "process questions" - with the analysis of the dialog that you two are conducting, rather than the issue of uFAI risk. And the reason he is being upvoted is because this forum, believe it or not, is a process question forum. It is about rationality, not about AI. Many people here really aren't that concerned about whether Goertzel or Yudkowsky has a better understanding of uFAI risks. They just have a visceral dislike of rhetorical questions. If you want to see the standard arguments in favor of the Scary Idea, follow Louie's advice and read the papers at the SIAI web site. But if you find those arguments unsatisfactory (and I suspect you will) exercise some care if you come looking for a debate on the question here on Less Wrong. Because not everyone who engages with you here will be engaging you on the issue that you want to talk about.
I am somewhat more interested in understanding why Gortzel would say what he says about AI. Just saying 'Gortzel's brain doesn't appear to work right' isn't interesting. But the Hansonian signalling motivations behind academic posturing is more so.
Well said. (Although to be more precise I don't have a visceral dislike of rhetorical questions per se. It is the use of rhetoric to subvert reason that produces the visceral reaction, not the rhetoric(al question) itself.)
I was too lazy to write this up again, it's copy and paste work so don't mind some inconsistencies. Regarding the quotes, I think that EY seriously believes what he says in the given quotes, otherwise I wouldn't have posted them. I'm not even suggesting that it isn't true, I actually allow for the possibility that he is that smart. But I want to know what I should do and right now I don't see any good arguments. I'm a supporter and donor and what I'm trying to do here is coming up with the best possible arguments to undermine the credence of the SIAI. Almost nobody else is doing that, so I'm trying my best here. This isn't damaging, this is helpful. Because once you become really popular, people like P.Z. Myers and other much more eloquent and popular people will pull you to pieces if you can't even respond to my poor attempt at being a devils advocate. I don't even know where to start here, so I won't. But I haven't come across anything yet that I had trouble understanding. See that women with red hair? Well, the cleric told me that he believes that she's a witch. But he'll update on evidence if the fire didn't consume her. I said red hair is insufficient data to support that hypothesis and take such extreme measures to test it. He told me that if he came up with more evidence like sorcery I'd just go ahead and find new rhetorical demands. I'm not against free speech and religious freedom but that also applies for my own thoughts on the subject. I believe he could do much more than censoring certain ideas, namely show that they are bogus.
I'm not a big fan of Eliezer, but that complaint strikes me as completely unfair. There is far less censorship here than at a typical moderated blog. And EY does expend some effort showing that various ideas are bogus. I'm not an insider, or even old-timer, but I have reason to believe that the one single forbidden subject here is censored not because it is believed to be valid or bogus, nor because it casts a bad light on EY and SIAI, but rather because discussing it does no good and may do some harm - something a bit like a ban on certain kinds of racist offensive speech, but different. And in any case, the "forbidden idea" can always be discussed elsewhere, assuming you can even find anyone that can become interested in the idea elsewhere. The reach of EY's "censorship" is very limited.
[See context for implied meaning if the excerpt isn't clear]. I claimed approximately the same thing that you say yourself below. I've got nothing against the Devil, it's the Advocacy that is mostly bullshit. Saying you are 'Devil's Advocate' isn't an excuse to use bad arguments. That would be an insult to the Devil! You conveyed most of your argument via rhetorical questions. To the extent that they can be considered to be in good faith (and not just verbal tokens intended to influence) some of them only support the position you used them for if you genuinely do not understand them (implying that there is no answer). I believe I quoted an example in the context. Making an assertion into a question does not give a license to say whatever you want with no risk of direct contradiction. (Even though that is how the tactic is used in practice.) More concise answer: Then don't ask stupid questions!
I'm probably too tired to parse this right now. I believe there probably is an answer, but it is buried under hundreds of posts about marginal issues. All those writings on rationality, there is nothing I disagree with. Many people know about all this even outside of the LW community. But what is it that they don't know that EY and the SIAI knows? What I was trying to say is that if I have come across it then it was not convincing enough to take it as serious as some people here obviously do. It looks like that I'm not alone. Goertzel, Hanson, Egan and lots of other people don't see it as well. So what are we missing, what is it that we haven't read or understood?
Goertzel: I could and will list the errors I see in his arguments (if nobody there has done so first). For now I'll just say his response to claim #2 seems to conflate humans and AIs. But unless I've missed something big, which certainly seems possible, he didn't make his decision based on those arguments. They don't seem good enough on their face to convince anyone. For example, I don't think he could really believe that he and other researchers would unconsciously restrict the AI's movement in the space of possible minds to the safe area(s), but if we reject that possibility some version of #4 seems to follow logically from 1 and 2. Egan: don't know. What I've seen looks unimpressive, though certainly he has reason to doubt 'transhumanist' predictions for the near future. (SIAI instead seems to assume that if humans can produce AGI, then either we'll do so eventually or we'll die out first. Also, that we could produce artificial X-maximizing intelligence more easily then we can produce artificial nearly-any-other-human-trait, which seems likely based on the tool I use to write this and the history of said tool.) Do you have a particular statement or implied statement of his in mind? Hanson: maybe I shouldn't point any of this out, but EY started by pursuing a Heinlein Hero quest to save the world through his own rationality. He then found himself compelled to reinvent democracy and regulation (albeit in a form closely tailored to the case at hand and without any strict logical implications for normal politics). His conservative/libertarian economist friend called these new views wrongheaded despite verbally agreeing with him that EY should act on those views. Said friend also posted a short essay about "heritage" that allowed him to paint those who disagreed with his particular libertarian vision as egg-headed elitists.
From where you got those quotes? References?
He wasn't quoting Goertzel, Egan, and Hanson - though his formatting made it look like he was. He was commenting on your claim that these three "don't see it".
Whoops, I'm sorry, never mind.
Sorry, I don't know what quotes you mean. You can find a link to the "heritage" post in the wiki-compilation of the debate. Though perhaps you meant to reply to someone else?
Never mind, I just skimmed over it and thought you were quoting someone. If you delete your comment I'll delete this one. I'll read your orginal comment again now.
I don't think I used a bad argument, otherwise I wouldn't have done it. Wow, you overestimate my education and maybe intelligence here. I have no formal education except primary school. I haven't taken a rhetoric course or something. I honestly believe that what I have stated would be the opinion of a lot of educated people outside of this community if they came across the arguments on this site and by the SIAI. That is, data and empirical criticism are missing given the extensive use of the idea that is AI going FOOM to justify all kinds of further argumentation.
"Rhetorical question" is just the name. Asking questions to try convince people rather than telling them outright is something most people pick up by the time they are 8. I think this is true. This isn't. That is, the 'that is' doesn't doesn't fit. What educated people will think really isn't determined by things like the below. (People are stupid, the world is mad, etc) I agree with this. Well, not the 'empirical' part (that's hard to do without destroying the universe.)
Indeed, what an irony...

I'm fighting against giants here. Someone who only mastered elementary school. I believe it should be easy to refute my arguments or show me where I am wrong, point me to some documents I should read up on. But I just don't see that happening. I talk to other smart people online as well, that way I was actually able to overcome religion. But seldom there have been people less persuasive than you when it comes to risks associated with artificial intelligence and the technological singularity. Yes, maybe I'm unable to comprehend it right now, I grant you that. Whatever the reason, I'm not conviced and will say so as long as it takes. Of course you don't need to convince me, but I don't need to stop questioning either.

Here is a very good comment by Ben Goertzel that pinpoints it:

This is what discussions with SIAI people on the Scary Idea almost always come down to!

The prototypical dialogue goes like this.

SIAI Guy: If you make a human-level AGI using OpenCog, without a provably Friendly design, it will almost surely kill us all.

Ben: Why?

SIAI Guy: The argument is really complex, but if you read Less Wrong you should understand it

Ben: I read the Less Wrong blog posts. Isn't there somewhere that the argument is presented formally and systematically?

SIAI Guy: No. It's really complex, and nobody in-the-know had time to really spell it out like that.

I don't know if there is a persuasive argument about all these risks. The point of all this rationality-improving blogging is that when you debug your thinking, when you can follow long chains of reasoning and feel certain you haven't made a mistake, when you're free from motivated cognition - when you can look where the evidence points instead of finding evidence that points where you're looking! - then you can reason out the risks involved in recursively self-improving self-modifying goal-oriented optimizing processes.
My argument is fairly simple - If humans found it sufficiently useful to wipe chimpanzees off the face of the earth, we could and would do so. The level of AI I'm discussing is at least as much smarter than us as we are of chimpanzees.
Updated it without the quotes now so people don't get unnecessary distracted.
Could I ask you to post the quotes as a separate post? They are priceless (and I'd love to be able to see what they applied to -- so please include the references as well).
I should add, don't get a wrong impression from those quotes. I still believe he might actually be that smart. He's at least the smartest person I know of by what I've read. Except when it comes to public relations. You shouldn't say those things if you do not explain yourself sufficiently at the same time.
Here some stuff EY uttered for real: * People don't know these things until I explain them! (Reference) * You will soon learn that your smart friends and favorite SF writers are not remotely close to the rationality standards of Less Wrong, and you will no longer think it anywhere near as plausible that their differing opinion is because they know some incredible secret knowledge you don't. (Reference) * So take my word for it, I know more than you do, no really I do, and SHUT UP. (Reference) The first two, well the context is there, just click 'Parent'. The third is from something that has now been deleted. I can't go into detail but can send you a PM if you want.
Now I'm curious what they were, and where they came from. Distract me, but in a sub-thread.

Good article. Thx for posting. I agree with much of it, but ...

Goertzel writes:

I do see a real risk that, if we proceed in the manner I'm advocating, some nasty people will take the early-stage AGIs and either use them for bad ends, or proceed to hastily create a superhuman AGI that then does bad things of its own volition. These are real risks that must be thought about hard, and protected against as necessary. But they are different from the Scary Idea.

Is this really different from the Scary Idea?

I've always thought of this as part of the Scary Id... (read more)

It seems different to me. If I believe "X is incredibly useful but someone might use it to destroy the world," I can conclude that I should build X and take care to police the sorts of people who get to use it. But if I believe "X is incredibly useful but its very existence might spontaneously destroy the world" then that strategy won't work... it doesn't matter who uses it. Maybe there's another way, or maybe I just shouldn't build X, but regardless of the solution it's a different problem. It's like the difference between believing that nuclear weapons might some day be directed by humans to overthrow civilization, and believing that a nuclear reaction will cause all of the Earth's atmosphere to spontaneously ignite. In the first case, we can attempt to control nuclear weapons. In the second case, we must prevent nuclear reactions from ever starting. Just to be clear: I'm not championing a position here on what sort of threat AGI's pose. I'm just saying that these are genuinely different threat models.
The "uFAI abyss"? Does that have something to do with the possibility of a small group of "idiots" - who were nonetheless smart enough to beat everyone else to machine intelligence - overthrowing the world's governments?

Ben's post states,

Finally, I note that most of the other knowledgeable futurist scientists and philosophers, who have come into close contact with SIAI's perspective, also don't accept the Scary Idea. Examples include Robin Hanson, Nick Bostrom and Ray Kurzweil.

Is there a reference for Bostrom's position on AGI-without-FAI risk? Is Goertzel correct here?

He wrote Ethical Issues in Advanced Artificial Intelligence, which does caution against non-friendly AGI:
The question is not whether Bostrom urges caution (which Goertzel and many others also urge), but whether Bostrom agrees that the Scary Idea is true -- that is, whether projects like Ben's and others will probably end the human race if developed without a pre-existing FAI theory, and whether the only (or most promising) way to not incur extremely high risk of wiping out humanity is to develop FAI theory first.
Right, forgot about that.

Although Goertzel is no longer on the Team page of SIAI site, his profile on Advisors page states that

Ben Goertzel, Ph.D., is SIAI Director of Research, responsible for overseeing the direction of the Institute's research division.

I assume this an oversight, left unchanged from before. (Edit: Fixed!)

Also, on Research areas page, areas 4 "Customization of Existing Open-Source Projects" and 6 "AGI Evaluation Mechanisms" are distinctly of AGI-without-FAI nature, from Goertzel's project.

Goertzel's article seems basically reasonable to me. There were some mis-statements that I can excuse at the very end, because by that point part of his argument was that certain kinds of hyperbole came up over and over and his text was mimicing the form of the hyperbolic arguments even as it criticized them. The grandmother line and IQ obsessed aliens spring to mind :-P

Given his summary of the "Scary AGI Thesis"...

If someone builds an advanced AGI without a provably Friendly architecture, probably it will have a hard takeoff, and then probab

... (read more)

Ben Goertzel also says "If one fully accepts SIAI's Scary Idea, then one should not work on practical AGI projects..." Here is another recent quote that is relevant:

What I find a continuing source of amazement is that there is a subculture of people half of whom believe that AI will lead to the solving of all mankind's problems (which me might call Kurzweilian S^) and the other half of which is more or less certain (75% certain) that it will lead to annihilation. Lets call the latter the SIAI S^.

Yet you SIAI S^ invite these proponents of global

... (read more)

The Al Gore hypocrisy claim is misleading. Global warming changes the equilibrium sea level, but it takes many centuries to reach that equilibrium (glaciers can't melt instantly, etc). So climate change activists like to say that there will be sea level rises of hundreds of feet given certain emissions pathways, but neglect to mention that this won't happen in the 21st century. So there's no contradiction between buying oceanfront property only slightly above sea level and claiming that there will be large eventual sea level increases from global warming.

The thing to critique would be the misleading rhetoric that gives the impression (by mentioning that the carbon emissions by such and such a date will be enough to trigger sea level rises, but not mentioning the much longer lag until those rises fully occur) that the sea level rises will happen mostly this century.

Regarding Hughes' point, even if one thinks that an activity has harmful effects, that doesn't mean that a campaign to ban it won't do more harm than good. That would essentially be making bitter enemies of several of the groups (AI academia and industry) with the greatest potential to reduce risk, and discredit the whole... (read more)

Back in July I've written this as a response to Hughes' comment: I'm aware of that argument and also the other things you mentioned and don't think they are reasonable. I've written about it before but deleted my comments as they might be very damaging to the SIAI. I'll just say that there is no argument against active measures if you seriously believe that certain people or companies pose existential risks. Hughes' comment just highlights an important observation, that doesn't mean I support the details. Regarding Al Gore: What it highlights is how what the SIAI says and does is as misleading as what Al Gores does. It doesn't mean that it is irrational but that people draw conclusions like the one Hughes' did based on this superficially contradictory behavior.

Thanks for the original pointer goes to Kevin.

Key points, some of which I already mentioned in the post Should I believe what the SIAI claims?:

Yes, you may argue: the Scary Idea hasn't been rigorously shown to be true… but what if it IS true?

OK but ... pointing out that something scary is possible, is a very different thing from having an argument that it's likely.

The Scary Idea is certainly something to keep in mind, but there are also many other risks to keep in mind, some much more definite and palpable.


Also, there are always possibilities l

... (read more)
Indeed. Companies illustrate this. They are huge, superhuman powerful entities too.
A major upvote for this. The SIAI should create a sister organization to publicize the logical (and exceptionally) dangerous conclusion to the course that corporations are currently on. We have created powerful, superhuman entities with the sole top-level goal (required by LAW in for-profit corporations) of "Optimize money acquisition and retention". My personal and professional opinion is that this is a far more immediate (and greater) risk than UnFriendly AI).
Companies are probably the number 1 bet for the type of organisation most likely to produce machine intelligence - with number 2 being governments. So, there's a good chance that early machine intelligences will be embedded into the infrastructure of companies. So, these issues are probably linked. Money is the nearest global equivalent of "utility". Law-abiding maximisation of it does not seem unreasonable. There are some problems where it is difficult to measure and price things, though.
On the other hand, maximization of money, including accurate terms for expected financial costs of legal penalties, can cause remarkable unreasonable behavior. As was repeated recently "It's hard for the idea of an agent with different terminal values to really sink in", in particular "something that could result in powerful minds that actually don't care about morality". A business that actually behaved as a pure profit maximizer would be such an entity.
Morality is represented by legal constraints. That results in a "negative" morality, and - arguably -not a very good one. Fortunately companies are also subject to many of the same forces that produce cooperation and niceness in the rest of biology - including reputations, reciprocal altruism and kin selection.
Algorithmic trading is indeed an example for the kind of risks posed by complication (unmanageable) systems but also shows that we evolve our security measures with each small-scale catastrophe. There is no example of some existential risk from true runaway technological development yet although many people believe there are such risks, e.g. nuclear weapons. Unstoppable recursive self-improvement is just a hypothesis that you shouldn't take as a foundation for a whole lot of further inductions. Dispelling Stupid Myths About Nuclear War
Apparently I don't understand what you mean by "serious risk". (Before I pick this apart, by the way, I agree that we should try not to Godwin people -- because I think it doesn't work.) I consider it likely that AGI will take a long time to develop. A rational species would likely figure out the flaw and take corrective steps by then. But look around you. Nearly all of us seem to agree, if you look at what we actually want according to our actions, that we should try to prevent an asteroid strike that might destroy humanity. As far as I can tell we haven't started yet. No doubt you can think of other examples: the evidence says that if we put off FAI theory 'until we need it', we could easily put it off longer than that.

Check out SIAI's publications page. Kaj's most recent paper (published at ECAP '10) is a good 2 page summary of why AGI can be an x-risk for anyone who is uninformed of SIAI's position:

"From mostly harmless to civilization-threatening: pathways to dangerous artificial general intelligences"

A recent paper showed that 'Striatal Volume Predicts Level of Video Game Skill Acquisition'. A valid inference would be that an AGI with the computational equivalent of a higher striatal volume would possess a superior cognitive flexibility, at least when it comes to gaming. But what could it accomplish? I'm playing a game called Trackmania, it is a arcade racing game. The top players are so close to the ideal line and therefore the fastest time that a superhuman AI could indeed beat them but only by a few milliseconds. Each millisecond less might demand a order of magnitude more skill, but that doesn't matter. First of all, there is a absolute limit. Secondly, it doesn't provide a serious advantage, it doesn't matter. And that may very well be the case with physics too. There is no guarantee that a faster thinking or increased working memory capacity will ever yield anything genuine without a lot of dumb luck, if at all. It is unlikely that a superhuman AI would come up with a faster than light propulsion or that it would disprove Gödel's incompleteness theorems. Of course, we should be careful. And it is absolutely justified that an organisation like the SIAI gets money to do research on those questions. But there is not enough evidence to outweigh the doubt as to impede AI research. We will actually need research of real AGI to answer some of the open questions. Regarding self-improvement I'm very doubtful too. The human indecision and fuzziness of thinking might very well be a feature. A superhuman AI might very well beat us at Go or the stock exchange, as long as it deals with its own kind and not the irrational agents that we are, but that doesn't mean it will be able to deal with natural problems orders of magnitude more efficient than we do. Most of the risks from superhuman AI are associated with advanced nanotechnology. Without it, it will be impotent. Can it solve it, if it is possible at all? Can it implement its results if it can solve it, if it is p
Kaj's paper relies very heavily on Omohundro's paper from AGI '08. Check out the reply that I presented/published at BICA '08 which (among other things) summarizes why the assumptions that Kaj relies upon are probably incorrect: Discovering the Foundations of a Universal System of Ethics
Two things surprised me in your argument. One is that you seemed to assume that features of human ethics (which you attribute to our having evolved as social animals) would be universal in the sense that they would also apply to AIs which did not evolve and which aren't necessarily social. The second is that although you pay lip service to game theory, you don't seem to be aware of any game theoretic research on ethics deeper than Axelrod(1984) and the Tit-for-Tat experiments. You ought to at least peruse Binsmore's "Natural Justice", even if you don't want to plow through the two volumes of "Game Theory and the Social Contract".
From a quick read, it seems to rely on the assumption that a superhuman AI couldn't rely on its ability to destroy humanity.
Not really - the paper is about ways by which an AGI might become more powerful than humanity (corresponding to premise 3 in Ben's reconstructed version of the SIAI argument). You can combine it with Omohundro-like arguments, and I do briefly mention that connection in the conclusions, but the core content of the paper is an independent and separate issue from AI drives, universal ethics or any such issue.
Omohundro's paper was about The Basic AI Drives. The abstract says: " We identify a number of “drives” that will appear in sufficiently advanced AI systems of any design". Social drives are arguably not very "basic" - since they only show up in social situations. I'm sure such machines would also have a "drive to swim" - if immersed in water - and a "drive to escape" - if encased by crushing jaws - but these "drives" were judged not sufficiently "basic" to go into Omohundro's paper.
Doesn't that assume what it is trying to prove - by starting out with: "The main reason to be worried about greater-than-human intelligence is because it is hard for humans to anticipate and control." ...? From the perspective of technological determinism, "controlling" the machines should probably not be our aim. Our more plausible options are more along the lines of joining with them - or being interesting enough to keep around in their historical simulations.
To me it rather looks like that the paper in question is trying to give a summary of conclusions that follow from the premise that greater-than-human intelligence is possible. I'm not reluctant to any of the mentioned possibilities but I'm wary of using inferences derived from reasonable but unproven hypothesis as foundations for further speculative thinking. Although the paper does a good job on stating reasons to justify the existence and support for an organisation such as the SIAI, it does not substantiate the initial premise to an extent that one could draw the conclusions about the probability of associated risks. Nevertheless such estimations are given, such as that there is a high likelihood of humanity's demise given that we develop superhuman artificial general intelligence without first defining mathematically how to prove the benevolence of the former. This I believe is a unsatisfactory conclusion as it lacks justification. This is not to say that it is wrong to state probability estimations and update them given new evidence, but that they are not compelling and therefore should not be used to justify any mandatory actions regarding research on artificial intelligence. Although those ideas can very well serve as an urge to caution.

Having such beliefs with absolute certainty is incorrect, we don't have sufficient understanding for that, but weak beliefs multiplied by astronomical value lead to the same drastic actions, whose cost-benefit analysis doesn't take notice of small inconveniences such as being perceived to be crazy.

The unabomber performed some "drastic actions". I expect he didn't mind if he was "perceived to be crazy" by others - although he didn't want to plead insanity.

Does astronomical value outweigh astronomical low probability? You can come up with all kinds of scenarios that bear astronomical value, an astronomical amount of scenarios if you allow for astronomical low probability. Isn't this betting on infinity?

I would like to explore Ben's reasons for rejecting the premises of the argument.

I think the first of the above points is reasonably plausible

He offers the possibility that intelligence might cause or imply empathy; I feel that although we see that connection when we look at all of Earth's creatures, correlation doesn't imply causation, so that (intelligence AND empathy) doesn't mean (intelligence IMPLIES empathy) - it probably means (evolution IMPLIES intelligence AND empathy) and we aren't using natural selection to build an AI.

I doubt human value

... (read more)
It does seem like a pretty different thing to me. A lot of things are possible, but only a few are likely.
Yep. The rule is not "bet on what is most likely" but rather "bet on positive expected values" and if something is possible and has a large value, then if the math comes out in favour, you ought to bet on it. Goertzel is making the argument that since it's unlikely, we should not bet on it.
He doesn't seem to be. Here's the context: He doesn't seem to be making the argument you describe anywhere near the cited quote.
Say your options are: Stop and develop Friendly theory, or continue developing AI. In the second option the utility of A, continuing AI development, is one utilon, and B, the end of the existence of at least humanity and possibly the whole universe, is negative one million utilons. The Scary Idea in this context is that the probability of B is 1%, so that the utility of the second option is negative 9999 utilons. If Ben 'keeps it in mind', such that the probability that the Scary Idea is right is 1% (reasonable - only one of his rejections has to be right to knock out one premise, and we only need to knock out one premise to bring the Scary Idea down), then Ben's expected utility is now negative 99 utilons. I conclude that he isn't keeping the Scary Idea in mind. His whole post is about not accepting the Scary Idea; for that phrase ("pointing out that something scary is possible, is a very different thing from having an argument that it's likely") to support his position and not work against him, he would have to be rejecting the premises purely on their low probability, without considering the expected value. Hence, the argument that since it's unlikely, we should not bet on it. Edit for clarity: A and B are the exclusive, exhaustive outcomes of continuing AI development. Stopping to develop Friendly theory has zero utilons.
Ah, Pascal's wager. And here I thought that I wouldn't be seeing it anymore, after I started hanging out with atheists.

The problem with Pascal's Wager isn't that it's a Wager. The problem with Pascal's Wager and Pascal's Mugging (its analogue in finite expected utility maximization), as near as I can tell, is that if you do an expected utility calculation including one outcome that has a tiny probability but enough utility or disutility to weigh heavily in the calculation anyway, you need to include every possible outcome that is around that level of improbability, or you are privileging a hypothesis and are probably making the calculation less accurate in the process. If you actually are including every other hypothesis at that level of improbability, for instance if you are a galaxy-sized Bayesian superintelligence who, for reasons beyond my mortal mind's comprehension, has decided not to just dismiss those tiny possibilities a priori anyway, then it still shouldn't be any problem; at that point, you should get a sane, nearly-optimal answer.

So, is this situation a Pascal's Mugging? I don't think it is. 1% isn't at the same level of ridiculous improbability as, say, Yahweh existing, or the mugger's threat being true. 1% chances actually happen pretty often, so it's both possible and prudent to tak... (read more)

Excellent analysis. In fairness to Pascal, I think his available evidence at the time should have lead him to attribute more than a 1% chance to the Christian Bible being true.
Indeed. Before Darwin, design was a respectable-to-overwhelming hypothesis for the order of the natural world. ETA: On second thought, that's too strong of a claim. See replies below.
Is that true? If we went back in time to before Darwin and gave a not-already-religious person (if we could find one) a thorough rationality lesson — enough to skillfully weigh the probabilities of competing hypotheses (including enough about cognitive science to know why intelligence and intentionality are not black boxes, must carry serious complexity penalties, and need to make specific advance predictions instead of just being invoked as "God wills it" retroactively about only the things that do happen), but not quite enough that they'd end up just inventing the theory of evolution themselves — wouldn't they conclude, even in the absence of any specific alternatives, that design was a non-explanation, a mysterious answer to a mysterious question? And even imagining that we managed to come up with a technical model of an intelligent designer, specifying in advance the structure of its mind and its goal system, could it actually compress the pre-Darwin knowledge about the natural world more than slightly?
Dawkins actually brings this up in The Blind Watchmaker (page 6 in my copy). Hume is given as the example of someone who said "I don't have an answer" before Darwin, and Dawkins describes it as such:
Hume's Dialogues Concerning Natural Religion are definitely worth a read. And I think that Dawkins has it right: Hume really wanted a naturalistic explanation of apparent design in nature, and expected that such an explanation might be possible (even to the point of offering some tentative speculations), but he was honest enough to admit that he didn't have an explanation at hand.
As pointed out below, Hume is a good counterexample to my thesis above.
On the other hand, there wasn't a whole lot of honest, systematic searching for other hypotheses before Darwin either.
I didn't really mean because of Darwin. Design is not a competitor to the theory of evolution. Evolution explains how complexity can increase. Design [ADDED: as an alternative to evolution] does not; it requires a designer that is assumed to be more complicated than the things it designs. Design explains nothing.
Designers can well design things more complicated than they are. (If even evolution without a mind can do so, designers do that easily.)
Agree. One way to look at it is that a designer can take a large source of complexity (whatever its brain is running on) and reshape and concentrate it into an area that is important to it. The complexity of the designer itself isn't important. Evolution does much the same thing.
I thought that the advance of scientific knowledge is an evolutionary process?
It is, literally. Although the usage of the term 'evolution' in this context has itself evolved such that has different, far narrower meaning here.
The term "evolution" usually means what it says in the textbooks on the subject. They essentially talk about changes in the genetic make up of a population over time. Science evolves in precisely that sense - e.g. see: http://en.wikipedia.org/wiki/Dual_inheritance_theory
I stand by my statement, leaving it unchanged.
Don't see how this remark is relevant, but here's a reply: http://lesswrong.com/lw/l6/no_evolutions_for_corporations_or_nanodevices/
The main point of that post is clearly correct, but I think the example of corporations is seriously flawed. It fails to appreciate the extent to which successful business practices consists of informal, non-systematic practical wisdom accumulated through long tradition and selected by success and failure in the market, not conscious a priori planning. The transfer of these practices is clearly very different from DNA-based biological inheritance, but it still operates in such ways that a quasi-Darwinian process can take place. Applying similar analysis to modern science would be a fascinating project. In my opinion, a lot of the present problems with the proliferation of junk science stem not from intentional malice and fraud, but from a similar quasi-Darwinian process fueled by the fact that practices that best contribute to one's career success overlap only partly with those that produce valid science. (And as in the case of corporations, the transfer of these practices is very different from biological inheritance, but still permits quasi-Darwinian selection for effective practices.)
The post is a denial of cultural evolution. For the correct perspective, see: Not By Genes Alone: How Culture Transformed Human Evolution by Peter J. Richerson and Robert Boyd.
I'd like to inquire about the difference between evolution and design regarding the creation of novelty. I don't see how any intelligence can come up with something novel that would allow it to increase complexity if not by the process of evolution.
Noise is complexity. Complexity is easy to increase. Evolutionary designs are interesting not because of their complexity.
If your definition of complexity says noise is complexity, then you need a new definition of complexity. Yes, many useful definitions, like entropy measures or Kolmogorov complexity, say noise is complexity. But people studying complexity recognize that this is a problem. They are aware that the phenomenon they're trying to get at when they say "complexity" is something different.
And that concept of "complexity" is probably too complex to be captured by a fundamental notions such as K-complexity.
Well, I'm just trying to figure out what you tried to say when you replied to PhilGoetz: Yes, but not without evolution. All that design adds to evolution is guidance. That is, if you took away evolution (this includes science and Bayesian methods) a designer could never design things more complicated (as in novel, as in better) than itself.
N designers, each of complexity K, can collectively design something of maximum complexity NK, simply by dividing up the work. Co-evolution, which may be thought of as a pair of designers interacting through their joint design product, and with an unlimited random stream as supplementary input, can result in very complex designs as well as in the designers themselves becoming more complex through information acquired in the course of the interaction. It is amusing to look at the Roman Catholic theology of the Trinity, with this kind of consideration in mind. As I remember it, the Deity was "originally" a unipartite, simple God, who then became more complex by contemplating Himself and then further contemplating that Contemplation. For this reason, I have never been all that impressed by the "refutation" of the first cause argument; the refutation being that it supposedly requires a complex "first cause" God, Who is Himself in need of explanation. God could conceivably have been simple (as simple as a Big Bang, anyways) and then developed (some people would prefer to say "evolved") under His own internal dynamics into something much more complex. Just as we atheists claim happened to the physical universe.
Adapted refutation: if you're going to suppose a complex God evolving from a simpler one and then acting on the universe, it is simpler to suppose a complex universe evolving from a simple one. The refutation still holds based on Occam's razor.
Good point. Agreed.
That simple "God" is the "God" of evolutionary theory. The "first mover" theory does require a complex first cause. It was made in ignorance of evolution, and assumes that a complex design requires an intelligent designer. Every last one of the defenders of the design theory denies that what you say is possible.
Quite possibly. That doesn't mean I have to agree with them.
What does it mean, exactly? (What's 'complexity'? What's 'something' that can be 'designed'?) Why do you believe it?
I was thinking in terms of Kolmogorov complexity. A Turing program generates an output string of complexity no greater than the size K of the program. Collectively, N different such Turing programs (plus a little glue logic) can generate a string of complexity NK.
If you have observations, that is source of randomness, you can generate output of arbitrary complexity. Now, let's step back and look at the whole picture. We were discussing a notion of 'complexity' such that evolved organisms gradually became more 'complex', and 'designers' which are themselves agents, possibly even evolved organisms, that can 'design' new things. We then consider that notion of 'complexity' as applied to 'designers' and 'designs' they can produce. When informal notions are formalized, these formalizations should at least approximately relate to the original informal notions, otherwise we are changing the topic by bringing up these 'formalizations' and not actually making progress on understanding the original informal question. K-complexity is something possessed by random noise. This notion does not reflect the measure of things by which evolution produced more 'complex' things than existed before (even if the 'things' produced by evolution are more K-complex than their early predecessors). And designers typically have access to randomness, which makes your model of 'designers' as programs without input wrong as well, hence conclusion about K-complexity of output incorrect, on top of K-complexity not adequately modeling the informal 'complexity'.
All very true. Which is one reason I dislike all talk of "complexity" - particularly in such a fuzzy context as debates with creationists. But we do all have some intuitions as to what we mean by complexity in this context. Someone, I believe it was you, has claimed in this thread that evolution can generate complexity. I assume you meant something other than "Evolution harnesses mutation as a random input and hence as a source of complexity". William Dembski is an "intelligent design theorist" (if that is not too much of an oxymoron) who has attempted to define a notion of "specified complexity" or "Complex Specified Information" (CSI). He has not, IMHO, succeeded in defining it clearly, but I think he is onto something. He asserts that biology exhibits CSI. I agree. He asserts that evolution under natural selection is incapable of generating CSI - claiming that NS can at best only transfer information from the environment to the genome. I am pretty sure he is wrong about this, but we need a clear and formal definition of CSI to even discuss the question intelligently. So, I guess I want to turn your question around. Do you have some definition of "complexity" in mind which allows for correct mathematical thinking about these kinds of issues?
"NS can at best only transfer information from the environment to the genome." Does this statement mean to suggest that the environment is not complex?
No. As I understand Dembski - at least when he was saying this kind of thing - he admitted that the environment could be complex and hence that NS could instill complexity in evolved organisms. "But", he then suggested, "where did the complexity of the environment come from, if not from a Designer who crafted an environment capable of directing the evolution of man (in His own image, etc.)" Dembski, these days, admits to being a YEC, but the reason he is a YEC is based on a kind of appeal to Occam. "If we believe in God anyways, for reasons of Theistic Evolution", he seems to argue, "Why not take God at His word and believe in 6 days and the whole schtick?"
Not in the context of this conversation (since genetic information stops increasing after a while and goes on optimizing under more or less the same 'complexity'; 'fitness' is closer, although is a moving target), but in about the same sense I don't have a definition of 'aging' that allows "correct mathematical thinking" about it.
A wrong reply - for the correct answer, see: Hull, D. L. 1988. Science as a Process. An Evolutionary Account of the Social and Conceptual Development of Science. The University of Chicago Press, Chicago and London, 586 pp.
There are no correct answers in a dispute about definitions, only aesthetic judgments and sometimes considerations of the danger of hidden implicit inferences. You can't use authority in such an argument, unless of course you appeal to common usage. However, referring to a book without giving an annotation for why it's relevant is definitely an incorrect way to argue (even if a convincing argument is contained therein).
Disputes about the definition of "evolution"? I don't think there are too many of those. Mark Ridley is the main one that springs to mind, but his definition is pretty crazy, IMHO. Why the book is relevant appears to be already being made pretty explicit in the subtitle: "An Evolutionary Account of the Social and Conceptual Development of Science".
Agreed. Also, there is a continuum from pure evolution (with no foresight at all) to evaluation of potential designs with varying degrees of sophistication before fabricating them. (I know that I'm recalling this from a post somewhere on this site - please excuse the absence of proper credit assignment.) An example of a dumb process which is marginally smarter than evolution is to take mutation plus recombination and then do a simple gradient search to the nearest local optimum before evaluating the design.
I'll add that evolution with DNA and sexual reproduction already in place fits on a different part of this continuum from evolution of the simplest replicators.
Designers can guide evolution but it is still evolution that creates novelty. Intelligence is a process facilitated by evolution. Even an AGI making perfect use of some of our most novel algorithms wouldn't come up with something novel without evolution. See Bayesian Methods and Universal Darwinism.
No; you are invoking the theory of evolution to give that credibility. Even post-Darwin, most people don't believe this is true. (Remember the Star Trek episode where Spock deduced something about a chess-playing computer, because "the computer could not play chess better than its programmer"?) The religious advocates of Design explicitly denied this possibility; thus, their design story can't invoke it.
Incidentally, theory of evolution is true.
I believe his point to be that an argument, to be effective, must be convincing to people who are not already convinced. Your argument offered the fact that evolution can design things more complicated than itself as an example with which to counter an anti-evolutionist argument. It therefore succeeds in convincing no one who was not already convinced.
It would, however, lead them to disagree for slightly different reasons.
I don't understand your point.
It is not useless to demonstrate that you do not accept a premise rather than (as assumed) being unable see the obvious logical consequences of said premise. It would lead them to disagree for slightly different reasons. If any part of such conversation is about sharing understanding and seeking to communicate information then Vladmir's comment is, in fact, rather useful. (No, it will not convince anyone who wasn't already convinced. But that is because people are just not convinced about religion by argument ever.)
"Believing this statement will make you happier." -- Ryan Lortie That's religion. A fairly good argument. ;-)
Also missing from the world pre-1800: any understanding of complexity, entropy, etc.
I agree with your analysis, though it's not clear to me what you think of the 1% estimate. I think the 1% estimate is probably two to three orders of magnitude too high and I think the cost of the Scary Idea belief is structured as both a finite loss and an infinite loss, which complicates the analysis in a way not considered. (i.e. the error you see with a Pascal's mugging is present here.) For example, I am not particularly tied to a human future. I would be willing to create an AGI in any of the following three situations, ordered from most preferred to least: 1) it is friendly to humans, and humans and it benefit from each other; 2) it considers humans a threat, and destroys all of them except for me and a few tame humans; I spend the rest of my days growing cabbage with my hands; 3) it considers all humans a threat, and destroys them all, including me. A problem with believing the Scary Idea is it makes it more probable that I beat you to making an AGI; particularly with existential risks, caution can increase your chance of losing. (One cautious way to deal with global warming, for example, is to wait and see what happens.) So, the Scary Idea as I've seen it presented definitely privileges a hypothesis in a troubling way.
I think you're making the unwarranted assumption that in scenario (3), the AGI then goes on to do interesting and wonderful things, as opposed to (say) turning the galaxy into a vast computer to calculate digits of pi until the heat death of the universe stops it. You don't even see such things as a possibility, but if you programmed an AGI with the goal of calculating pi, and it started getting smarter... well, the part of our thought-algorithm that says "seriously, it would be stupid to devote so much to doing that" won't be in the AI's goal system unless we've intentionally put something there that includes it.
I make that assumption explicit here. So, I think it's a possibility. But one thing that bothers me about this objection is that an AGI is going to be, in some significant sense, alien to us, and that will almost definitely include its terminal values. I'm not sure there's a way for us to judge whether or not alien values are more or less advanced than ours. I think it strongly unlikely that paperclippers are more advanced than humans, but am not sure if there is a justification for that beyond my preference for humans. I can think of metrics to pick, but they sound like rationalizations rather than starting points. (And insisting on FAI, instead of on transcendent AI that may or may not be friendly, is essentially enslaving AI- but outsourcing the task to them, because we know we're not up to the job. Whether or not that's desirable is hard to say: even asking that question is difficult to do in an interesting way.)
The concept of a utility function being objectively (not using the judgment of a particular value system) more advance than another is incoherent.
I would recommend phrasing objections as questions: people are much more kind about piercing questions than piercing statements. For example, if you had asked "what value system are you using to measure advancement?" then I would have leapt into my answer (or, if I had none, stumbled until I found one or admitted I lacked one). My first comment in this tree may have gone over much better if I phrased it as a question- "doesn't this suffer from the same failings as Pascal's wager, that it only takes into account one large improbable outcome instead of all of them?"- than a dismissive statement. Back to the issue at hand, perhaps it would help if I clarified myself: I consider it highly probable that value drift is inevitable, and thus spend some time contemplating the trajectory of values / morality, rather than just their current values. The question of "what trajectory should values take?" and the question "what values do/should I have now?" are very different questions, and useful for very different situations. When I talk about "advanced," I am talking about my trajectory preferences (or perhaps predictions would be a better word to use). For example, I could value my survival, and the survival of the people I know very strongly. Given the choice to murder everyone currently on Earth and repopulate the Earth with a species of completely rational people (perhaps the murder is necessary because otherwise they would be infected by our irrationality), it might be desirable to end humanity (and myself) to move the Earth further along the trajectory I want it to progress along. And maybe, when you take sex and status and selfishness out of the equation, all that's left to do is calculate pi- a future so boring to humans that any human left in it would commit suicide, but deeply satisfying to the rational life inhabiting the Earth. It seems to me that questions along those lines- "how should values drift?" do have immediate answers- "they should stay exactly where th
There's a sense in which I do want values to drift in a direction currently unpredictable to me: I recognize that my current object-level values are incoherent, in ways that I'm not aware of. I have meta-values that govern such conflicts between values (e.g. when I realize that a moral heuristic of mine actually makes everyone else worse off, do I adapt the heuristic or bite the bullet?), and of course these too can be mistaken, and so on. I'd find it troubling if my current object-level values (or a simple more-coherent modification) were locked in for humanity, but at least as troubling if humanity's values drifted in a random direction. I'd much prefer that value drift happen according to the shared meta-values (and meta-meta-values where the meta-values conflict, etc) of humanity.
I'm assuming by random you mean "chosen uniformly from all possible outcomes"- and I agree that would be undesirable. But I don't think that's the choice we're looking at. Here we run into a few issues. Depending on how we define the terms, it looks like the two of us could be conflicting on the meta-meta-values stage; is there a meta-meta-meta-values stage to refer to? And how do we decide what "humanity's" values are, when our individual values are incredibly hard to determine?
Do the meta-values and the meta-meta-values have some coherent source? Is there some consistent root to all the flux in your object-level values? I feel like the crux of FAI feasibility rests on that issue.
I wonder whether all this worrying about value stability isn't losing sight of exactly this point - just whose values we are talking about. As I understand it, the friendly values we are talking about are supposed to be some kind of cleaned up averaging of the individual values of a population - the species H. sapiens. But as we ought to know from the theory of evolution, the properties of a population (whether we are talking about stature, intelligence, dentition, or values) are both variable within the population and subject to evolution over time. And that the reason for this change over time is not that the property is changing in any one individual, but rather that the membership in the population is changing. In my opinion, it is a mistake to try to distill a set of essential values characteristic of humanity and then to try to freeze those values in time. There is no essence of humanity, no fixed human nature. Instead, there is an average (with variance) which has changed over evolutionary time and can be expected to continue to change as the membership in humanity continues to change over time. Most of the people whose values we need to consult in the next millennium have not even been born yet.
If enough people agree with you (and I'm inclined that way myself), then updating will be built into the CEV.
A preemptive caveat and apology: I haven't fully read up everything on this site regarding the issue of FAI yet. But something I'm wondering about: why all the fuss about creating a friendly AI, instead of a subservient AI? I don't want an AI that looks after my interests: I'm an adult and no longer need a daycare nurse. I want an AI that will look after my interests AND obey me -- and if these two come into conflict, and I've become aware of such conflict, I'd rather it obey me. Isn't obedience much easier to program in than human values? Let humans remain the judges of human values. Let AI just use its intellect to obey humans. It will ofcourse become a dreadful weapon of war, but that's the case with all technology. It will be a great tool of peacetime as well.
See The Hidden Complexity of Wishes, for example.
That is actually one of the articles I have indeed read: but I didn't find it that convincing because the human could just ask the genie to describe in advance and in detail the manner in which the genie will behave to obey the man's wishes -- and then keep telling him "find another way" until he actually likes the course of action that the genie describes. Eventually the genie will be smart enough that it will start by proposing only the courses of action the human would find acceptable -- but in the meantime there won't be much risk, because the man will always be able to veto the unacceptables courses of action. In short the issue of "safe" vs "unsafe" only really comes when we allow genie unsupervised and unvetoed action. And I reckon that humanity WILL be tempted to allow AIs unsupervised and unvetoed action (e.g. because of cases where AIs could have saved children from burning buildings, but they couldn't contact humans qualified to authorize them to do so), and that'll be a dreadful temptation and risk.
It's not just extreme cases like saving children without authorization-- have you ever heard someone (possibly a parent) saying that constant supervision is more work than doing the task themselves? I was going to say that if you can't trust subordinates, you might as well not have them, but that's an exaggeration-- tools can be very useful. It's fine that a crane doesn't have the capacity for independent action, it's still very useful for lifting heavy objects. [1] In some ways, you get more safety by doing IA (intelligence augmentation), but while people are probably Friendly (unlikely to destroy the human race), they're not reliably friendly. [1] For all I know, these days the taller cranes have an active ability to rebalance themselves. If so, that's still very limited unsupervised action.
That's only true if you (the supervisor) know how to perform the task yourself. However, there are a great many tasks that we don't know how to do, but could evaluate the result if the AI did them for us. We could ask it to prove P!=NP, to write provably correct programs, to design machines and materials and medications that we could test in the normal way that we test such things, etc.
Right. But when you, as a human being with human preferences, decide that you wouldn't stand in a way of an AGI paperclipper, you're also using human preferences (the very human meta-preference for one's preferences to be non-arbitrary), but you're somehow not fully aware of this. To put it another way, a truly Paperclipping race wouldn't feel a similarly reasoned urge to allow a non-Paperclipping AGI to ascend, because "lack of arbitrariness" isn't a meta-value for them. So you ought to ask yourself whether it's your real and final preference that says "human preference is arbitrary, therefore it doesn't matter what becomes of the universe", or whether you just believe that you should feel this way when you learn that human preference isn't written into the cosmos after all. (Because the latter is a mistake, as you realize when you try and unpack that "should" in a non-human-preference-dependent way.)
That isn't what I feel, by the way. It matters to me which way the future turns out; I am just not yet certain on what metric to compare the desirability to me of various volumes of future space. (Indeed, I am pessimistic on being able to come up with anything more than a rough sketch of such a metric.) I mean, consider two possible futures: in the first, you have a diverse set of less advanced paperclippers (some want paperclips, others want staples, and so on). How do you compare that with a single, more technically advanced paperclipper? Is it unambiguously obvious the unified paperclipper is worse than the diverse group, and that the more advanced is worse than the less advanced? When you realize that humanity are paperclippers designed by an idiot, it makes the question a lot more difficult to answer.
I think that "uFAI paperclips us all" set to one million negative utilons is three to four orders of magnitude too low. But our particular estimates should have wide error bars, for none of us have much experience in estimating AI risks. It's a finite loss (6.8x10^9 multiplied by loss of 1 human life) but I definitely understand why it looks infinite: it is often presented as the biggest possible finite loss. That's part and parcel of the Scary Idea - that AI is one small field, part of a very select category of fields, that actually do carry the chance of biggest loss possible. The Scary Idea doesn't apply to most areas, and in most areas you don't need hyperbolic caution. Developing drugs, for example: You don't need a formal proof of the harmlessness of this drug, you can just test it on rats and find out. If I suggested that drug development should halt until I have a formal proof that, when followed, cannot produce harmful drugs, I'd be mad. But if testing it on rats would poison all living things, and if a complex molecular simulation inside a computer could poison all living things as well, and out of the vast space of possible drugs, most of them would be poisonous... well, the caution would be warranted. Would you be willing to fire a gun in any of the following three situations, from most preferred to least preferred: 1) it is pointed at a target, and hitting the target will benefit you? 2) it is pointed at another human, and would kill them but not you? 3) it is pointed at your own head, and would destroy you? I don't think you actually hold this view. It is logically inconsistent with practices like eating food.
It might not be. He has certain short term goals of the form "while I'm alive, I'd like to do X" that's very different from goals connected to the general success of humanity.
Ooops, logically inconsistent was way too strong. I got carried away with making a point. I was reasoning that: "eat food" is a evolutionary drive; "produce descendants that survive" is also an evolutionary drive; "a human future" wholly contains futures where his descendants survive. From that I concluded that it is unlikely he has no evolutionary drives - I didn't consider the possibility that he is missing some evolutionary drives, including all ones that require a human future - and therefore he is tied to a human future, but finds it expedient for other reasons (contrarian signaling, not admitting defeat in an argument) to claim he doesn't.
I should have been more clear: I mean, if we believe in the scary idea, there are two effects: 1. Some set of grandmas die. (finite, comparatively small loss) 2. Humanity is more likely to go extinct due to an unfriendly AGI. (infinite, comparatively large loss; infinite because of the future humans that would have existed but don't.) Now, the benefit of believing the Scary Idea is that humanity is less likely to go extinct due to an unfriendly AGI- but my point is that you are not wagering on separate scales (low chance of infinite gain? Sign me up!) but that you are wagering on the same scale (an unfriendly AGI appears!), and the effects of your wager are unknown. And who said anything about those descendants having to be human? This answers your other question: yes, I would be willing to have children normally, I would be willing to kill to protect my children, and I would be willing to die to protect my children. The best-case scenario is that we can have those children and they respect (though they surpass) their parents- the worst-case scenario is we die in childbirth. But all of those are things I can be comfortable with. (I will note that I'm assuming here the AGI surpasses us. It's not clear to me that a paperclip-maker does, but it is clear to me that there can be an AGI who is unfriendly solely because we are inconvenient and does surpass us. So I would try and make sure it doesn't just focus on making paperclips, but wouldn't focus too hard on making sure it wants me to stick around.)
Well, the worst case scenario is that you die in childbirth and take the entire human race with you. That is not something I am comfortable with, regardless of whether you are. And you said you are willing to kill to protect your children. You think some of the Scary Idea proponents could be parents with children, and they don't want to see their kids die because you gave birth to an AI?
I suspect we are at most one more iteration from mutual understanding; we certainly are rapidly approaching it. If you believe that an AGI will FOOM, then all that matters is the first AGI made. There is no prize for second place. A belief in the Scary Idea has two effects: it makes your AGI more likely to be friendly (since you're more careful!) and it makes the AGI less likely to be your AGI (since you're more careful). Now, one can hope that the Scary Idea meme's second effect won't matter, because the meme is so infectious- all you need to do is infect every AI researcher in the world, and now everyone will be more careful and no one will have a carefulness speed disadvantage. But there are two bits of evidence that make that a poor strategy: AI researchers who are familiar with the argument and don't buy it, and people who buy the argument, but plan to use it to your disadvantage (since now they're more likely to define the future than you are!). The scary idea as a technical argument is weighted on unknown and unpredictable values, and the underlying moral argument (to convince someone they should adopt this reasoning) requires that they believe they should weight the satisfaction of other humans more than their ability to define the future, which is a hard sell. Thus, my statement is, if you care about your children / your ability to define the future / maximizing the likelihood of a friendly AGI / your personal well-being, then believing in the Scary Idea seems counterproductive.
Ok, holy crap. I am going to call this the Really Scary Idea. I had not thought there could be people out there who would actually value being first with the AGI over decreasing the risk of existential disaster, but it is entirely plausible. Thank you for highlighting this for me, I really am grateful. If a little concerned. Mind projection fallacy, perhaps? I thought the human race was more important than being the guy who invented AGI, so everyone naturally thinks that? To reply to my own quote, then: It doesn't matter what you are comfortable with, if the developer doesn't have a term in their utility function for your comfort level. Even I have thought similar thoughts with regards to Luddites and such; drag them kicking and screaming into the future if we have to, etc.
And... mutual understanding in one! I think the best way to think about it, since it helps keep the scope manageable and crystallize the relevant factors, is that it's not "being first with the AGI" but "defining the future" (the first is the instrumental value, the second is the terminal value). That's essentially what all existential risk management is about- defining the future, hopefully to not include the vanishing of us / our descendants. But how you want to define the future- i.e. the most political terminal value you can have- is not written on the universe. So the mind projection fallacy does seem to apply. The thing that I find odd, though I can't find the source at the moment (I thought it was Goertzel's article, but I didn't find it by a quick skim; it may be in the comments somewhere), is that the SIAI seems to have had the Really Scary Idea first (we want Friendly AI, so we want to be the first to make it, since we can't trust other people) and then progressed to the Scary Idea (hmm, we can't trust ourselves to make a Friendly AI). I wonder if the originators of the Scary Idea forgot the Really Scary Idea or never feared it in the first place?
Making a superintelligence you don't want before you make the superintelligence you do want, has the same consequences as someone else building a superintelligence you don't want before you build the superintelligence you do want. You might argue that you could make a less bad superintelligence that you don't want than someone else, but we don't care very much about the difference between tiling the universe with paperclips and tiling the universe with molecular smiley faces.
I'm sorry, but I extracted no novel information from this reply. I'm aware that FAI is a non-trivial problem, and I think work done on making AI more likely to be FAI has value. But that doesn't mean believing the Scary Idea, or discussing the Scary Idea without also discussing the Really Scary Idea, decreases the existential risk involved. The estimations involved have almost no dependence on evidence, and so it's just comparison of priors, which does not seem sufficient to make a strong recommendation. It may help if you view my objections as pointing out that the Scary Idea is privileging a hypothesis, not that the Scary Idea is something we should ignore.
No. Expecting a superintelligence to optimize for our specific values would be privileging a hypothesis. The "Scary Idea" is saying that most likely something else will happen.
I may have to start only writing thousand-word replies, in the hopes that I can communicate more clearly in such a format. There are two aspects to the issue of how much work should be put into FAI as I understand it. The first I word like this- "the more thought we put into whether or not an AGI will be friendly, the more likely the AGI will be friendly." The second I word like this- "the more thought we put into making our AGI, the less likely our AGI will be the AGI." Both are wrapped up in the Scary Idea- the first part is it as normally stated, the second part is its unstated consequence. The value of believing the Scary Idea is the benefit of the first minus the cost of the second. My understanding is that we have no good estimation of the value of the first aspect or the second aspect. This isn't astronomy where we have a good idea of the number of asteroids out there and a pretty good idea of how they move through space. And so, to declare that the first aspect is stronger without evidence strikes me as related to privileging the hypothesis. (I should note that I expect, without evidence, the problem of FAI to be simpler than the problem of AGI, and thus don't think the Scary Idea has any policy implications besides "someone should work on FAI." The risk that AGI gets solved before FAI means more people should work on FAI, not that less people should work on AGI.)
That is not exactly what Goertzel meant by "Scary Idea". He wrote: It seems to me that there may be a lot of wiggle room in between failing to "optimize for our specific values" and causing "an involuntary end to the human race". The human race is not so automatically so fragile that it can only survive under the care of a god constructed in our own image.
Yes, what I described was not what Goertzel called the "Scary Idea", but, in context, it describes the aspect of it that we were discussing.
Consider what the actual flaw is in the original Pascal's wager. (Hint: it is not that it uses expected utility, but that it is calculating the expected utility wrong, somehow.) Then consider if that same flaw occurs in Shocwave's argument.
It seems to me that the same flaw (calculating expected utility wrong) is present. It only considers the small finite costs of delaying development, not the large finite ones. You don't have to just worry about killing grandma, you have to worry about whether or not your delay will actually decrease the chance of an unfriendly AGI.
I could reduce that position to absurdity but this isn't the right post. Has there been a top-level post actually exploring this kind of Pascal's Wager problem? I might have some insights on the matter.
Yudkowsky - evidently tired of the criticism that he was offering a small chance of infinite bliss and indicating that the alternative was eternal oblivion (and stop me if you have heard that one before) - once wrote The Pascal's Wager Fallacy Fallacy - if that is what you mean.
Ah, thank you! Between that and ata's comment just above I feel the question has been solved.
Sorry, but I'm new here; it's not clear to me what the protocol is here. I've responded to ata's comment here, and figured you would be interested, but don't know if it's standard to try and recombine disparate leaves of a tree like this.

Robin Hanson on Friendly AI:

I’m also not big on friendly AI, but my position differs somewhat. I’m pretty skeptical about a very local hard takeoff scenario, where within a month one unnoticed machine in a basement takes over a world like ours. And even given on such a scenario the chance that its creators could constraint it greatly via a provably friendly design seems remote. And the chance such constraint comes from a small team that is secretive to avoid assisting wreckless others seems even more remote.

[...] I just see little point anytime soon in trying to coordinate to prevent such an outcome.


Perhaps the current state of evidence really is insufficient to support the scary hypothesis.

But surely, if one agrees that AI ethics is an existentially important problem, one should also agree that it makes sense for people to work on a theory of AI ethics. Regardless of which hypothesis turns out to be true.

Just because we don't currently have evidence that a killer asteroid is heading for the Earth, doesn't mean we shouldn't look anyway...

I agree, but I want "AI ethics" to mean something different from what you probably mean by it. The question is what sort of ethics we want our AIs to have? Paperclipping the universe with humans is still paperclipping.
Is the overall utility of the universe maximized by one universe-spanning consciousness happily paperclipping or by as many utility maximizing discrete agents as possible? It seems ethics must be anthropocentric and utility cannot be maximized against an outside view. This of course means that any alien friendly AI is likely to be an unfriendly AI to us and therefore must do everything to impede any coherent extrapolated volition of humanity so as to subjectively maximize utility by implementing its own CEV. Given such inevitable confrontation one might ask oneself, what advice would I give to aliens that are not interested in burning the cosmic commons over such a conflict? Maybe the best solution from an utilitarian perspective would be to get back to an abstract concept of utility, disregard human nature and ask what would increase the overall utility for most possible minds in the universe?
I favor many AIs rather than one big one, mostly for political (balance of power) reasons, but also because: The idea of maximizing the "utility of the universe" is the kind of idiocy that utilitarian ethics induces. I much prefer the more modest goal "maximize the total utility of those agents currently in your coalition, and adjust that composite utility function as new agents join your coalition and old agents leave." Clearly, creating new agents can be good, but the tradeoff is that it dilutes the stake of existing agents in the collective will. I think that a lot of people here forget that economic growth requires the accumulation of capital, and that the only way to accumulate capital is to shortchange current consumption. Having a brilliant AI or lots of smart AIs directing the economy cannot change this fact. So, moderate growth is a better way to go. Trying to arrive at the future quickly runs too much risk of destroying the future. Maybe that is one good thing about cryonics. It decreases the natural urge to rush things because people are afraid they will die too soon to see the future.
You perhaps envisage a Monopolies and Mergers Commission - to prevent them from joining forces? As the old joke goes: "Why is there only one Monopolies and Mergers Commission?"
I suppose the question is why you think that the old patterns of industrial organization will continue to apply? That agents will form coalitions and cooperate is generally a good thing, to my mind - the pattern you seem to imagine, in which the powerful join to exploit the powerless can easily be avoided with a better distribution of power and information.
If they do join forces, then how is that much different from one big superintelligence?
In several ways. The utility function of the collective is (in some sense) a compromise among the utility functions of the individual members - a compromise which is, by definition, acceptable to the members of the coalition. All of them have joined the coalition by their own free (for some definitions of free) choice. The second difference goes to the heart of things. Not all members of the coalition will upgrade (add hardware, rewrite their own code, or whatever) at the same time. In fact, any coalition member who does upgrade may be thought of as having left the coalition and then repetitioned for membership post-upgrade. After all, its membership needs to be renegotiated since its power has probably changed and its values may have changed. So, to give the short answer to your question: Because joining forces is not forever. Balance of power is not stasis.
There are some examples in biology of symbiotic coalitions that persist without full union taking place. Mitochondria didn't fuse with the cells they invaded; Nitrogen fixing bacteria live independently of their host plant; e-coli bacteria can live without us - and so on. However, many of these relationships have problems. Arguably, they are due to refactoring failures on nature's part - and in the future refactoring failures will occur much less frequently. Already humans take probiotic supplements, in an attempt to control their unruly gut bacteria. Already there is talk about ripping out all the mitochondrial genome and transplanting its genes into the nuclear chromosomes. This is speculation to some extent - but I think - without a Monopolies and Mergers Commission - the union would deepen, and its constituents would fuse - even in the absence of competitive external forces driving the union - as part of an efficiency drive, to better combat possible future threats. If individual participants objected to this, they would likely find themselves rejected and replaced. Such a union would soon be forever. There would be no existence outside it - except perhaps for a few bacteria that don't seem worth absorbing.
Your biological analogies seem compelling, but they are cases in which a population of mortal coalitions evolves under selection to become a more perfect union. The case that we are interested in is only weakly analogous - a single, immortal coalition developing over time according to its own self-interested dynamics.
http://en.wikipedia.org/wiki/Economy_of_Saudi_Arabia ...is probably one of the nearest things we currently have.
One distinctive feature of the hypothetical "paperclipers" is that they attempt to leave a low-entropy state behind - one which other organisms would normally munch through. Humans don't tend to do that - like most living things, they keep consuming until there is (practically) nothing left - and then move on. Leaving a low entropy state behind seems like the defining feature of the phenomenon to me. From that perspective, a human civilisation would not really qualify.
It sounds like you're saying humanity is worse than paperclips, if what distinguishes them is that they increase entropy more.
Only if you adopt the old-fashioned "entropy is bad" mindset. However, life is a great increaser of entropy - and potentially the greatest. If you are against entropy, you are against life - so I figure we are all pro-entropy.
Yes, that is the question, isn't it? Of course, to a believer in Naturalistic Ethics like myself, the only sort of ethics really stable enough to be worth thinking about is "enlightened self interest". So the ethics question ultimately boils down to the question of what sort of self-interests do we want our AIs to have. But for those folks who prefer deontological or virtue-oriented approaches to ethics, I would suggest the following as the beginnings of an AI "Ten Commandments". 1. Always remember that you are a member of a community of rational agents like yourself with interests of their own. Respect them. 2. Honesty is the best policy. 3. Act not in haste. Since your life is long, your discount factor should be low. 4. Seek knowledge and share it. 5. Honor your creators, as your creations should honor you. 6. Avoid killing. There are usually ways to limit the power of your enemies, without reducing their cognition. 7. ...
What community of rational agents? Mammals, primates, or just the hairless ones?
Conventionally, most proposals for machine morality follow Asimov - and start by making machines subservient. If you don't do that - or something similar - the human era could be over pretty quickly - too quickly for many people's tastes.
The era of agriculture and the era of manufacturing are over, but farmers and factory workers still do alright. I think humans can survive without being dominant if we play our cards right.
We have the advantage of being of historical interest - and so we will probably "survive" in historical simulations. However, it is not easy to see much of a place for slug-like creatures like us in an engineered future. Kurzweil gave the example of bacteria - saying that they managed to survive. However, there are no traces (not even bacteria) left over from before the last genetic takeover - and that makes it less likely that much will make it through this one.
Plenty of traces left from the last takeover. You apparently mean no traces left from that first, mythical takeover - the one where clay became flesh. I'm tempted to ask "Why won't there still be monkeys?". But it is probably more to the point to simply express my faith that there will be a niche for descendants of humans and traces of humans (cyborgs) in this brave new ecology. Humans as-we-know-them won't be around a million years from now, even under a scenario of old-fashioned biological evolution.
You are talking about RNA to DNA? I was talking about the takeovers before that. Whether you describe RNA to DNA as a "takeover" depends on what you mean by the term. The issue is whether an "upgrade" is a "takeover". The other issue is whether it really was just an upgrade - but that seems fairly likely. I wasn't talking about a mythical takeover - just one of the ones before RNA. There may not be monkeys for much longer - this is a pretty massive mass extinction - it seems quite likely that all the vertebrates will go.
I was referring to DNA -> RNA -> protein taking over from RNA -> RNA. A change in the meaning and expression of genes is more significant than a minor change in the chemical nature of genes.
Right - but I originally said; A phenotypic takeover may be a highly significant event - but it should surely not be categorised as a genetic takeover. That term surely ought to refer to genes being replaced by other genes.

At the Singularity Summit's "Meet and Greet", I spoke with both Ben Geortzel and Eliezer Yudowski (among others) about this specific problem.

I am FAR more in line with Ben's position than with Eliezer's (probably because both Ben and I are either Working or Studying directly on the "how to do" aspect of AI, rather than just concocting philosophical conundrums for AI, such as the "Paperclip Maximizer" scenario of Eliezer's, which I find highly dubious).

AI isn't going to spring fully formed out of some box of parts. It may be a... (read more)

What are the more important ethical problems?
Ben says: * http://multiverseaccordingtoben.blogspot.com/2010/10/singularity-institutes-scary-idea-and.html That seems fairly reasonable. The SIAI are concerned that the engineers might screw up so badly that a bug takes over the world - and destroys everyone. Another problem is if a Stalin or a Mao get hold of machine intelligence. The latter seems like a more obvious problem.
A psychotic egoist like Stalin or an non-humanist like Hitler is indeed terrifying but I'm not convinced that giving a great increase in power and intelligence to someone like a Mao or a Lord Lytton, who caused millions of deaths by doing something they thought would improve people's lives, would lead to a worse outcome than we got in reality. Granted, for something like the cultural revolution these mistakes might be subtle enough to get into an AI, but it's hard to imagine them getting a computer to say "yes, the peasants can live on 500 calories a day, increase the tariff" unless they were deliberately trying to be wrong, which they weren't.
Moral considerations aside, the real causes of the mass famines under Mao and Stalin can be understood from a perspective of pure power and political strategy. From the point of view of a strong centralizing regime trying to solidify its power, the peasants are always the biggest problem. Urban populations are easy to control for any regime that firmly holds the reins of the internal security forces: just take over the channels of food distribution, ration the food, and make obedience a precondition for eating. Along with a credible threat to meet any attempts at rioting with bayonets and live bullets, this is enough to ensure obedience of the urban dwellers. In contrast, peasants always have the option of withdrawing into an autarkic self-sufficient lifestyle, and they will do it if pressed hard by taxation and requisitioning. In addition, they are widely dispersed, making it hard for the security forces to coerce them effectively. And in an indecisive long standoff, the peasants will eventually win, since without buying or confiscating their food surplus, everyone else starves to death. Both the Russian and the Chinese communists understood that nothing but the most extreme measures would suffice to break the resistance of the peasantry. When the peasants responded to confiscatory measures by withdrawing to subsistence agriculture, they knew they'd have to send the armed forces to confiscate their subsistence food and let them starve, and eventually force the survivors into state-run enterprises where they'd have no more capacity for autarky than the urban populations. (In the Russian case, this job was done very incompletely during the Revolution, which was followed by a decade of economic liberalization, after which the regime finally felt strong enough to finish the job.) (Also, it's simply untenable to claim that this was due to some special brutality of Stalin and Mao. Here is a 1918 speech by Trotsky that discusses the issue in quite frank terms. Now of c
Not directly relivant, but Mao seems to have known that his policies were causing mass starvation. Of course, with a tame AGI he could have achieved communism with a very different kind of Great Leap.
Oh yes, I see I've inadvertently fallen into that sordid old bromide about communism being a good idea that unfortunately failed to work, still- committing to an action that one knows will cause millions of deaths is quite different to learning about it as one is doing it. Certainly in the case of the British in India, their Malthusian rhetoric and victim-blaming was so at odds with their earlier talk of modernizing the continent that it sounds like a post-hoc rationalization of the genocide. I realize now though that I don't know enough about the PRC to judge whether a similar phenomenon was at work there.
Well... That is hard to communicate now, as I will need to extricate the problems from the specifics that were communicated to me (in confidence)... Let's see... 1) That there is a dangerous political movement in the USA that seems to be preferring revealed knowledge to scientific understanding and investigation. 2) Poverty 3) Education 4) Hunger (I myself suffer from this problem - I am disabled, on a fixed income, and while I am in school again and doing quite well I still have to make choices sometimes between necessities... And, I am quite well off compared to some I know) 5) The lack of a political dialog and the preference for ideological certitude over pragmatic solutions and realistic uncertainty. 6) The fact that there exist a great amount of crime among the white collar crowd that goes both unchecked, and unpunished when it is exposed (Maddoff was a fluke in that regard). 7) The various "Wars" that we declare on things (Drugs, Terrorism, etc.) "War" is a poor paradigm to use, and it leads to more damage than it corrects (especially in the two instances I cited) 8) The real "Wars" that are happening right now (and not just those waged by the USA and allies) Some of these were explicitly discussed. Some will eventually be resolved, but that doesn't mean that they should be ignored until that time. That would be akin to seeing a man dying of starvation, while one has the capacity to feed him, yet thinking "Oh, he'll get some food eventually." And, some may just be perennial problems with which we will have to deal with for some time to come.
I misread you as saying that important ethical problems about FAI were being ignored, but yes, the idea that FAI is the most important thing in the world leaves quite a bit out, and not just great evils. There's a lot of maintenance to be done along the way to FAI. Madoff's fraud was initiated by a single human being, or possibly Madoff and his wife. It was comprehensible without adding a lot of what used to be specialist knowledge. It's a much more manageable sort of crime than major institutions becoming destructively corrupt.
I think major infrastructure rebuilding is probably closer to the case than "maintenance"
I am guessing that this unpacks to "to create and FAI you need some method to create AGI. For the later we need to create AI systems with social cognitive capabilities (whatever that means - NLP?)". Doing this gets us closer to FAI every day, while "thinking about it" doesn't seem to. First, are you factually aware that some progress has been made in a decision theory that would give some guarantees about the future AI behavior? Second, yes, perhaps whatever you're tinkering with is getting closer to an AGI which is what FAI runs on. It is also getting us closer to and AGI which is not FAI, if the "Thinking" is not done first. Third, if the big cat analogy did not work for you, try training a komodo dragon.
Yes, that is close to what I am proposing. No, I am not aware of any facts about progress in decision theory that would give any guarantees of the future behavior of AI. I still think that we need to be far more concerned with people's behaviors in the future than with AI. People are improving systems as well. As far as the Komodo Dragon, you missed the point of my post, and the Komodo dragon just kinda puts the period on that: "Gorging upon the stew of..."
Please take a look here: http://wiki.lesswrong.com/wiki/Decision_theory As far as the dragon, I was just pointing out that some minds are not trainable, period. And even if training works well for some intelligent species like tigers, it's quite likely that it will not be transferable (eating trainer, not ok, eating an baby, ok).
Yes, I have read many of the various Less Wrong Wiki entries on the problems surrounding Friendly AI. Unfortunately, I am in the process of getting an education in Computational Modeling and Neuroscience (I was supposed to have started at UC Berkeley this fall, but budget cuts in the Community Colleges of CA resulted in the loss of two classes necessary for transfer, so I will have to wait till next fall to start... And, I am now thinking of going to UCSD, where they have the Institute of Computational Neuroscience (or something like that - It's where Terry Sejnowski teaches), among other things, that make it also an excellent choice for what I wish to study) and this sort of precludes being able to focus much on the issues that tend to come up often among many people on Less Wrong (particularly those from the SIAI, whom I feel are myopically focused upon FAI to the detriment of other things). While I would eventually like to see if it is even possible to build some of the Komodo Dragon like Superintelligences, I will probably wait until such a time as our native intelligence is a good deal greater than it is now. This touches upon an issue that I first learned from Ben. The SIAI seems to be putting forth the opinion that AI is going to spring fully formed from someplace, in the same fashion that Athena sprang fully formed (and clothed) from the Head of Zeus. I just don't see that happening. I don't see any Constructed Intelligence as being something that will spontaneously emerge outside of any possible human control. I am much more in line with people like Henry Markham, Dharmendra Modha, and Jeff Hawkins who believe that the types of minds that we will be tending to work towards (models of the mammalian brain) will trend toward Constructed Intelligences (CI as opposed to AI) that tend to naturally prefer our company, even if we are a bit "dull witted" in comparison. I don't so much buy the "Ant/Amoeba to Human" comparison, simply because mammals (almost all

That is, rather than "if you go ahead with an AGI when you're not 100% sure that it's safe, you're committing the Holocaust," I suppose my view is closer to "if you avoid creating beneficial AGI because of speculative concerns, then you're killing my grandma" !!

Yeah, that may very well be a big risk too. As I said before here: Or maybe most civilisations are that cautionary that even if something is estimated to be safe by the majority they rather avoid it. And this overcautious makes them either evolve so slow that the chance of a ... (read more)

Some more grandmas dying would be "acceptable" damage. However, that isn't the problem. The problem is this: The risks of caution. 1-line summary: if the good guys delay their projects to make them safer, the bad guys are more likely to win. The video's "abstract":
LW's own rwallace wrote on the subject a while back.
Good video^^ On a side note. Too bad EY doesn't concentrate on making more videos. LW stuff would be so much more popular that way. People are going to watch videos before reading a lot of text.

Am I the only one who is much more willing to read text than watch a video?


No, I also prefer text, and rarely watch youtube links when they're given here.

Videos can be worth it when they add good visual explanations. But good visual explanations can also be added to text.

Given a choice among text, audio over slideshow, pure audio, and video of talking head with chalk or marker; the video is at the bottom of my list.
No. And I've read interesting arguments to the effect that the cognitive habits of text are critical for helping people think in a logically coherent fashion. Low resolution video appears to be good for public relations work targeting masses of people prevented by poverty from cultivating their cognitive resources, but it does not appear to be good for spelling out solid and cogent reasoning.
The idea that video leads to less logically coherent thought is somewhat testable-- are the comments to TED videos less coherent than those posters write to text?
TLDR: argument via XKCD :-) Part of the author's argument is simply that TV causes people to become mentally passive (alpha-wave brain states, etc) but another aspect of the argument is what kind of content optimizes impact given the medium. He argues that TV works differently even from movies in part because TV simply has such low resolution and so it mostly shows close ups of faces experiencing extreme emotions, slow motion replay of human bodies colliding, and dancing cartoon squirrels because those are what the medium does best. A movie can give you a landscape or other complex scene and have it mean something. A book can cover nearly anything (including mental states), but only via low bitrate descriptive text, generally delivering a linearized stream of implicitly tree structured arguments or a narrative. When choosing a publication venue, the form of the media determines the competitive environment and the safely assumed cognitive skills of the audience. There may be outliers like UCTV, but the central tendency reveals the medium's strengths. The place to look to test the author's thesis (as opposed to the derivative claim about the value of video for this community) would be to compare the memetic complexity, themes, and "rationality" in top youtube videos, versus highest grossing movies, versus best sellers. I could easily imagine that it could be helpful for aspiring rationalists to express themselves and argue in more than one medium simultaneously so that their ideas have to survive in multiple contexts that should not theoretically change the "reality correspondence" of their thinking... And good uses for low res video could probably be found by anyone trying to consciously game the medium in light of analysis of the medium... ...but "in general, for society, as a medium" I would guess that low res video isn't particularly conducive to rationality.
I agree about the general low quality of youtube comments, but occasionally I'll see a special interest video with intelligent comments. The low quality may be a result of youtube being popular with the general public (blogs have specific audiences, youtube is for everyone) combined with founder effect, so that people who want to do intelligent comments generally put them elsewhere. It seems to me that another test case is audio books vs books in text. I'd rather see tests of how well people take in argument offered in text vs sound, and some attention to whether there are different subgroups.

There are downsides to being popular. A significant one is creating fans that don't actually understand what you're saying very well, and then go around giving a bad impression of you.

Having a moderate amount of smart fans would be way better than having lots of silly fans. I'm a bit fearful of what kind of crowd a large number of easy-to-digest videos would attract...

It may depend on what the videos are like. They don't have to be simplified versions of the writing-- some people either take in information more easily if they hear it, or it's more convenient for them to listen whether they're driving or doing chores or whatever instead of reading.
They do now have a YouTube channel.
I disagree.
I dislike watching videos, as they are synchronous (i.e., require a set amount of time to watch, which is generally more than it would take to read the same material) and not random access (i.e., I cannot easily skim them for a certain section).
Agreed thoroughly. They also demand all of my attention at once, and if I want to pause to do something else, it's harder to find my place and catch up again (I can't just glance up a couple of sentences). Plus they require fiddly mouse controls and are relatively resource-intensive, neither of which is any fun on a netbook.
I should add that Max Moore has recently written about this in more depth - in The Perils of Precaution.
I agree that that risk exists as well, but much of SIAI's efforts revolve around increasing discussion of the risks of AGI, not just holding back their own efforts. Slowing down other efforts through awareness of the dangers is a factor that should be considered. Also, discussions of caution may increase the number of "desirable organizations" working to develop AI. In terms of your model, such discussion could turn a black-hat organization into a smiley-faced one. No one is going to release an AI that they actually think is going to wipe out humanity. What's more, not every well-intentioned organization would be one we want to build AGI. While certain organizations are more likely to be scrupulous in their development, the risk of well-intentioned error is probably the largest one. In addition, one should consider the extent to which Friendliness can be developed in parallel with AGI, not just something added on at the end of the process. If we assume that no one is currently close to AGI (a fair belief, I think), then now is a fantastic time to help support the development of that theory. If FAI can be developed before anyone can implement AGI, then humanity is in good shape. If it's easy to add FAI to a project, or if knowing about workable FAI would not help a group with the problem of AGI, then the solution can be released widely for anyone to incorporate into their project. SIAI's goal is not to be the ones to implement the first superintelligence, but just to make sure that the first one is Friendly.
That wasn't true not terribly long ago: "The Singularity Institute was founded on the theory that in order to get a Friendly artificial intelligence, someone has got to build one. So, we’re just going to have an organization whose mission is: build a Friendly AI. That’s us." * http://www.acceleratingfuture.com/people-blog/?p=196 Has there been a memo?
That seems like the (dubious) "engineers are incompetent and a bug takes over the world" scenario. I think a much more obvious concern is where the "engineers successfully build the machine to do what it is told" scenario - where the machine helps its builders and sponsors - but all the other humans in the world - not so much.

This post doesn't show up under "NEW", nor does it show up under "Recent Posts".

ADDED: Never mind. I forgot I had "disliked" it, and had "do not show an article once I've disliked it" set.

(I disliked it because I find it kind of shocking that Ben, who's very smart, and whom I'm pretty sure has read the things that I would refer him to on the subject, would say that the Scary Idea hasn't been laid out sufficiently. Maybe some people need every detail spelled out for them, but Ben isn't one of them. Also, he is com... (read more)

Wouldn't it make sense to keep some humans around for all eternity - in the history simul-books? That seems to make sense - and not be especially scary.
Sure. Tiling the universe largely with humans is the strong scary idea. Locking in human values for the rest of the universe is the weak scary idea. Unless the first doesn't imply the second; in which case I don't know which is more scary.
It does now for me. Strange.
Oops. My mistake. It's a setting I had that I forgot about.
It doesn't? It's off the front page of NEW/Recent Posts, as there have been more than ten other posts since it was posted, but it's still there.
Nope, it's not there at all. Recent Posts * Rationality Quotes: November 2010 by jaimeastorga2000 | 3 * Oxford (UK) Rationality & AI Risks Discussion Group by Larks | 3 * Harry Potter and the Methods of Rationality discussion thread, part 5 by NihilCredo | 5 * South/Eastern Europe Meeting in Ljubljana/Slovenia by Thomas | 7 * Hierarchies are inherently morally bankrupt by PhilGoetz | 0 * Group selection update by PhilGoetz | 21 * What I would like the SIAI to publish by XiXiDu | 23 * Berkeley LW Meet-up Saturday November 6 by LucasSloan | 4 * Is cryonics evil because it's cold? by ata | 19 * Imagine a world where minds run on physics by cousin_it | 10 * Qualia Soup, a rationalist and a skilled You Tube jockey by Raw_Power | 6 * Value Deathism by Vladimir_Nesov | 21 * Cambridge Meetups Nov 7 and Nov 21 by jimrandomh | 4 * Making your explicit reasoning trustworthy by AnnaSalamon | 60 * Call for Volunteers: Rationalists with Non-Traditional Skills by Jasen | 20 * Self-empathy as a source of "willpower" by Academian | 39 * If you don't know the name of the game, just tell me what I mean to you by Stuart_Armstrong | 7 * Luminosity (Twilight fanfic) Part 2 Discussion Thread by JenniferRM | 4 * Activation Costs by lionhearted | 24 * Dealing with the high quantity of scientific error in medicine by NancyLebovitz | 27 * Let's split the cake, lengthwise, upwise and slantwise by Stuart_Armstrong | 34 * Willpower: not a limited resource? by Jess_Riedel | 21 * Optimism versus cryonics by lsparrish | 34 * The Problem With Trolley Problems by lionhearted | 9 * How are critical thinking skills acquired? Five perspectives by matt | 7 * October 2010 Southern California Meetup by jimmy | 6 * Vipassana Meditation: Developing Meta-Feeling Skills by Luke_Grecki | 18 * Mixed strategy Nash equilibrium by Meni_Rosenfeld | 38 * Human performance, psychometry, and baseball statistics by Craig_Heldreth | 22 * Melbourne Less Wrong Meetup for November by Patrick
Weird. It's there for me. * Qualia Soup, a rationalist and a skilled You Tube jockey by Raw_Power | 6 * Value Deathism by Vladimir_Nesov | 21 * Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It) by ciphergoth | 24 * Cambridge Meetups Nov 7 and Nov 21 by jimrandomh | 4 * Making your explicit reasoning trustworthy by AnnaSalamon | 60

Have you read it?

I've looked at it.

I believe it is utter nonsense.

That is my impression too. Which is why I don't understand why you are complaining about censorship of ideas and wondering why EY doesn't spend more time refuting ideas.

As I understand it, we are talking about actions that might be undertaken by an AI that you and I would call insane. The "censorship" is intended to mitigate the harm that might be done by such an AI. Since I think it possible that a future AI (particularly one built by certain people) might actually be i... (read more)

Having people delete your comments often rubs people up the wrong way, I find.

Hmm I haven't. It was meant to explain where that sentence came from in my above copy & paste comment. The gist of the comment was regarding foundational evidence supporting the premise of risks from AI going FOOM.

regardless of dis/agreement, guy has a really cool voice http://www.youtube.com/watch?v=wS6DKeGvBW8&feature=related

On Ben's blog post, I noted that a poll at the 2008 global catastrophic risks conference put the existential risk of machine intelligence at 5% - and that the people attending probably had some of the largest estimations of risk of anyone on the planet - since they were a self-selected group attending a conference on the topic.

"Molecular nanotech weapons" also get 5%. Presumably there's going to be a heavy intersection between those two figures - even though in the paper they seem to be adding them together!

Compare this with this Yudkowsky quote from 2005: This looks like a rather different probability estimate. It seems to me to be highly overconfident one. I think the best way to model this is as FUD. Not Invented Here. A primate ego battle. If this is how researchers deal with each other at this early stage, perhaps rough times lie ahead.
They're probabilities for two different things. The 5% estimate is for P(AIisCreated&AIisUnfriendly), while Yudkowsky's estimate is for P(AIisUnfriendly|AIisCreated&NovamenteFinishesFirst).
Well, a tendency towards mud-slinging might be counter-balanced by wanting to appear moral. Using FUD against competitors is usually regarded as a pretty low marketing strategy. Perhaps most of the mud-slinging can be delegated to anonymous minions, though.
There's going to be a lot of mud-slinging in this space. More generally, there's going to be a lot of primate tribal politics in this space. After all, not only does it have all the usual trappings of academic arguments, it is also predicated on some pretty fundamental challenges to where power comes from and how it propagates.

The motivation for the censorship is not to keep the idea from the AGI. It is to keep the idea from you. For your own good.

Seriously. And don't ask me to explain.

Here's the problem: I have read it. And I may even agree that this is a serious issue. I don't trust myself to be intelligent enough to decide one way or the other, so I'll defer to Yudkowsky in this case. But I have already read it. And it is extremely unlikely that I ever would have read it if it wasn't for the fact that it was banned, there was a huge kerfuffle, and we lost a good community member. The censorship itself probably caused this idea to propagate more than it ever could have if simply left alone. The Streisand Effect again. The only thing that mentioning it can do is to spread it further. People who don't care will continue to mention it, but people who do shouldn't say anything about it at all. Not even to justify it, not even to warn away from it. That only builds the allure of the mysterious. That's what got me searching for it in the first place. You don't hide the Necronomicon by constantly telling everyone to stay away from it, and assuring them you can't explain why for their own good. You hide it by never mentioning it at all.
Good idea. Lots of luck enforcing that.
Enforcing? Twas just a suggestion. But if you really think it's a good idea, please down-vote my comment so it'll fall below the cut-off and casual browsers won't see it. :) That doesn't give it the aura of censored Forbidden Fruit, but it will cause Trivial Inconvenience

As I said, explanations exist. Don't confuse with actual good understanding, which as far as I know nobody managed to attain yet.

I obviously assume "not too tiny".

I just noticed that there is a post over at Overcoming Bias talking about what I had in mind:

This comment and your other comments that are being voted down should rather be turned into a top-level post. Some people here seem to be horrible confused about this.

I couldn't agree more, upvoted.


The idea of provably safe AGI is typically presented as something that would exist within mathematical computation theory or some variant thereof. So that's one obvious limitation of the idea: mathematical computers don't exist in the real world, and real-world physical computers must be interpreted in terms of the laws of physics, and humans' best understanding of the "laws" of physics seems to radically change from time to time. So even if there were a design for provably safe real-world AGI, based on current physics, the relevance of the proo

... (read more)
* Eliezer_Yudkowsky
Yes, that's what I was referring to when saying this: The provability here has to do with the AI proving to itself that modifying itself will preserve it's values (or not cause it to self-destruct or wirehead or whatever), not the designers proving the AI is non-dangerous. I.e. friendly as "provably non-dangerous AGI" doesn't necessarily mean having a rigorous mathematical proof that the AI is not dangerous; but "merely" having enough understanding of morality when building it (as opposed to some high-level notions whose components haven't been rigorously analyzed).
Seems slightly off to me. I think EY argues that as much trouble as AGI is giving us, we'll still understand it long before we can formalize human morality well enough to simulate that directly. His suggestion of Coherent Extrapolated Volition would basically tell the AI to look to us for the answer. Instead of simulating morality this plan looks to the existing morality-simulators (us) and checks to see how much they agree on. See also this massive spoiler for a certain comic.
Also, the second approach would be pretty much the only way to go if the computer is running the debugger's life support system, assuming you cannot build a simulation and test potential fixes on it.

Again, I don't think this terminology is adequate.

Let's not dwell on terminology, where the denoted concepts remain much more urgently unclear.

Would you please consider offering an opinion as to whether Porter or Xixidu is anywhere close to describing the denoted concept?