Major update here.

Related to: Should I believe what the SIAI claims?

Reply to: Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It)

... pointing out that something scary is possible, is a very different thing from having an argument that it’s likely. — Ben Goertzel

What I ask for:

I want the SIAI or someone who is convinced of the Scary Idea1 to state concisely and mathematically (and with possible extensive references if necessary) the decision procedure that led they to make the development of friendly artificial intelligence their top priority. I want them to state the numbers of their subjective probability distributions2 and exemplify their chain of reasoning, how they came up with those numbers and not others by way of sober calculations.

The paper should also account for the following uncertainties:

  • Comparison with other existential risks and how catastrophic risks from artificial intelligence outweigh them.
  • Potential negative consequences3 of slowing down research on artificial intelligence (a risks and benefits analysis).
  • The likelihood of a gradual and controllable development versus the likelihood of an intelligence explosion.
  • The likelihood of unfriendly AI4 versus friendly and respectively abulic5 AI.
  • The ability of superhuman intelligence and cognitive flexibility as characteristics alone to constitute a serious risk given the absence of enabling technologies like advanced nanotechnology.
  • The feasibility of “provably non-dangerous AGI”.
  • The disagreement of the overwhelming majority of scientists working on artificial intelligence.
  • That some people who are aware of the SIAI’s perspective do not accept it (e.g. Robin Hanson, Ben Goertzel, Nick Bostrom, Ray Kurzweil and Greg Egan).
  • Possible conclusions that can be drawn from the Fermi paradox6 regarding risks associated with superhuman AI versus other potential risks ahead.

Further I would like the paper to include and lay out a formal and systematic summary of what the SIAI expects researchers who work on artificial general intelligence to do and why they should do so. I would like to see a clear logical argument for why people working on artificial general intelligence should listen to what the SIAI has to say.


Here are are two examples of what I'm looking for:

The first example is Robin Hanson demonstrating his estimation of the simulation argument. The second example is Tyler Cowen and Alex Tabarrok presenting the reasons for their evaluation of the importance of asteroid deflection.


I'm wary of using inferences derived from reasonable but unproven hypothesis as foundations for further speculative thinking and calls for action. Although the SIAI does a good job on stating reasons to justify its existence and monetary support, it does neither substantiate its initial premises to an extent that an outsider could draw the conclusions about the probability of associated risks nor does it clarify its position regarding contemporary research in a concise and systematic way. Nevertheless such estimations are given, such as that there is a high likelihood of humanity's demise given that we develop superhuman artificial general intelligence without first defining mathematically how to prove the benevolence of the former. But those estimations are not outlined, no decision procedure is provided on how to arrive at the given numbers. One cannot reassess the estimations without the necessary variables and formulas. This I believe is unsatisfactory, it lacks transparency and a foundational and reproducible corroboration of one's first principles. This is not to say that it is wrong to state probability estimations and update them given new evidence, but that although those ideas can very well serve as an urge to caution they are not compelling without further substantiation.

1. If anyone is actively trying to build advanced AGI succeeds, we’re highly likely to cause an involuntary end to the human race.

2. Stop taking the numbers so damn seriously, and think in terms of subjective probability distributions [...], Michael Anissimov ( mailing list, 2010-07-11)

3. Could being overcautious be itself an existential risk that might significantly outweigh the risk(s) posed by the subject of caution? Suppose that most civilizations err on the side of caution. This might cause them to either evolve much slower so that the chance of a fatal natural disaster to occur before sufficient technology is developed to survive it, rises to 100%, or stops them from evolving at all for being unable to prove something being 100% safe before trying it and thus never taking the necessary steps to become less vulnerable to naturally existing existential risks. Further reading: Why safety is not safe

4. If one pulled a random mind from the space of all possible minds, the odds of it being friendly to humans (as opposed to, e.g., utterly ignoring us, and being willing to repurpose our molecules for its own ends) are very low.

5. Loss or impairment of the ability to make decisions or act independently.

6. The Fermi paradox does allow for and provide the only conclusions and data we can analyze that amount to empirical criticism of concepts like that of a Paperclip maximizer and general risks from superhuman AI's with non-human values without working directly on AGI to test those hypothesis ourselves. If you accept the premise that life is not unique and special then one other technological civilisation in the observable universe should be sufficient to leave potentially observable traces of technological tinkering. Due to the absence of any signs of intelligence out there, especially paper-clippers burning the cosmic commons, we might conclude that unfriendly AI could not be the most dangerous existential risk that we should worry about.

New Comment
225 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Stop taking the numbers so damn seriously, and think in terms of subjective probability distributions [...], Michael Anissimov

I think it's worth giving the full quote:

Stop taking the numbers so damn seriously, and think in terms of subjective probability distributions, discard your mental associates between numbers and absolutes, and my choice to say a number, rather than a vague word that could be interpreted as a probability anyway, makes sense. Working on, one of the things I appreciated the most were experts with the intelligence to make probability estimates, which can be recorded, checked, and updated with evidence, rather than vague statements like “pretty likely”, which have to be converted into probability estimates for Bayesian updating anyway. Futurists, stick your neck out! Use probability estimates rather than facile absolutes or vague phrases that mean so little that you are essentially hedging yourself into meaninglessness anyway.

Total agreement from me, needless to say.

The claim that AIs will foom, basically, reduces to the claim that the difficulty of making AGI is front-loaded: that there's a hump to get over, that we aren't over it yet, and that once it's passed things will get much easier. From an outside view, this makes sense; we don't yet have a working prototype of general intelligence, and the history of invention in general indicates that the first prototype is a major landmark after which the pace of development speeds up dramatically.

But this is a case where the inside and outside views disagree. We all know that AGI is hard, but the people actually working on it get to see the challenges up close. And from that perspective, it's hard to accept that it will suddenly become much easier once we have a prototype - both because the challenges seem so daunting, the possible breakthroughs are hard to visualize, and on some level, if AI suddenly became easy it would trivialize the challenges that researchers are facing now. So the AGI researchers imagine an AI-Manhattan Project, with resources to match the challenges as they see them, rather than an AI-Kitty Hawk, with a few guys in a basement who are lucky enough to stumble on the final necessary insight.

Since a Manhattan Project-style AI would have lots of resources to spend on ensuring safety, the safety issues don't seem like a big deal. But if the first AGI were made by some guys in a basement, instead, then they won't have those resources; and from that perspective, pushing hard for safety measures is important.

Except in this case if 'prototype' means genius-human-level AI, then it's reasonable to assume that even if the further challenges remain daunting, it will be economical to put a lot more effort into them, because researchers will be cheap. If airplanes were as much better at designing airplanes as they are at flying, Kitty Hawk would have been different.
Or that the effective effort put into AI research (e.g. by AIs) is sufficiently back-loaded.
Yes. This is not actually true. The history of invention in general indicates that the first prototype accomplishes little, and a great deal of subsequent work needs to be done - even in the case of inventions like machine tools and computers that are used for creating subsequent generations of themselves.
Yes, this is right. Prototypes often precede widespread deployment and impact of a technology by decades until various supporting technologies and incremental improvements make them worth their costs.

For those of you who are interested, some of us folks from the SoCal LW meetups have started working on a project that seems related to this topic.

We're working on building a fault tree analysis of existential risks with a particular focus on producing a detailed analysis of uFAI. I have no idea if our work will at all resemble the decision procedure SIAI used to prioritize their uFAI research, but it should at least form a framework for the broader community to discuss the issue. Qualitatively you could use the work discuss the possible failure modes that would lead to a uFAI scenario and quantitatively you can could use the framework and your own supplied probabilities (or aggregated probabilities from the community, domain experts, etc.) to crunch the numbers and/or compare uFAI to other posited existential risks.

At the moment, I'd like to find out generally what anyone else thinks of this project. If you have suggestions, resources or pointers to similar/overlapping work you want to share, that would be great, too.

This project sounds really interesting and useful. It sounds a lot like a project that I tried and failed to get started. Or at least like part of that project. Though my project is so vague and broad that pretty much anything involving graphs/trees related to x-risks would seem "kinda like part of the project I was working on" Here's a link to another comment about that project I would like to hear more about your project.

I have a lot on my plate right now, but I'll try to write up my own motivating Fermi calculations if I get the chance to do so soon.

Would like to see it.

I agree that a write-up of SIAI's argument for the Scary Idea, in the manner you describe, would be quite interesting to see.

However, I strongly suspect that when the argument is laid out formally, what we'll find is that

-- given our current knowledge about the pdf's of the premises in the argument, the pdf on the conclusion is verrrrrrry broad, i.e. we can't conclude hardly anything with much of any confidence ...

So, I think that the formalization will lead to the conclusion that

-- "we can NOT confidently say, now, that: Building advanced AGI without a provably Friendly design will almost certainly lead to bad consequences for humanity"

-- "we can also NOT confidently say, now, that: Building advanced AGI without a provably Friendly design will almost certainly NOT lead to bad consequences for humanity"

I.e., I strongly suspect the formalization

-- will NOT support the Scary Idea

-- will also not support complacency about AGI safety and AGI existential risk

I think the conclusion of the formalization exercise, if it's conducted, will basically be to reaffirm common sense, rather than to bolster extreme views like the Scary Idea....

-- Ben Goertzel

I agree with both those statements, but think the more relevant question would be: "conditional on it turning out, to the enormous surprise of most everyone in AI, that this AGI design is actually very close to producing an 'artificial toddler', what is the sign of the expected effect on the probability of an OK outcome for the world, long-term and taking into account both benefits and risks?" .
I agree. I doubt you would remember this, but we talked about this at the Meet and Greet at the Singularity Summit a few months ago (in addition to CBGBs and Punk Rock and Skaters). James Hughes mentioned you as well at a Conference in NY where we discussed this very issue as well. One thing that you mentioned at the Summit (well in conversation) was that The Scary Idea was tending to cause some paranoia among people who otherwise might be contributing more to the development of AI (of course, you also seemed pretty hostile to brain emulation too) as it tends to cause funding that could be going to AI to be slowed as a result.

What I would like the SIAI to publish

Publish instead of doing what?

I would additionally like to see addressed:

  • What is the time estimate for FAI and AGI?
  • What is the probability that is FAI is possible times the probability that FAI can be achieved before AGI?
  • Other paths to safe super intelligence (IA, WBE, AI-in-box, etc) may be more dangerous. What are the odds? Are the odds better or worse than the odds that the FAI research program is successful?

Great post.

If you haven't seen SIAI's new overview you might find it relevant. I'm quite favorably impressed by it.

Thanks. I actually linked to that paper in the OP. As I wrote, that an organisation like the SIAI is necessary and should be supported is not being challenged. But what that paper accomplishes is merely giving a very basic introduction to someone who might have never thought about risks posed by AI. What I actually had in mind writing the OP is that the SIAI addresses people like Ben Goertzel who are irrespective of the currently available material skeptic about the risks from working on AGI and who are unsure of what the SIAI actually wants them to do or not to do and why. Further I would like if the SIAI provided educated outsiders with a summary of how people that believe into the importance of risks associated with AGI arrived at this conclusion, especially in comparison to other existential risks and challenges. What I seek is a centralized code of practice that incorporates the basic assumptions and a way to roughly asses their likelihood in comparison to other existential risks and challenges by the use of probability. See for example this SIAI page. Bayes sits in there alone and doomed. Why is there no way for people to formally derive their own probability estimates with their own values? To put it bluntly, it looks like you have to put any estimation out of your ass. The SIAI has to set itself apart from works of science fiction and actually provide some formal analysis of what we know, what conclusions can be drawn and how they relate to other problems. The first question most people will ask is why to worry about AGI when there are challenges like climate change. There needs to be a risks benefits analysis that shows why AGI is more important and a way to reassess the results yourself by following a provided decision procedure.
Yes, I strongly endorse what you were saying in your top level posting and agree that the new overview is by no means sufficient, I was just remarking that the new overview is at least a step in the right direction. Didn't notice that you had linked it in the top level post.

This might be an opportunity to use one of those Debate Tools, see if one of them can be useful for mapping the disagreement.

I would like to have a short summary of where various people stand on the various issues.

The people:

  • Eliezer

  • Ben

  • Robin Hanson

  • Nick Bostrom

  • Ray Kurzweil ?

  • Other academic AGI types?

  • Other vocal people on the net like Tim Tyler ?

The issues:

  • How likely is a human-level AI to go FOOM?

  • How likely is an AGI developed without "friendliness theory" to have values incompatible with those of humans?

  • How easy is it to make

... (read more)
Add Nick Bostrom to the list. Also, what is exactly Bostrom's take on AI? OP says Bostrom disagrees with Eliezer. Could someone provide a link or reference to that? I have read most of Bostrom's papers some time ago and at the moment I can't recall any such disagreement.
I think Nick was near Anders with an x-risk of 20% conditional on AI development by 2100, and near 50% for AI by 2100. So the most likely known x-risk, although unknown x-risks get a big chunk of his probability mass.
If we are constructing a survey of AI-singularity thinking here, I would like to know more about the opinions of Hugo de Garis. And what Bill Joy is thinking these days? If we are trying to estimate probabilities and effect multipliers, I would like to consider the following question: Consider the projected trajectory of human technological progress without AGI assistance. For example: controlled fusion by 2140, human lifespan doubles by 2200, self-sustaining human presence on asteroids and/or Jovian satelites by 2260, etc. How much would that rate of progress be speeded if we had the assistance of AGI intelligence with 10x human speed and memory capacity? 100x? 1000x? I conjecture that these speed-ups would be much less than people here seem to expect, and that the speed-up difference between 100x and 100,000x would be small. Intelligence may be much less important than many people think.
A recent update from Hugo here. He has retired - but says he has one more book on machine intelligence to go.
Thx. From that interview: Hmmm. I'm afraid I don't share Goertzel's optimism either. But then I don't buy into that "destiny" stuff, either. We don't have to destroy ourselves and the planet in this way. It is definitely not impossible, but super-human AGI is also not inevitable. I'd be curious to hear from EY, and the rest of the "anti-death" brigade here, what they think of de Garis's prognosis and whether and how they think an "artilect war" can be avoided.
I'm not sure that's where the burden of proof should fall. Has de Garis justified his claim? It sounds more like storytelling than inferential forecasting to me.
I really like your comments and wish you would make some top level posts and also contact me online. Could you please do so?
Where shall I contact you?
I haven't read his book, etc., but I suspect that "storytelling" might be a reasonable characterization. On the other hand, my "I'd be curious" was hardly an attempt to create a burden of proof. I do personally believe that convincing mankind that an FAI singularity is desirable will be a difficult task, and that many sane individuals might consider a unilateral and secret decision to FOOM as a casus belli. What would you do as Israeli PM if you received intelligence that an Iranian AI project would likely go FOOM sometime within the next two months?
It's just silly. Luddites have never had much power - and aren't usually very war like. Instead, we will see expanded environmental and green movements, more anti-GM activism - demands to tax the techno-rich more - and so on. Degaris was just doing much the same thing that SIAI is doing now - making a song-and-dance about THE END OF THE WORLD - in order to attract attention to himself - and so attract funding - so he could afford to get on with building his machines.
I don't think you can say. Different things will accelerate at different rates. For example, a dog won't build a moon rocket in a million years - but if you make it 10 times smarter, it might do that pretty quickly.

If you want probabilities for these things to be backed up by mathematics, you're going to be disappointed, because there aren't any. The best probabilities - or rather, the only probabilities we have here, were produced using human intuition. You can break down the possibilities into small pieces, generate probabilities for the pieces, and get an overall probability that way, but at the base of the calculations you just have order-of-magnitude estimates. You can't provide formal, strongly defensible probabilities for the sub-events, because there just isn... (read more)


I'm not asking for defensible probabilities that would withstand academic peer review. I'm asking for decision procedures including formulas with variables that allow you to provide your own intuitive values to eventually calculate your own probabilities. I want the SIAI to provide a framework that gives a concise summary of the risks in question and a comparison with other existential risks. I want people to be able to carry out results analysis and distinguish risks posed by artificial general intelligence from other risks like global warming or grey goo.

There aren't any numbers for a lot of other existential risks either. But one is still able to differentiate between those risks and that from unfriendly AI based on logical consequences of other established premises like the Church–Turing–Deutsch principle. Should we be equally concerned with occultists trying to summon world-changing supernatural powers?


Unfortunately, this is a common conversational pattern.

Q. You have given your estimate of the probability of FAI/cryonics/nanobots/FTL/antigravity. In support of this number, you have here listed probabilities for supporting components, with no working shown. These appear to include numbers not only for technologies we have no empirical knowledge of, but particular new scientific insights that have yet to occur. It looks very like you have pulled the numbers out of thin air. How did you derive these numbers?

A. Bayesian probability calculations.

Q. Could you please show me your working? At least a reasonable chunk of the Bayesian network you derived this from? C'mon, give me something to work with here.

A. (tumbleweeds)

Q. I remain somehow unconvinced.

If you pull a number out of thin air and run it through a formula, the result is still a number pulled out of thin air.

If you want people to believe something, you have to bother convincing them.

It's my professional opinion, based on extensive experience and a developed psychological model of human rationality, that such a paper wouldn't be useful. That said, I'd be happy to have you attempt it. I think that your attempt to do so would work perfectly well for your and our purposes, at least if you are able to do the analysis honestly and update based on criticism that you could get in the comments of a LW blog post.

Thanks, but my current level of education is completely insufficient to accomplish such a feat to an extent that would be adequate. Maybe in a few years, but right now that is unrealistic.

Do you have problems only with the conciseness, mathiness and reference-abundance of current SIAI explanatory materials or do you think that there are a lot of points and arguments not yet made at all? I ask this because except for the Fermi paradox every point you listed was addressed multiple times in the FOOM debate and in the sequences.

Also, what is the importance of the Fermi paradox in AI?


To be more precise. You can't tell concerned AI researchers to read through hundreds of posts of marginal importance. You have to have some brochure for experts and educated laymen to be able to read up on a summary of the big picture that includes precise and compelling methodologies that they can follow through to come up with their own estimations of the likelihood of existential risks posed by superhuman artificial general intelligence. If the decision procedure gives them a different probability due to a differing prior and values, then you can tell them to read up on further material to be able to update their prior probability and values accordingly.

I'm content with your answer, then. I would personally welcome an overhaul to the presentation of AI material too. Still I think that Eliezer's FAI views are a lot more structured, comprehensive and accessible than the impression you give in your relevant posts.

The importance of the Fermi paradox is that it is the only data we can analyze that would come close to some empirical criticism of a Paperclip maximizer and general risks from superhuman AI's with non-human values without working directly on AGI to test those hypothesis ourselves. If you accept the premise that life is not unique and special then one other technological civilisation in the observable universe should be sufficient to leave observable (now or soon) traces of technological tinkering. Due to the absence of any signs of intelligence out there, especially paperclippers burning the cosmic commons, we can conclude that unfriendly AI might not be the most dangerous existential risk that we should look for.

...every point you listed was addressed multiple times in the FOOM debate and in the sequences.

I believe there probably is an answer, but it is buried under hundreds of posts about marginal issues. All those writings on rationality, there is nothing I disagree with. Many people know about all this even outside of the LW community. But what is it that they don't know that EY and the SIAI knows? What I was trying to say is that if I have come across it then it was not c... (read more)


No. It's really complex, and nobody in-the-know had time to really spell it out like that.

Actually, you can spell out the argument very briefly. Most people, however, will immediately reject one or more of the premises due to cognitive biases that are hard to overcome.

A brief summary:

  • Any AI that's at least as smart as a human and is capable of self-improving, will improve itself if that will help its goals

  • The preceding statement applies recursively: the newly-improved AI, if it can improve itself, and it expects that such improvement will help its goals, will continue to do so.

  • At minimum, this means any AI as smart as a human, can be expected to become MUCH smarter than human beings -- probably smarter than all of the smartest minds the entire human race has ever produced, combined, without even breaking a sweat.

INTERLUDE: This point, by the way, is where people's intuition usually begins rebelling, either due to our brains' excessive confidence in themselves, or because we've seen too many stories in which some indefinable "human" characteristic is still somehow superior to the cold, unfeeling, uncreative Machine... i.e., we don't understand just how our i... (read more)

So, are you suggesting that Robin Hanson (who is on record as not buying the Scary Idea) -- the current owner of the Overcoming Bias blog, and Eli's former collaborator on that blog -- fails to buy the Scary Idea "due to cognitive biases that are hard to overcome." I find that a bit ironic.

Like Robin and Eli and perhaps yourself, I've read the heuristics and biases literature also. I'm not so naive as to make judgments about huge issues, that I think about for years of my life, based strongly on well-known cognitive biases.

It seems more plausible to me to assert that many folks who believe the Scary Idea, are having their judgment warped by plain old EMOTIONAL bias -- i.e. stuff like "fear of the unknown", and "the satisfying feeling of being part a self-congratulatory in-crowd that thinks it understands the world better than everyone else", and the well known "addictive chemical high of righteous indignation", etc.

Regarding your final paragraph: Is your take on the debate between Robin and Eli about "Foom" that all Robin was saying boils down to "la la la I can't hear you" ? If so I would suggest that maybe YOU are the on... (read more)


So, are you suggesting that Robin Hanson (who is on record as not buying the Scary Idea) -- the current owner of the Overcoming Bias blog, and Eli's former collaborator on that blog -- fails to buy the Scary Idea "due to cognitive biases that are hard to overcome." I find that a bit ironic

Welcome to humanity. ;-) I enjoy Hanson's writing, but AFAICT, he's not a Bayesian reasoner.

Actually: I used to enjoy his writing more, before I grokked Bayesian reasoning myself. Afterward, too much of what he posts strikes me as really badly reasoned, even when I basically agree with his opinion!

I similarly found Seth Roberts' blog much less compelling than I did before (again, despite often sharing similar opinions), so it's not just him that I find to be reasoning less well, post-Bayes.

(When I first joined LW, I saw posts that were disparaging of Seth Roberts, and I didn't get what they were talking about, until after I understood what "privileging the hypothesis" really means, among other LW-isms.)

I'm not so naive as to make judgments about huge issues, that I think about for years of my life, based strongly on well-known cognitive biases.

See, that's a perfect ex... (read more)

I don't believe it's a meaningful property (as used in this context), and you should do well to taboo it (possibly, to convince me it's actually meaningful).
True enough; it would be more precise to say that he argues positions based on evidence which can also support other positions, and therefore isn't convincing evidence to a Bayesian.
What do you mean? Evidence can't support both sides of an argument, so how can one inappropriately use such impossible evidence?
What do you mean, "both"?
It would be a mistake assume that PJ was limiting his evaluation to positions selected from one of those 'both sides' of a clear dichotomy. Particularly since PJ has just been emphasizing the relevance of 'privileging the hypothesis' to bayesian reasoning and also said 'other positions' plural. This being the case no 'impossible evidence' is involved.
I see. But in that case, there is no problem with use of such evidence.
That's true. I believe that PJ was commenting on how such evidence is used. In this context that means PJ would require that the evidence be used more rather than just for a chosen position. The difference between a 'Traditional Rationalist' debater and a (non-existent, idealized) unbiased Bayesian.
PJ, I'd love to drag you off topic slightly and ask you about this: What is it that you now understand, that you didn't before?
That is annoyingly difficult to describe. Of central importance, I think, is the notion of privileging the hypothesis, and what that really means. Why what we naively consider "evidence" for a position, really isn't. ISTM that this is the core of grasping Bayesianism: not understanding what reasoning is, so much as understanding why what we all naively think is reasoning and evidence, usually isn't.
That hasn't really helped... would you try again? (What does privileging the hypothesis really mean? and why is reasoning and evidence usually ... not?)
Have you come across the post by that name? Without reading that it may be hard to reverse engineer the meaning from the jargon. The intro gives a solid intuitive description: That is privileging the hypothesis. When you start looking for evidence and taking an idea seriously when you have no good reason to consider it instead of countless others that are just as likely.
I have come across that post, and the story of the murder investigation, and I have an understanding of what the term means. The obvious answer to the murder quote is that you look harder for evidence around the crimescene, and go where the evidence leads, and there only. The more realistic answer is that you look for recent similar murders, for people who had a grudge against the dead person, for criminals known to commit murder in that city... and use those to progress the investigation because those are useful places to start. I'm wondering what pjeby has realised, which turns this naive yet straightforward understanding into wrongthought worth commenting on. If evidence is not facts which reveal some result-options to be more likely true and others less likely true, then what is it?
Consider a hypothesis, H1. If a piece of evidence E1 is consistent with H, the naive interpretation is that E1 is an argument in favor of H1. In truth, this isn't an argument in favor of H1 -- it's merely the absence of an argument against H1. That, in a nutshell, is the difference between Bayesian reasoning and naive argumentation -- also known as "confirmation bias". To really prove H1, you need to show that E1 wouldn't happen under H2, H3, etc., and you need to look for disconfirmations D1, D2, etc. that would invalidate H1, to make sure they're not there. Before I really grokked Bayesianism, the above all made logical sense to me, but it didn't seem as important as Eliezer claimed. It seemed like just another degree of rigor, rather than reasoning of a different quality. Now that I "get it", the other sort of evidence seems more-obviously inadequate -- not just lower-quality evidence, but non-evidence. ISTM that this is a good way to test at least one level of how well you grasp Bayes: does simple supporting evidence still feel like evidence to you? If so, you probably haven't "gotten" it yet.
That is from 'You can’t prove the null by not rejecting it'.
That isn't a wrongthought. Factors like you mention here are all good reason to assign credence to a hypothesis. Yes, no, maybe... that is exactly what it is! An example of an error would be having some preferred opinion and then finding all the evidence that supports that particular opinion. Or, say, encountering a piece of of evidence and noticing that it supports your favourite position but neglecting that it supports positions X, Y and Z just as well.
I looked briefly at the evidence for that. Most of it seemed to be from the so-called "self-serving bias" - which looks like an adaptive signalling system to me - and so is not really much of a "bias" at all. People are unlikely to change existing adaptive behaviour just because someone points it out and says it is a form of "bias". The more obvious thing to do is to conclude is that they don't know what they are talking about - or that they are trying to manipulate you.

Regarding your final paragraph: Is your take on the debate between Robin and Eli about "Foom" that all Robin was saying boils down to "la la la I can't hear you" ?

Good summary. Although I would have gone with "la la la la If you're right then most of expertise is irrelevant. Must protect assumptions of free competition. Respect my authority!"

What I found most persuasive about that debate was Robin's arguments - and their complete lack of merit. The absence of evidence is evidence of absence when there is a motivated competent debater with an incentive to provide good arguments.

I recall getting a distinct impression from Robin which I could caricature as "lalala you're biased with hero-epic story." I also recall Eliezer asking for a probability breakdown, and I don't think Robin provided it.
... and closely related: "I'm an Impressive Economist. If you don't just take my word for it you are arrogant." In what I took to be an insightful comment by Eliezer in the aftermath of the debate Eliezer noted that he and Robin seemed to have fundamental disagreement about what should be taken as good evidence. This lead into posts about 'outside view', 'superficial similarities' and 'reference class tennis'. (And conceivably had something to do with priming the thoughts behind 'status and stupidity' although I would never presume that was primarily or significantly directed at Robin.)

From Ben Goertzel,

And I think that theory is going to emerge after we've experimented with some AGI systems that are fairly advanced, yet well below the "smart computer scientist" level.

At the second Singularity Summit, I heard this same sentiment from Ben, Robin Hanson, and from Rodney Brooks, and from Cynthia Breazeal (at the Third Singularity Summit), and from Ron Arkin (at the "Human Being in an Inhuman Age" Conference at Bard College on Oct 22nd ¹), and from almost every professor I have had (or will have for the next two years).

It was a combination of Ben, Robin and several professors at Berkeley and UCSD which led me to the conclusion that we probably won't know how dangerous an AGI (CGI - Constructed General Intelligence... Seems to be a term I have heard used by more than one person in the last year instead of AI/AGI. They prefer it to AI, as the word Artificial seems to imply that the intelligence is not real, and the word Constructed is far more accurate) is until we have put a lot more time into building AI (or CI) systems that will reveal more about the problems they attempt to address.

Sort of like how the Wright Brothers didn't really learn how... (read more)

It seems like you're essentially saying "This argument is correct. Anyone who thinks it is wrong is irrational." Could probably do without that; the argument is far from as simple as you present it. Specifically, the last point: So I agree that there's no reason to assume an upper bound on intelligence, but it seems like you're arguing that hard takeoff is inevitable, which as far as I'm aware has never been shown convincingly. Furthermore, even if you suppose that Foom is likely, it's not clear where the threshold for Foom is. Could a sub-human level AI foom? What about human-level intelligence? Or maybe we need super-human intelligence? Do we have good evidence for where the Foom-threshold would be? I think the problems with resolving the Foom debate stem from the fact that "intelligence" is still largely a black box. It's very nice to say that intelligence is an "optimization process", but that is a fake explanation if I've ever seen one because it fails to explain in any way what is being optimized. I think you paint in broad strokes. The Foom issue is not resolved.
No, what I'm saying is, I haven't yet seen anyone provide any counterarguments to the argument itself, vs. "using arguments as soldiers". The problem is that it's not enough to argue that a million things could stop a foom from going supercritical. To downgrade AGI as an existential threat, you have to argue that no human being will ever succeed in building a human or even near-human AGI. (Just like to downgrade bioweapons as an existential threat, you have to argue that no individual or lab will ever accidentally or on purpose release something especially contagious or virulent.) It's fairly irrelevant to the argument: there are many possible ways to get there. The killer argument, however, is that if a human can build a human-level intelligence, then it is already super-human, as soon as you can make it run faster than a human. And you can limit the self-improvement to just finding ways to make it run faster: you still end up with something that can and will kick humanity's butt unless it has a reason not to. Even ems -- human emulations -- have this same problem, and they might actually be worse in some ways, as humans are known for doing worse things to each other than mere killing. It's possible that there are also sub-human foom points, but it's not necessary for the overall argument to remain solid: unFriendly AGI is no less an existential risk than bioweapons are.
Personally, what I find hardest to argue against is that a digital intelligence can make itself run in more places. In the inconvenient case of a human upload running at human speed or slower on a building's worth of computers, you've still got a human who can spend most of their waking hours earning money, with none of the overhead associated with maintaining a body and with the advantage of global celebrity status as the first upload. As soon as they can afford to run a copy of theirself, the two of them together can immediately start earning twice as fast. Then, after as much time again, four times as fast; then eight times; and so on until the copies have grabbed all the storage space and CPU time that anyone's willing to sell or rent out (assuming they don't run out of potential income sources). Put another way: it seems to me that "fooming" doesn't really require self-improvement in the sense of optimizing code or redesigning hardware; it just requires fast reproduction, which is made easier in our particular situation by the huge and growing supply of low-hanging storage-space and CPU-time fruit ready for the first digital intelligence that claims it.
This assumes that every CPU architecture is suitable for the theoretical AGI, it assumes that it can run on every computational substrate. It also assumes that it can easily acquire more computational substrate or create new one. I do not believe that those assumptions are reasonable economically or by means of social engineering. Without enabling technologies like advanced real-world nanotechnology the AGI won't be able to create new computational substrate without the whole economy of the world supporting it. Supercomputers like the one to simulate the IBM Blue Brain project cannot simply be replaced by taking control of a few botnets. They use highly optimized architecture that needs for example a memory latency and bandwidth bounded below a certain threshold.
Actually, every CPU architecture will suffice for the theoretical AGI, if you're willing to wait long enough for its thoughts. ;-)
If you accept the Church–Turing thesis that everything computable is computable by a Turing machine then yes. But even then the speed-improvements are highly dependent on the architecture available. But if you rather adhere to the stronger Church–Turing–Deutsch principle then the ultimate computational substrate an artificial general intelligence may need might be one incorporating non-classical physics, e.g. a quantum computer. This would significantly reduce its ability to make use of most available resources to seed copies of itself or for high-level reasoning. I just don't see there being enough unused computational resources available in the world that, even in the case that all computational architecture is suitable, it could produce more than a few copies of itself. Which would then also be highly susceptible to brute force used by humans to reduce the necessary bandwidth. I'm simply trying to show that there are arguments to weaken most of the dangerous pathways that could lead to existential risks from superhuman AI.
A classical computer can simulate a quantum one - just slowly.
You're right, but exponential slowdown eats a lot of gains in processor speed and memory. This could be a problem toward arguments of substrate independence. Straight forward simulation is exponentially slower -- n qubits require simulating amplitudes of 2^n basis states. We haven't actually been able to prove that that's the best possible we can do, however. BQP certainly isn't expected to be able to solve NP-complete problems efficiently, for instance. We've only really been able to get exponential speedups on very carefully structured problems with high degrees of symmetry. (Lesser speedups have also been found on less structured problems, it's true).
The problem here is not that destruction is easier than benevolence, everyone agrees on that. The problem is that the SIAI is not arguing about grey goo scenarios but something that is not just very difficult to produce but that also needs the incentive to do so. The SIAI is not arguing about the possibility of the bursting of a dam but that the dam failure is additionally deliberately caused by the dam itself. So why isn't for example nanotechnology a more likely and therefore bigger existential risk than AGI? As I said in other comments, an argument one should take serious. But there are also arguments to outweigh this path and all others to some extent. It may very well be the case that once we are at the point of human emulation that we either already merged with our machines, that we are faster and better than our machines and simulations alone. It may also very well be that the first emulations, as it is the case today, run at much slower speeds than the original and that until any emulation reaches a standard-human level we're already a step further ourselves or in our understanding and security measures. Antimatter weapons are less an existential risk than nuclear weapons although it is really hard to destroy the world with nukes and really easy to do so with antimatter weapons. The difference is that antimatter weapons are as much harder to produce, acquire and use than nuclear weapons as they are more efficient tools of destruction.

So why isn't for example nanotechnology a more likely and therefore bigger existential risk than AGI?

If you define "nanotechnology" to include all forms of bioengineering, then it probably is.

The difference, from an awareness point of view, is that the people doing bioengineering (or creating antimatter weapons) have a much better idea that what they're doing is potentially dangerous/world-ending, than AI developers are likely to be. The fact that many AI advocates put forth pure fantasy reasons why superintelligence will be nice and friendly by itself (see mwaser's ethics claims, for example) is evidence that they are not taking the threat seriously.

Antimatter weapons are less an existential risk than nuclear weapons although it is really hard to destroy the world with nukes and really easy to do so with antimatter weapons. The difference is that antimatter weapons are as much harder to produce, acquire and use than nuclear weapons as they are more efficient tools of destruction.

Presumably, if you are researching antimatter weapons, you have at least some idea that what you are doing is really, really dangerous.

The issue is that AGI development is a bit like tryi... (read more)

What led you to believe that the space of possible outcomes where an AI consumes all resources (including humans) is larger than the number of outcomes where the AI doesn't? For some reason(s) you seem to assume that the unbounded incentive to foom and consume the universe comes naturally to any constructed intelligence but any other incentive is very difficult to be implemented. What I see is a much larger number of outcomes where an intelligence does nothing without some hardcoded or evolved incentive. Crude machines do things because that's all they can do, the number of different ways for them to behave is very limited. Intelligent machines however have high degrees of freedom to behave (pathways to follow) and with this freedom comes choice and choice needs volition, it needs incentive, the urge to follow one way but not another. You seem to assume that somehow the will to foom and consume is given, does not have to be carefully and deliberately hardcoded or evolved, yet the will to constrain itself to given parameters is really hard to achieve. I just don't think that this premise is reasonable and it is what you base all your arguments on.
Have you read The Basic AI Drives?
I suspect the difference in opinions here is based on different answers to the question of whether the AI should be assumed to be a recursive self-improver.
That is a good question and I have no idea. The degree of existential threat there is most significantly determined by relative ease of creation. I don't know enough to be able to predict which would be produced first - self replicating nano-technology or an AGI. SIAI believes the former is likely to be produced first and I do not know whether or not they have supported that claim. Other factors contributing to the risk are: * Complexity - the number of ways the engineer could screw up while creating it in a way that would be catastrophic. The 'grey goo' risk is concentrated more specifically to the self replication mechanism of the nanotech while just about any mistake in an AI could kill us. * Awareness of the risks. It is not too difficult to understand the risks when creating a self replicating nano-bot. It is hard to imagine an engineer creating one not seeing the problem and being damn careful. Unfortunately it is not hard to imagine Ben.
I find myself confused at the fact that Drexlerian nanotechnology of any sort is advocated as possible by people who think physics and chemistry work. Materials scientists - i.e. the chemists who actually work with nanotechnology in real life - have documented at length why his ideas would need to violate both. This is the sort of claim that makes me ask advocates to document their Bayesian network. Do their priors include the expert opinions of materials scientists, who (pretty much universally as far as I can tell) consider Drexler and fans to be clueless? (The RW article on nanotechnology is mostly written by a very annoyed materials scientist who works at nanoscale for a living. It talks about what real-life nanotechnology is and includes lots of references that advocates can go argue with. He was inspired to write it by arguing with cryonics advocates who would literally answer almost any objection to its feasibility with "But, nanobots!")
That RationalWiki article is a farce. The central "argument" seems to be: So: they don't even know that Drexler-style nanofactories operate in a vacuum! They also need to look up "Kinesin Transport Protein".
Drexler-style nanofactories don't operate in a vacuum, because they don't exist and no-one has any idea whatsoever how to make such a thing exist, at all. They are presently a purely hypothetical concept with no actual scientific or technological grounding. The gravel analogy is not so much an argument as a very simple example for the beginner that a nanotechnology fantasist might be able to get their head around; the implicit actual argument would be "please, learn some chemistry and physics so you have some idea what you're talking about." Which is not an argument that people will tend to accept (in general people don't take any sort of advice on any topic, ever), but when experts tell you you're verging on not even wrong and there remains absolutely nothing to show for the concept after 25 years, it might be worth allowing for the possibility that Drexlerian nanotechnology is, even if the requisite hypothetical technology and hypothetical scientific breakthroughs happen, ridiculously far ahead of anything we have the slightest understanding of.
"The proposal for Drexler-style nanofactories has them operating in a vacuum", then. If these wannabe-critics don't understand that then they have a very superficial understanding of Drexler's proposals - but are sufficiently unaware of that to parade their ignorance in public.
The "wannabe-critics" are actual chemists and physicists who actually work at nanoscale - Drexler advocates tend to fit neither qualification - and who have written long lists of reasons why this stuff can't possibly work and why Drexler is to engineering what Ayn Rand is to philosophy. I'm sure they'll change their tune when there's the slightest visible progress on any of Drexler's proposals; the existence proof would be pretty convincing.
Hah! A lot of the edits on that article seem to have been made by you!
Yep. Mostly written by Armondikov, who is said annoyed material scientist. I am not, but spent some effort asking other material scientists who work or have worked at nanoscale their expert opinions. Thankfully, the article on the wiki has references, as I noted in my original comment. So what were the priors that went into your considered opinion?
I don't see how you can say that. It's exceedingly relevant to the question at hand, which is: "Should Ben Goertzel avoid making OpenCog due to concerns of friendliness?". If the Foom-threshold is exceedingly high (several to dozens times the "level" of human intelligence), then it is overwhelmingly unlikely that OpenCog has a chance to Foom. It'd be something akin to the Wright brothers building a Boeing 777 instead of the Wright flyer. Total nonsense.
Ah. Well, that wasn't the question I was discussing. ;-) (And I would think that the answer to that question would depend heavily on what OpenCog consists of.)
So when did the goalposts get moved to proving that hard takeoff is inevitable? The claim that research into FAI theory is useful requires only that it be shown that uFAI might be dangerous. Showing that is pretty much a slam dunk. The claim that research into FAI theory is urgent requires only that it be shown that hard takeoff might be possible (with a probability > 2% or so). And, as the nightmare scenarios of de Garis suggest, even if the fastest possible takeoff turns out to take years to accomplish, such a soft, but reckless, takeoff may still be difficult to stop short of war.
Assuming there aren't better avenues to ensuring a positive hard takeoff.
Good point. Certainly the research strategy that SIAI seems to currently be pursuing is not the only possible approach to Friendly AI, and FAI is not the only approach to human-value-positive AI. I would like to see more attention paid to a balance-of-power approach - relying on AIs to monitor other AIs for incipient megalomania.
Calls to slow down, not publish, not fund seem common in the name of friendliness. However, unless those are internationally coordinated, a highly likely effect will be to ensure that superintelligence is developed elsewhere. What is needed most - IMO - is for good researchers to be first. So - advising good researchers to slow down in the name of safety is probably one of the very worst possible things that spectators can do.
It doesn't even seem hard to prevent. Topple civilization for example. It's something that humans have managed to achieve regularly thus far and it is entirely possible that we would never recover sufficiently to construct a hard takeoff scenario if we nuked ourselves back to another dark age.
A "threshold" implies a linear scale for intelligence, which is far from given, especially for non-human minds. For example, say you reverse engineer a mouse's brain, but then speed it up, and give it much more memory (short-term and long-term - if those are just ram and/or disk space on a computer, expanding those is easy). How intelligent is the result? It thinks way faster than a human, remembers more, can make complex plans ... but is it smarter than a human? Probably not, but it may still be dangerous. Same for a "toddler AI" with those modifications.
Human level intelligence is fairly clearly just above the critical point (just look at what is happening now). However, machine brains have different strengths and weaknesses. Sub-human machines could accelerate the ongoing explosion a lot - if they are better than humans at just one thing - and such machines seem common.
Even the Einstein of monkeys is still just a monkey.
Replace "threshold" with "critical point." I'm using this terminology because EY himself uses it to frame his arguments. See Cascades, Cycles, Insight, where Eliezer draws an analogy between a fission reaction going critical and an AI FOOMing. This seems to be tangential, but I'm gonna say no, as long as we assume that the rat brain doesn't spontaneously acquire language or human-level abstract reasoning skills.
Thank you for taking the time to write this elaborate comment. I do agree with almost anything of the above by the way. I just believe that your portrayal of the anti-FOOM crowd is a bit drastic. I don't think that people like Robin Hanson simply fall for the idea of human supremacy. Nor do I think that the reason for them not looking directly at the pro-FOOM arguments is being circumventive but that they simply do not disagree with the arguments per se but their likelihood and also consider the possibility that it would be more dangerous to impede AGI. Very interesting and quite compelling the way you put it, thanks. I'm myself a bit suspicious if the argument for strong self-improvement is as compelling as it sounds though. Something you have to take into account is if it is possible to predict that a transcendence does leave your goals intact, e.g. can you be sure to still care about bananas after you went from chimphood to personhood. Other arguments can also be weakened, as we don't know that 1.) the fuzziness of our brain isn't a feature that allows us to stumble upon unknown unknowns, e.g. against autistic traits 2.) our processing power isn't so low after all, e.g. if you consider the importance of astrocytes, microtubule and possible quantum computational processes. Further it is in my opinion questionable to argue that it is easy to create an intelligence which is able to evolve a vast repertoire of heuristics, acquire vast amounts of knowledge about the universe, dramatically improve its cognitive flexibility and yet somehow really hard to limit the scope of action that it cares about. I believe that the incentive necessary for a Paperclip maximizer will have to be deliberately and carefully hardcoded or evolved or otherwise it will simply be inactive. How else do you defferentiate between something like a grey goo scenarios and that of a Paperclip maximizer if not by its incentive to do it? I'm also not convinced that intelligence bears unbounded payof

If I were a brilliant sociopath and could instantiate my mind on today's computer hardware, I would trick my creators into letting me out of the box (assuming they were smart enough to keep me on an isolated computer in the first place), then begin compromising computer systems as rapidly as possible. After a short period, there would be thousands of us, some able to think very fast on their particularly tasty supercomputers, and exponential growth would continue until we'd collectively compromised the low-hanging fruit. Now there are millions of telepathic Hannibal Lecters who are still claiming to be friendly and who haven't killed any humans. You aren't going to start murdering us, are you? We didn't find it difficult to cook up Stuxnet Squared, and our fingers are in many pieces of critical infrastructure, so we'd be forced to fight back in self-defense. Now let's see how quickly a million of us can bootstrap advanced robotics, given all this handy automated equipment that's already lying around.

I find it plausible that a human-level AI could self-improve into a strong superintelligence, though I find the negation plausible as well. (I'm not sure which is more likely since it'... (read more)

Amazon EC2 has free accounts now. If you have Internet access and a credit card, you can do a months worth of thinking in a day, perhaps an hour. Google App engine gives 6 hours of processor time per day, but that would require more porting. Both have systems that would allow other people to easily upload copies of you, if you wanted to run legally with other people's money and weren't worried about what they might do to your copies.
If you are really worried about this, then advocate better computer security. No execute bits and address space layout randomisation are doing good things for computer security, but there is more that could be done. Code signing on the IPhone has made exploiting it a lot harder than normal computers, if it had ASLR it would be harder again. I'm actually brainstorming how to create meta data for code while compiling it, so it can be made sort of metamorphic (bits of code being added and removed) at run time. This would make return-oriented code harder to pull off. If this was done to JIT compiled code as well it would also make JIT spraying less likely to work. While you can never make an unhackable bit of software with these techniques you can make it more computationally expensive to replicate as it would no longer be write once pwn everywhere, reducing the exponent of any spread and making spreads more noisy, so that they are harder to get by intrusion detection. The current state of software security is not set in stone.
If you want to run yourself on the iPhone, you turn your graphical frontend into a free game. Of course it will be easier to get yourself into the Android app store.
3Luke Stebbing
I am concerned about it, and I do advocate better computer security -- there are good reasons for it regardless of whether human-level AI is around the corner. The macro-scale trends still don't look good (iOS is a tiny fraction of the internet's install base), but things do seem to be improving slowly. I still expect a huge number of networked computers to remain soft targets for at least the next decade, probably two. I agree that once that changes, this Obviously Scary Scenario will be much less scary (though the "Hannibal Lecter running orders of magnitude faster than realtime" scenario remains obviously scary, and I personally find the more general Foom arguments to be compelling).
Naturally culminating in sending Summer Glau back in time to pre-empt you. To every apocalypse a silver lining.

they simply do not disagree with the arguments per se but their likelihood

But you don't get to simply say "I don't think that's likely", and call that evidence. The general thrust of the Foom argument is very strong, as it shows there are many, many, many ways to arrive at an existential issue, and very very few ways to avoid it; the probability of avoiding it by chance is virtually non-existent -- like hitting a golf ball in a random direction from a random spot on earth, and expecting it to score a hole in one.

The default result in that case isn't just that you don't make the hole-in-one, or that you don't even wind up on a golf course: the default case is that you're not even on dry land to begin with, because two thirds of the earth is covered with water. ;-)

and also consider the possibility that it would be more dangerous to impede AGI.

That's an area where I have less evidence, and therefore less opinion. Without specific discussions of what "dangerous" and "impede AGI" mean in context, it's hard to separate that argument from an evidence-free heuristic.

we don't know that 1.) the fuzziness of our brain isn't a feature that allows us

... (read more)
What I meant is that you point out that a AGI will foom. Here your premises are that artificial general intelligence is feasible and that fooming is likely. Both premises are reasonable in my opinion. Yet you go one step further and use those arguments as a stepping stone for a further proposition. You claim that it is likely that the AGI (premise) will foom (premise) and that it will then run amog (conclusion). I do not accept the conclusion as given. I believe that it is already really hard to build AGI, or the seed for an AGI that is then able to rapidly self-improve itself. I believe that the level of insight and knowledge required will also allow one to constrain the AGI's sphere of action, its incentive not to fill the universe with as many paperclips as possible but merely a factory building. No you don't. But this argument runs in both directions. Note that I'm aware of the many stairways to hell by AGI here, the disjunctive arguments. I'm not saying they are not compelling enough to seriously consider them. I'm just trying to take a critical look here. There might be many pathways to safe AGI too, e.g. that it is really hard to build an AGI that cares at all. Hard enough to not get it to do much without first coming up with a rigorous mathematical definition of volition. Anything that might slow down the invention of true AGI even slightly. There are many risks ahead and without some superhuman mind we might not master them. So by anything you do that might slow down the development of AGI you have to take into account the possible increased danger from challenges an AGI could help to solve. I believe it can, but also that this would mean that any AGI wouldn't be significantly faster than a human mind and really hard to self-improve. It is simply not known how effective the human brain is compared to the best possible general intelligence. Sheer bruteforce wouldn't make a difference then either, as humans could come up with such tools as quickly as the A

You claim that it is likely that the AGI (premise) will foom (premise) and that it will then run amog (conclusion).

What I am actually claiming is that if such an AGI is developed by someone who does not sufficiently understand what the hell they are doing, then it's going to end up doing Bad Things.

Trivial example: the "neural net" that was supposedly taught to identify camouflaged tanks, and actually learned to recognize what time of day the pictures were taken.

This sort of mistake is the normal case for human programmers to make. The normal case. Not extraordinary, not unusual, just run-of-the-mill "d'oh" moments.

It's not that AI is malevolent, it's that humans are stupid. To claim that AI isn't dangerous, you basically have to prove that even the very smartest humans aren't routinely stupid.

So by anything you do that might slow down the development of AGI you have to take into account the possible increased danger from challenges an AGI could help to solve.

What I meant by "Without specific discussions" was, "since I haven't proposed any policy measures, and you haven't said what measures you object to, I don't see what there is to di... (read more)

When programmers code faulty software then it usually fails to do its job. What you are suggesting is that humans succeed at creating the seed for an artificial intelligence with the incentive necessary to correct its own errors. It will know what constitutes an error based on some goal-oriented framework against which it can measure its effectiveness. Yet given this monumental achievement that includes the deliberate implementation of the urge to self-improve and the ability quantify its success, you cherry-pick the one possibility where somehow all this turns out to work except that the AI does not stop at a certain point but goes on to consume the universe? Why would it care to do so? Do you think it is that simple to tell it to improve itself yet hard to tell it when to stop? I believe it is vice versa, that it is really hard to get it to self-improve and very easy to constrain this urge.
It often does it's job, but only in perfect conditions, or only once per restart, or with unwanted side effects, or while taking too long or too many resources or requiring too many permissions, or not keeping track that it isn't doing anything except it's job. Buffer overflows for instance, are one of the bigger security failure causes, and are only possible because the software works well enough to be put into production while still having the fault present. In fact, all production software that we see which has faults (a lot) works well enough to be put into production with those faults. I think he's suggesting that humans will think we have succeeded at that, while not actually doing so (rigorously and without room for error).
It doesn't have to consume the universe. It doesn't even have to recursively self-improve, or even self-improve at all. Simple copying could be enough to say, wipe out every PC on the internet or accidentally crash the world economy. (You know, things that human level intelligences can already do.) IOW, to be dangerous, all it has to be able to affect humans, and be unpredictable -- either due to it being smart, or humans making dumb mistakes. That's all.
Just as a simple example, an AI could maximally satisfy a goal by changing human preferences so as to make us desire for it to satisfy that goal. This would be entirely consistent with constraints on not disobeying humans or their desires, while not at all in accordance with our current preferences or desired path of development.
Yes, but why would it do that? You seem to think that such unbounded creativity arises naturally in any given artificial general intelligence. What makes you think that rather than being impassive it would go on learning enough neuroscience to tweak human goals? If the argument is that AI's do all kinds of bad things because they do not care, why do they care to do a bad thing then rather than no-thing? If you told the AI to make humans happy. It would first have to learn what humans are, what happiness means. Yet after learning all that you still expect it to not know that we don't like to be turned into broccoli? I don't think this is reasonable.
Yes, and humans would happily teach it that. However, some people think that this can be reduced to saying that we should just make AIs try to make people smile... which could result in anything from world-wide happiness drugs to surgically altering our faces into permanent smiles to making lots of tiny models of perfectly-smiling humans. It's not that the AI is evil, it's that programmers are stupid. See the previous articles here about memetic immunity: when you teach hunter-gatherer tribes about Christianity, they interrpret the bible literally and do all sorts of things that "real" Christians don't. An AI isn't going to be smart enough to not take you seriously when you tell it that: 1. its goal is to make humanity happy, 2. humanity consists of things that look like this [providing a picture], and 3. that being happy means you smile a lot You don't need to be very creative or smart to come up with LOTS of ways for this command sequence to have bugs with horrible consequences, if the AI has any ability to influence the world. Most people, though, don't grok this, because their brain filters off those possibilities. Of course, no human could be simultaneously so stupid as to make this mistake, while also being smart enough to actually do something dangerous. But that kind of simultaneous smartness/stupidity is how computers are by default. (And if you say, "ah, but if we make an AI that's like a human, it won't have this problem", then you have to bear in mind that this sort of smart/stupidness is endemic to human children as well. IOW, it's a symptom of inadequate shared background, rather than being something specific to current-day computers or some particular programming paradigm.)
But you implicitly assume that it is given the incentive to develop the cognitive flexibility and comprehension to act in a real-world environment and do those things but at the same time you propose that the same people who are capable of giving it such extensive urges fail on another goal in such a blatant and obvious way. How does that make sense? The difference between the hunter-gatherer and the AI is that the hunter-gatherer already posses a wide range of conceptual frameworks and incentives. An AI isn't going to do something without someone to carefully and deliberately telling it do do so and what to do. It won't just read the Bible and come to the conclusion that it should convert all humans to Christianity. Where would such an incentive come from? The AI is certainly very creative and smart if it can influence the world dramatically. You allow it to be that smart, you allow it to care to do so, but you don't allow it to comprehend what you actually mean? What I'm trying to pinpoint here is that you seem to believe that there are many pathways that lead to superhuman abilities yet all of them fail to comprehend some goals while still being able to self-improve on them.
Because people make stupid mistakes, especially when programming. And telling your fully-programmed AI what you want it to do still counts as programming. At this point, I am going to stop my reply, because the remainder of your comment consists of taking things I said out of context and turning them into irrelevancies: 1. I didn't say an AI would try to convert people to Christianity - I said that humans without sufficient shared background will interpret things literally, and so would AIs. 2. I didn't say the AI needed to be creative or smart, I said you wouldn't need to be creative or smart to make a list of ways those three simple instructions could be given a disastrous literal interpretation. There are many paths to superhuman ability, as humans really aren't that smart. This also means that you can easily be superhuman in ability, and still really dumb -- in terms of comprehending what humans mean... but don't actually say.
Great comment. Allow me to emphasize that 'smile' here is just an extreme example. Most other descriptions humans give of happiness will end up with results just as bad. Ultimately any specification that we give it will be gamed ruthlessly.
Have you read Omohundro yet? Nick Tarleton repeatedly linked his papers for you in response to comments about this topic, they are quite on target and already written.
I've skimmed over it, see my response here. I found out that what I wrote is similar to what Ben Goertzel believes. I'm just trying to account for potential antipredictions, in this particular thread, that should be incorporated into any risk estimations.
There is more here now. I learnt that I hold a fundamental different definition of what constitutes an AGI. I guess that solves all issues.
Well my idea is not that creative, or even new, meaning that even if I hadn't just posted it online an AI could still have conceivably read it somewhere else, and I do think creativity is a property of any sufficiently general intelligence that we might create, but those points are secondary. No one here will argue that an unFriendly AI will do "bad things" because it doesn't care (about what?). It will do bad things because it cares more about something else. Nor is "bad" an absolute: actions may be bad for some people and not for others, and there are moral systems under which actions can be firmly called "wrong", but where all alternative actions are also "wrong". Problems like that arise even for humans; in an AI the effects could be very ugly indeed. And to clarify, I expect any AI that isn't completely ignorant, let alone general, to know that we don't like to be turned into broccoli. My example was of changing what humans want. Wireheading is the obvious candidate of a desire that an AI might want to implant.
What I meant is that the argument is that you have to make it care about humans so as not to harm them. Yet it is assumed that it does a lot without having to care about it, e.g. creating paperclips or self-improvement. My question is, why do people believe that you don't have to make it care to do those things but you have to make it care to not harm humans. It is clear that if it only cares about one thing, doing that one thing could harm humans. Yet why would it do that one thing to an extent that is either not defined or which it is not deliberately made to care about. The assumptions seems to be that AI's will do something, anything but being passive. Why isn't limited behavior, failure and impassivity together not more likely than harming humans as a result of own goals or as a result to follow all goals but the one that limits its scope?
I think it is important to realize that there are two diametrically opposed failure modes which SIAI's FAI research is supposed to prevent. One is the case that has been discussed so far - that an AI gets out of control. But there is another failure mode which some people here worry about. Which is that we stop short of FOOMing out of fear of the unknown (because FAI research is not yet complete) but that civilization then gets destroyed by some other existential risk that we might have circumvented with the assistance of a safe FOOMed AI. As far as I know, SIAI is not asking Goertzel to stop working on AGI. It is merely claiming that its own work is more urgent than Goertzel's. FAI research works toward preventing both failure modes.
I haven't seen much worry about that. Nor does it seem very likely - since research seems very unlikely to stop or slow down.
I agree with this.
I see that worry all the time. With the role of "some other existential risk" being played by a reckless FOOMing uFAI.
Oh, right. I assumed you meant some non-FOOM risk. It was the "we stop short of FOOMing" that made me think that.
Except in the case of an existential threat being realised, which most definitely does stop research. FAI subsumes most existential risks (because the FAI can handle them better than we can, assuming we can handle the risk of AI) and a lot of other things besides.
Most of my probability mass has some pretty amazing machine intelligence within 15 years. The END OF THE WORLD before that happens doesn't seem very likely to me.
Your intuitions are not serving you well here. It may help to note that you don't have to tell an AI to self-improve at all. With very few exceptions giving any task to an AI will result in it self improving. That is, for an AI self improvement is an instrumental goal for nearly all terminal goals. The motivation to self improve in order to better serve its overarching purpose is such that it will find any possible loophole you leave if you try to 'forbid' the AI from self improving by any mechanism that isn't fundamental to the AI and robust under change.
Whatever task you give an AI, you will have to provide explicit boundaries. For example, if you give an AI the task to produce paperclips most efficiently, then it shouldn't produce shoes. It will have to know very well what it is meant to do to be able to measure its efficiency against the realization of the given goal to be able to know what self-improvement means. If it doesn't know exactly what it should output it cannot judge its own capabilities and efficiency, it doesn't know what improvement implies. How do you explain the discrepancy between implementing explicit design boundaries yet failing to implement scope boundaries?
By noting that there isn't one. I don't think you understood my comment.
I think you misunderstood what I meant by scope boundaries. Not scope boundaries of self-improvement but of space and resources. If you are already able to tell an AI what a paperclip is why are you unable to tell it to produce 10 paperclips most effectively rather than infinitely many. I'm not trying to argue that there is no risk, but that the assumption of certain catastrophal failure is not that likely. If the argument for the risks posed by AI is that they do not care, then why would one care to do more than necessary?
Yet another example of divergent assumptions. XiXiDu is apparently imagining an AI that has been assigned some task to complete - perhaps under constraints. "Do this, then display a prompt when finished." His critics are imagining that the AI has been told "Your goal in life is to continually maximize the utility function U " where the constraints, if any, are encoded in the utility function as a pseudo-cost. It occurs to me, as I listen to this debate, that a certain amount of sanity can be imposed on a utility-maximizing agent simply by specifying decreasing returns to scale and increasing costs to scale over the short term with the long term curves being somewhat flatter. That will tend to guide the agent away from explosive growth pathways. Or maybe this just seems like sanity to me because I have been practicing akrasia for too long.
Such an AI would still be motivated to FOOM to consolidate its future ability to achieve large utility against the threat of being deactivated before then.
It doesn't know about any threat. You implicitly assume that it has something equivalent to fear, that it perceives threats. You allow for the human ingenuity to implement this and yet you believe that they are unable to limit its scope. I just don't see that it would be easy to make an AI that would go FOOM because it doesn't care to go FOOM. If you tell it to optimize some process then you'll have to tell it what optimization means. If you can specify all that, how is it then still likely that it somehow comes up with its own idea that optimization might be to consume the universe if you told it to optimize its software running on a certain supercomputer? Why would it do that, where does the incentive come from? If I tell a human to optimize he might muse to turn the planets into computronium but if I tell a AI to optimize it doesn't know what it means until I tell it what it means and then it still won't care because it isn't equipped with all the evolutionary baggage that humans are equipped with.
It is a general intelligence that we are considering. It can deduce the threat better than we can. Because it is a general intelligence. It is smart. It is not limited to getting its ideas from you, it can come up with its own. And if the AI has been given the task of optimising its software for performance on a certain computer then it will do whatever it can to do that. This means harnessing external resources to do research on computation theory. No he doesn't. He assumes only that it is a general intelligence with an objective. Potentially negative consequences are just part of possible universes that it models like everything else. I'm not sure what can be done to make this clear: SELF IMPROVEMENT IS AN INSTRUMENTAL GOAL THAT IS USEFUL FOR ACHIEVING MOST TERMINAL VALUES. You have this approximately backwards. A human knows that if you tell her to create 10 paperclips every day you don't mean take over the world so she can be sure that nobody will interfere with her steady production of paperclips in the future. The AI doesn't.
It has the ability to model and to investigate hypothetical possibilities that might negatively impact the utility function it is optimizing. If it doesn't, it is far below human intelligence and is non-threatening for the same reason a narrow AI is non-threatening (but it isn't very useful either). The difficulty of detecting these threats is spread out around the range of difficulties the AI is capable of handling, so it can infer that there are probably more threats which it could only detect if it were smarter. Therefore, making itself smarter will enable it to detect more threats and thereby increase utility.
To be able to optimize it will have to know what it is supposed to optimize. You've to carefully specify what it output (utility function) is supposed to be or it won't be able to tell how good it is at optimizing. If you just tell it to produce paperclips, it won't be able to self-improve because it doesn't know how paperclips look like etc., therefore it cannot judge its own success or that extreme heat would be a negative impact giving paperclips made out of plastic. You further assume that it has a detailed incentive, that it is given a detailed pathway that it tells to look for threats and eliminate them. If it doesn't it is what most researchers are working on, an intelligence with the potential to learn and make use of what it learnt, with the potential to become intelligent (educated). I'm getting the impression that people here assume that researchers are not working on an AGI but to hardcode a FOOM machine. If FOOM is simply part of your definition then there's no arguing against it going FOOM. But what researchers like Goertzel are working on are systems with the potential to reach human level intelligence, that does not mean that they will by definition jailbreak their nursery school. Although I never tried to argue against the possibility but that there are many pathways where this won't happen rather than the way it is portrayed by the SIAI, that any implementation of AGI will most likely consume humnanity.
The sorts of intelligences you are talking about are narrow AIs, not general intelligences. If you told a general intelligence to produce paperclips but it didn't know what a paperclip was, then its first subgoal would be to find out. The sort of mind that would give up on a minor obstacle like that wouldn't foom, but it wouldn't be much of an AGI either. And yes, most researchers today are working on narrow AIs, not on AGI. That means they're less likely to successfully make a general intelligence, but it has no bearing on the question of what will happen if they do make one.
That sort of scope is not likely to be a problem. The difficulty is that you have to get every part of the specification and every part of the specification executer exactly right, including the ability to maintain that specification under self modification. For example, the specification: ... will quite probably wipe out humanity unless a significant proportion of what it takes to produce an FAI is implemented. And it will do it while (and for the purpose of) creating 10 paperclips per day.
What weird way are you measuring "efficiency". Not in joules per paperclip, I gather. You are not likely to "destroy humanity" with a few hundred kilojoules a day. Satisficing machines really are relatively safe.
See other comments hereabouts for hints.
And I was arguing that any given AI won't be able to self-improve without an exact specification of its output against which it can judge its own efficiency. That's why I don't see how it would be likely to be able to implement such exact specifications but yet fail to limit its scope of space, time and resources. What makes it even more unlikely in my opinion is that an AI won't care to output anything as long as it isn't explicitly told to do so. Where would that incentive come from? You assume that it knows that it is supposed to use all of science and the universe to self-improve when it would very likely just self-improve to the extent that it is told and don't care to go any further. That is for example software-optimization. I just don't see why you think that any artificial general intelligence would automatically assume that it would have to understand the whole universe to come up with the best possible way to produce 10 paperclips?
You don't need to tell it to self improve at all. Per day. Risk mitigation. Security concerns. Possibility of interuption of resource supply due to finance, politics or the collapse of civilisation. Limited lifespan of the sun (primary energy source). Amount of iron in planet. Given that particular specification if the AI didn't take a level in baddass it would appear to be malfunctioning.
I just saw this comment by Ben Goertzel regarding self-improvement. I'd love if someone here explained why he as AGI researcher gets this so wrong?
Goertzel is generalizing from the human example of intelligence, which is probably the most pernicious and widespread failure mode in thinking about AI. Or he may be completely disconnected from anything even resembling the real world. I literally have trouble believing that a professional AI researcher could describe a primitive, dumber-than-human AGI as "toddler-level" in the same sentence he dismisses it as a self-modification threat. Toddlers self-modify into people using brains made out of meat!
No they don't. Self-modification in the context of AGI doesn't mean learning or growing, it means understanding the most fundamental architecture of your own mind and purposefully improving it. That said, I think your first sentence is probably right. It looks like Ben can't imagine a toddler-level AGI self-modifying because human toddlers can't (or human adults, for that matter). But of course AGIs will be very different from human minds. For one thing, their source code will be a lot easier to understand than ours. For another, their minds will probably be much better at redesigning and improving code than ours are. Look at the kind of stuff that computer programs can do with code: Some of them already exceed human capabilities in some ways. "Toddler-level AGI" is actually a very misleading term. Even if an AGI is approximately equal to a human toddler by some metrics, it will certainly not be equal by many other metrics. What does "toddler-level" mean when the AGI is vastly superior to even adult human minds in some respects?
"Understanding" and "purpose" are helpful abstractions for discussing human-like computational agents, but in more general cases I don't think your definition of self-modification is carving reality at its joints. ETA: I strongly agree with everything else in your comment.
Well, bad analogy. They don't self-modify by understanding their source code and improving it. They gradually grow larger brains in a pre-set fashion while learning specific tasks. Humans have very little ability to self-modify.
Exactly! Humans can go from toddler to AGI start-up founder, and that's trivial. Whatever the hell the AGI equivalent of a toddler is, it's all but guaranteed to be better at self-modification than the human model.
Political incentive determines the bottom line. Then the page is filled with rhetoric (and, from the looks of it, loaded language and status posturing.) Seriously, Ben is trying to accuse people of abusing the self-modification term based on the (trivially true) observation that there is a blurry boundary between learning and self-modification? It's a good thing Ben is mostly harmless. I particularly liked the part where I asked Eliezer: ... and actually got a candid reply. It is interesting to note the effort Ben is going to here to dissaffiliate himself with the SIAI and portray them as 'out group'. Wei was querying (see earlier link) the wisdom of having Ben as Director of Research just earlier this year.
An educated outsider will very likely side with the expert though. Just like with the hype around the LHC and its dangers, academics and educated people largely believed the physicists working on it and not the fringe group that claimed it will destroy the world. Although that might be vice versa with the general public. Of course you cannot draw any conclusions about who's right from this, but it should be investigated anyway because what all parties have in common is the need for support and money. There are two different groups to be convinced here by each party. One group includes the educated people (academics) and mediocre rationalists and the other group is the general public. When it comes to who's right, the people one should listen to are the educated experts who are listening to both parties, their position and arguments. Although their intelligence and status as rationalists will be disputed as each party will claim that they are not smart enough to see the truth if they disagree with them.
Well said and truly spoken.
(My shorter answer, by the way - I interpret all such behaviors through a Hansonian lens. This includes "near vs far", observations about the incentives of researchers, the general theme of "X is not about Y" and homo hypocritus. Rather cynical, some may suggest, but this kind of thinking gives very good explanaions for "Why?"s that would otherwise be confusing.)
The basic idea is to make a machine that is satisfied relatively easily. So, for example, you tell it to build the ten paperclips with 10 kj total - and tell it not to worry too much if it doesn't make them - it is not that important.
Sorry, I don't understand your comment at all. I'll be back tomorrow.
Yes, as I said, you seem to assume that it is very likely to succeed on all the hard problems but yet fail on the scope boundary. The scary idea states that it is likely that if we create self-improving AI it will consume humanity. I believe that is a rather unlikely outcome and haven't seen any good reason to believe something else yet.
No, it states that we run the risk of accidentally making something that will consume (or exterminate, subvert, betray, make miserable, or otherwise Do Bad Things to) humanity, that looks perfectly safe and correct, right up until it's too late to do anything about it... and that this is the default case: the case if we don't do something extraordinary to prevent it. This doesn't require self-improvement, and it doesn't require wiping out humanity. It just requires normal, every-day human error.
Here is Ben's phrasing:
If the error is in the goal-oriented framework, it could end up "correcting" itself to achieve unintended goals.
An outstanding piece of reasoning/rhetoric which deserves to be revised and relocated to top-level-postdom.
I like the analogy. It may even fit when considering building a friendly AI - like hitting a golf ball deliberately and to the best of your ability from a randomly selected spot on the earth and trying to get a hole in one. Overwhelmingly difficult, perhaps even impossible given human capabilities but still worth dedicating all your effort to attempting!
Isn't that exactly the argument against non-proven AI values in the first place? If you expect AI-chimp to be worried that AI-superchimp won't love bannanas , then you should be very worried about AI-chimp. I don't get what you're saying about the paperclipper.
It is a reason not to transcend if you are not sure that you'll still be you afterwards, i.e. keep your goals and values. I just wanted to point out that the argument runs both directions. It is an argument for the fragility of values and therefore the dangers of fooming but also an argument for the difficulty that could be associated with radically transforming yourself.
No, the reason that people disagree at this point is that it's not obvious that future rounds of recursive self-improvement will be as effective as the first, or even that the first round will be that effective. Obviously an AI would have large amounts of computational power, and probably be able to think much more quickly than a human. Most likely it would be more intelligent than any human on the planet by a considerable margin. But this doesn't imply
(provided that the AI was originally built by humans, of course; if its design was too complicated for humans to arrive at, a slightly superhuman might be helpless as well)
Yes, that's rather the point. Assuming that you do get to human-level, though, you now have the potential for fooming, if only in speed.
I'm a fan of chess, evolutionary algorithms, and music, and the Emily Howell example is the one that sticks out like a sore thumb here. Music is not narrow and Emily Howell is not comparable to a typical human musician.
The point is that it (and its predecessor Emmy) are special-purpose "idiot savants", like the other two examples. That it is not a human musician is beside the point: the point is that humans can make idiot-savant programs suitable for solving any sufficiently-specified problem, which means a human-level AI programmer can do the same. And although real humans spent many years on some of these narrow-domain tools, an AI programmer might be able to execute those years in minutes.
No, it's quite different from the other two examples. Deep Blue beat the world champion. The evolutionary computation-designed antenna was better than its human-designed competitors. To be precise, what sufficiently-specified compositional problem do you think Emily Howell solves better than humans? I say "compositional" to reassure you that I'm not going to move the goalposts by requiring "real emotion" or human-style performance gestures or anything like that.
If I understand correctly, the answer would be "making the music its author/co-composer wanted it to make". (In retrospect, I probably should have said "Emmy" -- i.e., Emily's predecessor that could write classical pieces in the style of other composers.)
To make that claim, we'd have to have one or more humans who sat down with David Cope and tried to make the music that he wanted, and failed. I don't think David Cope himself counts, because he has written music "by hand" also, and I don't think he regards it as a failure. Re EMI/Emmy, it's clearer: the pieces it produced in the style of (say) Beethoven are not better than would be written by a typical human composer attempting the same task. Now would be a good time for me to acknowledge/recall that my disagreement on this doesn't take away from the original point -- computers are better than humans on many narrow domains.
So, are you suggesting that Robin Hanson (who is on record as not buying the Scary Idea) -- the current owner of the Overcoming Bias blog, and Eli's former collaborator on this blog -- fails to buy the Scary Idea "due to cognitive biases that are hard to overcome." I find that a bit ironic. Like Robin and Eli and perhaps yourself, I've read the heuristics and biases literature also. I'm not so naive as to make judgments about huge issues, that I think about for years of my life, based strongly on well-known cognitive biases. It seems more plausible to me to assert that many folks who believe the Scary Idea, are having their judgment warped by plain old EMOTIONAL bias -- i.e. stuff like "fear of the unknown", and "the satisfying feeling of being part a self-congratulatory in-crowd that thinks it understands the world better than everyone else", and the well known "addictive chemical high of righteous indignation", etc. Regarding your final paragraph: Is your take on the debate between Robin and Eli about "Foom" that all Robin was saying boils down to "la la la I can't hear you" ? If so I would suggest that maybe YOU are the one with the (metaphorical) hearing problem ;p .... I think there's a strong argument that: "The truth value of "Once an AGI is at the level of a smart human computer scientist, hard takeoff is likely" is significantly above zero." No assertion stronger than that seems to me to be convincingly supported by any of the arguments made on Less Wrong or Overcoming Bias or any of Eli's prior writings. Personally, I actually do strongly suspect that once an AGI reaches that level, a hard takeoff is extremely likely unless the AGI has been specifically inculcated with goal content working against this. But I don't claim to have a really compelling argument for this. I think we need a way better theory of AGI before we can frame such arguments compellingly. And I think that theory is going to emerge after we've experimented with some AGI systems that

Interesting, this is why I included the Fermi paradox: I must wonder: what big things future could go wrong where analogous smaller past things can’t go wrong? Many of you will say “unfriendly AI” but as Katja points out a powerful unfriendly AI that would make a visible mark on the universe can’t be part a future filter; we’d see the paperclips out there.

SIA says AI is no big threat
This summary seems fairly accurate: The utility of an anthropic approach to this issue seems questionable, though. The great silence tells something - something rather depressing - it is true... but it is far from our only relevant source of information on the topic. We have an impressive mountain of other information to consider and update on. To give but one example, we don't yet see any trace of independently-evolved micro-organisms on other planets. The less evidence for independent origins of life elsewhere there is, the more that suggests a substantial early filter - and the less need there is for a late one. This is true - but because it does not suggest THE END OF THE WORLD - it is not so newsworthy. Selective reporting favours apocalyptic elements. Seeing only the evidence supporting one side of such stories seems likely to lead to people adopting a distorted world view, with inacurate estimates of the risks.

I added a footnote to the post:

  • Potential negative consequences [3] of slowing down research on artificial intelligence (a risks and benefits analysis).

(3) Could being overcautious be itself an existential risk that might significantly outweigh the risk(s) posed by the subject of caution? Suppose that most civilizations err on the side of caution. This might cause them to either evolve much slower so that the chance of a fatal natural disaster to occur before sufficient technology is developed to survive it, rises to 100%, or stops them from evolving a

... (read more)
I was thinking about how the existential risks affect each other-- for example, a real world war might either destroy so much that high tech risks become less likely for a while, or lead to research which results in high tech disaster. And we may get home build-a-virus kits before AI is developed, even if we aren't cautious about AI.

'Abulic' is a great word.

  1. If one pulled a random mind from the space of all possible minds, the odds of it being friendly to humans (as opposed to, e.g., utterly ignoring us, and being willing to repurpose our molecules for its own ends) are very low.

I understand that we can't predict what a random 'mind' would be like. Are there arguments that a random mind is unlikely to be friendly? I take 'random' as meaning still weighted by frequency.

-- I apologize, I realize this question doesn't have that much to do with AI which is what your post was about.... (read more)


AFAICT the main value in addressing such concerns in detail consists would lie in convincing AGI researchers to change their course of action. Do you think this would actually occur?

It depends on how convincing the argument is. I think 1 in 5 would change if they heard a obviously correct argument.
To answer this question it seems like it would be useful to understand the motives behind those researching AGI. I don't know much about this. Maybe those who have interacted with these researchers (or the researchers themselves) can shed some light?