What I would like the SIAI to publish

Major update here.

Related to: Should I believe what the SIAI claims?

Reply to: Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It)

... pointing out that something scary is possible, is a very different thing from having an argument that it’s likely. — Ben Goertzel

What I ask for:

I want the SIAI or someone who is convinced of the Scary Idea1 to state concisely and mathematically (and with possible extensive references if necessary) the decision procedure that led they to make the development of friendly artificial intelligence their top priority. I want them to state the numbers of their subjective probability distributions2 and exemplify their chain of reasoning, how they came up with those numbers and not others by way of sober calculations.

The paper should also account for the following uncertainties:

  • Comparison with other existential risks and how catastrophic risks from artificial intelligence outweigh them.
  • Potential negative consequences3 of slowing down research on artificial intelligence (a risks and benefits analysis).
  • The likelihood of a gradual and controllable development versus the likelihood of an intelligence explosion.
  • The likelihood of unfriendly AI4 versus friendly and respectively abulic5 AI.
  • The ability of superhuman intelligence and cognitive flexibility as characteristics alone to constitute a serious risk given the absence of enabling technologies like advanced nanotechnology.
  • The feasibility of “provably non-dangerous AGI”.
  • The disagreement of the overwhelming majority of scientists working on artificial intelligence.
  • That some people who are aware of the SIAI’s perspective do not accept it (e.g. Robin Hanson, Ben Goertzel, Nick Bostrom, Ray Kurzweil and Greg Egan).
  • Possible conclusions that can be drawn from the Fermi paradox6 regarding risks associated with superhuman AI versus other potential risks ahead.

Further I would like the paper to include and lay out a formal and systematic summary of what the SIAI expects researchers who work on artificial general intelligence to do and why they should do so. I would like to see a clear logical argument for why people working on artificial general intelligence should listen to what the SIAI has to say.

Examples:

Here are are two examples of what I'm looking for:

The first example is Robin Hanson demonstrating his estimation of the simulation argument. The second example is Tyler Cowen and Alex Tabarrok presenting the reasons for their evaluation of the importance of asteroid deflection.

Reasons:

I'm wary of using inferences derived from reasonable but unproven hypothesis as foundations for further speculative thinking and calls for action. Although the SIAI does a good job on stating reasons to justify its existence and monetary support, it does neither substantiate its initial premises to an extent that an outsider could draw the conclusions about the probability of associated risks nor does it clarify its position regarding contemporary research in a concise and systematic way. Nevertheless such estimations are given, such as that there is a high likelihood of humanity's demise given that we develop superhuman artificial general intelligence without first defining mathematically how to prove the benevolence of the former. But those estimations are not outlined, no decision procedure is provided on how to arrive at the given numbers. One cannot reassess the estimations without the necessary variables and formulas. This I believe is unsatisfactory, it lacks transparency and a foundational and reproducible corroboration of one's first principles. This is not to say that it is wrong to state probability estimations and update them given new evidence, but that although those ideas can very well serve as an urge to caution they are not compelling without further substantiation.


1. If anyone is actively trying to build advanced AGI succeeds, we’re highly likely to cause an involuntary end to the human race.

2. Stop taking the numbers so damn seriously, and think in terms of subjective probability distributions [...], Michael Anissimov (existential.ieet.org mailing list, 2010-07-11)

3. Could being overcautious be itself an existential risk that might significantly outweigh the risk(s) posed by the subject of caution? Suppose that most civilizations err on the side of caution. This might cause them to either evolve much slower so that the chance of a fatal natural disaster to occur before sufficient technology is developed to survive it, rises to 100%, or stops them from evolving at all for being unable to prove something being 100% safe before trying it and thus never taking the necessary steps to become less vulnerable to naturally existing existential risks. Further reading: Why safety is not safe

4. If one pulled a random mind from the space of all possible minds, the odds of it being friendly to humans (as opposed to, e.g., utterly ignoring us, and being willing to repurpose our molecules for its own ends) are very low.

5. Loss or impairment of the ability to make decisions or act independently.

6. The Fermi paradox does allow for and provide the only conclusions and data we can analyze that amount to empirical criticism of concepts like that of a Paperclip maximizer and general risks from superhuman AI's with non-human values without working directly on AGI to test those hypothesis ourselves. If you accept the premise that life is not unique and special then one other technological civilisation in the observable universe should be sufficient to leave potentially observable traces of technological tinkering. Due to the absence of any signs of intelligence out there, especially paper-clippers burning the cosmic commons, we might conclude that unfriendly AI could not be the most dangerous existential risk that we should worry about.

218 comments, sorted by
magical algorithm
Highlighting new comments since Today at 9:41 AM
Select new highlight date

Stop taking the numbers so damn seriously, and think in terms of subjective probability distributions [...], Michael Anissimov

I think it's worth giving the full quote:

Stop taking the numbers so damn seriously, and think in terms of subjective probability distributions, discard your mental associates between numbers and absolutes, and my choice to say a number, rather than a vague word that could be interpreted as a probability anyway, makes sense. Working on www.theuncertainfuture.com, one of the things I appreciated the most were experts with the intelligence to make probability estimates, which can be recorded, checked, and updated with evidence, rather than vague statements like “pretty likely”, which have to be converted into probability estimates for Bayesian updating anyway. Futurists, stick your neck out! Use probability estimates rather than facile absolutes or vague phrases that mean so little that you are essentially hedging yourself into meaninglessness anyway.

Total agreement from me, needless to say.

I agree that a write-up of SIAI's argument for the Scary Idea, in the manner you describe, would be quite interesting to see.

However, I strongly suspect that when the argument is laid out formally, what we'll find is that

-- given our current knowledge about the pdf's of the premises in the argument, the pdf on the conclusion is verrrrrrry broad, i.e. we can't conclude hardly anything with much of any confidence ...

So, I think that the formalization will lead to the conclusion that

-- "we can NOT confidently say, now, that: Building advanced AGI without a provably Friendly design will almost certainly lead to bad consequences for humanity"

-- "we can also NOT confidently say, now, that: Building advanced AGI without a provably Friendly design will almost certainly NOT lead to bad consequences for humanity"

I.e., I strongly suspect the formalization

-- will NOT support the Scary Idea

-- will also not support complacency about AGI safety and AGI existential risk

I think the conclusion of the formalization exercise, if it's conducted, will basically be to reaffirm common sense, rather than to bolster extreme views like the Scary Idea....

-- Ben Goertzel

So, I think that the formalization will lead to the conclusion that "we can NOT confidently say, now, that: Building advanced AGI without a provably Friendly design will almost certainly lead to bad consequences for humanity" "we can also NOT confidently say, now, that: Building advanced AGI without a provably Friendly design will almost certainly NOT lead to bad consequences for humanity"

I agree with both those statements, but think the more relevant question would be:

"conditional on it turning out, to the enormous surprise of most everyone in AI, that this AGI design is actually very close to producing an 'artificial toddler', what is the sign of the expected effect on the probability of an OK outcome for the world, long-term and taking into account both benefits and risks?" .

I agree.

I doubt you would remember this, but we talked about this at the Meet and Greet at the Singularity Summit a few months ago (in addition to CBGBs and Punk Rock and Skaters).

James Hughes mentioned you as well at a Conference in NY where we discussed this very issue as well.

One thing that you mentioned at the Summit (well in conversation) was that The Scary Idea was tending to cause some paranoia among people who otherwise might be contributing more to the development of AI (of course, you also seemed pretty hostile to brain emulation too) as it tends to cause funding that could be going to AI to be slowed as a result.

For those of you who are interested, some of us folks from the SoCal LW meetups have started working on a project that seems related to this topic.

We're working on building a fault tree analysis of existential risks with a particular focus on producing a detailed analysis of uFAI. I have no idea if our work will at all resemble the decision procedure SIAI used to prioritize their uFAI research, but it should at least form a framework for the broader community to discuss the issue. Qualitatively you could use the work discuss the possible failure modes that would lead to a uFAI scenario and quantitatively you can could use the framework and your own supplied probabilities (or aggregated probabilities from the community, domain experts, etc.) to crunch the numbers and/or compare uFAI to other posited existential risks.

At the moment, I'd like to find out generally what anyone else thinks of this project. If you have suggestions, resources or pointers to similar/overlapping work you want to share, that would be great, too.

This project sounds really interesting and useful.

It sounds a lot like a project that I tried and failed to get started. Or at least like part of that project. Though my project is so vague and broad that pretty much anything involving graphs/trees related to x-risks would seem "kinda like part of the project I was working on"

Here's a link to another comment about that project

I would like to hear more about your project.

The claim that AIs will foom, basically, reduces to the claim that the difficulty of making AGI is front-loaded: that there's a hump to get over, that we aren't over it yet, and that once it's passed things will get much easier. From an outside view, this makes sense; we don't yet have a working prototype of general intelligence, and the history of invention in general indicates that the first prototype is a major landmark after which the pace of development speeds up dramatically.

But this is a case where the inside and outside views disagree. We all know that AGI is hard, but the people actually working on it get to see the challenges up close. And from that perspective, it's hard to accept that it will suddenly become much easier once we have a prototype - both because the challenges seem so daunting, the possible breakthroughs are hard to visualize, and on some level, if AI suddenly became easy it would trivialize the challenges that researchers are facing now. So the AGI researchers imagine an AI-Manhattan Project, with resources to match the challenges as they see them, rather than an AI-Kitty Hawk, with a few guys in a basement who are lucky enough to stumble on the final necessary insight.

Since a Manhattan Project-style AI would have lots of resources to spend on ensuring safety, the safety issues don't seem like a big deal. But if the first AGI were made by some guys in a basement, instead, then they won't have those resources; and from that perspective, pushing hard for safety measures is important.

Except in this case if 'prototype' means genius-human-level AI, then it's reasonable to assume that even if the further challenges remain daunting, it will be economical to put a lot more effort into them, because researchers will be cheap.

If airplanes were as much better at designing airplanes as they are at flying, Kitty Hawk would have been different.

The claim that AIs will foom, basically, reduces to the claim that the difficulty of making AGI is front-loaded

Or that the effective effort put into AI research (e.g. by AIs) is sufficiently back-loaded.

The claim that AIs will foom, basically, reduces to the claim that the difficulty of making AGI is front-loaded

Yes.

the history of invention in general indicates that the first prototype is a major landmark after which the pace of development speeds up dramatically.

This is not actually true. The history of invention in general indicates that the first prototype accomplishes little, and a great deal of subsequent work needs to be done - even in the case of inventions like machine tools and computers that are used for creating subsequent generations of themselves.

Yes, this is right. Prototypes often precede widespread deployment and impact of a technology by decades until various supporting technologies and incremental improvements make them worth their costs.

I have a lot on my plate right now, but I'll try to write up my own motivating Fermi calculations if I get the chance to do so soon.

What I would like the SIAI to publish

Publish instead of doing what?

I would additionally like to see addressed:

  • What is the time estimate for FAI and AGI?
  • What is the probability that is FAI is possible times the probability that FAI can be achieved before AGI?
  • Other paths to safe super intelligence (IA, WBE, AI-in-box, etc) may be more dangerous. What are the odds? Are the odds better or worse than the odds that the FAI research program is successful?

This might be an opportunity to use one of those Debate Tools, see if one of them can be useful for mapping the disagreement.

I would like to have a short summary of where various people stand on the various issues.

The people:

  • Eliezer

  • Ben

  • Robin Hanson

  • Nick Bostrom

  • Ray Kurzweil ?

  • Other academic AGI types?

  • Other vocal people on the net like Tim Tyler ?

The issues:

  • How likely is a human-level AI to go FOOM?

  • How likely is an AGI developed without "friendliness theory" to have values incompatible with those of humans?

  • How easy is it to make an AGI (really frickin' hard, or really really really frickin' hard?)?

  • How likely is it that Ben Goerzel's "toddler AGI" would succeed, if he gets funding etc.?

  • How likely is it that Ben Goerzel's "toddler AGI" would be dangerous, if he succeeded?

  • How likely is it that some group will develop an AGI before 2050? (Or more generally, estimated timelines of AGI)

Add Nick Bostrom to the list.

Also, what is exactly Bostrom's take on AI? OP says Bostrom disagrees with Eliezer. Could someone provide a link or reference to that? I have read most of Bostrom's papers some time ago and at the moment I can't recall any such disagreement.

I think Nick was near Anders with an x-risk of 20% conditional on AI development by 2100, and near 50% for AI by 2100. So the most likely known x-risk, although unknown x-risks get a big chunk of his probability mass.

If we are constructing a survey of AI-singularity thinking here, I would like to know more about the opinions of Hugo de Garis. And what Bill Joy is thinking these days?

If we are trying to estimate probabilities and effect multipliers, I would like to consider the following question: Consider the projected trajectory of human technological progress without AGI assistance. For example: controlled fusion by 2140, human lifespan doubles by 2200, self-sustaining human presence on asteroids and/or Jovian satelites by 2260, etc. How much would that rate of progress be speeded if we had the assistance of AGI intelligence with 10x human speed and memory capacity? 100x? 1000x?

I conjecture that these speed-ups would be much less than people here seem to expect, and that the speed-up difference between 100x and 100,000x would be small. Intelligence may be much less important than many people think.

A recent update from Hugo here. He has retired - but says he has one more book on machine intelligence to go.

Thx. From that interview:

Interviewer: So what's your take on Ben Goertzel's Cosmism, as expressed in "A Cosmist Manifesto"?

de Garis: Ben and I have essentially the same vision, i.e. that it’s the destiny of humanity to serve as the stepping-stone towards the creation of artilects. Where we differ is on the political front. I don’t share his optimism that the rise of the artilects will be peaceful. I think it will be extremely violent — an artilect war, killing billions of people.

Hmmm. I'm afraid I don't share Goertzel's optimism either. But then I don't buy into that "destiny" stuff, either. We don't have to destroy ourselves and the planet in this way. It is definitely not impossible, but super-human AGI is also not inevitable.

I'd be curious to hear from EY, and the rest of the "anti-death" brigade here, what they think of de Garis's prognosis and whether and how they think an "artilect war" can be avoided.

I'd be curious to hear from EY, and the rest of the "anti-death" brigade here, what they think of de Garis's prognosis and whether and how they think an "artilect war" can be avoided.

I'm not sure that's where the burden of proof should fall. Has de Garis justified his claim? It sounds more like storytelling than inferential forecasting to me.

I really like your comments and wish you would make some top level posts and also contact me online. Could you please do so?

I haven't read his book, etc., but I suspect that "storytelling" might be a reasonable characterization. On the other hand, my "I'd be curious" was hardly an attempt to create a burden of proof.

I do personally believe that convincing mankind that an FAI singularity is desirable will be a difficult task, and that many sane individuals might consider a unilateral and secret decision to FOOM as a casus belli. What would you do as Israeli PM if you received intelligence that an Iranian AI project would likely go FOOM sometime within the next two months?

It's just silly. Luddites have never had much power - and aren't usually very war like.

Instead, we will see expanded environmental and green movements, more anti-GM activism - demands to tax the techno-rich more - and so on.

Degaris was just doing much the same thing that SIAI is doing now - making a song-and-dance about THE END OF THE WORLD - in order to attract attention to himself - and so attract funding - so he could afford to get on with building his machines.

Consider the projected trajectory of human technological progress without AGI assistance. For example: controlled fusion by 2140, human lifespan doubles by 2200, self-sustaining human presence on asteroids and/or Jovian satelites by 2260, etc. How much would that rate of progress be speeded if we had the assistance of AGI intelligence with 10x human speed and memory capacity? 100x? 1000x?

I don't think you can say. Different things will accelerate at different rates. For example, a dog won't build a moon rocket in a million years - but if you make it 10 times smarter, it might do that pretty quickly.

Great post.

If you haven't seen SIAI's new overview you might find it relevant. I'm quite favorably impressed by it.

Thanks. I actually linked to that paper in the OP. As I wrote, that an organisation like the SIAI is necessary and should be supported is not being challenged. But what that paper accomplishes is merely giving a very basic introduction to someone who might have never thought about risks posed by AI. What I actually had in mind writing the OP is that the SIAI addresses people like Ben Goertzel who are irrespective of the currently available material skeptic about the risks from working on AGI and who are unsure of what the SIAI actually wants them to do or not to do and why. Further I would like if the SIAI provided educated outsiders with a summary of how people that believe into the importance of risks associated with AGI arrived at this conclusion, especially in comparison to other existential risks and challenges.

What I seek is a centralized code of practice that incorporates the basic assumptions and a way to roughly asses their likelihood in comparison to other existential risks and challenges by the use of probability. See for example this SIAI page. Bayes sits in there alone and doomed. Why is there no way for people to formally derive their own probability estimates with their own values? To put it bluntly, it looks like you have to put any estimation out of your ass. The SIAI has to set itself apart from works of science fiction and actually provide some formal analysis of what we know, what conclusions can be drawn and how they relate to other problems. The first question most people will ask is why to worry about AGI when there are challenges like climate change. There needs to be a risks benefits analysis that shows why AGI is more important and a way to reassess the results yourself by following a provided decision procedure.

Yes, I strongly endorse what you were saying in your top level posting and agree that the new overview is by no means sufficient, I was just remarking that the new overview is at least a step in the right direction. Didn't notice that you had linked it in the top level post.

If you want probabilities for these things to be backed up by mathematics, you're going to be disappointed, because there aren't any. The best probabilities - or rather, the only probabilities we have here, were produced using human intuition. You can break down the possibilities into small pieces, generate probabilities for the pieces, and get an overall probability that way, but at the base of the calculations you just have order-of-magnitude estimates. You can't provide formal, strongly defensible probabilities for the sub-events, because there just isn't any data - and there won't be any data until after the danger of AGI has destroyed us, or passed without destroying us. And that, I think, is the reason why SIAI doesn't provide any numbers: since they'd only be order of magnitude estimates, they'd give the people who had already made up their minds something to attack.

I'm not asking for defensible probabilities that would withstand academic peer review. I'm asking for decision procedures including formulas with variables that allow you to provide your own intuitive values to eventually calculate your own probabilities. I want the SIAI to provide a framework that gives a concise summary of the risks in question and a comparison with other existential risks. I want people to be able to carry out results analysis and distinguish risks posed by artificial general intelligence from other risks like global warming or grey goo.

There aren't any numbers for a lot of other existential risks either. But one is still able to differentiate between those risks and that from unfriendly AI based on logical consequences of other established premises like the Church–Turing–Deutsch principle. Should we be equally concerned with occultists trying to summon world-changing supernatural powers?

+1

Unfortunately, this is a common conversational pattern.

Q. You have given your estimate of the probability of FAI/cryonics/nanobots/FTL/antigravity. In support of this number, you have here listed probabilities for supporting components, with no working shown. These appear to include numbers not only for technologies we have no empirical knowledge of, but particular new scientific insights that have yet to occur. It looks very like you have pulled the numbers out of thin air. How did you derive these numbers?

A. Bayesian probability calculations.

Q. Could you please show me your working? At least a reasonable chunk of the Bayesian network you derived this from? C'mon, give me something to work with here.

A. (tumbleweeds)

Q. I remain somehow unconvinced.

If you pull a number out of thin air and run it through a formula, the result is still a number pulled out of thin air.

If you want people to believe something, you have to bother convincing them.

It's my professional opinion, based on extensive experience and a developed psychological model of human rationality, that such a paper wouldn't be useful. That said, I'd be happy to have you attempt it. I think that your attempt to do so would work perfectly well for your and our purposes, at least if you are able to do the analysis honestly and update based on criticism that you could get in the comments of a LW blog post.

Thanks, but my current level of education is completely insufficient to accomplish such a feat to an extent that would be adequate. Maybe in a few years, but right now that is unrealistic.

Interesting, this is why I included the Fermi paradox:

...so I must wonder: what big things future could go wrong where analogous smaller past things can’t go wrong? Many of you will say “unfriendly AI” but as Katja points out a powerful unfriendly AI that would make a visible mark on the universe can’t be part a future filter; we’d see the paperclips out there.

SIA says AI is no big threat

Artificial Intelligence could explode in power and leave the direct control of humans in the next century or so. It may then move on to optimize the reachable universe to its goals. Some think this sequence of events likely.

If this occurred, it would constitute an instance of our star passing the entire Great Filter. If we should cause such an intelligence explosion then, we are the first civilization in roughly the past light cone to be in such a position. If anyone else had been in this position, our part of the universe would already be optimized, which it arguably doesn’t appear to be. This means that if there is a big (optimizing much of the reachable universe) AI explosion in our future, the entire strength of the Great Filter is in steps before us.

This means a big AI explosion is less likely after considering the strength of the Great Filter, and much less likely if one uses the Self Indication Assumption (SIA).

SIA implies that we are unlikely to give rise to an intelligence explosion for similar reasons, but probably much more strongly.

This summary seems fairly accurate:

In summary, if you begin with some uncertainty about whether we precede an AI explosion, then updating on the observed large total filter and accepting SIA should make you much less confident in that outcome.

The utility of an anthropic approach to this issue seems questionable, though. The great silence tells something - something rather depressing - it is true... but it is far from our only relevant source of information on the topic. We have an impressive mountain of other information to consider and update on.

To give but one example, we don't yet see any trace of independently-evolved micro-organisms on other planets. The less evidence for independent origins of life elsewhere there is, the more that suggests a substantial early filter - and the less need there is for a late one.

This is true - but because it does not suggest THE END OF THE WORLD - it is not so newsworthy. Selective reporting favours apocalyptic elements. Seeing only the evidence supporting one side of such stories seems likely to lead to people adopting a distorted world view, with inacurate estimates of the risks.

I added a footnote to the post:

  • Potential negative consequences [3] of slowing down research on artificial intelligence (a risks and benefits analysis).

(3) Could being overcautious be itself an existential risk that might significantly outweigh the risk(s) posed by the subject of caution? Suppose that most civilizations err on the side of caution. This might cause them to either evolve much slower so that the chance of a fatal natural disaster to occur before sufficient technology is developed to survive it, rises to 100%, or stops them from evolving at all for being unable to prove something being 100% safe before trying it and thus never taking the necessary steps to become less vulnerable to naturally existing existential risks. Further reading: Why safety is not safe

I was thinking about how the existential risks affect each other-- for example, a real world war might either destroy so much that high tech risks become less likely for a while, or lead to research which results in high tech disaster.

And we may get home build-a-virus kits before AI is developed, even if we aren't cautious about AI.

Do you have problems only with the conciseness, mathiness and reference-abundance of current SIAI explanatory materials or do you think that there are a lot of points and arguments not yet made at all? I ask this because except for the Fermi paradox every point you listed was addressed multiple times in the FOOM debate and in the sequences.

Also, what is the importance of the Fermi paradox in AI?

To be more precise. You can't tell concerned AI researchers to read through hundreds of posts of marginal importance. You have to have some brochure for experts and educated laymen to be able to read up on a summary of the big picture that includes precise and compelling methodologies that they can follow through to come up with their own estimations of the likelihood of existential risks posed by superhuman artificial general intelligence. If the decision procedure gives them a different probability due to a differing prior and values, then you can tell them to read up on further material to be able to update their prior probability and values accordingly.

I'm content with your answer, then. I would personally welcome an overhaul to the presentation of AI material too. Still I think that Eliezer's FAI views are a lot more structured, comprehensive and accessible than the impression you give in your relevant posts.

The importance of the Fermi paradox is that it is the only data we can analyze that would come close to some empirical criticism of a Paperclip maximizer and general risks from superhuman AI's with non-human values without working directly on AGI to test those hypothesis ourselves. If you accept the premise that life is not unique and special then one other technological civilisation in the observable universe should be sufficient to leave observable (now or soon) traces of technological tinkering. Due to the absence of any signs of intelligence out there, especially paperclippers burning the cosmic commons, we can conclude that unfriendly AI might not be the most dangerous existential risk that we should look for.

...every point you listed was addressed multiple times in the FOOM debate and in the sequences.

I believe there probably is an answer, but it is buried under hundreds of posts about marginal issues. All those writings on rationality, there is nothing I disagree with. Many people know about all this even outside of the LW community. But what is it that they don't know that EY and the SIAI knows? What I was trying to say is that if I have come across it then it was not convincing enough to take it as serious as some people here obviously do.

It looks like that I'm not alone. Goertzel, Hanson, Egan and lots of other people don't see it as well. So what are we missing, what is it that we haven't read or understood?

Here is a very good comment by Ben Goertzel that pinpoints it:

This is what discussions with SIAI people on the Scary Idea almost always come down to!

The prototypical dialogue goes like this.

SIAI Guy: If you make a human-level AGI using OpenCog, without a provably Friendly design, it will almost surely kill us all.

Ben: Why?

SIAI Guy: The argument is really complex, but if you read Less Wrong you should understand it

Ben: I read the Less Wrong blog posts. Isn't there somewhere that the argument is presented formally and systematically?

SIAI Guy: No. It's really complex, and nobody in-the-know had time to really spell it out like that.

No. It's really complex, and nobody in-the-know had time to really spell it out like that.

Actually, you can spell out the argument very briefly. Most people, however, will immediately reject one or more of the premises due to cognitive biases that are hard to overcome.

A brief summary:

  • Any AI that's at least as smart as a human and is capable of self-improving, will improve itself if that will help its goals

  • The preceding statement applies recursively: the newly-improved AI, if it can improve itself, and it expects that such improvement will help its goals, will continue to do so.

  • At minimum, this means any AI as smart as a human, can be expected to become MUCH smarter than human beings -- probably smarter than all of the smartest minds the entire human race has ever produced, combined, without even breaking a sweat.

INTERLUDE: This point, by the way, is where people's intuition usually begins rebelling, either due to our brains' excessive confidence in themselves, or because we've seen too many stories in which some indefinable "human" characteristic is still somehow superior to the cold, unfeeling, uncreative Machine... i.e., we don't understand just how our intuition and creativity are actually cheap hacks to work around our relatively low processing power -- dumb brute force is already "smarter" than human beings in any narrow domain (see Deep Blue, evolutionary algorithms for antenna design, Emily Howell, etc.), and a human-level AGI can reasonably be assumed capable of programming up narrow-domain brute forcers for any given narrow domain.

And it doesn't even have to be that narrow or brute: it could build specialized Eurisko-like solvers, and manage them at least as intelligently as Lenat did to win the Travelller tournaments.

In short, human beings have a vastly inflated opinion of themselves, relative to AI. An AI only has to be as smart as a good human programmer (while running at a higher clock speed than a human) and have access to lots of raw computing resources, in order to be capable of out-thinking the best human beings.

And that's only one possible way to get to ridiculously superhuman intelligence levels... and it doesn't require superhuman insights for an AI to achieve, just human-level intelligence and lots of processing power.

The people who reject the FAI argument are the people who, for whatever reason, can't get themselves to believe that a machine can go from being as smart as a human, to massively smarter in a short amount of time, or who can't accept the logical consequences of combining that idea with a few additional premises, like:

  • It's hard to predict the behavior of something smarter than you

  • Actually, it's hard to predict the behavior of something different than you: human beings do very badly at guessing what other people are thinking, intending, or are capable of doing, despite the fact that we're incredibly similar to each other.

  • AIs, however, will be much smarter than humans, and therefore very "different", even if they are otherwise exact replicas of humans (e.g. "ems").

  • Greater intelligence can be translated into greater power to manipulate the physical world, through a variety of possible means. Manipulating humans to do your bidding, coming up with new technologies, or just being more efficient at resource exploitation... or something we haven't thought of. (Note that pointing out weaknesses in individual pathways here doesn't kill the argument: there is more than one pathway, so you'd need a general reason why more intelligence doesn't ever equal more power. Humans seem like a counterexample to any such general reason, though.)

  • You can't control what you can't predict, and what you can't control is potentially dangerous. If there's something you can't control, and it's vastly more powerful than you, you'd better make sure it gives a damn about you. Ants get stepped on, because most of us don't care very much about ants.

Note, by the way, that this means that indifference alone is deadly. An AI doesn't have to want to kill us, it just has to be too busy thinking about something else to notice when it tramples us underfoot.

This is another inferential step that is dreadfully counterintuitive: it seems to our brains that of course an AI would notice, of course it would care... what's more important than human beings, after all?

But that happens only because our brains are projecting themselves onto the AI -- seeing the AI thought process as though it were a human. Yet, the AI only cares about what it's programmed to care about, explicitly or implicitly. Humans, OTOH, care about a ton of individual different things (the LW "a thousand shards of desire" concept), which we like to think can be summarized in a few grand principles.

But being able to summarize the principles is not the same thing as making the individual cares ("shards") be derivable from the general principle. That would be like saying that you could take Aristotle's list of what great drama should be, and then throw it into a computer and have the computer write a bunch of plays that people would like!

To put it another way, the sort of principles we like to use to summarize our thousand shards are just placeholders and organizers for our mental categories -- they are not the actual things we care about... and unless we put those actual things in to an AI, we will end up with an alien superbeing that may inadvertently wipe out things we care about, while it's busy trying to do whatever else we told it to do... as indifferently as we step on bugs when we're busy with something more important to us.

So, to summarize: the arguments are not that complex. What's complex is getting people past the part where their intuition reflexively rejects both the premises and the conclusions, and tells their logical brains to make up reasons to justify the rejection, post hoc, or to look for details to poke holes in, so that they can avoid looking at the overall thrust of the argument.

While my summation here of the anti-Foom position is somewhat unkindly phrased, I have to assume that it is the truth, because none of the anti-Foomers ever seem to actually address any of the pro-Foomer arguments or premises. AFAICT (and I am not associated with SIAI in any way, btw, I just wandered in here off the internet, and was around for the earliest Foom debates on OvercomingBias.com), the anti-Foom arguments always seem to consist of finding ways to never really look too closely at the pro-Foom arguments at all, and instead making up alternative arguments that can be dismissed or made fun of, or arguing that things shouldn't be that way, and therefore the premises should be changed

That was a pretty big convincer for me that the pro-Foom argument was worth looking more into, as the anti-Foom arguments seem to generally boil down to "la la la I can't hear you".

So, are you suggesting that Robin Hanson (who is on record as not buying the Scary Idea) -- the current owner of the Overcoming Bias blog, and Eli's former collaborator on that blog -- fails to buy the Scary Idea "due to cognitive biases that are hard to overcome." I find that a bit ironic.

Like Robin and Eli and perhaps yourself, I've read the heuristics and biases literature also. I'm not so naive as to make judgments about huge issues, that I think about for years of my life, based strongly on well-known cognitive biases.

It seems more plausible to me to assert that many folks who believe the Scary Idea, are having their judgment warped by plain old EMOTIONAL bias -- i.e. stuff like "fear of the unknown", and "the satisfying feeling of being part a self-congratulatory in-crowd that thinks it understands the world better than everyone else", and the well known "addictive chemical high of righteous indignation", etc.

Regarding your final paragraph: Is your take on the debate between Robin and Eli about "Foom" that all Robin was saying boils down to "la la la I can't hear you" ? If so I would suggest that maybe YOU are the one with the (metaphorical) hearing problem ;p ....

I think there's a strong argument that: "The truth value of "Once an AGI is at the level of a smart human computer scientist, hard takeoff is likely" is significantly above zero." No assertion stronger than that seems to me to be convincingly supported by any of the arguments made on Less Wrong or Overcoming Bias or any of Eli's prior writings.

Personally, I actually do strongly suspect that once an AGI reaches that level, a hard takeoff is extremely likely unless the AGI has been specifically inculcated with goal content working against this. But I don't claim to have a really compelling argument for this. I think we need a way better theory of AGI before we can frame such arguments compellingly. And I think that theory is going to emerge after we've experimented with some AGI systems that are fairly advanced, yet well below the "smart computer scientist" level.

Regarding your final paragraph: Is your take on the debate between Robin and Eli about "Foom" that all Robin was saying boils down to "la la la I can't hear you" ?

Good summary. Although I would have gone with "la la la la If you're right then most of expertise is irrelevant. Must protect assumptions of free competition. Respect my authority!"

What I found most persuasive about that debate was Robin's arguments - and their complete lack of merit. The absence of evidence is evidence of absence when there is a motivated competent debater with an incentive to provide good arguments.