Summary: I do not understand why MIRI hasn’t produced a non-technical (pamphlet/blog post/video) to persuade people that UFAI is a serious concern. Creating and distributing this document should be MIRI’s top priority.

If you want to make sure the first AGI is FAI, one way to do so is to be the first to create an AI, and ensure it is FAI. Another is to persuade people that UFAI is a legitimate concern, and do so in large numbers. Ideally this would become a real concern, so nobody runs into the trap of Eliezer1999ish of “I’m going to build an AI and see how it works”.

1) is tough for an organisation of MIRI’s size. 2) is a realistic goal. It benefits from: 

Funding: MIRI’s funding almost certainly goes up if more people are concerned with AI x-risk. Ditto FHI.
Scalability: If MIRI has a new math finding, that's one new theorem. If MIRI creates a convincing demonstration that we have to worry about AI, spreading this message to a million people is plausible.
Partial goal completion: making a math breakthrough that reduces the time to AI might be counter-productive. Persuading an additional person of the dangers of UFAI raises the sanity waterline.
Task difficulty: creating an AI is hard. Persuading people that “UFAI is a possible extinction risk. Take it seriously” is nothing like as difficult. (I was persuaded of this in about 20 minutes of conversation.)

One possible response is “it’s not possible to persuade people without math backgrounds, training in rationality, engineering degrees, etc”. To which I reply: what’s the data supporting that hypothesis? How much effort has MIRI expended in trying to explain to intelligent non-LW readers what they’re doing and why they’re doing it? And what were the results?

Another possible response is “We have done this, and it's available on our website. Read the Five Theses”. To which I reply: Is this is in the ideal form to persuade a McKinsey consultant who’s never read Less Wrong? If an entrepreneur with net worth $20m but no math background wants to donate to the most efficient charity he finds, would he be convinced? What efforts has MIRI made to test the hypothesis that the Five Theses, or Evidence and Import, or any other document, has been tailored to optimise the chance of convincing others?
(Further – if MIRI _does_ think this is as persuasive as it can possibly be, why doesn't it shift focus to get the Five Theses read by as many people as possible?)

Here’s one way to go about accomplishing this. Write up an explanation of the concerns MIRI has and how it is trying to allay them, and do so in clear English. (The Five Theses are available in Up-Goer Five form. Writing them in language readable by the average college graduate should be a cinch compared to that). Send it out to a few of the target market and find the points that could be expanded, clarified, or made more convinced. Maybe provide two versions and see which one gets the most positive response. Continue this process until the document has been through a series of iterations and shows no signs of improvement. Then shift focus to getting that link read by as many people as possible. Ask all of MIRI’s donors, all LW readers, HPMOR subscribers, friends and family etc, to forward that one document to their friends.

New Comment
96 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings
  • Pamphlets work for wells in Africa. They don't work for MIRI's mission. The inferential distance is too great, the ideas are too Far, the impact is too far away.
  • Eliezer spent SIAI's early years appealing directly to people about AI. Some good people found him, but the people were being filtered for "interest in future technology" rather than "able to think," and thus when Eliezer would make basic arguments about e.g. the orthogonality thesis or basic AI drives, the responses he would get were basically random (except for the few good people). So Eliezer wrote The Sequences and HPMoR and now the filter is "able to think" or at least "interest in improving one's thinking," and these people, in our experience, are much more likely to do useful things when we present the case for EA, for x-risk reduction, for FAI research, etc.
  • Still, we keep trying direct mission appeals, to some extent. I've given my standard talk, currently titled "Effective Altruism and Machine Intelligence," at Quixey, Facebook, and Heroku. This talk explains effective altruism, astronomical stakes, the x-risk landscape, and the challenge of FAI, all in 25 min
... (read more)

Pamphlets work for wells in Africa. They don't work for MIRI's mission. The inferential distance is too great, the ideas are too Far, the impact is too far away.

Didn't you get convinced about AI risk by reading a short paragraph of I. J. Good?

Certainly there exist people who will be pushed to useful action by a pamphlet. They're fairly common for wells in Africa, and rare for risks from self-improving AI. To get 5 "hits" with well pamphlets, you've got to distribute maybe 1000 pamphlets. To get 5 hits with self-improving AI pamphlets, you've got to distribute maybe 100,000 pamphlets. Obviously you should be able to target the pamphlets better than that, but then distribution and planning costs are a lot higher, and the cost per New Useful Person look higher to me on that plan than distributing HPMoR to leading universities and tech companies, which is a plan for which we already have good evidence of effectiveness, and which we are therefore doing.
Yes. But MIRI's ideas have now influenced the mainstream. Since 2011 we have had Norvig & Russell, Barrat, etc, providing some proof by authority and social proof. The next step is not to popularize the ideas to a mass audience, but to continue targeting the relevant elite audience, e.g. Gary Marcus (not that he really gets it). HPMOR has had some success at reaching the younger and more flexible of these, but bringing some more senior people on board will allow the junior researchers to work on MIRI-style work without ruining their careers -- as-is, some are doing it as a part-time hobby during a PhD on another topic, which is a precarious situation. MIRI is actually having some success at this. It seems that this audience can now be targeted with a decent chance of success and high value for that success. Here I am talking about the academic community, but the forward-thinking tech-millionaire community is a harder nut to crack and probably needs a separate plan.
I would hesitate to use failure during "SIAI's early years" to justify the ease or difficulty of the task. First, the organization seems far more capable now than it was at the time. Second, the landscape has shifted dramatically even in the last few years. Limited AI is continuing to expand and with it discussion of the potential impacts (most of it ill-informed, but still). While I share your skepticism about pamphlets as such, I do tend to think that MIRI has a greater chance of shifting the odds away from UFAI with persuasion/education rather than trying to build an FAI or doing mathematical research.
I agree and would also add that "Eliezer failed in 2001 to convince many people" does not imply "Eliezer in 2013 is incapable of persuading people". From his writings, I understand he has changed his views considerably in the last dozen years.
Who says the speculation of potential impacts is damagingly ill-informed? Just because people think of "AI" and then jump to "robots" and then "robots who are used to replace workers, destroy all our jobs, and then rise up in revolution as a robotic resurrection of Communism" doesn't mean they're not correctly reasoning that the creation of AI is dangerous.
The next time you give your talk, record it, and put it on YouTube.
Thanks, Luke. This is an informative reply, and it's great to hear you have a standard talk! Is it publicly available, and where can I see it if so? Maybe MIRI should ask FOAFs to publicise it? It's also great to hear that MIRI has tried one pamphlet. I would agree that "This one pamphlet we tried didn't work" points us in the direction that "No pamphlet MIRI can produce will accomplish much", but that proposition is far from certain. I'd still be interested in the general case of "Can MIRI reduce the chance of UFAI x-risk through pamphlets?" You may be right. But, it is possible to convince intelligent non-rationalists to take UFAI x-risk seriously in less than an hour (I've tested this), and anything that can do that process in a manner that scales well would have a huge impact. What's the Value of Information on trying to do that? You mention the Sequences and HPMOR (which I've sent to a number of people with the instruction "set aside what you're doing and read this"). I definitely agree that they filter nicely for "able to think". But they also require a huge time commitment on the part of the reader, whereas a pamphlet or blog post would not.
For what value of "taking seriously" is that statement true?
"Hear ridiculous-sounding proposition, mark it as ridiculous, engage explanation, begin to accept arguments, begin to worry about this, agree to look at further reading"
It could be useful to attach a, "If you didn't like/agree with the contents of this pamphlet, please tell us why at," note to any given pamphlet. Personally I'd find it easier to just look at the contents of the pamphlet with the understanding that 99% of people will ignore it and see if a second draft has the same flaws.
Thanks, Luke. This is an informative reply, and it's great to hear you have a standard talk! Where can I find it? (or if it's not publicly available, why isn't it?) Do you have more details on the 4 page pamphlet? I would be interested in seeing it, if it still exists. Obviously nobody would get from the single premise "This one pamphlet we tried didn't work" to the conclusion "pamphlets don't work", so I'd still be interested in the general case of "Can MIRI reduce the chance of UFAI x-risk through pamphlets?" I'd also love to know your reasoning behind this statement: I am willing to believe the second sentence, but given that it is possible to convince intelligent non-rationalists to take UFAI x-risk seriously (I've tested this), I would like to consider ways in which we can spread this.
There has got to be enough writing by now that an effective chain mail can be written. ETA: The chain mail suggestion isn't knocked down in luke's comment. If it's not relevant or worthy of acknowledging, please explain why. ETA2: As annoying as some chain mail might be, it does work because it does get around. It can be a very effective method of spreading an idea.

Facing the Intelligence Explosion is a nontechnical introduction.

Great. A five minutes video would be better. Maybe ask the SciShow people if they want to make one! They're really good at compressing complex topics into such a format and if they understood the issue, they're rational enough to want to help.
I agree and I like it. I think it could be further optimised for "convince intelligent non-LWers who have been sent one link from their rationalist friends and will read only that one link", but it could definitely serve as a great starting point.

I do not understand why MIRI hasn’t produced a non-technical (pamphlet/blog post/video) to persuade people that UFAI is a serious concern.

It would be far more useful if MIRI provided technical argumentation for its Scary Idea. There are a lot of AGI researchers, myself included, which remain entirely unconvinced. AGI researchers - the people who would actually create an UFAI - are paying attention and not sufficiently convinced to change their behavior. Shouldn't that be of more concern than a non-technical audience?

A decade of effort on EY's part has taken the idea of friendliness mainstream. It is now accepted as fact by most AGI researchers that intelligence and morality are orthonormal concepts, despite contrary intuition, and that even with the best of intentions a powerful, self-modifying AGI could be a dangerous thing. The degree of difference in belief is in the probability assigned to that could.

Has the community responded? Yes. Quite a few mainstream AGI researchers have proposed architectures for friendly AGI, or realistic boxing/oracle setups, or a friendliness analysis of their own AGI design. To my knowledge MIRI has yet to engage with any of these proposals. Why?

I want a believable answer to that before a non-technical pamphlet or video, please.

Not entirely true - there's been a bit of a preliminary response in the form of surveying and briefly commenting on the different proposals.
Kaj, thank you I'm glad that MIRI is taking up this issue. I'm disappointed that this was not more prominently advertised or featured on the website. I am, presumably, among the target audience of this report and didn't know it existed. Skimming it, I'm sorry to see that it doesn't discuss my favored approach to confinement of oracle AIs. But I will have to read it more closely before passing judgment. Is this document still being worked on?. Where is the best place to post criticisms or questions? I hope also that MIRI also generates a technical report justifying the Scary Idea, which is of somewhat greater importance as it frames the entire issue.
This discussion page is the official preferred location for discussing the article, though e.g. this comment thread is really just as fine. It's no longer being worked on, but minor changes (esp. fixing obvious errors) are possible. What's your favored approach to Oracle AI confinement? We tried to cover most of the existing AI risk literature, but there's so much of it and some of it is in such unexpected places that we were bound to miss something.
It's based on the simple observation that the goal set of outcomes which accomplish the task and are acceptable to humans is strictly greater than the goal set which accomplish the task, are acceptable to humans, and secretly nefarious (the typical scenario for unboxing an oracle AI). The difference between the two is quite large, and an auditor should be able to determine from analysis of its reported state and measured resource usage, whether it is calculating the former or the latter. In case that doesn't make sense, here's a simple way to achieve confinement with > 50% success rate (thought experiment only): (1) put the UFAI in a room with metered electricity; (2) attach a nuclear device to the meter, near the room; (3) set it to detonate if it is not fed hashcash/bitcoin proofs-of-work equal to the amount of energy consumed (with assumptions about achievable hashes/kJ, a sliding window and error bars, etc.). The AI is then free to do whatever it wants... provided that what it chooses to do is compute hashes and nothing else, or self-terminate. The principle is the same, except instead of cryptographic proof-of-work we let the AI do useful work and use complexity analysis and probabilistic verification to estimate work done and to make sure that the AI is not cooking the books in its response, and instead of a destructive explosion we cut power and perform forensic analysis. It becomes supremely difficult for an AI to deceive us with bounded computation, since doing so basically involves finding a nefarious response that looks innocent, and then constructing or maintaining a believable alternate-reality version of itself which if run would stumble upon this same result for human-acceptable reasons, while being much less efficient (to account for the difference in resource consumption) yet believably so, and to extract audit-able proofs from that alternate reality version without actually simulating it. Furthermore, from the start it would have to be sufficien
Sorry but that's a dismal plan. This is an AI we're talking about, not a pocket calculator, correct? And you think you can determine so precisely the amount of computation it will need to carry out, that monitoring its electricity consumption is a sensible way to tell whether it's doing bad things on the side? If it's performing advanced cognitive tasks, it will be doing things like deciding to conduct searches of solution spaces that only it has identified, using heuristics of its own invention. How are you going to obtain your estimates of space and time complexity expected of such a search? I assume you don't want to just ask the AI itself! I suppose it might be instructive and fun to try to design a system architecture which works in this way... Any AGI has to pay some attention to the space and time demands of a planned computation. The existence of a physical bomb which goes off if it exceeds its computational budget, just means that the negative feedback associated with such a budgetary event is even stronger than usual. :-)
I addressed this point specifically. There is no ghost in the machine. Should it decide to perform such counter-measures, both that decision and execution of the counter-measures are themselves cognitive processes, their execution observable to the outside world. With a proper machine architecture, the auditors could have access to all sorts of information. They could have records of process execution and summaries of memory access patterns, which would have to match the audit log provided by the AI. The entire mental space of the AI would be available for inspection from outside; regions which show activity but don't show up in the audit logs would be a large red flag. We can take higher resolution logs of random moments, and check if that activity pattern matches some portion of the final high-level audit. There are probably a dozen other ways in which unfriendly activity could be detected, beyond what I've written here and what Nick Bostrom et al have explored in their AI boxing paper. These would be defenses in depth: the UFAI would have to overcome all defenses, including ones it may not know about on its first attempt in order to successfully deceive its human operators.

So here are some more problems I have:

  • UFAI isn't necessarily about deception. You also have to worry that the AI will perform its assigned task in a way inimical to human values, that jumps through constraints intended to prevent this, through sheer ingenuity... Suppose the AI is designed to do X, something that human beings want, but that humans also care about Y and Z. And suppose the AI isn't designed to intrinsically respect Y and Z. Instead there are constraints C that it knows about, the violation of which is also monitored by human beings, and these constraints are supposed to protect values Y and Z from violation. You have to worry that the AI will achieve X in a way which satisfies C but still violates Y and Z.

  • Auditing has the potential to slow down the AI - the AI may be paused regularly for forensic analysis and/or it may go slow in order to satisfy the safety constraints. Audited AI projects may be overtaken by others with a different methodology.

  • You want humans to "take us through the singularity". But we aren't through the singularity until superhuman intelligence exists. Is your plan, therefore, to suppress development of superhuman AI, until there

... (read more)
Oracle / tool AI is. The usual premise is that questions are asked to the superhuman AI, and responses only implemented if they are comprehensible, sane, and morally acceptable. Your example of satisfies C but still violates Y and Z would be picked up by the human oversight (or, the output is too complicated to be understood, and is shelved). Blindly following the AI's directives is a failure mode the oracle AI path is meant to avoid. Further, search processes do not happen across solutions which are seemingly ok but deviously setup an AI breakout or kill-all-humans scenario just by random chance - the probability of that is astronomically low. So really, the only likely ways in which the AI says to do X, but ends up violating unstated constraints Y and Z is if (a) the human overseers failed at their one and only job, or (b) deception. Yup, it does. This is a race, but the question is not “is this approach faster than straight-up UFAI?” but rather “is this approach faster than other pathways to friendly AI?” FAI is a strict subset of the UFAI problem: there is no approach to FAI which is faster than a straight sprint to UFAI, consequences be damned. My own informed opinion is that (UF)AGI is only 10-20 years away, max. Provably-friendly AI is not even a well defined problem, but by any definition it is strictly harder. The only estimates I've seen come out of MIRI for their approach puts FAI decades further out (I remember Luke saying 50-70 years). Such a date makes sense when compared with progress in verifiable computing in other fields. But 2nd place doesn't count for anything here. Oracle / tool AGI has the advantage of making safeguards a parallel development. The core AGI is not provably friendly, and can be developed at the same breakneck pace as one would expect of a hedge fund exploring this area. The security controls can be developed and put in place in parallel, without holding up work on the AGI itself. It does require choosing a particular architect
How do you decide whether some interaction of a complex neural net is friendly or unfriendly? It's very hard to tell what a neural net or complex algorithm is doing even if you have logs.
Don't use a neural net (or variants like deep belief networks). The field has advanced quite a bit since the 60's, and since the late 80's there have been machine learning and knowledge representation structures which are human and/or auditor comprehensible, such as probabilistic graphical models. This would have to be first class types of the virtual machine which implements the AGI if you are using auditing as a confinement mechanism. But that's not really a restriction as many AI techniques are already phrased in terms of these models (including Eliezer's own TDT, for example), and others have simple adaptations.
I think they need to cut into Strong and Weak versions of the Scary Idea. Weak Version: AIs behave "as intuitively expected", like assignable robots or animals, but their reward/value signals are unaligned with ours, so they eventually "rebel" or "wirehead" as we might imagine. Since AIs will be cheaper to produce/reproduce than humans (if not, why are they economically useful?), they will have large population numbers (or a large, resourceful singleton instance), and become a threat to people. Friendliness becomes a matter of designing systems for containing potentially rogue AIs and designing goal systems to prevent these problems from happening in the first place. Strong Version: Any AI except an approvedly Friendly AI will instantly go all Singularity and paper-clip the universe within too short a period of time for us to stop it; any attempts to contain the AI will fail as the AI proceeds to take control over human minds through a mere text channel and build an army of zombies. Stronger Version: This may already have happened, since all those people you see on the street seem like such stupid, brainwashed sheeple already ;-).
In other words, all AGI researchers are already well aware of this problem and take precautions according to their best understanding?
s/all/most/ - you will never get them all. But yes, that's an accurate statement. Friendliness is taught in artificial intelligence classes at university, and gets mention in most recent AI books I've seen. Pull up the AGI conference proceedings and search for "friendly" or "safe" - you'll find a couple of invited talks and presented papers each year. Many project roadmaps include significant human oversight of the developing AGI, and/or boxing mechanisms, for the purpose of ensuring friendliness proactive response.

Overexposure of an idea can be harmful as well. Look at how Kurzweil promoted his idea of the singularity. While many of the ideas (such as intelligence explosion) are solid, people don't take Kurzweil seriously anymore, to a large extent.

It would be useful debating why Kurzweil isn't taken seriously anymore. Is it because of the fraction of wrong predictions? Or is it simply because of the way he's presented them? Answering these questions would be useful to avoid ending up like Kurzweil has.

While not doubting the accuracy of the assertion, why precisely do you believe Kurzweil isn't taken seriously anymore, and in what specific ways is this a bad thing for him/his goals/the effect it has on society?
I wasn't aware Kurzweil was ever taken seriously in the first place.

At least he's been cited: Google Scholar reports 1600+ citations for The Singularity is Near as well as for The Age of Spiritual Machines, his earlier book on the same theme.

Also, if we're talking about him in general, and not just his Singularity-related writings, Wikipedia reports that:

Kurzweil received the 1999 National Medal of Technology and Innovation, America's highest honor in technology, from President Clinton in a White House ceremony. He was the recipient of the $500,000 Lemelson-MIT Prize for 2001,[6] the world's largest for innovation. And in 2002 he was inducted into the National Inventors Hall of Fame, established by the U.S. Patent Office. He has received nineteen honorary doctorates, and honors from three U.S. presidents. Kurzweil has been described as a "restless genius"[7] by The Wall Street Journal and "the ultimate thinking machine"[8] by Forbes. PBS included Kurzweil as one of 16 "revolutionaries who made America"[9] along with other inventors of the past two centuries. Inc. magazine ranked him #8 among the "most fascinating" entrepreneurs in the United States and called him "Edison's rightful heir".[10]

I'd point out that much of the above is not (at least not entirely) related to his futurism - Kurweil has done a lot of other things.
That was the point - he already had a lot of credibility from his earlier achievements, which might cause people to also take his futurist claims more seriously than if the same books had been written by random nobodies.
Wow - reading comprehension fail, retracted.
Director of Engineering at Google. I'm pretty sure that some very smart people are taking him seriously.
It's bad because as I understand it, his goals are to make people adjust their behavior and attitude for the singularity before it happens (something that is well aligned with what MIRI wants to do) and if he isn't taken seriously then people won't do this. Such things include taking seriously transhumanist concepts (life extension, uploading, etc.) and other concepts such as cryonics. I can't speak for Kurzweil but it seems that he thinks that if people took these ideas seriously right now, we would be headed for a much smoother and more pleasant ride into the future (as opposed to suddenly being awoken to a hard FOOM scenario rapidly eating up your house, your lunch, and then you). I agree with this perspective.

One possible response is “it’s not possible to persuade people without math backgrounds, training in rationality, engineering degrees, etc”. To which I reply: what’s the data supporting that hypothesis? How much effort has MIRI expended in trying to explain to intelligent non-LW readers what they’re doing and why they’re doing it? And what were the results?

Convincing people in Greenpeace that an UFAI presents a risk that they should care about has it's own dangers. There a risk that you associate caring about UFAI with luddites.

If you get a broad public... (read more)

Not only that
Is "bad publicity" worse than "good publicity" here? If strong AI became a hot political topic, it would raise awareness considerably. The fiction surrounding strong AI should bias the population towards understanding it as a legitimate threat. Each political party in turn will have their own agenda, trying to attach whatever connotations they want to the issue, but if the public at large started really worrying about uFAI, that's kind of the goal here.
Politically people who fear AI might go after companies like google. I don't think that the public at large is the target audience. The important thing is that the people who could potential build an AGI understand that they are not smart enough to contain the AGI. If you have a lot of people making bad arguments for why UFAI is a danger, smart MIT people might just say, hey those people are wrong I'm smart enough to program an AGI that does what I want. I mean take a topic like genetic engineering. There are valid dangers involved in genetic engineering. On the other hand the people who think that all gene manipulated food is poisons are wrong. As a result a lot of self professed skeptics and Atheists see it as their duty to defend genetic engineering.
Right, but what damage is really being done to GE? Does all the FUD stop the people who go into the science from understanding the dangers? If uFAI is popularized, the academia will pretty much be forced to seriously address the issue. Ideally, this is something we'll only need to do once; after it's known and taken seriously, the people who work on AI will be under intense pressure to ensure they're avoiding the dangers here. Google probably already has an AI (and AI-risk) team internally that they've just had no reason to publicize their having. If uFAI becomes widely worried about, you can bet they'd make it known they were taking their own precautions.
Letting plants grow their own pesticides for killing of things that eat the plants sounds to me like a bad strategy if you want healthy food. It makes things much easier for the farmer, but to me it doesn't sound like a road that we should go on. I wouldn't want to buy such food in the supermarket but I have no problem with buying genetic manipulated that adds extra vitamins. Then there are various issues with introducing new species. Issues about monocultures. Bioweapons. The whole work is dangerous. Safety is really hard.
This is more or less the opposite of what we actually actually use genetic engineering of crops for. Production of pesticides isn't something that plants were incapable of until we started tinkering with their genes, it's something they've been doing for hundreds of millions of years. Plants in nature have to deal with tradeoffs between producing their own natural pesticides and using their biological resources for other things, such as more rapid growth, greater drought resistance, etc. In general, genetically engineered plants actually have less innate pest resistance, which farmers then compensate for by spraying pesticides onto them, because it allows them to trade off that natural pesticide production for faster growth.
ChristianKl may be thinking of Bt corn (maize) and, for instance, the Starlink corn recall. Bt corn certainly does express a pesticide, namely Bacillus thuringiensis toxin.
Somewhat tangentially: does it sound like a better or a worse strategy than not letting plants do this, and growing the plants in an environment where external pesticides are regularly applied to them? (This really is a question about GMOs, not some kind of oblique analogical question about AIs.)
"AIs" -> "experts being informed in their field of study" ETA: Was this not actually apparent?
As a matter of evolutionary biology plants have been doing this for many millions of years and are pretty good at making poisons.
Is there reason to believe someone in the field of genetic engineering would make such a mistake? Shouldn't someone in the field be more aware of that and other potential dangers, despite the GE FUD they've no doubt encountered outside of academia? It seems like the FUD should just be motivating them to understand the risks even more—if for no other reason than simply to correct people's misconceptions on the issue. Your reasoning for why the "bad" publicity would have severe (or any notable) repercussions isn't apparent. This just doesn't seem very realistic when you consider all the variables.
Because those people do engineer plants to produce pesticides? Bt Potato was the first which was approved by the FDA in 1995. The commerical incentives that exist encourage the development of such products. A customer in a store doesn't see whether a potato is engineered to have more vitamins. He doesn't see whether it's engineered to produce pesticides. He buys a potato. It's cheaper to grow potatos that produce their own pesticides than it is to grow potatos that don't. In the case of potatos it might be harmless. We don't eat the green of the potatos anyway, so why bother if the green has additional poison? But you can slip up. Biology is complicated. You could have changed something that also gets the poison to be produced in the edible parts. It's not a question of motivation. Politics is the mindkiller. If a topic gets political people on all sides of the debate get stupid. According to Eliezer it takes strong math skills to see how an AGI can overtake their own utility function and is therefore dangerous. Eliezer made the point that it's very difficult to explain to people who are invested into their AGI design that it's dangerous because that part needs complicated math. It easy to say in abstract that some AGI might become UFAI, but it's hard to do the assessment for any individual proposal.
Based on my (subjective and anecdotal, I'll admit) personal experiences, I think it would be bad. Look at climate change.
Is there something wrong with climate change in the world today? Yes, it's hotly debated by millions of people, a super-majority of them being entirely unqualified to even have an opinion, but is this a bad thing? Would less public awareness of the issue of climate change have been better? What differences would there be? Would organizations be investing in "green" and alternative energy if not for the publicity surrounding climate change? It's easy to look back after the fact and say, "The market handled it!" But the truth is that the publicity and the corresponding opinions of thousands of entrepreneurs is part of that market. Looking at the two markets: 1. MIRI's warning of uFAI is popularized. 2. MIRI's warning of uFAI continues in obscurity. The latter just seems a ton less likely to mitigate uFAI risks than the former.
The failure mode that I'm most concerned about is overreaction followed by a backlash of dismissal. If that happened, the end result would be far worse than obscurity.

Nastier issue: the harder argument of convincing people UFAI is an avoidable risk. If you can't convince people they've got a realistic chance (ie: one they would gamble on, given the possible benefits of FAI) of winning this issue, then it doesn't matter how informed they are.

See: Juergen Schmidhuber's interview on this very website, where we basically says, "We're damn near AI in my lab, and yes, it is a rational optimization process," followed by, "We see no way to prevent the paper-clipping of humanity whatsoever, so we stopped giving a damn and just focus on doing our research."

This is what cfar is for


Dang, and here I was thinking they were trying to help me improve my life.

through ponies!

This post makes me wonder if the relevant information could be compressed into a series of self-contained videos along the lines of MinutePhysics. So far as I can tell most people find video more accessible. (I don't, but I'm an outlier, like most here)

I'm going to guess it's impossible, but I'm not sure if it's Shut Up and Do the Impossible impossible or Just Lose Hope Already impossible.

HPMOR could end with Harry destroying the world through an UFAI. The last chapters already pointed to Harry destroying the world.

Strategically that seems to be the best choice. HPMOR is more viral than some technical document. There already effort invested in getting a lot of people to read HPMOR.

People bond with the characters. Ending the book with, now everyone is dead because an AGI went FOOM let's people take that scenario seriously and that's exactly the right time to tell them: "Hey, this scenario could also happen in our world, so let's do something to prevent it from happening."

I would consider it probably the worst possible ending for HPMoR. I assume that Eliezer is smart enough to avoid overt propaganda.
0Scott Garrabrant
What do you mean "smart enough?" You think that that ending would do harm for FAI?
It would likely "do harm" to the story and consequently reduce its appeal and influence.
Even more people have read the Bible, the Quran, and the Vedas, so why not put out pamphlets in which Jesus, Muhammad and Krishna discuss AGI?
I would be interested in reading them.
Jesus: We excel at absorbing external influences and have no problems with setting up new cults (just look at Virgin Mary) -- so we'll just make a Holy Quadrinity! Once you go beyond monotheism there's no good reason to stop at three... Muhammad: Ah, another prophet of Allah! I said I was the last but maybe I was mistaken about that. But one prophet more, one prophet less -- all is in the hands of Allah. Krishna: Meh, Kali is more impressive anyways. Now where are my girls?
That would probably upset many existing Christians. Clearly Jesus' second coming is in AI form.
Robot Jesus! :-) And rapture is clearly just an upload.
No, it couldn't.
There are multiple claims in the book that Harry will destroy the world. It starts in the first chapter with "The world will end". Interessingly that wasn't threre at the time the chapter was first published, but retrospectively added. Creating a AI in the world is just a matter of creating a magical item. Harry knows how to make them self aware. Harry knows that magical creatures like trolls constantly self modify through magic. Harry is into inventing new powerful spells. All the pieces for building an AGI that goes FOOM are there in the book.
I assign 2% probability on this scenario. What probability do you assign?
Given that the pieces the last time I read it p=.99 for that claim. The more interesting claim is that an AGI actually goes FOOM. I say p=.65.
Yeah. that was the claim I meant. Would you be willing to bet on this? I'd be willing to bet 2 of my dollars against 1 of yours, that no AGI will go FOOM in the remainder of the HPMoR story (for a maximum of 200 of my dollars vs 100 of yours)
Even in early 2012, I didn't think 2:1 was the odds for an AGI fooming in MoR... How would you like to bet 1 of your dollars against 3 of my dollars that an AGI will go FOOM? Up to a max of 120 of my dollars and 40 of yours; ie. if an AGI goes foom, I pay you $120 and if it doesn't, you pay me $40. (Payment through Paypal.) Given your expressed odds, this should look like a good deal to you.
Ι said I assign 2% probability on an AGI going FOOM in the story. So how would this look like a good deal for me? The odds I offered to ChristianKI were meant to express a middle ground between the odds I expressed (2%) and the odds he expressed (65%) so that the bet would seem about equally profitable to both of us, given our stated probabilities.
Bah! Fine then, we won't bet. IMO, you should have offered more generous terms. If your true probability is 2%, then that's an odds against of 1:49, while his 65% would be 1:0.53, if I'm cranking the formula right. So a 1:2 doesn't seem like a true split.
You are probably right about how it's not a true split -- I just did a stupid "add and divide by 2" on the percentages, but it doesn't really work like that.. He would anticipate to lose once every 3 times, but given my percentages I anticipated to lose once every 50 times. (I'm not very mathy at all)
At the moment I unfortunately don't have enough cash to invest in betting projects. Additionally I don't know Eliezer personally and there are people on LessWrong that do and which might have access to nonpublic information. As a result it's not a good topic for betting money.
Fortunately, that's why we have PredictionBook! Looking through my compilation of predictions (, I see we already have two relevant predictions: * Harry will create a superintelligent AI using magic or magical objects * and it won't be Friendly, killing many/all (I've added a new more general one as well.)
I added my prediction to that.
In fiction, deus (or diabolus) ex machina is considered an anti-pattern.
That strikes me as incredibly likely to backfire. Most obviously, a paper with more than half a million words is a little much to as an introductory work, especially with things like the War of the Three Armies (because Death Note wasn't complicated enough!). Media where our heroes destroy a planet also tend to have issues with word of mouth when not a comedy or written by Tomino. More subtly, there are some serious criticisms of the idea of the Singularity and more generally of transhumanism, which rest on things that would be obviated in HPMoR by nature of the Harry Potter series starting as a fantasy series for young teens, and genre conventions of fantasy series, rather than by the strength of MIRI's arguments. Many of these criticisms are not very terribly strong. They are still shouted as if strong AI were Rumpelstilskin, unable to stand the sound of an oddly formed name, and HPMoR would have to be twisted very hard to counter them.
A lot of people think of strong AI like C3PO from Star Wars. Science fiction has the power of giving people mental models even it isn't realitstic. The magical enviroment of the Matrix movies shapes how people think about the simulation argument.
Very true. I'd recommend against using Star Wars as a setting for cautionary tales about the Singularity, as well. The Harry Potter setting is just particularly bad, because we've already seen and encountered methods for producing human-intelligence artificial constructs that think just like a human. If Rationalist!Harry ends up having the solar system wallpapered with smiley faces, it's a lot less believable that he did it because The Machine Doesn't Care when quite a number of other machines already have. You'll have to fight assumptions like metaphysical dualism or what limitations self-reinforcing processes might have, no matter what you do, because those mental models apply in fairly broad strokes, but it's a lot easier to do so when the setting isn't fighting you at the same time.
I don't think that you have to fight assumptions of metaphysical dualism. I think that the people who don't believe in UFAI as a risk on that basis are not the ones that are dangerous and might develop an AGI.
That's an appealing thought, but I'm not sure it's a true one. For one, if we're talking about appealing to general audiences, many folk won't be trying to develop an AGI, but still be relevant to our interests. Thinking AGI can not invent because they lack souls, or that AGI will be friendly if annoying golden translation droids, may be inconsistent with writing evolutionary algorithms, but is not certainly inconsistent with having investment or political capital. At a deeper level, a lot of folk do hold such beliefs and simultaneously have inconsistent belief structures, which may still leave them dangerous. It is demonstrably possible have incorrect beliefs about evolution yet run a PCR, or to think it's easy to preserve semantic significance but also be a computer programmer. It's tempting to dismiss people who hold irrational beliefs since rationality strongly correlates with long-term success, but from an absolute safety perspective that gets increasingly risky.
You need a bit more to develop an AGI than running PCR that someone else invents. I don't think you can develop an AGI when you think AGI are impossible due to metaphysical dualism. You can believe that humans have souls are still design AGI that have minds but no souls, but you won't get far at developing an AGI with something like a mind if you think that task is impossible.
I don't expect AI itself to show up, but I think it's clear that in the story that magic is serving as a sort of metaphor for AI, with Harry playing the role of an ambitious AI researcher: Harry wants to use magic to solve death and make everything perfect, but we've gotten a lot of warning that Harry's plans could go horribly wrong and possibly destroy the world. Eliezer once mentioned he was considering a "solve this puzzle or the story ends sad" conclusion for HPMOR like he did for Three Worlds Collide. If Eliezer goes through with that, I expect the "sad" ending to be "Harry destroys the world." Or if Eliezer doesn't do that, he may just make it clear how Harry came very close to destroying the world before finding another solution.
EDIT: So how does Harry almost destroy the world? My own personal theory is "conservation law arbitrage." Or maybe some plan involving Dementors going horribly wrong.
This comment from a few years back and the associated discussion seems vaguely relevant.