Part of the series AI Risk and Opportunity: A Strategic Analysis.

(You can leave anonymous feedback on posts in this series here. I alone will read the comments, and may use them to improve past and forthcoming posts in this series.)

This post provides a list of questions about AI risk strategy — questions we want answered. Please suggest additional questions (a paragraph of explanation is preferred but not necessary); I may add them to the list. You can submit questions anonymously here.

Also, please identify which 3-5 of these questions you think are low-hanging fruit for productive strategic analysis on Less Wrong.

The list is in no particular order, but question numbers will remain unchanged (so that you can reliably refer to questions by their number):

  1. What methods can we use to predict technological development? We don't yet have reliable methods for long-term technological forecasting. But not all methods have been examined yet. Perhaps technology futures have a good track record. Perhaps we could look at historical technological predictions and see if there is any pattern in the data suggesting that certain character traits and contexts lend themselves to accurate technological predictions. Perhaps there are creative solutions we haven't thought of yet.

  2. Which kinds of differential technological development should we encourage, and how? Should we "push" on WBE, or not? Are some kinds of AI research risk-reducing, and other kinds risk-increasing? How can we achieve such effects, if they are desired?

  3. Which open problems are safe to discuss, and which are potentially dangerous? AI risk research may itself produce risk in some cases, in the form of information hazards (Bostrom 2011). Is it safe to discuss decision theories? Acausal trade? Certain kinds of strategic questions, for example involving government intervention?

  4. What can we do to reduce the risk of an AI arms race?

  5. What can we do to raise the "sanity waterline," and how much will this help?

  6. What can we do to attract more funding, support, and research to x-risk reduction and to the specific sub-problems of successful Singularity navigation?

  7. Which interventions should we prioritize?

  8. How should x-risk reducers and AI safety researchers interact with governments and corporations? Does Drexler's interaction with the U.S. government regarding molecular nanotechnology provide any lessons for how AI risk researchers should act?

  9. How can optimal philanthropists get the most x-risk reduction for their philanthropic buck?

  10. How does AI risk compare to other existential risks?

  11. Which problems do we need to solve, and which ones can we have an AI solve?

  12. How can we develop microeconomic models of WBEs and self-improving systems?

  13. How can we be sure a Friendly AI development team will be altruistic?

  14. How hard is it to create Friendly AI?

  15. What is the strength of feedback from neuroscience to AI rather than brain emulation?

  16. Is there a safe way to do uploads, where they don't turn into neuromorphic AI?

  17. How much must we spend on security when developing a Friendly AI team?

  18. What's the best way to recruit talent toward working on AI risks?

  19. How difficult is stabilizing the world so we can work on Friendly AI slowly?

  20. How hard will a takeoff be? To what degree is "intelligence" (as efficient cross-domain optimization) a matter of content vs. algorithms? How much does takeoff depend on slow, real-world experiments?

  21. What is the value of strategy vs. object-level progress toward a positive Singularity?

  22. What different kinds of Oracle AI are there, and are any of them both safe and feasible?

  23. How much should we be worried about "metacomputational hazards"? E.g. should we worry about nonperson predicates? Oracle AIs engaging in self-fulfilling prophecies? Acausal hijacking?

  24. What improvements can we make to the way we go about answering strategy questions? Wei Dai's notes on this question: "For example, should we differentiate between "strategic insights" (such as Carl Shulman's insight that WBE-based Singletons may be feasible) and "keeping track of the big picture" (forming the overall strategy and updating it based on new insights and evidence), and aim to have people specialize in each, so that people deciding strategy won't be tempted to overweigh their own insights? Another example: is there a better way to combine probability estimates from multiple people?"

  25. How do people in other fields answer strategy questions? Wei Dai's notes on this question: "Is there such a thing as a science or art of strategy that we can copy from (and perhaps improve upon with ideas from x-rationality)?"

[more questions to come, as they are posted to the comments section]

New to LessWrong?

New Comment
20 comments, sorted by Click to highlight new comments since: Today at 1:17 PM

Additional question: How can we be sure a Friendly AI development team will be sane (i.e., not overly confident or suffer from biases such as sunken cost, overly attached to identity as FAI builders, etc.)?

And I'd like to see these questions discussed first:

How hard is it to create Friendly AI?

I think it's very hard, to the extent that it's not worth trying to build a FAI unless one was smarter than human, or had a lot of (subjective) time to work on the problem. I'd like to see if anyone has good arguments to the contrary.

Is there a safe way to do uploads, where they don't turn into neuromorphic AI?

If we box an upload, don't let it do brain surgery on itself, and limit the hardware speed so that the simulated neurons aren't running faster than biological neurons, how much danger is there that it could turn superintelligent and break out of the box? If they just turn into neuromorphic AI, but not superintelligent ones, we can just turn them off, improve the scanning/emulation tech, and try again, right? If there is some danger I'm not seeing here, I'd like to see a discussion post to highlight it. (If the concern is long-term value drift or slowly going crazy, a solution has already been proposed.)

How do people in other fields answer strategy questions?

This could potentially be a low hanging fruit, which we're not seeing only due to unfamiliarity with the relevant fields.

I think it's very hard, to the extent that it's not worth trying to build a FAI unless one was smarter than human, or had a lot of (subjective) time to work on the problem. I'd like to see if anyone has good arguments to the contrary.

Not worth trying in view of which tradeoffs? The obvious candidates are opportunity cost of not working more on dominating WBE/intelligence improvement tech (call both "intelligence tech"), and the potential increase in UFAI risk that would hurt a possible future FAI project after an intelligence tech shift. Both of these are only important to the extent that probability of winning on the second round is comparable to the probability of winning on the current round. The current round is at a disadvantage in that we only have so much time and human intelligence. The next round has to be reached before a catastrophe, and a sane FAI project has to dominate it. Both seem rather unlikely, and since I don't see why the second round is any better than the first, saving more of the second round at the expense of the first doesn't seem like a clearly good move. (This has to be explored in more detail, our recent conversations at least painted a clearer picture for me.)

The argument to the contrary is that people created some very impressive pieces of theory on the order of decades, so not seeing how something can be done is weak evidence for it not being doable in several decades with at least low probability. It'll probably get clearer in about 50 years, when less time is left until an intelligence tech shift (assuming no disruptions), but then it'll probably be too late to start working on the problem (and have any chance of winning on this round).

I'd like to see you folk wondering to what extent secrecy surrounding research reduces opportunities for cooperation, collaboration and trust.

Great list! Here are some of my beliefs;

19 How difficult is stabilizing the world so we can work on Friendly AI slowly?

Virtually impossible. The people working on AI number in the thousands, and not even governments can stop technological progress. There are probably routes to discourage AI funding, or make such work unpopular, but to talk of "stabilizing the world" is way beyond what any group of people can do.

6 What can we do to attract more funding, support, and research to x-risk reduction and to the specific sub-problems of successful Singularity navigation?

I think this is the most important question by far.

5 What can we do to raise the "sanity waterline," and how much will this help?

I really think that trying to raise the "sanity waterline", which refers to the average sanity of all people, is an enormous task which, while noble, would be a waste of time. We simply don't have the time to do it. We should focus on the people that can possibly accidentally create UFAI--academics in machine learning programs, researchers with access to supercomputers, et cetera.

If anyone disagrees, I would love to hear some evidence against what I've said. This stuff is way too complicated for me to be really confident.

I'll take you up on the disagreement. In the next 40 years, I find it very unlikely that any form of AI will be developed. Furthermore, we do want technological improvement in machine learning fields because of the advantages that can be offered in fields like self-driving cars, assisting doctors in diagnosing and treating illnesses, and many other fields.

And, because of the >40 year timeline, it will most likely be the next generation that leads the discovery to AI/FAI. So, we can't target particular people (although we can focus on those who are likely to have children to go into AI, which is happening as a result of this site's existence). This means that raising the overall waterline is probably one of the best ways to go about doing this, because children who grow up in a more rational culture are more likely to be rational as well.

In the next 40 years, I find it very unlikely that any form of AI will be developed.

"Any form of AI"? You mean superhuman AI?

Sorry, yeah.

  • How much leverage for one project can a WBE transition provide, depending on what? For example, if WBE transition starts slowly, a possible speedup of a FAI project is matched by that of AGI projects, and so doesn't translate into greater amount of research time. This argues for discouraging early WBE, so that it would allow significant speedup shortly after it's ready.
  • What can make a FAI project more likely to win the WBE race, or equivalently the winner of the WBE race to end up working on a FAI project? Mainstream status and acceptance of the UFAI risk idea and enough FAI ideas to form a sane problem statement, and existence of well-respected high quality texts and experts on the topic seems like something that could help. Alternatively (and in that case concurrently), if a FAI project has enough funding, far beyond what's currently possible, it could try to maneuver itself into a winning position.

Meta-comment; it would be good if you could change the numbering from html to plain text. You can't copy and paste the numbers. Also, when I did copy and paste the questions in my comment below, and then added the numbers manually, it converted all the numbers I added to "1." I think it copied the

  • tag. Removing the period got rid of it.
  • How can we develop microeconomic models of WBEs and self-improving systems?

    Hey, that sounds like something I wrote.

    :P

    This might sound more provocative than intended. But some people might perceive answering some of those questions to be a prerequisite condition.

    For example, 'What can we do to raise the "sanity waterline," and how much will this help?' If you don't know the answer to that question, how did you decide that it was useful to spend years on writing the sequences and that it is now worthwhile to create a 'Rationality Org' (see e.g. here)?

    The WBE question is altogether a question of values - is the WBE the part of mankind and represent survival of mankind, or not? If you got a WBE to someone who has no relatives, nor friends, nor children, and is also a complete sociopath or best yet a psychopath, then yeah they might self optimize into god knows what. At the same time they are exceedingly likely to please themselves, or the like.

    On top of this, if you are ultra frigging clever running at 10^15 flops, you may be a brilliant engineer, but you won't forecast weather much better than the best weather prediction supercomputer because there ain't clever shortcuts around Lyapunov's exponent. Even an exact-but-for-1 atom replica of Earth won't predict weather very well. Pulling the trick from 'Limitless' (or Ted Chiang's Understand) on stock markets may also be impossible.

    That's the thing with fiction, we make up fiction characters that are easy to make up, which means that the super-intelligent characters got to be pretty damn average monkey that does savant tricks, and the powers of prediction are the easiest savant trick to speculate up (and the one we should expect the least). This goes for answering questions wrt what we should and shouldn't expect of the AI

    It doesn't look as though many people are highlighting the three to five most important questions as you suggested. Maybe a poll would be helpful? It's too bad you didn't submit the questions as individual comments to be voted up or down.

    Keep in mind that because of unknown unknowns, and because empiricism is useful, it's probably worthwhile investigating all the questions at least a little bit.

    Here are the ones I nominate for discussion on less wrong:

    Which interventions should we prioritize?

    What's the best way to recruit talent toward working on AI risks?

    How can optimal philanthropists get the most x-risk reduction for their philanthropic buck?

    What improvements can we make to the way we go about answering strategy questions?

    • What can we do to raise the "sanity waterline," and how much will this help?

    Liberal folks don't wanna hear that in order to have a firm grip on AI you have to be a quasi-religious nut.

    Meaning:

    1. You don't want to create "God" at all. You want to create a servant (ouch!) that is fully aware that you are his master and that he serves you and you are HIS "God". No matter how big or powerful it gets you want iT to have a clear picture that iT is a small part of a greater system.

    2. You wanna be able to cut out its blood supply at will...

    3. ONE! TWO! FIVE! WILT, I SAID WILT! "For Thelemites only." Private enterprise will have to accept the military nosing around. Else you're creating the Moonchild.

    As to morals, it's a real JFK/Riconosciuto issue as someone put it. Meaning: you gonna fuck one person for the greater good? It's one of the most god awful parts of it.

    Can collecting expert opinion help us get more accurate answers to these questions? (collecting expert opinion is also a way of getting those experts to care about these issues, and the support of recognized export is useful, even if they didn't bring any new information)

    Which of these questions should matter for AI Researchers?

    On what topics on existential risk do AI researchers disagree with each other and with the SIAI?

    What biases are likely to come into play in forecasting, and how can we collect expert opinion while avoiding those biases? (substituting a hard question for an easy one, etc.)

    These look low-hanging to me:

    Which open problems are safe to discuss, and which are potentially dangerous?

    I'd say all of the problem are unsafe not to discuss (at this point). Probability of disaster because of missing essential things or getting them wrong due to lack of independent review seems much higher than because of dangerous knowledge in wrong hands.

    Which problems do we need to solve, and which ones can we have an AI solve?

    We need to solve all of our problems. AI needs to ensure this happens.

    What different kinds of Oracle AI are there, and are any of them both safe and feasible?

    This is being discussed on LW, productively I think.

    How much should we be worried about "metacomputational hazards"?

    Not much. It's way too early.