What does the world look like, the day before FAI efforts succeed?

The main way complexity of this sort would be addressable is if the intellectual artifact that you tried to prove things about were simpler than the process that you meant the artifact to unfold into. For example, the mathematical specification of AIXI is pretty simple, even though the hypotheses that AIXI would (in principle) invent upon exposure to any given environment would mostly be complex. Or for a more concrete example, the Gallina kernel of the Coq proof engine is small and was verified to be correct using other proof tools, while most of the comp... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

[anonymous]7y3

This is actually one of the best comments I've seen on Less Wrong, especially this part:

Shannon information was discovered for the informal notion of surprise (with the assumption of independent identically distributed symbols from a known distribution). Bayesian decision theory was discovered for the informal notion of rationality (with assumptions like perfect deliberation and side-effect-free cognition). And Solomonoff induction was discovered for the informal notion of Occam's razor (with assumptions like a halting oracle and a taken-for-granted choi

... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post

What does the world look like, the day before FAI efforts succeed?

by [anonymous] 5 min read16th Nov 201264 comments

23


TL;DR: let's visualize what the world looks like if we successfully prepare for the Singularity.

I remember reading once, though I can't remember where, about a technique called 'contrasting'. The idea is to visualize a world where you've accomplished your goals, and visualize the current world, and hold the two worlds in contrast to each other. Apparently there was a study about this; the experimental 'contrasting' group was more successful than the control in accomplishing its goals.

It occurred to me that we need some of this. Strategic insights about the path to FAI are not robust or likely to be highly reliable. And in order to find a path forward, you need to know where you're trying to go. Thus, some contrasting:

It's the year 20XX. The time is 10 AM, on the day that will thereafter be remembered as the beginning of the post-Singularity world. Since the dawn of the century, a movement rose in defense of humanity's future. What began with mailing lists and blog posts became a slew of businesses, political interventions, infrastructure improvements, social influences, and technological innovations designed to ensure the safety of the world.

Despite all odds, we exerted a truly extraordinary effort, and we did it. The AI research is done; we've laboriously tested and re-tested our code, and everyone agrees that the AI is safe. It's time to hit 'Run'.

And so I ask you, before we hit the button: what does this world look like? In the scenario where we nail it, which achievements enabled our success? Socially? Politically? Technologically? What resources did we acquire? Did we have superior technology, or a high degree of secrecy? Was FAI research highly prestigious, attractive, and well-funded? Did we acquire the ability to move quickly, or did we slow unFriendly AI research efforts? What else?

I had a few ideas, which I divided between scenarios where we did a 'fantastic', 'good', or 'sufficient' job at preparing for the Singularity. But I need more ideas! I'd like to fill this out in detail, with the help of Less Wrong. So if you have ideas, write them in the comments, and I'll update the list.

Some meta points:

  • This speculation is going to be, well, pretty speculative. That's fine - I'm just trying to put some points on the map. 
  • However, I'd like to get a list of reasonable possibilities, not detailed sci-fi stories. Do your best.
  • In most cases, I'd like to consolidate categories of possibilities. For example, we could consolidate "the FAI team has exclusive access to smart drugs" and "the FAI team has exclusive access to brain-computer interfaces" into "the FAI team has exclusive access to intelligence-amplification technology." 
  • However, I don't want too much consolidation. For example, I wouldn't want to consolidate "the FAI team gets an incredible amount of government funding" and "the FAI team has exclusive access to intelligence-amplification technology" into "the FAI team has a lot of power".
  • Lots of these possibilities are going to be mutually exclusive; don't see them as aspects of the same scenario, but rather different scenarios.

Anyway - I'll start.

Visualizing the pre-FAI world

  • Fantastic scenarios
    • The FAI team has exclusive access to intelligence amplification technology, and use it to ensure Friendliness & strategically reduce X-risk.
    • The government supports Friendliness research, and contributes significant resources to the problem. 
    • The government actively implements legislation which FAI experts and strategists believe has a high probability of making AI research safer.
    • FAI research becomes a highly prestigious and well-funded field, relative to AGI research.
    • Powerful social memes exist regarding AI safety; any new proposal for AI research is met with a strong reaction (among the populace and among academics alike) asking about safety precautions. It is low status to research AI without concern for Friendliness.
    • The FAI team discovers important strategic insights through a growing ecosystem of prediction technology; using stables of experts, prediction markets, and opinion aggregation.
    • The FAI team implements deliberate X-risk reduction efforts to stave off non-AI X-risks. Those might include a global nanotech immune system, cheap and rigorous biotech tests and safeguards, nuclear safeguards, etc.
    • The FAI team implements the infrastructure for a high-security research effort, perhaps offshore, implementing the best available security measures designed to reduce harmful information leaks.
    • Giles writes: Large amounts of funding are available, via government or through business. The FAI team and its support network may have used superior rationality to acquire very large amounts of money.
    • Giles writes: The technical problem of establishing Friendliness is easier than expected; we are able t construct a 'utility function' (or a procedure for determining such a function) in order to implement human values that people (including people with a broad range of expertise) are happy with.
    • Crude_Dolorium writes: FAI research proceeds much faster than AI research, so by the time we can make a superhuman AI, we already know how to make it Friendly (and we know what we really want that to mean).
  • Pretty good scenarios
    • Intelligence amplification technology access isn't exclusive to the FAI team, but it is differentially adopted by the FAI team and their supporting network, resulting in a net increase in FAI team intelligence relative to baseline. The FAI team uses it to ensure Friendliness and implement strategy surrounding FAI research.
    • The government has extended some kind of support for Friendliness research, such as limited funding. No protective legislation is forthcoming.
    • FAI research becomes slightly more high status than today, and additional researchers are attracted to answer important open questions about FAI.
    • Friendliness and rationality memes grow at a reasonable rate, and by the time the Friendliness program occurs, society is more sane.
    • We get slightly better at making predictions, mostly by refining our current research and discussion strategies. This allows us a few key insights that are instrumental in reducing X-risk.
    • Some X-risk reduction efforts have been implemented, but with varying levels of success. Insights about which X-risk efforts matter are of dubious quality, and the success of each effort doesn't correlate well to the seriousness of the X-risk. Nevertheless, some X-risk reduction is achieved, and humanity survives long enough to implement FAI.
    • Some security efforts are implemented, making it difficult but not impossible for pre-Friendly AI tech to be leaked. Nevertheless, no leaks happen.
    • Giles writes: Funding is harder to come by, but small donations, limited government funding, or moderately successful business efforts suffice to fund the FAI team.
    • Giles writes: The technical problem of aggregating values through a Friendliness function is difficult; people have contradictory and differing values. However, there is broad agreement as to how to aggregate preferences. Most people accept that FAI needs to respect values of humanity as a whole, not just their own.
    • Crude_Dolorium writes: Superhuman AI arrives before we learn how to make it Friendly, but we do learn how to make an 'Anchorite' AI that definitely won't take over the world. The first superhuman AIs use this architecture, and we use them to solve the harder problems of FAI before anyone sets off an exploding UFAI.
  • Sufficiently good scenarios
    • Intelligence amplification technology is widespread, preventing any differential adoption by the FAI team. However, FAI researchers are able to keep up with competing efforts to use that technology for AI research.
    • The government doesn't support Friendliness research, but the research group stays out of trouble and avoids government interference.
    • FAI research never becomes prestigious or high-status, but the FAI team is able to answer the important questions anyway.
    • Memes regarding Friendliness aren't significantly more widespread than today, but  the movement has grown enough to attract the talent necessary to implement a Friendliness program.
    • Predictive ability is no better than it is today, but the few insights we've gathered suffice to build the FAI team and make the project happen.
    • There are no significant and successful X-risk reduction efforts, but humanity survives long enough to implement FAI anyway.
    • No significant security measures are implemented for the FAI project. Still, via cooperation and because the team is relatively unknown, no dangerous leaks occur.
    • Giles writes: The team is forced to operate on a shoestring budget, but succeeds anyway because the problem turns out to not be incredibly sensitive to funding constraints.
    • Giles writes: The technical problem of aggregating values is incredibly difficult. Many important human values contradict each other, and we have discovered no "best" solution to those conflicts. Most people agree on the need for a compromise but quibble over how that compromise should be reached. Nevertheless, we come up with a satisfactory compromise.
    • Crude_Dolorium writes: The problems of Friendliness aren't solved in time, or the solutions don't apply to practical architectures, or the creators of the first superhuman AIs don't use them, so the AIs have only unreliable safeguards. They're given cheap, attainable goals; the creators have tools to read the AIs' minds to ensure they're not trying anything naughty, and killswitches to stop them; they have an aversion to increasing their intelligence beyond a certain point, and to whatever other failure modes the creators anticipate; they're given little or no network connectivity; they're kept ignorant of facts more relevant to exploding than to their assigned tasks; they require special hardware, so it's harder for them to explode; and they're otherwise designed to be safer if not actually safe. Fortunately they don't encounter any really dangerous failure modes before they're replaced with descendants that really are safe.

 

23