Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This post is a result of numerous discussions with other participants and organizers of the MIRI Summer Fellows Program 2019. It describes ideas that are likely already known by many researchers. However, given how often disagreements about slow/fast takeoffs come up, I believe there is significant value in making them common knowledge.

Takeoff speed & why it matters

In Superintelligence, Nick Bostrom distinguishes between slow, medium, and fast AI takeoff scenarios (where the takeoff speed is measured by how much real-world time passes between the milestones of human-level AI (HLAI) and superintelligent AI (SAI)). He argues that slow takeoff should be reasonably safe since the humanity would have sufficient time coordinate and solve the AI alignment problem, while fast takeoff would be particularly dangerous since we wouldn't be able to react to what the AI does.

Real-world time is not what matters

In many scenarios, the real-time takeoff speed indeed strongly correlates with our ability to influence the outcome. However, we can also imagine many scenarios where this is not the case. As an example, suppose we obtain HLAI by simulating humans in virtual environments, and that this procedure additionally fully preserves the simulated humans' alignment with humanity. Since this effectively increases the speed at which humanity operates, we might get a "fully controlled takeoff" even if the transition from HLAI to SAI[1] only takes a few days of real-world time. More generally, if our path to HLAI also increases the effectivity of humanity's efforts, the "effective time" we get between HLAI and SAI will scale accordingly. For example, this might be the case if we go the way of Iterated Distillation and Amplification or Comprehensive AI Services. Less controversially, suppose we automate most of the current programming tasks and increase the re-usability of code, such that every computer scientist becomes 100-times as effective as they are now.

Conclusion: measure useful work

Given these examples, I think we should measure takeoff speeds not in real-world time, but rather in (some operationalization of) the work-towards-AI-alignment that humanity will be able to do between HLAI and SAI. Anecdotal examples of such measures might include "integral of the human-originating GDP between HLAI and SAI" or "number of AI safety papers published between HLAI and SAI". I believe that finding a non-anecdotal operationalization would benefit many AI policy/strategy discussions.

  1. Recall that Bostrom distinguishes between speed, collective, and quality superintelligence. Arguably, being able to simulate humans (with enough compute) already constitutes a speed superintelligence. However, I don't think this diminishes the overall point of the post. ↩︎

New Comment
1 comment, sorted by Click to highlight new comments since:

This is am interesting take on framing Takeoff Dynamic Timelines -- and I understand why it became your emphasis -- the ability to align the AI is the outcome that really matters in that situation.

That said, I see two major challenges with defining time through any form of operationalization standards:

  1. Operations can change (and should improve). Often humans shock themselves with what is possible to operationalize in crisis times. (See output during WWII as a simple, clear example). This means that your time scale will fluctuate based on the of the operational effectiveness and scale -- or, we don't have a sense of the scale of the operational input in the 'transition period'.
  2. We're in the nascent stages where we can not predict how much operational input is required for successful alignment. In other words, we have no way to judge how much operational input will get us the outcome (operational output) we desire. 

Without either a clear operational input (capacity and effectiveness will likely fluctuate wildly over the coming years / decades) or clear operational output (we do not have a clear sense of the amount of input to get our desired outcome), its not a highly effective measure.

Specifically, not knowing either of these would lead to the conclusion that "we want more clock time" and perhaps more "crisis time" specifically, to maximize our operational inputs and thus greatest probability of a successful alignment outcome.

All that said, I think you hit on a key idea here — we need to make sure that we are measuring and tracking what actually matters: our ability to safely align the AI in question.

Curious if you think I missed something here?