The "Post-Singularity Social Contract" and Bostrom's "Vulnerable World Hypothesis"

by philosophytorres4 min read25th Nov 20186 comments



I thought it might be worth outlining a few interesting (at least in my view!) parallels between Nick Bostrom’s recent working draft paper on the “vulnerable world hypothesis” (VWH) and my slightly less recent book chapter “Superintelligence and the Future of Governance: On Prioritizing the Control Problem at the End of History.”

In brief, my chapter offers an explicit (and highly speculative) proposal for extricating ourselves from the “semi-anarchic default condition” in which we currently find ourselves, although I don’t use that term. But I do more or less describe the three features that Bostrom identifies of this condition:

(1) The phenomenon of “agential risks”—which Bostrom refers to as the “apocalyptic residual,” a phrase that I think will ultimately introduce confusion and thus ought to be eschewed!—entails that, once the destructive technological means become available, the world will almost certainly face a global-scale catastrophe.

(2) I argue that sousveillance and the “transparent society” model (of David Brin) are both inadequate to prevent with virtually 100 percent reliability global-scale attacks from risky agents. Furthermore, I contend that (the alternative of) asymmetrical invasive global surveillance systems will almost certainly be misused and abused by those ("mere humans") in charge, thus threatening another type of existential hazard: Totalitarianism.

(3) Finally, I suggest that a human-controlled singleton (or global governing system) is unlikely to take shape on the relevant timescales. That is, yes, there appears to be some general momentum toward a unipolar configuration (see Bostrom’s “singleton hypothesis,” which seems plausible to me), but unprecedented destructive capabilities will likely be widely distributed among nonstate actors before this happens.

I then argue for a form of algocracy: One way to avoid almost certain doom (due to agential risks and world-destroying technologies) is to design a superintelligent algorithm for the purpose of coordinating planetary affairs and essentially spying on all citizens for the purpose of preemptively obviating global-scale attacks, some of which could have irreversible consequences. Ultimately, my conclusion is that, given the exponential development of dual-use emerging technologies, we may need to actually accelerate work on both (a) the philosophical control problem, and (b) the technical problem of creating an artificial superintelligence. As I write:

First, everything hangs on our ability to solve the control problem and create a friendly superintelligence capable of wise governance. This challenge is formidable enough given that many AI experts anticipate a human-level AI within this century—meaning that there appears to be a deadline—but the trends outlined in Figure 2 [click here] open up the possibility that we may have even less time to figure out what our “human values” are and how they can be encoded in “the AI’s programming language, and ultimately in primitives such as mathematical operators and addresses pointing to the contents of individual memory registers” (Bostrom 2014). Thus, the present paper offers a novel reason for allocating large amounts of resources for projects that focus on solving the control problem: not only will continued progress in computer science make the control problem probably unavoidable, but the convergence of state and nonstate power could require new forms of global governance—namely, a friendly supersingleton—within the coming decades.

This being said, here are some differences between the two papers:

(i) Bostrom also emphasizes phenomena like non-omnicidal actors who are locked in situations that, for structural reasons, lead them to pursue actions that cause disaster.

(ii) Whereas Bostrom focuses on the possibility of extracting a “black ball”—i.e., a “technology that invariably or by default destroys the civilization that invents it”—out of the tubular urn of innovation, I am essentially suggesting that we may have already extracted such a ball. For example, synthetic biology (more specifically: de-skilling plus the Internet plus lowering costs of lab equipment) will almost certainly place unprecedented destructive power within arm’s reach of a large number of terrorist groups or even single individuals. As I have elsewhere demonstrated—in two articles on agential risks—there are plenty of violent individuals with omnicidal inclinations looming in the shadows of society! This being said, some pretty simple calculations reveal that the probability of a successful global-scale attack by any one group or individual need be only negligible for annihilation to be more or less guaranteed over the course of decades or centuries, due to the fact that probability accumulates across space and time. Both John Sotos (in an article titled “Biotechnology and the Lifetime of Technical Civilizations”) and I (in an article titled “Facing Disaster: The Great Challenges Framework,” although I had similar calculations in my 2017 book) have crunched the numbers to show that, quoting from “Facing Disaster”:

For the sake of illustration, let’s posit that there are 1,000 terror agents in a population of 10 billion and that the probability per decade of any one of these individuals gaining access to world-destroying weapons … is only 1 percent. What overall level of existential risk would this expose the entire population to? It turns out that, given these assumptions, the probability of a doomsday attack per decade would be a staggering 99.995 percent. One gets the same result if the number of terror agents is 10,000 and the probability of access is 0.1 percent, or if the number is 10 million and the probability is 0.000001. Now consider that the probability of access may become far greater than 0.000001—or even 1—percent, given the trend of [the radical democratization of science and technology], and that the number of terror agents could exceed 10 million, which is a mere 0.1 percent of 10 billion. It appears that an existential strike could be more or less inescapable.

This suggests that a scenario as extreme as the “easy nuke” one that Bostrom outlines in his paper need not be the case for the conclusion of the VWH to obtain. Recall that this hypothesis states:

If technological development continues then a set of capabilities will at some point be attained that make the devastation of civilization extremely likely, unless civilization sufficiently exits the semi-anarchic default condition.

My sense is, again, that these capabilities are already with us today—that is, they are emerging (rapidly) from the exponentially growing field of synthetic biology, although it’s possible that they could arise from atomically-precise manufacturing as well. Indeed, this is a premise of my argument in “Superintelligence and the Future of Governance,” which I refer to as “The Threat of Universal Unilateralism.” Indeed, I summarize my argument in section 1 of the paper as follows:

(i) The Threat of Universal Unilateralism: Emerging technologies are enabling a rapidly growing number of nonstate actors to unilaterally inflict unprecedented harm on the global village; this trend of mass empowerment is significantly increasing the probability of an existential catastrophe—and could even constitute a Great Filter (Sotos 2017).

(ii) The Preemption Principle: If we wish to obviate an existential catastrophe, then societies will need a way to preemptively avert not just most but all possible attacks with existential consequences, since the consequences of an existential catastrophe are by definition irreversible.

(iii) The Need for a Singleton: The most effective way to preemptively avert attacks is through some regime of mass surveillance that enables governing bodies to monitor the actions, and perhaps even the brain states, of citizens; ultimately, this will require the formation of a singleton.

(iv) The Threat of State Dissolution: The trend of (i) will severely undercut the capacity of governing bodies to effectively monitor their citizens, because the capacity of states to provide security depends upon a sufficiently large “power differential” between themselves and their citizens.

(v) The Limits of Security: If states are unable to effectively monitor their citizens, they will be unable to neutralize the threat posed by (i), thus resulting in a high probability of an existential catastrophe.

There seem, at least to my eyes, to be some significant overlaps here with Bostrom's (fascinating) paper, which suggests a possible convergence of scholars on a single conception of humanity's (near-)future predicament on spaceship Earth. Perhaps the main difference is, once more, that Bostrom’s thesis hinges upon the notion of a not-yet-realized-but-maybe-possible “black ball” technology, whereas the message of my analysis is far more urgent: “Weapons of total destruction” (WTDs) already exist, are in their adolescence (so to speak), and will likely mature in the coming years and decades. Put differently, humanity will need to escape the semi-anarchic default condition in which we current reside quite soon or else face almost certain annihilation. My solution is algocratic, with everything depending upon success on the control problem.

This post is, admittedly, quite hastily written. Please don’t hesitate to let me know if aspects of it are opaque and thus require clarification. I would also—as always—welcome comments of any sort!!