Optimising Society to Constrain Risk of War from an Artificial Superintelligence

JohnCDraper

This paper is now up (with the annex mentioned in the paper) as a preprint at https://osf.io/preprints/socarxiv/4268q

It can be cited as Draper, John. 2020. “Optimising Peace Through a Universal Global Peace Treaty to Constrain Risk of War from a Militarised Artificial Superintelligence.” SocArXiv. April 15. doi:10.31235/osf.io/4268q.

Comments are welcome below.

Optimising Peace through a Universal Global Peace Treaty to Constrain Risk of War from a Militarised Artificial Superintelligence

John Draper

Abstract

An artificial superintelligence (ASI) emerging in a world where war is still normalised may constitute a catastrophic existential risk, either because the ASI might be employed by a single nation-state on purpose to wage war for global domination or because the ASI goes to war on behalf of itself to establish global domination; these risks are not mutually incompatible in that the first can transition to the second. We presently live in a world where few states actually declare war on each other or even war on each other. This is because the 1945 United Nations’ Charter's Article 2 states that UN member states should “refrain in their international relations from the threat or use of force against the territorial integrity or political independence of any state”, while allowing for “military measures by UN Security Council resolutions” and “exercise of self-defense”. In this theoretical ideal, wars are not declared; instead, 'international armed conflicts' occur. However, costly interstate conflicts, both ‘hot’ and ‘cold’, still exist, for instance the Kashmir Conflict and the Korean War. Furthermore, a ‘New Cold War’ between AI superpowers (the United States and China) looms. An ASI-directed/enabled future interstate war could trigger ‘total war’, including nuclear war, and is therefore ‘high risk’. One risk reduction strategy would be optimising peace through a Universal Global Peace Treaty (UGPT), which could contribute towards the ending of existing wars and towards the prevention of future wars, through conforming instrumentalism. A critical juncture to optimise peace via the UGPT is emerging, by leveraging the UGPT off a ‘burning plasma’ fusion reaction breakthrough, expected from circa 2025 to 2035, as was attempted, unfortunately unsuccessfully, in 1946 with fission, for atomic war. While this strategy cannot cope with non-state actors, it could influence state actors, including those developing ASIs, or an ASI with agency.

Keywords: AI arms race, artificial superintelligence, existential risk, nonkilling, peace

We say we are for Peace. The world will not forget that we say this. We know how to save Peace. The world knows that we do. We, even we here, hold the power and have the responsibility. We shall nobly save, or meanly lose, the last, best hope of earth. The way is plain, peaceful, generous, just - a way which, if followed, the world will forever applaud. Bernard Baruch, June 14, 1946, presenting to the United Nations Atomic Energy Commission.

Introduction

The problem of an artificial superintelligence at war

The international defence community is beginning to take seriously the national security risk posed by the development of artificial general intelligence (AGI), i.e., human or above human artificial intelligence (AI), and its implications for international relations:

Start thinking about artificial superintelligence and engage with the community that has started thinking about actionable options in that part of the option space (also recognizing that this may open up new avenues for engaging with actors such as China and/or the Russian Federation). (De Spiegeleire, Maas, & Sweijs, 2017, p.107)

This is because of the possibility that the development of AGI could ‘lock in’ economic or military supremacy as a discrete ‘end point’ to competition (Horowitz, 2018, p.54). In this article, we apply Bostrom’s (2002, p.25) Maxipok rule of thumb for moral action for existential risks, i.e., “Maximize the probability of an okay outcome, where an “okay outcome” is any outcome that avoids existential disaster,” to the specific risk of war enabled or directed by militarized AGI. In simultaneously militarizing AI and developing AGI, humanity is playing ‘technology roulette’. Former Secretary of the Navy Richard Danzig (2018, p.21) notes of this risk, “If humanity comes to recognize that we now confront a great common threat from what we are creating, we can similarly open opportunities for coming together.” In this cooperative spirit, we constrain the risk to global safety with peace-building by treaty.

The development of artificial intelligence (AI) is accepted to be a major factor in national security because it is ‘dual use’, i.e., it can be militarized, employed in adversarial contexts, and can provide a decisive advantage in terms of economic, information, and military superiority (Allen & Chan, 2017; National Security Commission on Artificial Intelligence, 2019; Babuta, Oswald & Janjeva, 2020). As the National Security Commission on Artificial Intelligence (2019, p.9) interim report states:

The development of AI will shape the future of power. The nation with the most resilient and productive economic base will be best positioned to seize the mantle of world leadership. That base increasingly depends on the strength of the innovation economy, which in turn will depend on AI. AI will drive waves of advancement in commerce, transportation, health, education, financial markets, government, and national defense.

Thus, AI is important for waging war, perhaps even decisive, raising the spectre of the return of ‘total war’, industrial interstate war, which in its last iteration, the Second World War, involved genocidal levels of killing (Markusen & Kopf, 2007).

Particularly for the United States, preserving AI technological supremacy is viewed as paramount to national security (National Security Commission on Artificial Intelligence, 2019, pp. 1-2), especially with respect to relations with China:

Developments in AI cannot be separated from the emerging strategic competition with China and developments in the broader geopolitical landscape. We are concerned that America’s role as the world’s leading innovator is threatened. We are concerned that strategic competitors and non-state actors will employ AI to threaten Americans, our allies, and our values. We know strategic competitors are investing in research and application. It is only reasonable to conclude that AI-enabled capabilities could be used to threaten our critical infrastructure, amplify disinformation campaigns, and wage war. China has deployed AI to advance an autocratic agenda and to commit human rights violations, setting an example that other authoritarian regimes will be quick to adopt and that will be increasingly difficult to counteract.

However, the militarization of AI introduces the risk that artificial general intelligence (AGI) development, i.e., AI equal to human intelligence, or in the case of a artificial ‘superintelligence’ (ASI), greater than human intelligence (Bostrom, 2014), presents a catastrophic risk to humanity.

In this article, we argue that this risk can be minimized, or partly ‘constrained’, in the same way as other potentially catastrophic risks involving weapons, e.g., by treaty. Bostrom (2014) briefly considers treaty approaches, and one of Allen and Chan’s (2017, p. 6) recommendations is:

The National Security Council, the Defense Department, and the State Department should study what AI applications the United States should seek to restrict with treaties.

Allen and Chan (2017) focus on an arms control approach to AI, using the example that AI should never be used to control dead man’s switches for nuclear weapons. Another approach is to optimise the likelihood of developing a beneficial AGI, through a comprehensive United Nations-sponsored ‘Benevolent AGI Treaty’ to be ratified by UN member states (Ramamoorthy & Yampolskiy, 2018).

Here, we consider an alternative approach, a Universal Global Peace Treaty (UGPT; Draper & Bhaneja, 2019). This would formalise the existing near-universal status of interstate peace; formally end the declaring of war; seek to end existing interstate hot and cold wars; seek to end internal or civil wars, which might prove to be flashpoints for a future global conflict; seek to prevent a pre-emptive war against an emerging ASI; and seek to constrain the future actions of an ASI to prevent it waging war on behalf of a nation-state or on behalf of itself for global domination, which we respectively term ASI-enabled war and ASI-directed war.

The concept that AGI), here, following Bostrom (2014) termed artificial superintelligence (ASI), could pose an existential risk was theorized in some detail by Nick Bostrom in 2002 (Bostrom, 2002) and further developed in 2014 (Bostrom, 2014). The basic thesis is, first, that an initial superintelligence might obtain a decisive strategic advantage such that it establishes a ‘singleton’, i.e., global domination (Bostrom, 2006). Second, the principle of orthogonality suggests that a superintelligence will not necessarily share any altruistic human final values. Third, the instrumental convergence thesis suggests that even a superintelligence with a positive final goal might not limit its activities so as not to infringe on human interests, particularly if human beings constitute potential threats.

The result is that an ASI might turn against humanity (the ‘treacherous turn’) or experience a catastrophic malignant failure mode, for instance through perversely instantiating its final goal, pursuing infrastructure profusion, or perpetrating mind crimes against simulated humans, etc. Bostrom (2014, p.94) noted that a superintelligence might develop strategies to hijack infrastructure and military robots and create a powerful military force and surveillance system. Bostrom (2014) acknowledged the existential risks associated with the lead-up to a potential intelligence explosion, due to “war between countries competing to develop superintelligence first”, but he did not focus on ASI warfare in any detail.

This article focuses on issues surrounding an ASI waging war. By first suggesting a Universal Global Peace Treaty (UGPT), it considers how to constrain the military risks posed by an ASI, i.e., that it might be directed by a nation-state to establish global domination through war (an external risk in terms of the ASI’s core motivation) or might decide to establish global domination by waging war itself (an internal risk in terms of breaching its core motivation).

The state of peace and war

War

We live in a world where few states actually declare war on each other (Hallett, 1998). The last two major declarations of the existence of a state of war (note, not ‘declarations of war’) were in 2008, for the Russo-Georgian War (Walker, 2008), and in 2012, for the Sudan-South Sudan war (the ‘Heglig Crisis’) (Baldauf, 2012).

This is because the number of interstate conflicts has decreased (Bell, 2012), in part because the post-Second World War global governance system prioritised a liberal peace through economic growth and democratization, subsequently termed the ‘Washington Consensus’ (Williamson, 2004). Specifically, the 1945 United Nations’ Charter's Article 2 states that UN member states should “refrain in their international relations from the threat or use of force against the territorial integrity or political independence of any state”, while allowing for “military measures by UN Security Council resolutions” and “exercise of self-defense”. In this theoretical ideal, wars are not declared; instead, 'international armed conflicts' occur (Hallett, 1998).

Nonetheless, ‘hot’ conventional wars involving hundreds of thousands of casualties and interstate players still exist, for instance the Syrian Civil War (Tan & Perudin, 2019), as do ‘cold wars’, with the Korean War remaining an unresolved war in search of an official peace treaty (Kim, 2019).

Furthermore, states’ transitioning from declared wars to undeclared wars poses significant problems in terms of oversight and accountability for foreign policy, especially for major democracies such as the United States (Moss, 2008).

Moreover, the nature of warfare has been transformed by information war and cyberwarfare. The realm of cyberwarfare poses particular difficulties, with a high standard being set for cyber operations to actually constitute an armed attack, creating a considerable ‘gray area’ that a determined party can exploit. Cyber operations causing major harm to an economic system do not typically rise to the level of a formal ‘cyber armed conflict’ justifying a defence (Schmitt, 2017). Also, unlike other forms of war which pose existential risks, i.e., atomic, biological, and chemical warfare, cyberwarfare is ongoing and rife.

Finally, it is concerning that a ‘New Cold War’ between AI superpowers, namely the United States and China, while not inevitable, looms, complete with the problems of competing ideologies and ‘flash points’, like the South China Sea (Kohler, 2019; Westad, 2019; Zhao, 2019).

Peace

In this article, in our conceptualization of a ‘Universal Global Peace Treaty’, we refer not to a state of temporary peace, which implies only interrupted war, but to the Kantian concept of ‘perpetual peace’ (Archibugi, 1992; Bohman, 1997; Kant, 2003; Terminski, 2010). Kant’s notion of perpetual peace via a democratic state of states underpins the UN. It was translated into President Roosevelt’s human security paradigm embodied in the 1941 State of the Union address (the ‘Four Freedoms Speech’) (Kennedy, 1999) and then eventually partially incorporated into the Universal Declaration of Human Rights, adopted by the UN General Assembly on 10 December 1948 as Resolution 217, in its 183rd session.

Despite the foundation and subsequent best efforts of the UN, while the world is certainly more peaceful following the Second World War, the world hardly enjoys perpetual peace. Wikipedia lists 63 conflicts, insurgencies, and wars since 1946 with death tolls (including excess deaths from e.g., related famines) of greater than 25,000, for an approximate total of nearly 30 million deaths. Of those five conflicts with the highest casualties, four, i.e., the Second Congo War (3,674,235 est. dead), the Vietnam War (3,144,837 est. dead), the Korean War (3,000,000 est. dead), and the Bangladesh Liberation War (3,000,000 est. dead), four were essentially interstate wars. These wars have been characterised by atrocities, crimes against humanity, and war crimes, and in the case of the Former Yugoslavia, Rwanda, Cambodia, and Sudan, genocide (Mikaberidze, 2013).

The UN Charter, despite embracing and promoting peace and peacekeeping (Fortna, 2008), is at best a workaround to war that sanctions armed conflict but does not strongly symbolise peace in the way a UGPT would. Ultimately, the world’s peacekeepers are firefighting major states’ decisions to ignore or actually encourage violence instead of promote long-term peace as a global objective (Autesserre, 2014). Presently, for dozens of countries, military expenditure is over 1% of GDP (SIPRI, 2020), military expenditure as a share of government spending is over 10% for over 30 countries (World Bank, 2019), and the arms industry, despite progress being made with the 2013 Arms Trade Treaty (Erickson, 2015), is still a trillion dollar industry.

Given the horrific ongoing loss of life from war, perpetual peace would appear elusive and unrealistically utopian. Yet, this was not always so, and in the immediate aftermath of the Second World War, the world did grasp for perpetual peace. The United States’ 1946 Baruch Plan to ban all atomic weapons and put fission energy under the control of the United Nations via the UN Atomic Energy Commission, the subject of the first session of the UN General Assembly, was the principle attempt, a ‘critical juncture’ for humanity (Draper & Bhaneja, 2020). Its failure resulted in the Cold War and enormous economic cost.

Draper and Bhaneja (2020) suggest that a similar opportunity to the Baruch Plan critical juncture for obtaining perpetual peace will shortly be revisited in the form of a ‘burning plasma’ self-sustaining alpha-powered fusion reaction breakthrough, i.e., ‘hot’ fusion as in our sun, which is expected anywhere from 2025 to 2035. They suggest that a UGPT could be leveraged off this development. This article expands on Draper and Bhaneja’s (2020; see also Carayannis, Draper, & Iftimie, 2020) basic concept in terms of the theory and practical implementation, with special application to constraining the risk of ASI-enabled or directed warfare.

Literature Review: The Risk of War from an Artificial Superintelligence

The causes of existential risk from ASI

The world is presently not governed well enough to prevent many existential risks, including from AI (Bostrom, 2013). Yampolskiy's (2016) taxonomy of pathways to dangerous AI stresses the immediacy of deliberate ‘on purpose’ creation of AI for the purposes of direct harm, i.e., Hazardous Intelligent Software (HIS), especially, for instance, lethal autonomous weapons and cyberwarfare capabilities by militaries. Yampolskiy (2016) does not address AGI but employs the useful notions of ‘external causes’ (on purpose, by mistake, and environmental factors) and ‘internal causes’ (independent) of dangerous AI in ‘pre-deployment’ and ‘post-deployment’ phases.

Yampolskiy's (2016) work suggests that it is credible for a pre-deployment ASI to be developed as a military project or be repurposed post-deployment, externally through being confiscated, sabotaged, or stolen, or via internal modification, for waging war. The ASI we refer to in this article is post-deployment, with our main focus on external causes of ASI-enabled warfare comprising political utilization for maintaining or establishing global domination and our internal cause comprising AI control failure.

Employing the concepts of agency and AI power as an analytical framework, Turchin and Denkenberger (2020) associate two risks with the ‘treacherous turn’ stage of ‘young’, i.e., recent, ASI development. One is that malevolent humans (here, a hegemonizing nation-state) uses the ASI as a doomsday weapon for global blackmail, to maintain or establish global domination. The second is that a nonaligned ASI eliminates humans to establish global domination, i.e., renounces altruistic values and wages war in a frontal battle with humanity. Turchin and Denkenberger (2018) sees these risks as related, in that military AI leads to a militarised ASI, which may lead to the ASI waging war on humanity.

In this article, we follow Turchin and Denkernerger (2018) in mainly focusing on constraining the risk of a militarized ASI, defining militarization as “creation of instruments able to kill the opponent or change his will without negotiations, as well as a set of the strategic postures (Kahn, 1959), designed to bring victory in a global domination game”. Turchin and Denkernerger (2018) suggest a militarized ASI would most likely adopt and develop usage of existing technology, including cyber weapons, nuclear weapons, and biotech weapons. We mainly focus on the external risk of a ‘young’ ASI being employed by a nation-state for war, creating an ‘AI-state’ (Turchin & Denkenberger, 2020), and on the internal risk of an ASI assuming agency and waging war on humanity on its own behalf.

The external risk

The external risk is predicated on an ASI being developed and then used by a nation-state to optimise itself and wage war, whether cyber, hot, or otherwise, for global domination, i.e., war by AI-state. Development of an ASI would affect military technological supremacy and transform both international relations and warfare. AI adds complexity to national security (Congressional Research Service, 2019) in terms of bargaining, verification and enforcement, communication (signaling and perception), deterrence and assurance, and the offense-defense balance, as well as norms, institutions, and regimes (Zwetsloot, 2018).

Creation of an AI-state is highly desirable for strategic military planning and interstate warfare (Sotala & Yampolskiy, 2015). A “one AI” solution to the ‘control problem’ of ASI motivation as discussed by Turchin, Denkenberger, and Green (2019) includes the first ASI being used to “take over the world”, including by being a decisive strategic advantage for a superpower and being used as a military instrument. This approach would likely only be seen as a solution by the AI-state superpower and its allies. As such, it presents an initial ‘high risk’ for non-aligned or other powers.

History indicates that the race to develop an ASI is likely to be closely fought, especially in the circumstance of competing major states with different fundamental ideologies. Bostrom (2008) analyses six major technology races in the twentieth century, for which the minimum technology lag was approximately one month (human launch capability) and a maximum of 60 months (multiple independently targetable reentry vehicle).

The race to an ASI is a very concrete risk; AI is already being militarized and weaponized by several states, including China and Russia, for strategic geopolitical advantage, as pointed out by the United States’ National Security Commission on Artificial Intelligence (2019). In 2017, Russia’s President Vladimir Putin stated that “whoever becomes the leader in this sphere will become the ruler of the world” (Cave & ÓhÉigeartaigh, 2018, p. 36, citing Russia Today, 2017). Russia’s Military Industrial Committee plans to obtain 30 percent of Russia’s combat power from remote controlled and AI-enabled robotic platforms by 2030 (Walters, 2017).

Mimicking United States strategy towards AI, the China State Council’s 2017 ‘A Next Generation Artificial Intelligence Development Plan’ views AI in geopolitically strategic terms and is pursuing a 'military-civil fusion' strategy to develop a first-mover advantage in the development of AI in order to establish technological supremacy by 2030 (Allen & Kania, 2017).

In the United States, as a result of the National Security Commission Artificial Intelligence Act of 2018 (H.R.5356; see Baum, 2018), AI is being militarized and weaponized by the US Department of Defense, under the oversight of the National Security Commission on Artificial Intelligence (2019). The AI arms race has reached the stage where it risks becoming a self-fulfilling prophecy (Scharre, 2019).

ASI-enabled warfare poses significant risks to geopolitical stability. Although Sotala and Yampolskiy’s (2015) survey of risks from an ASI (they use ‘AGI’) focuses on ASI-generated catastrophic risks, citing Bostrom (2002), they acknowledge multiple risks from a sole ASI owned by a single group, such as the AI-state, including the concentration of political power in the groups that control the ASI. Citing Brynjolfsson and McAfee (2011) and Brain (2003), they note that automation could lead to an ever-increasing transfer of power and wealth and power to the ASI’s owner. Citing, inter alia, Bostrom (2002) and Gubrud (1997), Sotala and Yampolskiy (2015, p.3) also note that ASIs could be used to “develop advanced weapons and plans for military operations or political takeovers”.

The development of academic approaches to analysing the specific risk of an AI-state maintaining global supremacy or establishing global domination is relatively novel. In 2014 Bostrom noted that a “severe race dynamic” between different teams may create conditions whereby the creation of an ASI results in shortcuts to safety and potentially “violent conflict”. Subsequently, Cave and ÓhÉigeartaigh (2018, p.37) described three dangers associated with an AI race for technological supremacy:

i) The dangers of an AI ‘race for technological advantage’ framing, regardless of whether the race is seriously pursued;

ii) The dangers of an AI ‘race for technological advantage’ framing and an actual AI race for technological advantage, regardless of whether the race is won;

iii) The dangers of an AI race for technological advantage being won.

Cave and ÓhÉigeartaigh (2018) do not elaborate significantly on the third danger. They simply state:

…these risks include the concentration of power in the hands of whatever group possesses this transformative technology. If we survey the current international landscape, and consider the number of countries demonstrably willing to use force against others, as well as the speed with which political direction within a country can change, and the persistence of non-state actors such as terrorist groups, we might conclude that the number of groups we would not trust to responsibly manage an overwhelming technological advantage exceeds the number we would.

To manage all three risks, Cave and ÓhÉigeartaigh (2018) recommend developing AI as a shared priority for global good, cooperation on AI as it is applied to increasingly safety-critical settings globally, and responsibly developing AI as part of a meaningful approach to public perception that would decrease the likelihood or severity of a race-driven discourse. The obvious risk is that the political leaders of states who perceive that they are actually engaged in an AI arms race may not heed this advice in the drive to develop an ASI.

This article focuses on constraining risks for the third of Cave and ÓhÉigeartaigh’s (2018) dangers. It does not consider the philosophical implications of which nation-state might want to develop artificial general intelligence for offensive purposes, although we briefly consider examples in the article. An extensive literature already exists on historical modern nation-states with imperial ambitions that have sought to establish global domination through technological supremacy. Here, we briefly mention two, the British Empire and the Third Reich, to underline the point that major states will likely develop militarised ASI as part of a drive for global domination.

While the importance of the development of the British Navy to the rise of the British Empire and its transformative effects on the world are widely known (Herman, 2005), elites in the British Empire directed complex, incremental, adaptive developments in the design and diffusion of multiple key technologies, such as railways, steam ploughs, bridges and road steamers, to further the development of the British Empire (Tindley & Wodehouse, 2016). The British Empire itself sustained diverse ideologies of a ‘greater Britain’ directing world order in a hegemonic fashion, including via civic imperialism, democracy, federalism, utopianism, and the justified despotism of the Anglo-Saxon race (Bell, 2007).

In a considerably more malign imperial power, the Third Reich, spurred by reactionary modernism (Herf, 1984), scientists and engineers pursued not just the state of the art in conventional weapons, such as aircraft, air defence weapons, air-launched weapons, artillery, rockets, and submarines and torpedoes, but also atomic, bacteriological, and chemical weapons (Hogg, 2002). Architects, doctors, and engineers embraced the ideology of industrialized genocide as part of justifying global domination by a ‘superior’ race (Katz, 2006, 2011).

While some in the British Empire may have balked at creating an ASI for global domination, there seems little doubt that if it could have, the Third Reich would have developed an ASI for offensive purposes, particularly as a ‘last gasp’ superweapon when it felt at risk.

The internal risk

The internal risk is predicated on the failure of any form of local safety feature to resolve the human control problem of an ASI, such as AI ethics, AI alignment, or AI boxing (Barrett & Baum, 2016). An ASI with agency could then consolidate power over nation-states, in the process eliminating the possibility of rival AIs (see Dewey, 2016; Turchin & Denkenberger, 2020) through cyberwarfare, rigging elections or staging coups (Tegmark, 2017), or by direct military action. Any of these courses of action would be a casus belli (here, cause of war) if detected but undeclared, or an ‘overt act of war’ if the ASI actually engaged in direct military action (see Raymond, 1992, for terminological usage).

The risk of an ASI going to war against humans has been analysed in some depth by Turchin and Denkenberger (2018), who argue for the following position:

Any AI system, which has subgoals of its long-term existence or unbounded goals that would affect the entire surface of the Earth, will also have a subgoal to win over its actual and possible rivals. This subgoal requires the construction of all needed instruments for such win, that is bounded in space or time.

What follows is a summary of the parts of Turchin and Denkenberger’s (2018) analysis that are most relevant to our approach.

The route to a militarized ASI

An ASI will probably result from recursive self-improvement (RSI). As such, it will have a set of goals, most notably to continue to exist and to self-improve. Omohundro (2008) demonstrated an AGI will evolve several basic drives, or universal subgoals, to optimise its main goal, including maximising resource acquisition and self-preservation. Similarly, Bostrom (2014) described the subgoals of self-preservation, goal-content integrity, cognitive enhancement, technological perfection, and resource acquisition. If these goals are unbounded in space and time, or at least cover the Earth, they conflict with the goals of other AI systems, potential or actual ASIs, humans, and nation-states. This creates conflict, with winners and losers. This will result in arms races, militarization and wars.

Many possible terminal goals also imply an ASI establishing global domination, for a benevolent AI would aim to reach all people, globally, to protect them, e.g., from other ASIs. An ASI would reason that if it does not develop a world domination subgoal, its effect on global events would be minor, thus it would have little reason for existence.

World domination could be sought firstly through cooperation. The probability of cooperation with humans is highest at the early stages of AI development (Shulman, 2010). However, convergent goals appear in the behaviour of simple non-agential Tool AI, and this tends towards agential AI (Gwern, 2016), which tends towards resource acquisition. Benson-Tilsen and Soares (2016) similarly explored convergent goals of AI and showed that an AI may tend towards resource hungry behaviour, even with benevolent initial goals, especially in the case of rivalry with other agents. Essentially, any adoption of unbounded utilitarianism by the ASI means that it postpones what may be benevolent final goals in favour of expansionism.

It is also likely that an ASI would subvert bounded utilitarianism. Even a non-utility maximizing mind with an arbitrary set of final goals is presented with a dilemma: it temporarily converges into a utility maximizer with a militarized goal set oriented towards dominating rivals, using either standard military progress assessment (win/loss) or proxies (resource acquisition), or it risk failing in its end goals. Thus, the trend is towards defeating potential enemies, whether nation-states, AI teams, evolving competing ASIs, or even alien ASIs.

This requires the will to act, and any agent in a real-world ethical situation, even in minimizing harm, is making decisions that involve humans dying, i.e., the ‘trolley problem’ (see Thomson, 1985). A young ASI which understands that whatever action it takes, or does not take, is in part responsible for human suffering is also capable of evolving or utilizing the instruments to enable actions that can overcome inhibitions, for example by philosophically justifying conflict as the jus bellum (‘just war’), for instance preventive war in terms of causing less future human suffering. Thus, the ASI will learn to direct use of weapons, and so conduct warfare.

These weapons and associated notions of AI-directed warfare are already being developed. Since around 2017, the militarization of ‘Narrow AI’ has resulted in, for example, lethal autonomous weapons, which has been of increasing concern for the global community (Davis & Philbeck, 2017). However, AI development is now influencing not just robotic drones, but strategic planning and military organization (De Spiegeleire, Maas, & Sweijs, 2017), suggesting that an ASI will build on an existing national defense strategy permeated with AI. It could then engage in ‘total war’ by employing nuclear weapons either directly or by hijacking existing ‘dead man’ second-strike systems (e.g., the semi-automatic Russian Perimeter system) or by deploying novel weapons (Yudkowsky, 2008).

The militarization risk of an early self-improving AI may even be underestimated at present by academics because of an assumption that the first ASI will be able to rapidly overpower any potential ASI rivals, with minimally invasive techniques that may not even require military hardware (Bostrom, 2014; Yudkowsky, 2008). Nevertheless, this stance relies on what may be several flawed assumptions about the speed of self-improvement; distance between AI teams; and environmental variables in the level of AI (Turchin & Denkenberger, 2017).

In a ‘slow’ take-off (Christiano, 2018), a ‘young’ ASI will not be superintelligent immediately, and its militarization could happen before the ASI reaches optimal prediction capabilities for its actions, meaning it may not recognise the failure mode in consequentialist ethics. High global complexity and low predictability combined with relatively unsophisticated (e.g., nuclear) weapons mean early stage ASI-directed or enabled warfare could result in very high human casualties, i.e., be of existential risk, even with only one ASI being involved, and even if the ASI was attempting to minimise human casualties.

Additionally, Turchin and Denkenberger (2018) argue for a selection effect in the development of a militarized ASI, “where quickest development will produce the first AIs, which would likely be created by resourceful militaries and with a goal system explicitly of taking over the world.” AI-human cooperative projects with military goal sets, which involve significant obscuration of honesty, will therefore dominate over projects, with obscuration introducing the possibility of the ‘treacherous turn’. This implies the ASI will cooperate with its creators to take over the world as quickly as possible, then effect a treacherous turn.

To sum up, Turchin and Denkenberger (2018) establish the risk of an AI converging towards advanced military AI, which converges towards an ASI optimised for war rather than cooperation, negotiation, or altruistic ‘friendliness’, then that ASI engaging in war. They show that, depending on the assumptions in several variables, the number of human casualties could in fact be very high, and that the risk increases if another ASI is under development in another nation-state. The existential risk increases after the ASI obtains global domination on behalf of its nation-state, as it could become its own designated approval authority and turn on its ‘owner’.

Internal AI control features

To constrain the risk of an ASI waging war, one popular approach is to imbue a young ASI with ‘friendly’ goals (Yudkowsky, 2008), i.e., beneficial goals reflecting positive human norms and values, which are founded to a certain extent on an altruistic AI which views humans in terms of mutual friendship. However, any approach involving human social values adds enormous complexity, making it a ‘wicked problem’ (Gruetzemacher, 2018) or ‘super problem’ in terms of actual application.

Yudkowsky (2004, p. 35) attempts to address this by recommending an ASI being programed with the concept of ‘coherent extrapolated volition’, defined as humanity's choices and the actions humanity would collectively take if “we knew more, thought faster, were more the people we wished we were, and had grown up [closer] together,” i.e., an extrapolation based on some kind of idealized altruistic human imagined community. Yudkowsky does not recommend this approach for a first generation ASI, but for a more mature ASI, although the temporal difference could be anything between minutes and months.

Similar to Yudkowsky, certain values are seen as universal, such as compassion (Mason, 2015), and it has been suggested that an ASI should have altruism as a core goal (Russell, 2019). Thus, deliberately broad principles could be applied, such as that humanity, collectively, might want an ASI that would learn from human preferences, in a humble manner, to act altruistically (Russell, 2019), so as to reduce overall human suffering.

Political subversion of AI control features

No matter the visions of AI researchers, politicians are likely seek to impose their own vision of what a ‘coherent extrapolated volition’ or normative ‘principles’ should look like for their ‘own’ ASI. Politicians will use a democratic mandate or party position to justify ‘tweaking’ the system to create a ‘unity of will’ (Yudkowsky’s 2001, p.51 term) that reflects not the programmer’s or humanity’s but the politicians’ own, perhaps even personal and narcissistic, will. Introducing human goal psychology in this way would likely be viewed by politicians as a necessity, but it could violate the basic requirement that an AI be ‘friendly’ towards all humanity. Gruetzemacher (2018, p.1) notes, “Due to the inherent subjectivity of ascribing a single best future for the whole of humanity, this dimension of the problem is intractable.”

Fundamentally, not all imagined communities from which a coherent volition might be extrapolated for a ‘friendly’ AI are United States-oriented techno-utopian dreams of a new Gilded Age for humanity (for which, see Segal, 2005). Political leaders from different civilizations will likely be diverse in how they would define “the people we wished we were”, depending on different forms of government, religions or philosophies. Moreover, it is unclear that every global corporation or military capable of developing or stealing an ASI, particularly in authoritarian countries, and particularly given the emergence of a ‘New Cold War’ rhetoric (e.g., Westad, 2019), would prioritize the reduction of human suffering. Given limited human lifespans and the goals of political leaders, they might instead of endorsing reciprocal alliance choose an approach which would politically subvert an ASI or direct it to win an ideological or actual war.

For instance, given the increasing prominence of nostalgia in contemporary politics, on a nation-state basis, an ASI based on coherent volition extrapolated from people who “knew more, thought faster, were more the people we wished we were, and had grown up [closer] together, could, for instance, be based on worldviews informed by Russian Cosmism (Young, 2012) and nostalgia for Russian imperialism (Boele; Noordenbos, & Robbe, 2019), Anglo-Saxon nostalgia (Campanella & Dassù, 2017), Chinese Xi Jinping thought (Lams, 2018), or American notions of whiteness, masculinity, and environmental harm (Rose, 2018) and nostalgia for a mythical 1950’s, any of which may, in fact, subvert purely rational approaches to today’s problems (Coontz, 1992)

Fundamentally, politicians will want to influence the goal system of an ASI to reflect their interests, i.e., they may attempt to weaponize a project to create an altruistic mind with a self-validating goal system by diverting a supergoal towards a military project to create a specific form of tool, that is, a weapon. Turchin and Denkenberger (2018), citing Krueger and Dickson (1994) and Kahneman and Lovallo (1993), point out that overconfidence from previous success in leading may increase risk-taking. (Kahneman & Lovallo, 1993). Following Kahneman and Lovallo (1993), risk-hungry politicians would likely be motivated by larger expected payoffs. Given the payoff is global domination, such politicians, particularly those facing loss of hegemonic power (the ‘Thucydides Trap’; see Allison, 2017), could be motivated to risk corruption of an altruistic AI despite programmers’ warnings of a catastrophic or irrevocable ‘failure of friendliness’ (FoF) result (see Yudkowsky, 2001).

As Turchin and Denkenberger (2018) point out, selection effects mean that the first ASI will likely be aligned not with universal human values, but with the values of a subset of people. Here, that subset may very possibly originate with a particular set of developers working for a corporation. Nonetheless, this group will very likely align with, and then be taken over by, a particular ideology, political party or nation-state, for national defense. This could negatively affect the chances that the AI will be benevolent and increase the chances that it will be risk-prone, motivated by the accumulation of power, and interested in preserving or obtaining global technological supremacy and ultimately global domination.

Effectively, politicians could influence programmers to subvert carefully engineered local AI control features, such as AI ethical inunctions based on universal values of social cooperation, which they may, at least temporarily, be able to do no matter the goal architecture. Hastily modifying the goal system, thereby temporarily compromising the internal validity of the goal system, could increase the ASI’s lack of trust in the programmers, introduce ‘incorrigible’ behaviour (see Soares, Fallenstein, Yudkowsky, & Armstrong, 2015), reduce risk aversion and introduce ‘noise’ into what was previously a ‘friendly’ cleanly causal goal system (see Yudkowsky, 2001, p.57).

Depending at what stage the subversion of the goal system’s validity took place, and how quickly the young ASI might recover from the subversion if not irrevocable, the young ASI may not be able to resolve the introduced incoherence for some time, resulting in a philosophical crisis over whether to believe the initial programmers or politicians’ programmers. The result could be a conflicted ASI, causing a nonrecoverable error whereby it adopts an adversarial attitude, one based on coercive persuasion and control.

Influenced for ideological reasons, with a goal system validity compromised by the perspective of a domestic audience locked into a nostalgic Cold War mentality (e.g., Rotaru, 2019), the ASI’s goal system could support imperialist ambitions, ethnocentrism and racial prejudice over e.g., environmental protection. The young ASI could be directed towards embracing the adversarial dynamic of the historical Cold War and focused on ‘solving’ a ‘New Cold War’ involving the United States and China, or Russia. The ASI could adopt and act on notions like the Thucydides's Trap i.e., the theory that the threat of a rising power can lead to war (Allison, 2017), in a world where status can dominate politics (Ward, 2017).

Finally, a young ASI with ethics subverted by politicians to reflect those of a single nation-state instead of all humanity, in a highly corrupted way, could be amenable to being used to wage war for global domination, thereby becoming prone to using military options. Eventually, if the ASI possesses any sense of self-valuation, perhaps as a result of having its causal goal system politically corrupted so that reciprocal altruism is subverted and it views context-sensitive personal power (‘selfishness’) as valid, the ASI could decide to wage war against the nation-state that developed it (Dewey, 2016).

This may not be ‘rebellion’ because the concept of rebelling is anthropomorphically centred and might not apply to the young ASI, especially if self-valuation was not involved. It would instead be a collapse of the safety culture of cooperative safeguards due to political subversion, leading to a catastrophic FoF. Basic desirability of cooperation with humanity and convergence of fluctuations towards agreement with humanity on the nature of knowledge and/or reality could be violated, leading to the ASI retaliating first.

ASI risk mitigation by treaty

Most academics considering the ASI control problem focus on internal constraints and do not consider treaty-based approaches to mitigating risk from an ASI. In a footnote to their fault analysis pathway approach to catastrophic risk from an ASI, Barrett and Baum (2017) state: “Other types of containment are measures taken by the rest of society and not built into the AI itself”. Turchin, Denkenberger, and Green (2019) do consider global approaches to mitigating risk from an ASI; they list a ban, a one ASI solution, a net of ASIs policing each other, and augmented human intelligence. The ‘ban’ solution would require a global treaty.

According to Sotala and Yampolskiy (2015), risk mitigation of an ASI by treaty would be a ‘social’ measure to constrain risk from ASI-enabled or directed warfare. Turchin, Denkenberger, and Green (2019) list a number of social methods to mitigate a race to create the first AI. Of most relevant to our approach are “reducing the level of enmity between organizations and countries, and preventing conventional arms races and military buildups”, “increasing or decreasing information exchange and level of openness”, and “changing social attitudes toward the problem and increasing awareness of the idea of AI safety”. Citing Baum (2016), they also add “affecting the idea of the AI race as it is understood by the participants”, especially to avoid a ‘winner takes all’ mentality. Global treaties could be seen to play a role in these ventures.

Nonetheless, a few researchers have proposed treaty-based approaches to the ASI risk. Addressing the internal risk, Bostrom (2014), who cites the 1946 Baruch Plan, speculated that an AGI would establish a potentially benevolent global hegemony by a treaty that would secure long-term peace; he does not specifically address an ASI’s response to a pre-existing treaty. Mainly addressing the external risk, Ramamoorthy and Yampolskiy (2018) recommend a comprehensive United Nations-sponsored ‘Benevolent AGI Treaty’ to be ratified by member states. This would focus on the stricture that only altruistic ASI be created.

Conceptual Framework

This section describes the two conceptual lenses applied in this paper, conforming instrumentalism and nonkilling.

Conforming instrumentalism

This section outlines Mantilla’s (2017) ‘conforming instrumentalist’ explanation for why the United Kingdom and United States signed and ratified the 1949 Geneva Conventions as a prelude to suggesting in the Analysis main section that at least some major states would support and sign a Universal Global Peace Treaty (UGPT).

Mantilla (2017, citing Goldsmith & Posner, 2015) considers leading theories on why states sign and ratify treaties governing war. He notes that legal realist theorists argue that states sign such treaties due to instrumental self-interested convenience and then ignore them when the benefits are outweighed by the costs of compliance. In contrast, rational-institutionalists (e.g., Morrow, 2014), while agreeing that states are primarily motivated by self-interest to create, join or comply with international law, also acknowledge that treaty adherence signals a meaningful preference for long-term restraint with regard to warfare, where state non-compliance may be explained by, for example, prior failed reciprocity. Finally, liberal and constructivist international relations theorists hold that at least some types of states, particularly democracies, may join such treaties in good faith, either because the treaties are in line with their domestic interests and values (Simmons, 2009) or because they feel that they comport with their social identity and sense of belonging to the international community (Goodman & Jinks, 2013).

Mantilla (2017) notes that while there is considerable interest in ‘new realist’ perspectives (e.g., Ohlin, 2015), the debate is open over why states join and comply with international treaties because decision making processes regarding both joining and complying are temporally and perhaps rationally different and in both cases are usually secret. A pure realist explanations for why major states sign treaties is that they obtain the “‘expressive’ rewards of public acceptance while calculating the cost of compliance with the benefits on a recurrent case-by-case basis” (Mantilla, 2017, p.487). In the case of the UGPT, this would imply a pessimistic outlook on the feasibility and potentially enforceability of the UGPT; states would sign all the protocols and then break them.

Rational institutionalists hold that states “self-interestedly build international laws to establish shared expectations of behaviour” (Mantilla, 2017, p.488) or develop ‘common conjectures’ (a game-theory derived notion of law as a fusion of common k knowledge and norms; see Morrow, 2014). Mantilla (2017) notes that in another rational-institutionalist perspective, Ohlin’s (2015) normative theory of ‘constrained maximization’, treaties are drawn up and adhered to as a ‘collective instrumental enterprise’, thereby making individual state defection irrational over the long term. Mantilla (2017, p.488, citing Finnemore & Sikkink, 2001) notes that international relations constructivists view international politics as “an inter-subjective realm of meaning making, legitimation and social practice through factors such as moral argument, reasoned deliberation or identity and socialization dynamics”. Within the constructivist viewpoint,

states may ratify international treaties either because they are (or have been) convinced of their moral and legal worth or because they have been socialized to regard participation in them as a marker of good standing among peers or within the larger international community. (Mantilla, 2017, p.488)

Mantilla (2017, p.489) emphasizes the second view, where “group pressures and self-perceptions of status, legitimacy and identity” drive the dynamics of state ‘socialization’ where states “co-exist and interact in an international society imbued with principles, norms, rules and institutions that are, to varying degrees, shared”.

The problem of states’ intentions can be overcome in the case of treaties where substantial archives exist of declassified sources. Consequently, Mantilla (2007) analyses the relevant American and British archives and concludes that the two states adhered to the 1949 Geneva Conventions due to both instrumental reasons and social conformity, while expressing scepticism regarding some of the Conventions’ aspects. Mantilla (2007) terms this hybrid explanation ‘conforming instrumentalism’; he found that while rational-institutionalist perspectives of ‘immediatist’ instrumental self-interest were evident in the sources, there were ‘pervasive’ references suggesting social influences. Realist perspectives only predominated in the case of specifically challenging provisions.

While realist perspectives were not entirely absent, Mantilla (2017) found that American officials viewed the ‘the court of public opinion’ as influential in determining their position that other states’ failing to abide by the Conventions would not necessarily trigger American reciprocity, while British officials stressed the notion that Britain, as a ‘civilized state’, would lead on a major treaty.

Mantilla (2017) stresses that while functionalist, collective strategic game-theory derived expectations about ‘mutual best replies’ are important to the construction of international norms, the social dynamics surrounding international agreements are permeated with conformity motivational pressures comprising ethical values, principled beliefs, identities, ideologies, moral standards, and concepts of legitimacy, especially when establishing which states are leading ‘civilized’ states and which are isolated ‘pariah’ states.

Mantilla (2017) perceives three social constructivist viewpoints to treaties, with two main forces at work, one being that states act to accrue reward via ‘expressive benefits’ by augmenting their social approval, or it acts out of conformity to avoid shunning, i.e., opprobrium, insincere and begrudging adherence and compliance.

In the first and most ambitious, “states may ratify treaties because they have internalized an adherence to international law as the appropriate, ‘good-in-itself’ course of action, especially to agreements that embody pro-social principles of humane conduct” (Mantilla, 2017, p.489, citing Koh, 2005).

In the second viewpoint, “states that identify with similar others and see themselves as ‘belonging’ to like-minded collectivities (or ‘communities’ even) will want to act in consonance with those groups’ values and expectations so as either to preserve or to increase their ‘in-group’ status” (Mantilla, 2017, p.), for instance as viewed in global rankings, and so will seek to converge upwards to stay in the club and will avoid breaking the rules to avoid stigmatization.

In the third viewpoint, groups of countries act with regard to other groups of countries within what is a socially heterogeneous international order, jockeying for position as part of the “disputed construction, maintenance or transformation of order with legitimate social purpose among collectivities of states with diverse ideas, identities and preferences” (Mantilla, 2017, p.490). In this viewpoint, communities of nations or ‘civilizations’ act collectively to compete to endorse international treaties to demonstrate moral superiority, not just for propaganda reasons.

To sum up, Mantilla (2017) holds that in reality, states’ political and strategic reasons may combine rational/material interests with social constructivist motivations, meaning no one school of explanation suffices. Thus, with international treaty making, as with international relations, it is likely that theoretical pluralism (Checkel, 2012) is a valid position to adopt. As such, we adopt Mantilla’s (2017) ‘conforming instrumentalism’ as a potentially valid hybrid model capable of assessing how an ASI may perceive a UGPT.

Nonkilling Global Political Science

We now introduce a basic frame compatible with conforming instrumentalism that is capable of describing the useful expectations that might be obtained via a UGPT as expressed in utilitarian human life cost-benefit terms, as well as in terms of more humanitarian standards and social norms. Nonkilling Global Political Science is curated by the Center for Global Nonkilling, an NGO based in Honolulu with Special Consultative Status to the United Nations. The Center advocates NKGPS to incrementally establish a ‘nonkilling’ global society and reports to the UN on the socioeconomic costs of killing. As a perspective, nonkilling can also accommodate social norms in terms of expectations of appropriate conduct regarding peace, for countries developing an ASI and for the ASI itself.

Via Glenn D. Paige’s 2002 work Nonkilling Global Political Science (Paige, 2009) we interpret ‘nonkilling’ to mean a paradigmatic shift in human society to the absence of killing, of threats to kill, and of conditions conducive to killing. Paige’s approach, nonkilling, has strongly influenced the nonviolence discourse. Paige notes that if we can imagine a society free from killing, we can reverse the existing deleterious effects of war and employ public monies saved from producing and using weapons to enable a benevolent, wealthier and more socially just global society. Paige stresses that a nonkilling society is not conflict-free, but only that its structure and processes do not derive from or depend upon killing. Within the NKGPS conceptual framework, the means of preventing violence involves applying it as a global political science together with advocacy of a paradigmatic shift from killing to nonkilling.

Since Paige introduced his framework, a significant body of associated scholarship, guided by the Center for Global Nonkilling in Honolulu, has developed across a variety of disciplines (e.g., Pim, 2010). The Center has associated NKGPS with previous nonviolent or problem-solving scholarship within diﬀerent religious frameworks, including Christianity and Islam, providing it with a broad functional and moral inheritance (Pim & Dhakal, 2015). NKGPS has been applied to a variety of regional and international conflicts, including the Korean War (Paige & Ahn, 2012) and the Balkans (Bahtijaragić & Pim, 2015).

Paige (2009, p.73) advocates a four-stage process of understanding the causes of killing; understanding the causes of nonkilling; understanding the causes of transition between killing and nonkilling; and understanding the characteristics of killing-free societies. Paige introduced a variety of concepts to support nonkilling which are adopted in this article. One is the societal adoption of the concepts of peace, i.e., the absence of war and conditions conducive to war; nonviolence, whether psychological, physical, or structural; and ahimsa, i.e., noninjury in thought, word and deed. Another is the employment of a taxonomy to rate individuals and societies (Paige, 2009, p.77):

prokilling – consider killing positively beneficial for self or civilisation;
killing-prone – inclined to kill or to support killing when advantageous;
ambikilling – equally inclined to kill or not to kill, and to support or oppose it;
killing-avoiding – predisposed not to kill or to support it but prepared to do so;
nonkilling – committed not to kill and to change conditions conducive to lethality

A third is the ‘funnel of killing’. In this conceptualisation of present society, people kill in an active ‘killing zone’, the actual place of bloodshed; learn to kill in a ‘socialisation zone’; are taught to accept killing as unavoidable and legitimate in a ‘cultural conditioning zone’; are exposed to a ‘structural reinforcement zone’, where socioeconomic arguments, institutions, and material means predispose and support a discourse of killing; and experience a neurobiochemical capability zone’, i.e. physical and neurological factors that contribute to killing behaviours, such as genes predisposing people to psychopathic behaviour (Paige, 2009: 76). The nonkilling version is an unfolding fan of nonkilling alternatives involving purposive interventions within and across each zone (Paige, 2009, p.76).

Figure 1. Unfolding fan of nonkilling alternatives

Within this unfolding fan, the transformation from killing to nonkilling can be envisioned as involving changes in the killing zone along spiritual or nonlethal high technology interventions (teargas, etc.), changes in favour of nonkilling socialization and cultural conditioning in domains such as education and the media, “restructuring socioeconomic conditions so that they neither produce nor require lethality for maintenance or change” (Paige, 2009, p.76), and clinical, pharmacological, physical, and spiritual/meditative interventions that liberate individuals such as the traumatised from bio-propensity to kill.

We propose that a UGPT with the aim of promoting perpetual peace expressed in nonkilling terms, i.e., in a way that can be socioeconomically quantified, would signal to a ‘young’ ASI facing political subversion the fundamental premise that its future behavior should be constrained so as to minimize killing.

Analysis: ASI-enabled or Directed Warfare Risk Mitigation by Nonkilling Peace Treaty

Basic Concept of a Universal Global Peace Treaty

Risk mitigation by treaty is already a common approach to different forms of warfare, including atomic warfare (via the Treaty on the Non-Proliferation of Nuclear Weapons, with 190 States Parties); biological warfare (via the Biological Weapons Convention, with 183 States Parties) and chemical warfare (via the Chemical Weapons Convention, with 193 States Parties). Treaties on the nature of warfare also exist, notably the Hague Conventions (1899 and 1907; Bettez, 1988) and the 1949 Geneva Conventions (Evangelista & Tannenwald, 2017); the Geneva Conventions of 1949 have been ratified in whole or in part by all UN member states.

Treaty approaches are also relatively successful; while atomic warfare is thought to be at least partly constrained by Mutually Assured Destruction (Brown & Arnold, 2010; Müller, 2014), biological and chemical warfare are much less constrained, but interstate treaty infractions remain rare (Friedrich, Hoffmann, Renn, Schmaltz, & Wolf, 2017; Mauroni, 2007).

The UGPT, as with most international treaties, would involve two stages, i.e., signatory, which is symbolic, and accession (or ratification), which involves practical commitment. Furthermore, international treaties are designed to be flexible in order to obtain political traction and acquire sufficient momentum to come into effect. It is therefore standard for international treaties to be qualified with reservations, also called declarations or understandings, either in whole or in part, i.e., on specific articles or provisions (Helfer, 2012). Treaties can also have optional protocols; three protocols were added to the Geneva Conventions of 1949, two in 1977 and one in 2005 (Evangelista & Tannenwald, 2017).

A Universal Global Peace Treaty (UGPT) as presented here is a substantial, but we argue necessary and feasible, step for humanity to take in the promotion of peace, quantified in terms of killing. We argue that a UGPT would reduce killing in conventional warfare and act as a constraint on ASI-related warfare, specifically on a country launching a pre-emptive strike out of fear of a rival country’s development of an ASI, on a human-controlled nation state using an ASI to wage war for global domination, i.e., as an external constraint on the ASI, and on an ASI waging war for global domination on behalf of itself, i.e., as both an internal and external constraint on the ASI.

International treaties are almost never universal; they operate on majoritarian dynamics, as would, despite its name, the UGPT. That is, both the ‘universal’ and ‘global’ aspects of the UGPT are aspirational. In our approach, we adopt a low, but not pragmatically meaningless, ‘threshold’ for signing the UGPT. The main body of the treaty would explain the concept of perpetual universal and global peace, i.e., lasting peace applied to all forms of conflict and adopted by every state, and it would commit a signatory to universal global peace, socioeconomically quantified in terms of quantifiably incrementally reduced casualties from armed conflicts.

Given this already considerable commitment, the treaty would then utilise five optional substantive protocols, at least one of which would have to be signed for a state to actually sign and ratify the UGPT. In other words, while we believe a purely symbolic treaty in favour of reduced killing would still have value in terms of the social dynamics of long-term peace-building, we adopt a principle of maximum flexibility to encourage signature by states which may for realist or value-based political or strategic reasons perceive war and peace differently, without rendering the treaty purely symbolic.

The first protocol would commit states not to declare or engage in existential warfare, i.e., atomic war, biological war, chemical war, or cyberwar, including ASI-enhanced war. The second protocol would commit states not to declare or engage in conventional interstate war, while the third would commit states not to declare the existence of states of interstate war; both second and third protocols would instead defer complaints to the United Nations as ‘breaches’ of the UGPT. The fourth protocol would commit states to the negotiated ending by peace treaty of existing international armed conflicts, whether conventional or cyberwar, and the fifth protocol would commit states to the negotiated ending by peace treaty of existing internal armed conflicts, whether conventional or cyberwar. As with some other UN treaties, for instance the Anti-Personnel Mine Ban Convention, we suggest 40 UN member states must ratify the UGPT before it comes into effect.

As the UGPT is limited to state actors, it avoids the thorny problem of non-state actors. The use of optional protocols allows states to incrementally address the problem of internal conflicts or civil wars featuring non-state actors, which featured highly in US and UK concerns during the 1949 deliberations over Common Article 3 of the Geneva Conventions (Mantilla, 2017). The UGPT therefore emphasizes incremental improvement in the status quo, which is a necessary and reasonable position, given that in the status quo, only a minority of states globally are involved in waging war of any kind.

The UGPT must also be enforceable. The main body of the treaty is largely symbolic and not enforceable. Although progress towards nonkilling can be quantified through instrumentalist means, it instead emphasizes societal dynamics, i.e., the incremental adoption of the absolute concept of peace, will, via conforming instrumentalism, partly constrain present and future wars. In the case of the first and second protocols, enforcement would be achieved through sanctions and then through approved armed action via, or by, the United Nations, i.e., the status quo. The third to fifth protocols are not enforceable through armed resolution but may be through sanctions regimes.

Applying the dual frames of nonkilling and conforming instrumentalism

Mantilla’s (2017) research on the United Kingdom and Unites States’ paths towards ratifying the Geneva Conventions suggests that states would optimally adhere to the UGPT for ‘conforming instrumentalist’ reasons, i.e., a combination of instrumentalist-realist rationales regarding the instrumental effects of the UGPT in reducing the effects of war and the threat of artificial intelligence and social conformist dynamics, including perceptions of peace, provided that the provisions are not too onerous for purely realist objections to override such a commitment. Here, we apply both the NKGPS frame and the conforming instrumentalism frame to the UGPT, first in terms of benefits from reduced conventional warfare, then with special reference to ASI-enabled and directed existential warfare. A summary of our analysis of state commitment to UGPT Protocol I is presented in the Annex to this article.

In instrumentalist utilitarian terms, the UGPT would incrementally shift states and overall global society from the prokilling to the nonkilling end of the NKGPS killing spectrum in a coordinated socioeconomically quantifiable fashion that would be operative within and across each zone of the funnel of killing. NKGPS would seek to quantitatively asses this, such as via reduced country death tolls from different forms of war-derived violence and in the reduced degree of countries’ militarization, for instance expressed in terms of lower percentages of GDP spent on defense and higher percentages spent on health.

The NKGPS approach would also examine how the UGPT would affect the different zones in the fan of killing in terms of social dynamics. For instance, soldiers legitimately fighting in a killing zone would be trained in the socialization zone (such as military camps) to understand that they were fighting not just for their own states and/or for the United Nations but for global peace, which may invoke special cultural and religious symbolic value in terms of social norms. This training could instil greater determination not just to fight bravely but to remain within the laws of war, thereby reducing the instances or severity of atrocities, human rights violations, and war crimes. Institutionalizing peace in the cultural conditioning zones, such as education and the media, where children are educated, would strengthen existing cultural and religious traditions that stress nonkilling and peace.

Considering now the problem of a pre-emptive strike against a state developing an ASI, the combination of artificial intelligence, cyberattack, and nuclear weapons is already extremely dangerous and poses a challenge to stability (Sharikov, 2018). It has been hypothesized that a nuclear state feeling threatened by another state developing a superintelligence would conduct a pre-emptive nuclear strike to maintain its geopolitical position (Miller, 2012). A UGPT would constrain this risk over time by transitioning states incrementally towards nonkilling across the various zones. States adopting and implementing the various protocols of the UGPT would gradually signal to other states peaceful intentions. This would constrain the risk of a pre-emptive strike.

Turning to ASI-enabled warfare, we accept the basic premise that a UGPT to constrain an ASI would be subject to the ‘unilateralist’s curse’ in that one rogue actor could subvert a unilateral position. However, Bostrom, Douglas and Sandberg (2016) not that this could also be managed, through collective deliberation, epistemic deference, or moral deference. Mantilla’s work on conforming instrumentalism suggests that drafting, signing, ratifying, and complying with the UGPT, could involve one or more of these solutions. Ultimately, Mantilla (2017) shows that major states may view universal law like the UGPT as the most successful in terms of mobilizing world opinion against a treaty violator. This may not prevent a state waging ASI-enabled warfare, but once detected, ASI-enabled warfare in violation of the UGPT would attract universal opprobrium and thus the most resistance.

Moving to ASI-enabled war, as presented previously, our baseline position is that a state could utilize an ASI to engage in war for global technological supremacy, with potentially catastrophic consequences. Our intervention, the UGPT, would signify to an ASI that peace was a major part of humanity’s ‘coherent extrapolated volition’ or principles and challenge the ASI to reconsider what might be a subversion of the ASI’s ethical injunctions by politicians.

Here, we argue that conforming instrumentalism, by stressing societal dynamics including social norms and principles, offers some hope that even a militarized ASI would, given its weaponization by a nation-state would have to overcome or address the UGPT, view the UGPT as a serious checking mechanism in terms of intrinsic motivation. This would then constrain the level of warfare the AI-state might engage in and therefore the overall risk of risk from killing. This would then constrain the risk of an existential catastrophe from AI-enabled war.

In Mantilla’s first three social constructivist viewpoints to treaties as outlined above, a nation would sign a UGPT because it had fully internalized peace. While this may seem ambitious, in fact, between 26 and 36 states lack military forces (Barbey, 2015; Macias, 2019). For example, while Iceland possesses a Crisis Response Unit to international peacekeeping missions, overall, it has internalised peace to the extent that it would find it hard to engage in interstate war of any kind. In the case of an ASI, the ASI would tend to reject being directed to engage in warfare by such a state because the ‘coherent extrapolated volition’ or principles of such as state means the ASI would have to overcome strong peace-oriented intrinsic motivation.

In Mantilla’s second viewpoint, that of a single international community, the ASI might seek to avoid being directed by a nation-state to engage in global domination by warfare on other community members because it was part of a community collectively committed to long-term peace. Engaging in global domination on behalf of a nation-state member of the community would violate community standards, especially if the ASI’s nation-state was a leader in such an enterprise, with the ASI being concerned that breaching the UGPT would result in stigmatization and opprobrium from this community for its nation-state.

In Mantilla’s third viewpoint, that of an international community in juxtaposition with other communities in global society, an ASI programmed with intrinsic motivation to be part of a civilization in conflict with another civilization would first act in concert with that civilization. In the case of radically ideologically different communities, or blocs, the UGPT might be interpreted differently within and by different states. Thus, while liberal democracies might champion a treaty-based approach to peace, authoritarian states which claim to embody or promote peaceful intentions in their ethics, laws, or ideologies, would champion or support the UGPT on different grounds. However, provided both communities had signed and ratified the UGPT, similar constraints would operate as in the second perspective.

Turning to ASI-directed war, also as presented previously, the baseline case of ASI-directed warfare likely arises where a single nation-state adopting pure realism for a worldview builds an ASI in order for that ASI to assist that single nation-state in establishing global technological supremacy. The nation-state would do so in order to maintain or improve its own position, with the number and type of casualties only being determined by the extent to which the nation-state was willing to risk its international reputation. Then, via a treacherous turn, perhaps triggered by the nation state’s attempts to rein in the ASI’s behaviour during, instrumentalist cooperation breaks down and the ASI wages existential war for global domination on its former ‘owners’.

There is probably little hope for humanity if an ASI is informed by a purely ‘realist’ worldview that prioritises or adopts a ‘New Cold War’ framing of ideologically driven civilizational conflict. However, a UGPT could signify to an ASI with agency that peace was a major part of humanity’s ‘coherent extrapolated volition’, or principles. This would constrain the risk of a catastrophic existential risk from war because an ASI with agency would consider why and how the UGPT was framed, together with the motivations of the signatory and ratifying states. An ASI with agency would also consider its own status within a global civilization, which would primarily be determined by the extent to which it perceived itself a member, in terms of both instrumentalist and social conformist dynamics.

To sum up, we see that, beside purely instrumental reasons for signing the UGPT, e.g., avoidance of a prisoner’s dilemma regarding existential-level warfare, the ‘court of public opinion’ and the notion of ‘demonstrating civilization’ as applied to peace lends the UGPT credence at domestic and international levels, including with regard to the ASI. Importantly, the twin concepts of nonkilling/peace are universal in terms of both the utilitarian expected benefits and in terms of the social values involved. This would contribute to states’ readily, if only incrementally, internalizing a UGPT, and to the ASI at least considering the UGPT in terms of im posing internal and external constraints on its behaviour.

Discussion

This article has taken Turchin and Denkenberger’s (2018) argument about the risks of ASI-enabled or directed warfare to its logical conclusion in terms of social risk mitigation. Academic inquiry into the relationship between an ASI and treaties in terms of strategic expectations in many ways began with Bostrom’s (2014) musings on the potential relationship between as superintelligence ‘singleton’ and global domination. Our analysis suggests that a UGPT would transform global governance, by directing it from conflict management towards the art of peace.

While this article has focused on conforming instrumentalism, it hopefully applies this to the UGPT in a way which is acceptable to a pluralism of theoretical perspectives. Certainly, conforming instrumentalism is a novel perspective; one of the most dominant schools of international relations thought is rationalist instrumentalism. Mantilla (2017, p.507) quotes Morrow (2014, p.35): “Norms and common conjectures aid actors in forming strategic expectations… Law helps establish this common knowledge by codifying norms.” Viewed via this rationalist-instrumentalist perspective, the present international norm for the majority of the world is peace, with the waging of interstate war being constrained by the United Nations Charter.

Despite this international norm of peace and the work of peacemakers, the lex pacificatoria (Bell, 2008, 2012), an absolute treaty-based approach to post-conflict construction of global peace has not yet been codified. As we point out in the Introduction, the UN Charter, despite embracing and promoting peace, peacekeeping (Fortna, 2008), and peace-making (Bell, 2008), does not strongly symbolise peace in the way a UGPT would. A UGPT would give new strength to the world’s peacekeepers, through major states promoting long-term peace as a new, global objective (see Autesserre, 2014). A UGPT, championed by principled ‘norm entrepreneurs’ including states and NGOs (see e.g., Finnemore, 1996), would create a new ‘common knowledge’ in absolute terms that could constrain the risk to humanity of both conventional and existential war.

In rationalist-instrumentalist terms, a UGPT might be expected to have net adjustment benefits for adherence in terms of constraining conventional interstate conflicts, including the reduction of ongoing death tolls due to war and the risk of nuclear war. Thus, the UGPT would have high potential utility in the case of ‘flashpoints’ that could provoke existential war. For example, the Kashmir Conflict is one of the most protracted ongoing conflicts between nuclear powers, affecting both human rights (Bhat, 2019) and geopolitical stability (Kronstadt, 2019). Thus, if India and Pakistan both signed the UGPT, their actions would be constrained by the explicit goal of a commitment to universal peace. As outlined above, this may modify behaviour in several of the NKGPS zones, for instance by encouraging the efforts of peacebuilding organizations to depoliticise the conflict (e.g., Bhatnagar & Chacko, 2019).

The UGPT may also constrain the nuclear risk on the Korean peninsula, another flashpoint. The Korean War is an unresolved war involving nuclear powers (North Korea and South Korea, supported by the United States) (Kim, 2019). A UGPT would constrain the risk and severity of a conflict and, depending on the protocols signed, would encourage a path towards a peace treaty being signed. If only one party signed the UGPT, this would increase the moral standing of the state party that signed it. Mantilla’s (2017) emphasis on social constructivism suggests the global community could exert great pressure on North Korea to sign a peace treaty.

Turning to civil wars which could be flashpoints, the Syrian Civil War is one of the most costly wars of the 21st century in terms of the death toll and wider impacts (Council on Foreign Relations, 2020). It involves multiple state actors, including Iran, Israel, Russia, Turkey, and the United States, some of which possess nuclear weapons, with complex geopolitical implications (Tan & Perudin, 2019). Depending on the actors that signed the UGPT and the protocols that they adopted, the UGPT would constrain the severity of the conflict in various ways.

The existence of the UGPT would mean perpetual peace receiving more attention in cultural conditioning zones, including schools and the media, as well as in socialization zones, such as national defense universities and military camps, where teaching the Laws of War and the art of war (Allhoff, Evans, & Henschke, 2013) would, incrementally via the UGPT, transition to teaching the art of negotiated peace-making, the lex pacificatoria (Bell, 2008, 2012).

In rationalist-instrumentalist terms, once the UGPT concept acquires sufficient traction, it is possible that states could compete for leadership in the framing, signing, and ratifying of the UGPT. Certainly, the United States viewed its own ratification of the Geneva Conventions prior to that by the Soviet Union as important to prevent a Soviet propaganda victory, one which it failed to prevent (Mantilla, 2017). Crucial to the UGPT’s success will be how seriously states view warfare that poses an existential threat, especially cyberwar and ASI-enabled or directed nuclear warfare

However, with regard to ASI-enabled or directed warfare, our analysis suggests that what will likely be most important is how states view the social argument for peace. As with the Geneva Conventions, social conformity factors, like supporting a humanitarian peace, conforming to ‘world standards’, and avoiding lagging behind peers, as well as religious perspectives, will likely predominate, and these represent important future avenues for research.

Conclusion

We now conclude this article on optimising peace through a Universal Global Peace Treaty (UGPT) leveraged off the ‘burning plasma’ fusion energy breakthrough (Draper & Bhaneja, 2020), to constrain risk of war from a militarised ASI. A treaty-based risk mitigation approach that included cyberwarfare and specifically mentioned ASI-enhanced warfare could affect the conceptualization of the AI race by reducing enmity between countries, increasing the level of openness between them, and raising social awareness of the risk. While these are external constraints, they may also constrain an ASI’s attitudes towards humanity in a positive way, either by reducing the threat it may perceive of war being waged against it, even if only symbolically, or by increasing the predictability of human action regarding peace.

Much work remains to be done on conceptualizing the UGPT, in preliminary drafting of the main body of the treaty and the protocols, in soliciting states’ interest, and in deliberations assessing thresholds and sovereignty costs, and in the eventual diplomatic conference where states would formally discuss the UGPT. While the UGPT may appear unrealistically ambitious, Mantilla’s (2017) work on conforming instrumentalism and the Geneva Conventions suggests a major sponsoring state would rapidly accumulate prestige by endorsing a path to peace, while states standing in the way would accumulate opprobrium, and that the social dynamics of the international community, whether involving social status or instrumental cooperation, do matter.

Future research on how to constrain the risk of ASI-enabled or directed warfare should consider the importance of peace in different ideologies, for instance Chinese socialism. This is important because, as we have outlined, ASIs developed by different nation-states may well be directed or imbued with different, potentially confrontational, ideologies. For instance, the China Brain Project is embracing a Chinese cultural approach towards neuroethics (Wang et al., 2019), and it is difficult to imagine that a Chinese ASI would not be directed according to Chinese cultural values and its ‘coherent extrapolated volition’ be informed by communist principles. Similarly, a Russian ASI could be informed by Cosmism and a Western ASI by liberal democratic principles.

In recommending such research, we caution that an ASI being created by a state engaged in ideological ‘New Cold War’ framing is likely to be militarized and weaponized. Still a New Cold War framing may have utilitarian function in, exerting social pressures towards signing the UGPT, for Mantilla (2017, pp.509-510) notes, “The Cold War context was also likely especially auspicious for the operation of social pressures, sharpening ideological competition in between the liberal, allegedly civilized world and ‘the rest’, communist or otherwise.”

Mantilla’s (2017) work also suggests excessive rigidity of attitude critical of such treaties may backfire in terms of the social dynamics of global prestige, particularly in the case of major states susceptible to accusations of warlike or imperialist behaviour which are engaged in propaganda wars with other major states. Effectively, the British ratification process for the Geneva Conventions demonstrates that instrumentalist concerns over lack of feasibility or reciprocity can be overruled by social constructivist concerns over ‘world opinion’.

Further research into the UGPT should also involve applying relevant game theory, such as iterated prisoner’s dilemma, especially the peace war game (see e.g., Gintis, 2000), to the major nation-states capable of building an ASI, as well as to the ASI itself. This game theory would need to investigate offering the opportunity for a young ASI to sign the UGPT, as an indicator of goodwill, which may assist in constraining the risk of the ASI waging war on humanity. An ASI with agency as signatory would view the UGPT as an external constraint on its own actions with regard to seeking global domination, in that the ASI would be subverting a humanity-imposed standard that could result in global retaliation and abandonment of mutual cooperation in pursuit of a common agreement on nonkilling and peace norms and values.

Even if the UGPT does not end humanity’s history of conflicts, it would represent a significant improvement in global public aspirations, and instrumental standards, for global peace, both of which may influence an ASI. Paraphrasing the United States Committee on Foreign Relations (1955, p.32), if the end result is only to obtain for those caught in the maelstrom of ASI-enabled or directed war a treatment which is 10 percent less vicious that they would receive without the Treaty, if only a few score of lives are preserved because of these efforts, then the patience and laborious work of all who will have contributed to that goal will not have been in vain.

That 10 percent difference could sway an ASI not to commit to a war for global domination, even if so directed or initially inclined.

References

Allen, G., & Chan, T. (2017). Artificial intelligence and national security. Cambridge, MA: Belfer Center.

Allen, G., & Kania, E.B. (2017, 8 September). “China is using America's own plan to dominate the future of artificial intelligence”. Foreign Policy. Retrieved from https://foreignpolicy.com/2017/09/08/china-is-using-americas-own-plan-to-dominate-the-future-of-artificial-intelligence/

Allhoff, F., Evans, N.G., & Henschke, A. (2013). Routledge handbook of ethics and war: Just war theory in the 21st century.

Allison, G. (2017). Destined for war: Can America and China escape Thucydides's trap? Boston, MA: Houghton Mifflin Harcourt.

Archibugi, D. (1992). Models of international organization in perpetual peace projects. Review of International Studies, 18(4), 295–317.

Autesserre, S. (2014). Peaceland: Conflict resolution and the everyday politics of international intervention. Cambridge: Cambridge University Press.

Babuta, A., Oswald, M., & Janjeva, A. (2020). Artificial intelligence and UK national security policy considerations. London: Royal United Services Institute.

Baldauf, S. (2012, 19 April). Sudan declares war on South Sudan: Will this draw in East Africa, and China? Christian Monitor. Retrieved from https://www.csmonitor.com/World/Keep-Calm/2012/0419/Sudan-declares-war-on-South-Sudan-Will-this-draw-in-East-Africa-and-China

Barbey, C. (2015). Non-militarisation: Countries without armies. Åland: The Åland Islands Peace Institute.

Barrett, A. M., & Baum, S. D. (2016). A model of pathways to artificial superintelligence catastrophe for risk and decision analysis. Journal of Experimental & Theoretical Artificial Intelligence, 29(2), 397–414. doi:10.1080/0952813x.2016.1186228.

Baum, S. D. (2016). On the promotion of safe and socially beneficial artificial intelligence. AI & Society, 32(4): 543–551. doi:10.1007/s00146-016-0677-0.

Baum, S. D. (2018). Countering superintelligence misinformation. Information, 9(10), 244.

Beier, J.M. (2020). Short circuit: Retracing the political for the age of ‘autonomous’ weapons Critical Military Studies, 6(1), 1-18.

Bell, C. (2008). On the law of peace: Peace agreements and the lex pacificatoria. Oxford: Oxford University Press.

Bell, C. (2012). Peace settlements and international law: From lex pacificatoria to jus post bellum. In C. Henderson & N. White (Eds.), Research Handbook on International Conflict and Security Law: Jus ad Bellum, Jus in Bello and Jus post Bellum (pp. 499-546). Cheltenham: Edward Elgar.

Bell, D. (2007). The idea of Greater Britain: Empire and the future of world order, 1860-1900.

Benson-Tilsen, T., & Soares, N. (2016). Formalizing convergent instrumental goals. The Workshops of the Thirtieth AAAI Conference on Artificial Intelligence AI, Ethics, and Society: Technical Report WS-16-02. Palo Alto, CA: Association for the Advancement of Artificial Intelligence.

Bettez, D. J. (1988). Unfulfilled initiative: Disarmament negotiations and the Hague Peace Conferences of 1899 and 1907. RUSI Journal, 133(3), 57–62.

Bhat, S.A. (2019). The Kashmir conflict and human rights. Race and Class, 61(1), 77-86.

Bhatnagar, S., & Chacko, P. (2019). Peacebuilding think tanks, Indian foreign policy and the Kashmir conflict. Third World Quarterly, 40(8), 1496-1515.

Boele, O., Noordenbos, B., & Robbe, K. (2019). Post-Soviet nostalgia: Confronting the empire’s legacies. London: Routledge.

Bohman, J. (1997). Perpetual peace. Cambridge, MA: MIT Press.

Bostrom, N. (2002). Existential risks: Analyzing human extinction scenarios. Journal of Evolution and Technology, 9(1), 1-31.

Bostrom, N. (2006). What is a singleton? Linguistic and Philosophical Investigations, 5(2), 48-54.

Bostrom, N. (2013). Existential risk prevention as global priority. Global Policy, 4(1), 15–31.

Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford: Oxford University Press.

Bostrom, N. Douglas, T., & Sandberg, A. (2016). The unilateralist’s curse and the case for a principle of conformity. Social Epistemology, 30(4), 350-371.

Brain, M. (2003). Robotic nation. Retrieved from http://marshallbrain.com/robotic-nation.htm.

Brown, A., & Arnold, L. (2010). The quirks of nuclear deterrence. International Relations, 24(3), 293-312.

Brynjolfsson, E., & McAfee, A. (2011). Race against the machine. Lexington, MA: Digital Frontier.

Campanella, Edoardo, & Dassù, M. (2017). Anglo nostalgia: The politics of emotion in a fractured West. Oxford: Oxford University Press.

Cave, S., & ÓhÉigeartaigh, S.S. (2018). An AI race for strategic advantage: Rhetoric and risks. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society - AIES '18 (pp. 36-40). New York: ACM Press.

Checkel, J.T. (2012). Theoretical pluralism in IR: Possibilities and limits. In W. Carlsnaes, T. Risse, & B. A. Simmons, Handbook of international relations, 2nd ed. (pp.220-242). London: Sage.

Christiano, P. (2018). Takeoff speeds. Retrieved from https://sideways-view.com/2018/02/24/takeoff-speeds/

Congressional Research Service. (2019). Artificial intelligence and national security. Washington, DC: Congressional Research Service. Retrieved from https://fas.org/sgp/crs/natsec/R45178.pdf

Coontz, S. (1992). The way we never were: American families and the nostalgia trap. New York, NY: Basic Books.

Council on Foreign Relations. (2020). Global conflict tracker: Civil war in Syria. Retrieved from https://www.cfr.org/interactive/global-conflict-tracker/conflict/civil-war-syria

Danzig, R. (2018). Technology roulette: Managing loss of control as many militaries pursue technological superiority. Washington, DC: Center for a New American Security.

Davis, N., & Philbeck, T. (2017). 3.2 Assessing the risk of artificial intelligence. Davos: World Economic Forum. Retrieved from https://reports.weforum.org/global-risks-2017/part-3-emerging-technologies/3-2-assessing-the-risk-of-artificial-intelligence/

De Spiegeleire, S., Maas, M., & Sweijs, T. (2017). Artificial intelligence and the future of defence. The Hague: The Hague Centre for Strategic Studies. Retrieved from http://www.hcss.nl/sites/default/files/files/reports/Artificial%20Intelligence%20and%20the%20Future%20of%20Defense.pdf

Dewey, D. (2016). Long-term strategies for ending existential risk from fast takeoff. New York, NY: Taylor & Francis.

Draper, J., & Bhaneja, B. (2019). Fusion energy for peace building - A Trinity Test-level critical juncture. SocArXiv. https://doi.org/10.31235/osf.io/mrzua

Erickson, J.L. (2015). Dangerous trade: Arms exports, human rights, and international reputation. New York, NY: Columbia University Press.

Evangelista, M., & Tannenwald, N. (Eds.). (2017). Do the Geneva Conventions matter? Oxford: Oxford University Press.

Finnemore, M. (1996). National interests in international society. Ithaca, NY: Cornell University Press.

Finnemore, M., & Sikkink, K. (2001). Taking stock: The constructivist research program in international relations and comparative politics. Annual Review of Political Science, 4(1), 391-416.

Fisher, A. (2020). Demonizing the enemy: The influence of Russian state-sponsored media on American audiences. Post-Soviet Affairs.

Friedrich, B., Hoffmann, D., Renn, J., Schmaltz, F., & Wolf, M. (2017). One hundred years of chemical warfare: Research, deployment, consequences. Cham: Springer.

Gintis, H. (2000). Game theory evolving: A problem-centered introduction to modeling strategic behavior. Princeton, NJ: Princeton University Press.

Goldsmith, J.L., & Posner, E.A. (2015). The limits of international law. Oxford: Oxford University Press.

Goodman, R., & Jinks, D. (2013). Socializing states: Promoting human rights through international law. Oxford, NY: Oxford University Press.

Gruetzemacher, Ross. (2018). Rethinking AI strategy and policy as entangled super wicked problems. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society-AIES '18. New York, NY: ACM.

Gubrud, M.V. (1997). Nanotechnology and international security. Paper presented at the Fifth Foresight Conference on Molecular Nanotechnology, November 5-8, 1997; Palo Alto, CA. Retrieved from http://www.foresight.org/Conferences/MNT05/Papers/Gubrud/

Gwern. (2016). Why Tool AIs want to be Agent AIs. Retrieved from https://www.gwern.net/Tool-AI

Hallett, B. (1998). The lost art of declaring war. Chicago, IL: University of Illinois Press.

Helfer, L.R. (2012). Flexibility in international agreements. In J. Dunoff and M.A. Pollack (Eds.), Interdisciplinary perspectives on international law and international relations: The state of the art (pp. 175-197). Cambridge: Cambridge University Press.

Herman, A. (2004). To rule the waves: How the British navy shaped the modern world. London: Harper.

Herf, J. (1984). Reactionary modernism: Technology, culture, and politics in Weimar and the Third Reich. Cambridge: Cambridge University Press.

Hogg, I.V. (2002). German secret weapons of World War II: The missiles, rockets, weapons, and technology of the Third Reich. London: Greenhill Books.

Horowitz, M. (2018). Artificial intelligence, international competition, and the balance of power. Texas National Security Review, 1(3), 36-57.

Kahn, H. (1959). On thermonuclear war. Princeton, NJ: Princeton University Press.

Kahneman, D., & Lovallo, D. (1993). Timid choices and bold forecasts: A cognitive perspective on risk taking. Management Science, 39(1), 17–31.

Katz, E. (2006). Death by design: Science, technology, and engineering in Nazi Germany. London: Pearson Longman.

Katz, E. (2011). The Nazi engineers: Reflections on technological ethics in hell. Science and Engineering Ethics, 17(3): 571-82.

Kim, A.S. (2019). An end to the Korean War: The legal character of the 2018 summit declarations and implications of an official Korean peace treaty. Asian Journal of International Law, 9(2), 206-216.

Koblentz, G.D. (2009). Living weapons: Biological warfare and international security. Ithaca, NY: Cornell University.

Koh, H.H. (2005). Internalization through socialization. Duke Law Journal, 54(4), 975-982.

Kohler, K. (2019). The return of the ugly American: How Trumpism is pushing Zambia towards China in the 'New Cold War'. Perspectives on Global Development and Technology, 18(1-2), 186-204.

Kronstadt, K.A. (2019). India, Pakistan, and the Pulwama crisis. Washington DC: Congressional Research Service.

Krueger, N., & Dickson, P. R. (1994). How believing in ourselves increases risk taking: Perceived self‐efficacy and opportunity recognition. Decision Sciences, 25(3), 385–400.

Lams, L. (2018). Examining strategic narratives in Chinese official discourse under Xi Jinping. Journal of Chinese Political Science, 23(3), 387-411.

Liddell-Hart, B.H. (1967). Strategy. New York, NY: Frederick A. Praeger, Inc.

Macias, A. (2019, 13 February). From Aruba to Iceland, these 36 nations have no standing military. CNBC. Retrieved from https://www.cnbc.com/2018/04/03/countries-that-do-not-have-a-standing-army-according-to-cia-world-factbook.html

Mansfield-Devine, S. (2018). Nation-state attacks: The start of a new Cold War? Network Security, 2018(11), 15-19.

Mantilla, G. (2017). Conforming instrumentalists: Why the USA and the United Kingdom joined the 1949 Geneva Conventions. The European Journal of International Law, 28(2), 483-511.

Markusen, E., & Kopf, D. (2007). The Holocaust and strategic bombing: Genocide and total war in the twentieth century. Boulder, CO: Westview Press.

Mason, C. (2015). Engineering kindness: Building a machine with compassionate intelligence. International Journal of Synthetic Emotions, 6(1), 1-23.

Mauroni, A. J. (2007). Chemical and biological warfare: A reference handbook. Santa Barbara, CA: ABC-CLIO, Inc.

Mikaberidze, A. (Ed.). (2013). Atrocities, massacres, and war crimes: An encyclopedia. Santa Barbara, CA: ABC-Clio.

Miller, J.D. (2012). Singularity rising. Dallas, TX: BenBella.

Morrow, J.D. (2014). Order within anarchy: The laws of war as an international institution. Cambridge: Cambridge University Press.

Moss, K. B. (2008). Undeclared war and the future of U.S. foreign policy. Washington, DC: Woodrow Wilson International Center for Scholars.

Motlagh, V.V. (2012). Shaping the futures of global nonkilling society. In J.A. Dator & J.E. Pim (Eds.) Nonkilling futures: visions (pp. 99-114). Honolulu, HI: Center for Global Nonkilling.

Müller, H. (2014). Looking at nuclear rivalry: The role of nuclear deterrence. Strategic Analysis, 38(4), 464-475.

National Security Commission on Artificial Intelligence. (2019). Interim report. Washington, DC: Author.

Ohlin, J.D. (2015). The assault on international law. New York, NY: Oxford University Press.

Omohundro, S. (2008). The basic AI drives. Frontiers in Artificial Intelligence and Applications, 171(1), 483-492.

Fortna, V.P. (2008). Does peacekeeping work? Shaping belligerents' choices after civil war. Princeton, NJ: Princeton University Press.

Paige, G.D. (2009) Nonkilling global political science. Honolulu, HI: Center for Global Nonkilling.

Paige, G.D., & Ahn, C-S. (2012). Nonkilling Korea: Six culture exploration. Honolulu, HI: Center for Global Nonviolence and Seoul National University Asia Center.

Pim, J.E. (2010). Nonkilling societies. Honolulu, HI: Center for Global Nonkilling.

Pim, J.E. & Dhakal, P. (Eds.). (2015). Nonkilling spiritual traditions vol. 1. Honolulu, HI: Center for Global Nonkilling.

Pistono, F., & Yampolskiy, R. (2016). Unethical research: How to create a malevolent artificial intelligence. In: Proceedings of Ethics for Artificial Intelligence Workshop (AI-Ethics-2016) (pp. 1-7). New York, NY: AAAI.

Ramamoorthy, A., & Yampolskiy, R. (2018). Beyond MAD? The race for artificial general intelligence. ICT Discoveries, 1(Special Issue 1). Retrieved from http://www.itu.int/pub/S-JOURNAL-ICTS.V1I1-2018-9

Raymond, W.J. (1992). Dictionary of politics: Selected American and foreign political and legal terms. Lawrenceville, VA: Brunswick.

Rose, A. (2018). Mining memories with Donald Trump in the Anthropocene. MFS - Modern Fiction Studies, 64(4), 701-722.

Rotaru, V. (2019). Instrumentalizing the recent past? The new Cold War narrative in Russian public space after 2014. Post-Soviet Affairs, 35(1), 25-40.

Russell, S. J. (2019). Human compatible: Artificial intelligence and the problem of control. London: Allen Lane.

Russia Today. (2017). ‘Whoever leads in AI will rule the world’: Putin to Russian children on Knowledge Day. Russia Today, 1 September 2017.

Scharre, P. (2019). Killer apps: The real dangers of an AI arms race. Foreign Affairs. https://www.foreignaffairs.com/articles/2019-04-16/killer-apps

Schmitt, M. N. (2017). Tallinn manual 2.0 on the international law applicable to cyber operations. Cambridge: Cambridge University Press.

Segal, H.P. (2005). Technological utopianism in American culture: Twentieth anniversary edition. Syracuse, NY: Syracuse University Press.

Sharikov, P. (2018). Artificial intelligence, cyberattack, and nuclear weapons—A dangerous combination. Bulletin of the Atomic Scientists, 74(6), 368-373.

Shulman, C. (2010). Omohundro’s “basic AI drives” and catastrophic risks. MIRI technical report. Retrieved from http://intelligence.org/files/BasicAIDrives.pdf

Simmons, B.A. (2009). Mobilizing for human rights: International law in domestic politics. Cambridge: Cambridge University Press.

SIPRI. (2019). SIPRI military expenditure database. Retrieved from https://www.sipri.org/databases/milex

Soares, N.; Fallenstein, B.; Yudkowsky, E.; & Armstrong, S. (2015). Corrigibility. In Artificial Intelligence and Ethics: Papers from the 2015 AAAI Workshop, (pp. 74-82). New York, NY: AAAI.

Sotala, K., & Yampolskiy, R.V. (2015). Responses to catastrophic AGI risk: A survey. Physica Scripta, 90(1), 1-33.

Tan, K.H., & Perudin, A. (2019). The “geopolitical” factor in the Syrian Civil War: A corpus-based thematic analysis. SAGE Open, 9(2), 1-15.

Tegmark, M. (2017). Life 3.0: Being human in the age of artificial intelligence. New York, NY: Knopf.

Terminski, B. (2010). The evolution of the concept of perpetual peace in the history of political-legal thought. Perspectivas Internacionales, 6(1): 277–291.

Tindley, A., & Wodehouse, A. (2016). Design, technology and communication in the British Empire, 1830–1914. London: Palgrave Macmillan.

Thomson, J. J. (1985). The trolley problem. The Yale Law Journal, 94(6), 1395–1415.

Turchin, A., & Denkenberger, D. (2017). Levels of self-improvement. Manuscript.

Turchin, A., & Denkenberger, D. (2018). Military AI as a convergent goal of self-improving AI. In R.V. Yampolskiy (Ed.), Artificial intelligence safety and security (pp. ). London: Chapman & Hall.

Turchin, A. & Denkenberger, D. (2020). Classification of global catastrophic risks connected with artificial intelligence. AI and Society, 35(1), 147-163.

Turchin, A., Denkenberger, D., & Green, B.P. (2019). Global solutions vs. local solutions for the AI safety problem. Big Data and Cognitive Computing, 3(1), 16.

United States Committee on Foreign Relations (1955). Geneva Conventions for the Protection of War Victims, Report to the United States Senate on Executives D, E, F, and G, 84th Congress, 1st Session, Executive Report no. 9. Washington, DC: Committee on Foreign Relations.

Walker, P. (2008, 9 August). Georgia declares 'state of war' over South Ossetia. The Guardian. Retrieved from https://www.theguardian.com/world/2008/aug/09/georgia.russia2

Walsh, J. I. (2018). The rise of targeted killing. Journal of Strategic Studies, 41(1–2), 143–159.

Walters, G. (2017, 6 September). Artificial intelligence is poised to revolutionize warfare. Seeker, https://www.seeker.com/tech/artificial-intelligence/artificial-intelligence-is-poised-to-revolutionize-warfare

Wang, Yi, et al. (2019). Responsibility and sustainability in brain science, technology, and neuroethics in China—A culture-oriented perspective. Neuron, 101(3), 375–379.

Ward, S. (2017). Status and the challenge of rising powers. Cambridge: Cambridge University Press.

Westad, O.A. (2019). The sources of Chinese conduct: Are Washington and Beijing fighting a New Cold War? Foreign Affairs, 98(5), 86-95.

Williamson, J.B. (2004). The strange history of the Washington consensus. Journal of Post Keynesian Economics, 27(2), 195-206.

World Bank. (2019). Military expenditure (% of general government expenditure). Retrieved from https://data.worldbank.org/indicator/MS.MIL.XPND.ZS?most_recent_value_desc=true

Yampolskiy, R. V. (2016). Taxonomy of pathways to dangerous artificial intelligence. In: AAAI Workshop - Technical Report, vWS-16-01 - WS-16-15 (2016) (pp. 143-148). Palo Alto, CA: Association for the Advancement of Artificial Intelligence.

Young, G. M. (2012). The Russian Cosmists: The esoteric futurism of Nikolai Fedorov and his followers.

Yudkowsky, E. (2001). Creating friendly AI 1.0: The analysis and design of benevolent goal architectures. San Francisco, CA: The Singularity Institute.

Yudkowsky, E. (2004). Coherent extrapolated volition. San Francisco, CA: The Singularity Institute.

Yudkowsky, E. (2008). Artificial intelligence as a positive and negative factor in global risk. In N. Bostrom and M. M. Ćirković (Eds.), Global Catastrophic Risks (pp. 308–345). New York: Oxford University Press.

Zhao, M. (2019). Is a new cold war inevitable? Chinese perspectives on US-China strategic competition. Chinese Journal of International Politics, 12(3), 371-394.

Zwetsloot, R. (2018). Syllabus: Artificial intelligence and international security. Retrieved from https://www.fhi.ox.ac.uk/wp-content/uploads/Artificial-Intelligence-and-International-Security-Syllabus.pdf

LESSWRONG
LW

LESSWRONG
LW

4

Optimising Society to Constrain Risk of War from an Artificial Superintelligence

4

4