International cooperation vs. AI arms race

Summary

I think there's a decent chance that governments will be the first to build artificial general intelligence (AI). International hostility, especially an AI arms race, could exacerbate risk-taking, hostile motivations, and errors of judgment when creating AI. If so, then international cooperation could be an important factor to consider when evaluating the flow-through effects of charities. That said, we may not want to popularize the arms-race consideration too openly lest we accelerate the race.

Will governments build AI first?

AI poses a national-security threat, and unless the militaries of powerful countries are very naive, it seems to me unlikely they'd allow AI research to proceed in private indefinitely. At some point the US military would confiscate the project from Google or Goldman Sachs, if the US military isn't already ahead of them in secret by that point. (DARPA already funds a lot of public AI research.)

There are some scenarios in which private AI research wouldn't be nationalized:

  • An unexpected AI foom before anyone realizes what was coming.
  • The private developers stay underground for long enough not to be caught. This becomes less likely the more government surveillance improves (see "Arms Control and Intelligence Explosions").
  • AI developers move to a "safe haven" country where they can't be taken over. (It seems like the international community might prevent this, however, in the same way it now seeks to suppress terrorism in other countries.)
Each of these scenarios could happen, but it seems most likely to me that governments would ultimately control AI development.

AI arms races

Government AI development could go wrong in several ways. Probably most on LW feel the prevailing scenario is that governments would botch the process by not realizing the risks at hand. It's also possible that governments would use the AI for malevolent, totalitarian purposes.

It seems that both of these bad scenarios would be exacerbated by international conflict. Greater hostility means countries are more inclined to use AI as a weapon. Indeed, whoever builds the first AI can take over the world, which makes building AI the ultimate arms race. A USA-China race is one reasonable possibility.

Arms races encourage risk-taking -- being willing to skimp on safety measures to improve your odds of winning ("Racing to the Precipice"). In addition, the weaponization of AI could lead to worse expected outcomes in general. CEV seems to have less hope of success in a Cold War scenario. ("What? You want to include the evil Chinese in your CEV??") (ETA: With a pure CEV, presumably it would eventually count Chinese values even if it started with just Americans, because people would become more enlightened during the process. However, when we imagine more crude democratic decision outcomes, this becomes less likely.)

Ways to avoid an arms race

Averting an AI arms race seems to be an important topic for research. It could be partly informed by the Cold War and other nuclear arms races, as well as by other efforts at nonproliferation of chemical and biological weapons.

Apart from more robust arms control, other factors might help:

  • Improved international institutions like the UN, allowing for better enforcement against defection by one state.
  • In the long run, a scenario of global governance (i.e., a Leviathan or singleton) would likely be ideal for strengthening international cooperation, just like nation states reduce intra-state violence.
  • Better construction and enforcement of nonproliferation treaties.
  • Improved game theory and international-relations scholarship on the causes of arms races and how to avert them. (For instance, arms races have sometimes been modeled as iterated prisoner's dilemmas with imperfect information.)
  • How to improve verification, which has historically been a weak point for nuclear arms control. (The concern is that if you haven't verified well enough, the other side might be arming while you're not.)
  • Moral tolerance and multicultural perspective, aiming to reduce people's sense of nationalism. (In the limit where neither Americans nor Chinese cared which government won the race, there would be no point in having the race.)
  • Improved trade, democracy, and other forces that historically have reduced the likelihood of war.

Are these efforts cost-effective?

World peace is hardly a goal unique to effective altruists (EAs), so we shouldn't necessarily expect low-hanging fruit. On the other hand, projects like nuclear nonproliferation seem relatively underfunded even compared with anti-poverty charities.

I suspect more direct MIRI-type research has higher expected value, but among EAs who don't want to fund MIRI specifically, encouraging donations toward international cooperation could be valuable, since it's certainly a more mainstream cause. I wonder if GiveWell would consider studying global cooperation specifically beyond its indirect relationship with catastrophic risks.

Should we publicize AI arms races?

When I mentioned this topic to a friend, he pointed out that we might not want the idea of AI arms races too widely known, because then governments might take the concern more seriously and therefore start the race earlier -- giving us less time to prepare and less time to work on FAI in the meanwhile. From David Chalmers, "The Singularity: A Philosophical Analysis" (footnote 14):

When I discussed these issues with cadets and staff at the West Point Military Academy, the question arose as to whether the US military or other branches of the government might attempt to prevent the creation of AI or AI+, due to the risks of an intelligence explosion. The consensus was that they would not, as such prevention would only increase the chances that AI or AI+ would first be created by a foreign power. One might even expect an AI arms race at some point, once the potential consequences of an intelligence explosion are registered. According to this reasoning, although AI+ would have risks from the standpoint of the US government, the risks of Chinese AI+ (say) would be far greater.

We should take this information-hazard concern seriously and remember the unilateralist's curse. If it proves to be fatal for explicitly discussing AI arms races, we might instead encourage international cooperation without explaining why. Fortunately, it wouldn't be hard to encourage international cooperation on grounds other than AI arms races if we wanted to do so.

ETA: Also note that a government-level arms race might be preferable to a Wild West race among a dozen private AI developers where coordination and compromise would be not just difficult but potentially impossible.
143 comments, sorted by
magical algorithm
Highlighting new comments since Today at 2:37 AM
Select new highlight date

Please forgive the self-promotion but this is from Chapter 5 of my book Singularity Rising

"Successfully creating an obedient ultra-intelligence would give a country control of everything, making ultra-AI far more militarily useful than mere atomic weapons. The first nation to create an obedient ultra-AI would also instantly acquire the capacity to terminate its rivals’ AI development projects. Knowing the stakes, rival nations might go full throttle to win an ultra-AI race, even if they understood that haste could cause them to create a world destroying ultra-intelligence. These rivals might realize the danger and desperately wish to come to an agreement to reduce the peril, but they might find that the logic of the widely used game theory paradox of the Prisoners’ Dilemma thwarts all cooperation efforts."

"Scenario 2: Generals, I [The United States President] have ordered the CIA to try to penetrate the Chinese seed AI development program, but I’m not hopeful, since the entire program consists of only twenty software engineers. Similarly, although Chinese intelligence must be using all their resources to break into our development program, the small size of our program means they will likely fail. I’ve thought about suggesting to the Chinese that we each monitor the other’s development program, but then I realized that each of us would cheat by creating a fake program that the other could watch while the real program continued to operate in secret. Since we can’t monitor the Chinese and they can’t monitor us, I’m ordering you to proceed quickly."

"Scenario 4: Generals, I order you to immediately bomb the Chinese AI research facilities because they are on track to finish a few months before we do. Fortunately, their development effort is on a large enough scale that our spies were able to locate it. I know you worry that the Chinese will retaliate against us, but as soon as we attack, I will let the Chinese know that our seed AI development team operates out of submarines undetectable by their military. I will tell the Chinese that if they don’t strike back at us, then after the Singularity we will treat them extremely well."

Scenario 5: "Generals, I order you to strike China with a thousand hydrogen bombs. Our spies have determined that the Chinese are on the verge of activating their seed AI. Based on information given to them by the CIA, our AI development team believes that if the Chinese create an AI, it has a 20% chance of extinguishing mankind.

I personally called the Chinese Premier, told him everything we know about his program, and urged him to slow down, lest he destroy us all. But the Premier denied even having an AI program and probably believes that I'm lying to give our program time to finish ahead of his. And (to be honest), even if a Chinese ultra-AI would be just as safe as ours, I would be willing to deceive the Chinese if it would give our program a greater chance of beating theirs.

Tragically, our spies haven't been able to pinpoint the location of the Chinese program, and the only action I can take that has a high chance of stopping the program is to kill almost everyone in China. The Chinese have a robust second strike capacity and I'm certain that they'll respond to our attack by hitting us with biological weapons and hundreds of hydrogen bombs.

If unchecked, radioactive fallout and weaponized pathogens will eventually wipe out the human race. But our ultra-AI, which I'm 90% confident we will be able to develop within 15 years, could undoubtedly cleanup the radiation and pathogens, modify humans so they won't be affected by either, or even use nanotechnology to terraform Mars and transport our species there. Our AI program operates out of submarines and secure underground bases that can withstand any Chinese attack. Based on my intelligence I'm almost certain that the Chinese haven't similarly protected their program. I can't use the threat of thermonuclear war to make the Chinese halt their program because they would then place their development team outside of our grasp.

Within a year, we will probably have the technical ability to activate a seed AI, but once the Chinese threat has been annihilated our team will have no reason to hurry and could take a decade to fine-tune their seed AI. If we delay, any intelligence explosion we create will have an extremely high probability of yielding a friendly AI. Some people on our team think that, given another decade, they will be able to mathematically prove that the seed AI will turn into a friendly ultra-AI.

A friendly AI would allow trillions and trillions of people to eventually live their lives, and mankind and our descendents could survive to the end of the universe in utopia. In contrast, an unfriendly AI would destroy us. I have decided to make the survival of mankind my overwhelming priority. Consequently, since a thermonuclear war would non-trivially increase the chance of mankind’s survival, I believe that it's my moral duty to initiate war, even though my war will kill over a billion human beings. Physicists haven't ruled out the possibility of time travel, so perhaps our ultra-AI will be able to save all of the people I'm about to kill."

There's a fourth possibility for how the first AGI won't be government-written: goverments might overlook the potential until too late. It seems counter-intuitive to us, since we live in this idea-space, but most goverments are still in the process of noticing the internet. There probably isn't someone whose job it is to notice that uFAI is a risk, so it's entirely possible no one will.

As Paul Graham wrote in a different context

Fortunately for startups, big companies are extremely good at denial. If you take the trouble to attack them from an oblique angle, they'll meet you half-way and maneuver to keep you in their blind spot.

Goverments are even bigger, and even better at denial..

If this seems to be happening, we should probably encourage it.

Hmm, it seems like government AI development might be preferable to a Wild West of private groups. At least in a US-China arms race you have just two parties and so have a shot at treaties and iterated-prisoner's-dilemma (IPD) game dynamics. With unregulated private developers, you have a multiplayer prisoner's dilemma, making IPD-type cooperation or other forms of coordination much harder.

True, but governments have some really scary terminal values.

What are they, how do you know them, and how certain are you?

most goverments are still in the process of noticing the internet

I don't know about most governments, but at least some governments are well into the process of achieving full control of the Internet and are also using it to do things they couldn't do before it existed.

I'm very uneasy as to how to properly discuss AI reasearch:

  • One can't warn of the dangers of AI, without bragging of their power. Will the warning increase or decrease the probability of UAI?
  • One can't advise responsible people not to attempt to make an AI, without increasing the risk that the first AI will be made by someone irresponsible. But what are the chances that an AI made with good intentions destroys humanity anyways?
  • AI research seems to correspond with a prisoner's dilemma, so I wouldn't expect cooperation.
  • I don't know whether it is a better idea to oppose AI research or support it.

I think I can safely conclude:

  • Anyone who already shows an interest in AI should be warned of the dangers.
  • Advise using encryption at all levels when doing AI research, on the assumption that anyone who would steal AI research is more likely to be more dangerous (likely to make a mistake, or evil)
  • Support research into "what is good" in terms that might help programming
  • Support research into the obedience aspect of AI (either directly to the author, or to the author's intended programming)

As to the involvement of government, I'm really nervous about that whether it is an individual nation or supposed cooperation.

These are tricky issues. :)

AI research seems to correspond with a prisoner's dilemma, so I wouldn't expect cooperation.

Fortunately, many real-world scenarios are iterated prisoner's dilemmas (e.g., moving ahead with your country's AI research faster than what was agreed upon). We can also set up side payments against defection, such as by an international governing body. And changing people's views about the payoffs (such as by encouraging an internationalist outlook) could make the game no longer a prisoner's dilemma.

In general, this highlights the importance of improving theory of, institutions for, and inclinations toward compromise.

governments would botch the process by not realizing the risks at hand.

To be fair, so would private companies and individuals.

It's also possible that governments would use the AI for malevolent, totalitarian purposes.

It's less likely IMO that a government would launch a completely independent top secret AI project with the explicit goal of "take over and optimize existence", relying on FOOMing and first-mover advantage.

More likely, an existing highly funded arm of the government - the military, the intelligence service, the homeland department, the financial services - will try to build an AI that will be told to further their narrow goals. Starting from "build a superweapon", "spy on the enemy premier", "put down a revolution", "fix the economy", all the way to "destroy all other militaries", "gather all information", "control all citizens", and "control all money".

In such a scenario, the AI not only won't be told to optimize for "all people" or "all nations", but it won't even be told to optimize for "all interests of our country".

To be fair, so would private companies and individuals.

Yes, perhaps more so. :) The main point in the post was that risks of botching the process increase in a competitive scenario where you're pressed for time.

Did you really just publicly post an idea that has "should we discuss this idea publicly" as a major open question?

might not want the idea of AI arms races too widely known

It's a post on LW, not a concerted effort to publicize said ideas in the respective government circles. The quip (if you meant it as such) may appear obvious, but is inapt on a second-order approximation.

I had two private conversations first to ask whether I should make this post, and the general consensus was that it was net good to share on LW. It seems the upside to making the topic more widely discussed among ourselves exceeds the potential downside.

I should also note that it's not completely obvious if making the idea widely known to governments is net bad -- maybe this would help curb Wild West development scenarios. But we should get careful consensus on a decision like that before moving ahead.

If AI developers are sufficiently concerned about this risk, maybe they could develop AI in a large international consortium?

How much would AI developers be willing to sacrifice? They may be sufficiently concerned to at this risk as explained, but motivated and well-funded organizations (or governments) should have no problem attempting to influence, persuade or convert a fraction of AI developers to think otherwise.

I wonder if global climate change can be used as an analogy highlighting what some climate scientists are willing to publish due to funding and/or other incentives beyond scientific inquiry.

"What? You want to include the evil Chinese in your CEV??"

It seems to me that a correctly implanted CEV including only Americans (or only Chinese) would lead to a significantly better outcome than an incorrectly implemented CEV.

It also seems to me that a correctly implemented CEV including on myself and a few friends and trusted figures of authority would lead to a much better outcome than a CEV including all Americans or all Chinese.

Could be, although remember that everyone else would also prefer for just them, their friends, and their trusted figures to be in CEV. Including more people is for reasons of compromise, not necessarily intrinsic value.

Isaac_Davis made a good point that a true CEV might not depend that sensitively on what country it was seeded from. The bigger danger I had in mind would be the (much more likely) outcome of imperfect CEV, such as regular democracy. In that case, excluding the Chinese could lead to more parochial outcomes, and the Chinese would then also have more reason to worry about a US AI.

Could be, although remember that everyone else would also prefer for just them, their friends, and their trusted figures to be in CEV. Including more people is for reasons of compromise, not necessarily intrinsic value.

That's my point. If you're funding a small-team top secret AGI project, you can keep your seed community small too; you don't need to compromise. Especially if you're consciously racing to finish your project before any rivals, you won't want to include those rivals in your CEV.

Well, what does that imply your fellow prisoners in this one-shot prisoner's dilemma are deciding to do in the secret of their basements? Maybe our best bet is to change the payoffs so that we get a different game than a one-shot PD, via explicit coordination agreements and surveillance to enforce them.

The surveillance would have to be good enough to prevent all attempts made by the most powerful governments to develop in secret something that may (eventually) require nothing beyond a few programmers in a few rooms running code.

This is a real issue. Verifying compliance with AI-limitation agreements is much harder than with nuclear agreements, and already those have issues. Carl's paper suggest lie detection and other advanced transparency measures as possibilities, but it's unclear if governments will tolerate this even when the future of the galaxy is at stake.

Good point. :) With a pure CEV, it might converge to roughly the same thing. Where it could matter more is with a much more crude form of democracy determining the AI's values.

Also, if you're in a hurry to get the AI out the door before the other guy does, you don't have a lot of time for CEV or even for regular democratically made choices.

[Edited]

Or more simply, war and the multilateralist's curse are bad. Things being equal, we should avoid these. So I agree with promoting international cooperation. Exceptions might be if a rogue state was about to do something worse than war or the world was unified already on Vital Issues. Neither of these seem that compelling now.

Then there's that whole AI thing...

whoever builds the first AI can take over the world, which makes building AI the ultimate arms race.

As the Wikipedians often say, "citation needed". The first "AI" was built decades ago. It evidently failed to "take over the world". Possibly someday a machine will take over the world - but it may not be the first one built.

In the opening sentence I used the (perhaps unwise) abbreviation "artificial general intelligence (AI)" because I meant AGI throughout the piece, but I wanted to be able to say just "AI" for convenience. Maybe I should have said "AGI" instead.

The first OS didn't take over the world. The first search engine didn't take over the world. The first government didn't take over the world. The first agent of some type taking over the world is dramatic - but there's no good reason to think that it will happen. History better supports models where pioneers typically get their lunch eaten by bigger fish coming up from behind them.

Yes, let's engage in reference class tennis instead of thinking about object level features.

Doesn't someone have to hit the ball back for it to be "tennis"? If anyone does so, we can then compare reference classes - and see who has the better set. Are you suggesting this sort of thing is not productive? On what grounds?

Doesn't someone have to hit the ball back for it to be "tennis"?

Looks like someone already did.

And I'm not just suggesting this is not productive, I'm saying it's not productive. My reasoning is standard: see here and also here.

Standard? Invoking reference classes is a form of arguing by analogy. It's a basic thinking tool. Don't knock it if you don't know how to use it.

Don't be obnoxious. I linked to two posts that discuss the issue in depth. There's no need to reduce my comment to one meaningless word.

If we're talking reference classes, I would cite the example that the first hominid species to develop human-level intelligence took over the world.

At an object level, if AI research goes secret at some point, it seems unlikely, though not impossible, that if team A develops human-level AGI, then team B will develop super-human-level AGI before team A does. If the research is fully public (which seems dubious but again isn't impossible), then these advantages would be less pronounced, and it might well be that many teams could be in close competition even after human-level AGI. Still, because human-level AGI can be scaled to run very quickly, it seems likely it could bootstrap itself to stay in the lead.

If we're talking reference classes, I would cite the example that the first hominid species to develop human-level intelligence took over the world.

Note that humans haven't "taken over the world" in many senses of the phrase. We are massively outnumbered and out-massed by our own symbionts - and by other creatures.

Machine intelligence probably won't be a "secret" technology for long - due to the economic pressure to embed it.

While its true that things will go faster in the future, that applies about equally to all players - in a phenomenon commonly known as "internet time".

As has been pointed out numerious times on lesswrong, history is not a very good guide for dealing with AI since it is likely to be a singular (if you'll excuse the pun) event in history. Perhaps the only other thing it can be compared with is life itself, and we currently have no information about how it arose (did the first self-replicating molecule lead to all life as we know it? Or were there many competing forms of life, one of which eventually won?)

As has been pointed out numerious times on lesswrong, history is not a very good guide for dealing with AI since it is likely to be a singular (if you'll excuse the pun) event in history. Perhaps the only other thing it can be compared with is life itself [...]

What, a new thinking technology? You can't be serious.

Well, nice to know we're planning our global thermonuclear wars decades before there's any sign we'll need a global thermonuclear war for any good reason.

Goddamnit, do you people just like plotting wars!?

You're equivocating on the word "need". When one refers to needing most things, it means we're better off with them than with not having them. But for global thermonuclear war, the comparison is not to having no war; the comparison is to having a war where other parties are the ones with all the nukes.

Furthermore, describing many actions in terms of "need" is misleading. "Needing" something normally implies a naive model where if you want X to happen, you are willing to do X and vice versa. Look up everything that has been written here about precommitting; nuclear war is a case of precommitting and precommitting to something can actually reduce its likelihood.

You're equivocating on the word "need". When one refers to needing most things, it means we're better off with them than with not having them. But for global thermonuclear war, the comparison is not to having no war; the comparison is to having a war where other parties are the ones with all the nukes.

No, we're not talking about that kind of war. We're not talking about a balance of power that can be maintained through anti-proliferation laws (though I certainly support international agreements to not build AI and contribute to a shared, international FAI project!). If we get to the point of an American FAI versus a Chinese FAI, the two AIs will negotiate a rational compromise to best suit the American and Chinese CEVs (which won't even be that different compared to, say, Clippy).

Whereas if we get one UFAI that manages to go FOOM, it doesn't fucking matter who built it: we're all dead.

So the issue is not, "You don't build UFAI and I won't build UFAI." The issue is simply: don't build UFAI, ever, at all. All humans have rational reason to buy this proposition.

There are actually two better options here than preemptively plotting an existential-risk-grade war. They are not dichotomous and I personally support employing both.

  • Plot an international treaty to limit the creation of FOOM-able AIs outside a strict framework of cooperative FAI development that involves a broad scientific community and limits the resources needed for rogue states or organizations to develop UFAI. This favors the singleton approach advocated by Nick Bostrom and Eliezer Yudkowsky, and also avoids thermonuclear war. An Iraq-style conventional war of regime change is already a severe enough threat to bend most nations' interests in favor of either cooperative FAI development or just not developing AI.

  • For the case of a restricted-domain FAI being created, encourage global economic cooperation and cultural interaction, to ensure that whether the first FAI is Chinese or American, it will infer values over humans of a more global rather than parochial culture and orientation (though I had thought Eliezer's cognitivist approach to human ethics was meant to be difficult to corrupt using mere cultural brainwashing).

That leaves the following military options: in case of a regime showing signs of going rogue and developing their own AI, utilize conventional warfare (which in an increasingly economically interconnected world is already extremely painful for anyone except North Korea or the very poor, neither of which are good at building AIs). In case of an actual UFAI appearing and beginning a process of paper-clipping the world within a timespan that we can see it coming before it kills us: consider annihilating the planet.

However, strangely enough, none of these options suit the cultural fetish around here for sitting around in hooded cloaks plotting the doom of others in secret and feeling ever-so-"rational" about ourselves for being willing to engage in deception, secrecy, and murder for the Greater Good. So I predict people here won't actually want to take those options, because the terminal goal at work is Be Part of the Conspiracy rather than Ensure the First Superintelligence is Friendly.

Thanks, Eli. You make some good points amidst the storm. :)

I think the scenario James elaborated was meant to be a fictional portrayal of a bad outcome that we should seek to avoid. That it was pasted without context may have given the impression that he actually supported such a strategy.

I mostly agree with your bullet points. Working toward cooperation and global unification, especially before things get ugly, is what I was suggesting in the opening post.

Even if uFAI would destroy its creators, people still have incentive to skimp on safety measures in an arms-race situation because they're trading off some increased chance of winning against some increased chance of killing everyone. If winning the race is better than letting someone else win, then you're willing to tolerate some increased risk of killing everyone. This is why I suggested promoting internationalist perspective as one way to improve the situation -- because then individual countries would care less about winning the race.

BTW, it's not clear that Clippy would kill us all. Like in any other struggle for power, a newly created Clippy might compromise with humans by keeping them alive and giving them some of what they want. This is especially likely if Clippy is risk averse.

Interesting. So there are backup safety strategies. That's quite comforting to know, actually.

I think the scenario James elaborated was meant to be a fictional portrayal of a bad outcome that we should seek to avoid. That it was pasted without context may have given the impression that he actually supported such a strategy.

Oh thank God. I'd like to apologize for my behavior, but to be honest this community is oftentimes over my Poe's Law Line where I can no longer actually tell if someone is acting out a fictional parody of a certain idea or actually believes in that idea.

Next time I guess I'll just assign much more probability to the "this person is portraying a fictional hypothetical" notion.

If winning the race is better than letting someone else win, then you're willing to tolerate some increased risk of killing everyone.

Sorry, could you explain? I'm not seeing it. That is, I'm not seeing how increasing the probability that your victory equates with your own suicide is better than letting someone else just kill you. You're dead either way.

No worries. :-)

That is, I'm not seeing how increasing the probability that your victory equates with your own suicide is better than letting someone else just kill you. You're dead either way.

Say that value(you win) = +4, value(others win) = +2, value(all die) = 0. If you skimp on safety measures for yourself, you can increase your probability of winning relative to others, and this is worth some increased chance of killing everyone. Let me know if you want further clarification. :) The final endpoint of this process will be a Nash equilibrium, as discussed in "Racing to the Precipice," but what I described could be one step toward reaching that equilibrium.

none of these options suit the cultural fetish around here for sitting around in hooded cloaks plotting the doom of others in secret and feeling ever-so-"rational" about ourselves for being willing to engage in deception, secrecy, and murder for the Greater Good.

Oh, how... rebel of you.

May I recommend less drama?

May I recommend less drama?

Frankly, when someone writes a post recommending global thermonuclear war as a possible option, that's my line. My suggested courses of action are noticeably less melodramatic and noticeably closer to the plain, boring field of WW3-prevention.

But I gave you the upvote anyway for calling out my davkanik tendencies.

Frankly, when someone writes a post recommending global thermonuclear war as a possible option, that's my line.

I'm genuinely confused. There's an analogy to a nuclear arms race running through the OP, but as best I can tell it's mostly linking AI development controls to Cold War-era arms control efforts -- which seems reasonable, if inexact. Certainly it's not advocating tossing nukes around.

Can you point me to exactly what you're responding to?

Ah, I seem to be referring to James' excerpt from his book rather than the OP:

A friendly AI would allow trillions and trillions of people to eventually live their lives, and mankind and our descendents could survive to the end of the universe in utopia. In contrast, an unfriendly AI would destroy us. I have decided to make the survival of mankind my overwhelming priority. Consequently, since a thermonuclear war would non-trivially increase the chance of mankind’s survival, I believe that it's my moral duty to initiate war, even though my war will kill over a billion human beings.

Oh, that makes more sense. I'd assumed, since this thread was rooted under the OP, that you were responding to that.

After reading James's post, though, I don't think it's meant to be treated as comprehensive, much less prescriptive. He seems to be giving some (fictional) outlines of outcomes that could arise in the absence of early and aggressive cooperation on AI development; the stakes at that point are high, so the consequences are rather precipitous, but this is still something to avoid rather than something to pursue. Reading between the lines, in fact, I'd say the policy implications he's gesturing towards are much the same as those you've been talking about upthread.

On the other hand, it's very early to be hashing out scenarios like this, and doing so doesn't say anything particularly good about us from a PR perspective. It's hard enough getting people to take AI seriously as a risk, full stop; we don't need to exacerbate that with wild apocalyptic fantasies just yet.

It's hard enough getting people to take AI seriously as a risk, full stop

This bears investigating. I mean, come on, the popular view of AI among the masses is that All AI Is A Crapshoot, that every single time it will end in the Robot Wars. So how on Earth can it be difficult to convince people that UFAI is an issue?

I mean, hell, if I wanted to scare someone, I'd just point out that no currently-known model of AGI includes a way to explicitly specify goals desirable to humans. That oughtta scare folks.

I've talked to a number of folks who conclude that AIs will be superintelligent and therefore will naturally derive and follow the true morality (you know, the same one we do), and dismiss all that Robot Wars stuff as television crap (not unreasonably, as far as it goes).

(you know, the same one we do),

Which one's that, eh ;-)?

folks who conclude that AIs will be superintelligent and therefore will naturally derive and follow the true morality

Are these religious people? I mean, come on, where do you get moral realism if not from some kind of moral metaphysics?

dismiss all that Robot Wars stuff as television crap (not unreasonably, as far as it goes).

Certainly it's not unreasonable. One UFAI versus humans with no FAI to fight back, I wouldn't call anything so one-sided a war.

(And I'm sooo not making the Dalek reference that I really want to. Someone else should do it.)

moral realism

Pedantic complaint about language: moral realism simply says that moral claims do state facts, and at least some of them are true. It takes further assumptions ("internalism") to claim that these moral facts are universally compelling in the sense of moving any intelligent being to action. (I personally believe the latter assumption to be nonsense, hence AGI is a really bad idea.)

Granted, I don't know of any nice precise term for that position that all intelligent beings must necessarily do the right thing, possibly because it's so ridiculous no philosopher would profess it publicly in such words. On the other hand, motivational internalism would seem to be very intuitive, judging by the pervasiveness of the view that AI doesn't pose any risk.

Granted, I don't know of any nice precise term for that position that all intelligent beings must necessarily do the right thing

Isn't it called Convergence?

Are you under the impression that CEV advocates around here believe that all intelligent beings must necessarily do the right thing?

On the whole, confusion reigns, but there is a fairly consistent tendency to reject Intrinsic Motivation without argument.

What's "Intrinsic Motivation"? The only hits for it on LW are about akrasia.

So, moral motivational internalism. Then I agree that we tend to reject it. For example, here. You can make it work by having "this motivates the person considering it" be incorporated into the definition of "right", but that results in a relativist definition, and I don't see any need for it anyway.

Motivational internalism may not be an obvious truth, but that doesn't mean its falsehood is the default. I don't see the relevance of the link.

So, basicly, what we call "terminal values"?

No, the idea of motivational internalism is that you can't judge something as right or wrong without being motivated to pursue or avoid it. Like if the word "right" was short for "this thing matches my terminal values".

The alternative is externalism, where "right" means {X, Y, Z} and we (some/most/all humans) are motivated to pursue it just because we like {X, Y, Z}.

Does "Intrinsic Motivation" in this context entail that all intelligent beings must necessarily do the right thing?

If so, then I agree that we tend to reject it. As for "without argument"... do you mean you've read the local discussions of the topic and find them unconvincing? Or do you mean you believe it hasn't been discussed at all?

If not, then I don't know what you're saying.

If you prefer to continue expressing yourself in gnomic utterances, that's of course your choice, but I find it an unhelpful way to communicate and will tap out here if so.

If not, I'm

Eh, maybe? I've seen "convergence thesis" thrown about on LW, but it's hardly established terminology. Not sure it would be fair to use a phrase so easily confused with Bostrom's much more reasonable Instrumental Convergence Thesis either. (Also, it has nothing to do with CEV so I don't see the point of that link.)

I've never had that conversation with explicitly religious people, and moral realism at the "some things are just wrong and any sufficiently intelligent system will know it" level is hardly unheard of among atheists.

moral realism at the "some things are just wrong and any sufficiently intelligent system will know it" level is hardly unheard of among atheists.

Really? I mean, sorry for blathering, but I find this extremely surprising. I always considered it a simple fact that if you don't have some kind of religious/faith-based metaphysics operating, you can't be a moral realist. What experiment could you possibly perform to test moral-realist hypotheses, particularly when dealing with nonhumans? It simply doesn't make any sense.

Oh well.

Moral realism makes no more sense with religion. As CS Lewis said: "Nonsense does not cease to be nonsense when we put the words 'God can' before it."

Disagreed, depending on your definition of "morality". A sufficiently totalitarian God can easily not only decide what is moral but force us to find the proper morality morally compelling.

(There is at least one religion that actually believes something along these lines, though I don't follow it.)

Ok, that definition is not nonsense. But in that case, it could happen without God too. Maybe the universe's laws cause people to converge on some morality, either due to the logic of evolutionary cooperation or another principle. It could even be an extra feature of physics that forces this convergence.

Perhaps Eli and you are talking past each other a bit. A certain kind of god would be strong evidence for moral realism, but moral realism wouldn't be strong evidence for a god of any kind.

Well sure, but if you're claiming physics enforces a moral order, you've reinvented non-theistic religion.

I find this extremely surprising.

Why? Beliefs that make no sense are very common. Atheists are no exception.

Actually, if anything, I'd call it the reverse. Religious people know where we're making unevidenced assumptions.

Really? I mean, sorry for blathering, but I find this extremely surprising. I always considered it a simple fact that if you don't have some kind of religious/faith-based metaphysics operating, you can't be a moral realist. What experiment could you possibly perform

That would be epistemology...

to test moral-realist hypotheses, particularly when dealing with nonhumans? It simply doesn't make any sense.

There are rationally acceptable subjects that don't use empiricism, such as maths, and there are subjects such as economics which have a mixed epistemology.

However, if this epistemological-sounding complaint is actually about metaphysics, ie "what experiment could you perform to detect a non-natural moral property", the answer is that moral realists have to suppose the existence of special psychological faculty.

Might I suggest you take a look at the metaethics sequence? This position is explained very well.

Which position? The metaethics sequence isn't clearly re4alist, or anything else.

Well no, not really. The meta-ethics sequence takes a cognitivist position: there is some cognitive algorithm called ethics, which actual people implement imperfectly but which you could somehow generalize to obtain a "perfect" reification.

That's not moral realism ("morality is a part of the universe itself, external to human beings"), that's objective moral-cognitivism ("morality is a measurable part of us but has no other grounding in external reality").

Can you rule out a form objective moral-cognitivism that applies to any sufficiently rational and intelligent being?

Unless morality consists of game theory, I can rule out any objective moral cognitivism that applies to any intelligent and/or rational being.

Why shouldn't it consist of optimal rules for achieving certain goals?

Well if you knew what the goals were and could prove that such goals appeal to all intelligent, rational beings, including but not limited to humans, UFAI, Great Cthulhu, and business corporations...

I don't need to do that. We are used to the idea that some people don't find morality appealing, and we have mechanisms such as social disapproval and prisons to get the recalcitrant to play along.

That depends: what are you talking about? I seem to recall you defined the term as something that Eliezer might agree with. If you've risen to the level of clear disagreement, I haven't seen it.

A good refinement of the question is how you think AI could go wrong (that being Eliezer's field) if we reject whatever you're asking about.

You would have the exact failure mode you are already envisaging...clippies and so on. OMC is a way .AI would not go wrong. MIRI needs to argue it is unfeasible or unlikely to show that uFAI is likely.

You talk as though religion were something that appeared in people's minds fully formed and without causes, and that the logical fallacies associated with it were then caused by religion.

Hmm. Fair point. "We imagine the universe as we are."

What experiment could you possibly perform to test moral-realist hypotheses, particularly when dealing with nonhumans?

You seem to be confusing atheism with positivism. In particular, the kind of positivism that's self-refuting.

Works at what? Note that it,s not a synonym for science or empiricism , or the scientific method .

The proposition "only propositions that can be empirically tested are meaningful" cannot be empirically tested.

Are these religious people? I mean, come on, where do you get moral realism if not from some kind of moral metaphysics?

From abstract reason or psychological facts, or physical facts, or a mixture.

There is a subject called economics. It tells you how to achieve certain goals, such as maximising GDP. It doesn't do that by corresponding to a metaphysical Economics Object, it does that with a mixture of theoretical reasoning and examination of evidence.

There is a subject called ethics. It tells you how to achieve certain goals, such as maximising happiness....

There is a subject called ethics. It tells you how to achieve certain goals, such as maximising happiness....

Well there's the problem: ethics does not automatically start out with a happiness-utilitarian goal. Lots of extent ethical systems use other terminal goals. For instance...

Sufficient rationality will tell you how to maximize any goal, once you can clearly define the goal.

Rationality is quite helpful for clarifying goals too.

Of course economics doesn't have the well-established laws of physical science: it wouldn't be much of an analogy for ethics if it did.But having an epistemology that doens't work very well is not the same as having an epistemology that requires non-natural entities.

The main problem with economics is not its descriptive, but its predictive power. Too many of economics' calculations need to suppose that everyone will behave rationally, which regular people can't be trusted to do. Same problem with politics.

So how on Earth can it be difficult to convince people that UFAI is an issue?

Well, there's a couple prongs to that. For one thing, it's tagged as fiction in most people's minds, as might be suggested by the fact that it's easily described in trope. That's bad enough by itself.

Probably more importantly, though, there's a ferocious tendency to anthropomorphize this sort of thing, and you can't really grok UFAI without burning a good bit of that tendency out of your head. Sure, we ourselves aren't capital-F Friendly, but we're a far cry yet from a paperclip maximizer or even most of the subtler failures of machine ethics; a jealous or capricious machine god is bad, but we're talking Screwtape here, not Azathoth. HAL and Agent Smith are the villains of their stories, but they're human in most of the ways that count.

You may also notice that we tend to win fictional robot wars.

Also, note that the tropes tend to work against people who say "we have a systematic proof that our design of AI will be Friendly". In fact, in general the only way a fictional AI will turn out 'friendly' is if it is created entirely by accident - ANY fictional attempt to intentionally create a Friendly AI will result in an abomination, usually through some kind of "dick Genie" interpretation of its Friendliness rules.

Yeah. I think I'd consider that a form of backdoor anthropomorphization by way of vitalism, though. Since we tend to think of physically nonhuman intelligences as cognitively human, and since we tend to think of human ethics and cognition as something sacred and ineffable, fictional attempts to eff them tend to be written as crude morality plays.

Intelligence arising organically from a telephone exchange or an educational game or something doesn't trigger the same taboos.

when someone writes a post recommending global thermonuclear war as a possible option

Looks like you (emphasis mine):

In case of an actual UFAI appearing and beginning a process of paper-clipping the world within a timespan that we can see it coming before it kills us: consider annihilating the planet

and

my davkanik tendencies

You can be a contrarian with less drama perfectly well :-)

Looks like you (emphasis mine):

I would note that "we are all in the process of dying horribly" is actually a pretty dramatic situation. At the moment, actually, I'm not banking on ever seeing it: I think actual AI creation requires such expertise and has such extreme feasibility barriers that successfully building a functioning software-embodied optimization process tends to require such group efforts that someone thinks hard about what the goal system is.

I would note that "we are all in the process of dying horribly" is actually a pretty dramatic situation.

Given that "we are all in the process of dying" is true for all living beings for as long as living beings existed, I don't see anything dramatic in here. As to "horribly", what is special about today's "horror" compared to, say, a hundred years ago?

I hadn't meant today. I had meant in the case of a UFAI getting loose. That's one of those rare situations where you should consider yourself assuredly dead already and start considering how you're going to kill the damn UFAI, whatever that costs you.

Whereas in the present day, I would not employ "nuke it from orbit; only way to be sure" solutions to, well, anything.

Frankly, when someone writes a post recommending global thermonuclear war as a possible option, that's my line. My suggested courses of action are noticeably less melodramatic and noticeably closer to the plain, boring field of WW3-prevention.

The currently fashionable descriptor is "metacontrarianism" - you might get better responses if you phrase your objection in that way.

(man, I LOVE when things go factorially N-meta)

I'm not actually sure who the metacontrarian is here.

I think there's a decent chance that governments will be the first to build artificial general intelligence (AI).

Have governments historically been good at developing innovative software? Last I heard they were having trouble with CRUD websites. Just sayin'.

I guess it'd probably be better to look at DARPA's track record in particular.

Have governments historically been good at developing innovative software?

Don't forget that ARPA invented the internet, DARPA funds Boston Dynamics, the NSA was (and possibly still is) ahead of everyone else at crypto-tech, etc.

Governments are the richest entities, which lets them hire the smartest people. And they are the most powerful entities, which lets them stop rivals and nationalize private research efforts. They are also the best-informed entities on many subjects.

Governments are the richest entities, which lets them hire the smartest people.

Beyond a certain dollar amount, it seems that smart people typically start caring about other stuff like what they're working towards, how smart their co-workers are, etc. I'd expect that many/most top software engineers would prefer to work at Google doing good for the world making $150K than the US government doing bad for the world making $200K.

This is a good point, but I'm not sure how much of that is driven by "doing good for the world" and how much by "working at Google"; so governments might try to use private contractors. Also, it's not entirely obvious that the average Google project improves the world more than the average government program (that requires top programmers).

Governments don't earn their money through market savvy, so they tend lack the experience and skill to recognize talent when they see it. Without that, it becomes very difficult to hire the most capable people, even when you have large amounts of money.

Without that, it becomes very difficult to hire the most capable people, even when you have large amounts of money.

A theoretical argument, but does it hold empirically? In my experience, the most capable scientists all work for government organizations.

Hi Heller. How are we determining that they are the most capable? I feel like there are many ways to measure, and I think science includes the type of studying that allows people to create good websites and computer software and hardware, and good music and movies and so on (I think all these things can be researched formally). With that in mind, I'd say that the most capable science is being done by private companies.

Hopefully that makes sense. This isn't intended to muddle the term "scientist."

I think science includes the type of studying that allows people to create good websites and computer software and hardware, and good music and movies and so on

Thats so incredibly broad as to be a useless definition of "scientist." Lets use "scientist" as someone engaged in basic research oriented around the natural word (as opposed to an engineer involved in more applied research). My categorization isn't perfect, but your grouping puts musicians,actors, programmers, actuaries,engineers,etc all in to an umbrella category of "scientist."

Most fundamental research happens at public institutions under public grants (even the private institutions get massive public subsidy). Also, as a matter of public-goods, economic theory would expect private institutions to be systematically underinvested in basic research.

I'm not suggesting that everyone who does those things is a scientist, but that those things CAN be studied scientifically.

For example, not all singers are scientists, but the people who created auto-tune probably did so through scientific research, and, at least in an objective note-matching sense, it makes singers better.

Take weapons systems as an example. Few would claim that the government has been a failure at building nuclear arsenals, conventional-weapons fleets, remote weapons-control systems, etc. Of course, it may do so inefficiently, and the US military may sometimes perform poorly (e.g., Vietnam, Iraq), but on the whole nobody in the world would dare go up against it. The same could be true for a government AGI.

The US military has certainly developed some extremely powerful weapons. But as you said and I agree, we have understand if it was done more efficiently or capably then a market would've produced, and I'm not sure if there's a good example to use for weapons development.

Maybe we should look at government space programs compared to private spaceflight?

Yes, companies can sometimes produce technologies at lower cost. But my thinking is that when the technology is as much of a security threat as AGI, governments would use their power to prohibit private development of it (just as governments prevent private selling of advanced military weapons). Combined with the fact that governments are not totally ineffective, this makes it plausible that the first AGI will be built by a government. Of course, governments might not be first, especially if private companies are fast enough to outrun government prohibitions.

when the technology is as much of a security threat as AGI, governments would use their power to prohibit private development of it

This assumes that the government recognizes AGI development as a security threat which is not a given.

Agreed. It's an interesting question whether we want governments to realize it or not. I lean toward the "yes" side (in general, it seems better when governments understand catastrophic risks), but we should debate the question more before taking action.

This may sometimes be the case, but note that "market savvy" isn't necessary to gain useful experience in recognizing skill in prospective employees. You just need effective feedback mechanisms that tell you whether or not you're doing a good job.

May government institutions operate in the absence of such feedback mechanisms, but not all.

I think the question in this case is whether feedback mechanisms outside of proper free market forces should be labeled "effective," since many of us consider the accuracy of free market feedback to be light-years beyond that of rough individual human judgement.

(apologies if this is sliding into an inappropriate political discussion)

Free market feedback is generally strong, but often subject to perverse incentives. There are matters I would be more comfortable leaving in the hands of free market than the government, and other matters where I would be much less comfortable seeing them handled by the free market. I think that a "general case" where the balance clearly lies in favor of one or the other is probably mythical.

I've read and seen some really thought-provoking material on ways in which the free market could supposedly do a lot of traditional government roles. There are also sites like judge.me which are testing some of it out, including private contract enforcement and law. So I wouldn't automatically say that government is better at certain things.

What kind of perverse incentives are you concerned with? There is certainly some incentive to do things like using force and deception to get money or resources, but the market also includes a mechanism for punishing this and disincentivizing that type of behavior, and I'd say the same incentive exists in governments.

-1 points