LessWrong should offer a short pitch on why AI will be dangerous, and this aims to be that.

Many people think, "Why would humanity make dangerous AI? Seems like a stupid idea. Can't we just make the safe kind?" No. Humanity will make dangerous AI for the same reason we made every other technology dangerous: it's more useful.

A knife sharp enough to cut fruit can cut your finger. Electrical outlets with enough power to run your refrigerator can stop your heart. A car with enough horsepower to carry your family up a hill can easily kill pedestrians. Useful systems must be dangerous because useful implies they can have large effects on their environment. "Large effects" can be good or bad depending on your perspective.

We'll make AI powerful too: knowledgeable enough to cure diseases, tactical enough to outsmart terrorists, and capable enough to run an economy by itself, all so we can relax! That means our AI will also be knowledgeable enough to invent new diseases, tactical enough to outsmart freedom fighters, and responsible enough to run a military-industrial complex all by itself.

We won't do the latter on purpose any more than we crash cars or cut our fingers on purpose, the problem is that there are more ways for complex systems to go "wrong" than "right" because this universe is psychopathic, even when nobody is being malicious. The asteroid that killed the dinosaurs wasn't malicious, it just didn't care. Nowhere in the behavior of the asteroid was encoded a consideration of life, or even itself. It just followed mathematical laws (of gravity) which don't care about "morality".

Like the asteroid, AI systems just follow mathematical laws (of intelligence) which don't care about "morality". We'll add safety components that can consider moral questions, but mistakes will be made as have been made with all previous technologies. 

The odds of a mistake go up a lot when you consider that the whole selling point of AI is to be smarter than us, and that entities smarter than you are unpredictable. They will do things you didn't think of, sometimes things you didn't even know were possible. AI presents a unique and unfamiliar danger to humanity, because unlike other technologies, an AI disaster might not wait around for you to come clean it up. That wouldn't be very intelligent, would it?


1. Most posts are long, detailed, and use esoteric language. Not good for first impressions. Entry level posts should be easier to find, too.


New Comment
14 comments, sorted by Click to highlight new comments since: Today at 9:40 AM

Hi, I'm a skeptic of the Alignment Problem, meaning, I don't expect TAI will be dangerously misaligned. I'll go over some of the reasons why I'm skeptical, and I'll suggest how you could persuade me to take the problem more seriously.

Firstly, I definitely agree with the "insurance" argument. Even if I think that TAI is unlikely to be misaligned in a disastrous way, there is still the possibility of a rouge AI causing disaster, and I 100% agree that it worth investing in solving the Alignment Problem. So I DO NOT think that alignment research should be dismissed, ignored or defunded. In fact I would agree with most people on this forum that alignment research is severely underfunded and not taken seriously enough.

I also agree that misalignment is possible. I agree with the Orthogonality Thesis. Someone could build a paperclip maximizer (for example) if they for some reason wanted to. My opinion is that institutional actors (corporate and military research institutes) will not deploy misaligned AI. They will be aware that their AI is misaligned and will not deploy it until it is aligned, even if that takes a long time, because it simply is not in their self-interest to deploy an AI which is not useful to them. Also, if misaligned AI is deployed, the amount of damage that it could do is severely limited.

Corporate actors will not deploy an AI if it is likely to result in financial, legal or public relations disaster for them. This constraint does not apply to military actors, so if world war breaks out or if the Chinese or US Military-Industrial Complex takes the lead in AI away from the current leaders, US corporate research, that could be disastrous and is a scenario I am very afraid of. However, my primary fear is not of the "Technical" Alignment Problem but of the "Political" Alignment Problem. Allow me to explain.

Intelligence is commonly defined as "the ability to achieve goals" (or something like that). When we talk about AI-induced disaster, the AI is usually imagined to be extremely powerful (in at least one dimension). In other words, the fear is not of weak AI but of Artificial Super Intelligence, and "super" is a way of saying "the AI is super powerful and therefore dangerous." When discussing how disaster could happen, the story often goes "the AI invents grey goo nanotechnology/designs a lethal pathogen/discovers a lower vacuum energy and destroys the world."

What's never explained is how AI could suddenly become so powerful that it can achieve what entire nations of extremely smart people could never do. China, the Soviet Union, Nazi Germany or the US (at our worst) were never able to existentially destroy or totally dominate their enemies, despite extreme efforts to do so. Why would an AI be able to do so, and why would you expect it to happen by accident the moment AI is let out of the box?

A much scarier and much more plausible scenario involves conventional military hardware and police state tactics scaled up to an extreme. Suppose an evil dictator gets control of AGI and builds a fleet of billions of autonomous quadcoptors (drones) armed with ordinary guns. He uses the drones to spy on his people and suppress any attempt at resistance. If this dictator is not stopped by a competing military force, either because said dictator rules the sole superpower/one world government, or because the competing superpowers are also evil dictatorships, then this dictator would be able to oppress everyone and could not be opposed, possibility for the rest of eternity.

The dictator could be an ASI, but the dictator could also be a human being using a sufficiently aligned AI. Either way, we have a serious problem. Not the Technical Alignment Problem, but the Political Alignment Problem. Westerners used to believe that there was some sort of force of nature bending the moral arc towards justice, freedom and democracy as society progressed. IMHO this trend towards better society worked until the year 2001, when 9/11 caused the US to turn backwards (starting with the Patriot Act) and then more recently the rest of the world started to revert towards autocracy for some reason.

The problem is making the future safe and free for the common person. For all of human history, dictators were limited in how cruel and oppressive they could be. If the dictator was so awful that even his rank-and-file police and military would not support him, he would be toppled eventually. AI potentially takes that constraint away. An AI that is infinitely subservient (totally aligned) will follow the dictator's orders no matter how evil he is.

The Political Alignment Problem is probably not solvable -- we can only hope that free people prevail over autocracy. But enough politics, let's go back to the Technical Alignment Problem. Why don't I think it's a concern? Simply put, I don't think early AIs will be so powerful. Our world is built on millennia of legal, economic, technological and military safeguards to prevent individual actors (with the exception of autocrats) from doing too much damage. To my knowledge, the worst damage that any non-state individual actor could do was limited to about 16,000 deaths (upper estimate). Society is designed to limit the damage that bad actors and accidents can do, and there is no reason to believe that rouge AI will be dramatically more powerful than corporations or terrorists.

In summary, my objection is approximately the old response of "why wouldn't we just unplug the damn thing" with the added point that we would be willing to use police or if necessary military force if the AI resists. To be convinced otherwise, I would need to understand why early AIs would become so much more powerful than corporations, terrorists or nation-states.

My opinion is that institutional actors (corporate and military research institutes) will not deploy misaligned AI. They will be aware that their AI is misaligned and will not deploy it until it is aligned,

Why do you think that the creators of AI will know if its misaligned? If its anything like current AI research, we are talking people applying a large amount of compute to an algorithm they have fairly limited understanding of. Once you have the AI, you can try running tests on it, but if the AI realises its being tested, it will try to trick you, acting nice in testing and only turning on you once deployed. You probably won't know if the AI is aligned without a lot of theoretical results that don't yet exist. And the AI's behaviour is likely to be actively misleading. 

Why do you think AI can't cause harm until it is "deployed". The AI is running on a computer in a research lab. The security of this computer may be anywhere from pretty good to totally absent. The AI's level of hacking skills is also unknown. If the AI is very smart, its likely that it can hack its way out and get all over the internet with no one ever deciding to release it. 

What's never explained is how AI could suddenly become so powerful that it can achieve what entire nations of extremely smart people could never do.

How could a machine move that big rock when all the strongest people in the tribe have tried and failed? The difference between "very smart human" and the theoretical limits of intelligence may well be like the difference between "very fast cheetah" and the speed of light.

"why wouldn't we just unplug the damn thing"

Stopping WW2 is easy, the enemy needs air right, so just don't give them any air and they will be dead in seconds. 

Reasons unplugging an AI might not be the magic solution. 

  1. You don't know it's misaligned.  Its acting nicely so far. You don't realize it's plotting something.
  2. It's all over the internet. Millions of computers all over the world, including some on satellites.
  3. It's good at lying and manipulating humans. Maybe someone is making a lot of money from the AI. Maybe the AI hacked a bank and hired security guards for its datacenter. Maybe some random gullible person has been persuaded to run the AI on their gaming pc with a cute face and a sob story. If anyone in the world could download the AI's code off the internet and run it and get superhuman advice, many people would. Almost all our communication is digital, so good luck convincing people that the AI needs to be destroyed when the internet is full of very persuasive pro AI arguments.
  4. Its developed is own solar powered nanobot hardware. 

Turning the AI works a fraction of a second after the AI is turned on. But this is useless, no one would turn an AI on and then immediately turn it off again. The person turning an unaligned AI on is likely mistaken in some way about what their AI will do. The AI will make sure not to correct that flawed conception until its too late.

I want to add that the AI probably does not know it is misaligned for a while.

I don't know how to simply communicate to you the concept of things beyond the human level. Helpfully, Eliezer wrote a short story to communicate it, and I'll link to that.

What do you think would happen if we built an intelligence of that relative speed-up to humanity, and connected it to the internet?

I would need to understand why early AIs would become so much more powerful than corporations, terrorists or nation-states>

One argument I removed to make it shorter was approximately: "It doesn't have to take over the world to cause you harm". And since early misaligned AI is more likely to appear in a developed country, your odds of being harmed by it is higher compared to someone in an undeveloped country. If ISIS suddenly found itself 500 strong in Silicon Valley and in control of Google's servers, surely you would have the right to be concerned before they had a good chance of taking over the whole world. And you'd be doubly worried if you did not understand how it went from 0  to 500 "strong", or what the next increase in strength might be. You understand how nation states and terrorist organizations grow. I don't think anyone currently understands, well, how AI grows in intelligence.

There were a million other arguments I wanted to "head off" in this post, but the whole point of introductory material is to be short.

> there is no reason to believe that rouge AI will be dramatically more powerful than corporations or terrorists"

I don't think that's true. If our AI ends up no more powerful than existing corporations or terrorists, why are we spending billions on it? It had better be more powerful than something. I agree alignment might not be "solvable" for the reasons you mention, and I don't claim that it is. 

I am specifically claiming AI will be unusually dangerous, though.

[+][comment deleted]1y 1

I think that you raise a crucial point. I find it challenging to explain to people that AI is likely very dangerous. It‘s much easier to explain that pandemics, nuclear wars or environmental crises are dangerous. I think this is mainly due to the abstractness of AI and the concreteness of those other dangers, leading to availability bias.

The most common counterarguments I've heard from people about why AI isn't a serious risk are:

  • AI is impossible, and it is just "mechanical" and lacks some magical properties only humans have.
  • When we build AIs, we will not embed them with negative human traits such as hate, anger and vengeance.
  • Technology has been the most significant driver of improvements in human well-being, and there's no solid evidence that this will change.

I have found that comparing the relationship between humans and chimpanzees with humans and hypothetical AIs is a good explanation that people find compelling. There's plenty of evidence suggesting that chimpanzees are pretty intelligent, but they are just not nearly intelligent enough to influence human decision making. This has resulted in chimps spending their lives in captivity across the globe.

Another good explanation is based on insurance. The probability that your house will be destroyed is small, but it's still prudent to buy home insurance. Suppose you believe that the likelihood that AI will be dangerous is small. Is it not wise that we insure ourselves by dedicating resources towards the development of safe AI?

As another short argument: We don't need an argument for why AI is dangerous, because dangerous is the default state of powerful things. There needs to be a reason AI would be safe.

Nice! I think it would be helpful to add a bunch of links at the end along the lines of "Here are some Respected Thinkers You Respect who agree that AI Risk is a Serious Possibility" and "Here are some lengthier, more rigorous pitches and overviews for you to read if you want to engage more with the basic ideas and arguments sketched here."

I've been thinking of a pitch that starts along these lines:

"You know how you kind of hate Facebook but can't stop using it?" 

I feel like most people I know intuitively understand that.  

I'm still playing with the second half of the pitch.  Something like, "What if it were 200x better at keeping you using it and didn't care that you hate it?" or "What if it had nukes?"

unlike other technologies, an AI disaster might not wait around for you to come clean it up

I think this piece is extremely important, and I would have put it in a more central place. The whole "instrumental goal preservation" argument makes AI risk very different from the knife/electricity/car analogies. It means that you only get one shot, and can't rely on iterative engineering. Without that piece, the argument is effectively (but not exactly) considering only low-stakes alignment.

In fact, I think if we get rid of this piece of the alignment problem, basically all of the difficulty goes away. If you can always try again after something goes wrong, then if a solution exists you will always find it eventually. 

This piece seems like much of what makes the difference between "AI could potentially cause harm" and "AI could potentially be the most important problem in the world". And I think even the most bullish techno-optimist probably won't deny the former claim if you press them on it.

Might follow this up with a post?

I think that the central argument for AI risk has the following structure and could be formulated without LW-slang:

  1. AI can have unexpected capability gains which will make it significantly above the human level.
  2. Such AI can create dangerous weapons (likely, nanobots).
  3. AI will control these weapons.
  4. This AI will likely have a dangerous goal system and will not care about human wellbeing.

It doesn't have LW sleng, but using words difficult to understanding by average people.

Depending on who you are talking to for-profit corporations is a good analogy for what is meant by "misaligned". You can then point out that those same organizations are likely to make AI with profit maximization in mind, and might skimp out on moral restraint in favor of being superhumanly good at PR.

Use that comparison with the wrong person and they'll call you a communist.