TL;DR—We’re distributing $20k in total as prizes for submissions that make effective arguments for the importance of AI safety. The goal is to generate short-form content for outreach to policymakers, management at tech companies, and ML researchers. This competition will be followed by another competition in around a month that focuses on long-form content.
This competition is for short-form arguments for the importance of AI safety. For the competition for distillations of posts, papers, and research agendas, see the Distillation Contest.
Objectives of the arguments
To mitigate AI risk, it’s essential that we convince relevant stakeholders sooner rather than later. To this end, we are initiating a pair of competitions to build effective arguments for a range of audiences. In particular, our audiences include policymakers, tech executives, and ML researchers.
- Policymakers may be unfamiliar with the latest advances in machine learning, and may not have the technical background necessary to understand some/most of the details. Instead, they may focus on societal implications of AI as well as which policies are useful.
- Tech executives are likely aware of the latest technology, but lack a mechanistic understanding. They may come from technical backgrounds and are likely highly educated. They will likely be reading with an eye towards how these arguments concretely affect which projects they fund and who they hire.
- Machine learning researchers can be assumed to have high familiarity with the state of the art in deep learning. They may have previously encountered talk of x-risk but were not compelled to act. They may want to know how the arguments could affect what they should be researching.
We’d like arguments to be written for at least one of the three audiences listed above. Some arguments could speak to multiple audiences, but we expect that trying to speak to all at once could be difficult. After the competition ends, we will test arguments with each audience and collect feedback. We’ll also compile top submissions into a public repository for the benefit of the x-risk community.
Note that we are not interested in arguments for very specific technical strategies towards safety. We are simply looking for sound arguments that AI risk is real and important.
Competition details
The present competition addresses shorter arguments (paragraphs and one-liners) with a total prize pool of $20K. The prizes will be split among, roughly, 20-40 winning submissions. Please feel free to make numerous submissions and try your hand at motivating various different risk factors; it's possible that an individual with multiple great submissions could win a good fraction of the prize. The prize distribution will be determined by effectiveness and epistemic soundness as judged by us. Arguments must not be misleading.
To submit an entry:
- Please leave a comment on this post (or submit a response to this form), including:
- The original source, if not original.
- If the entry contains factual claims, a source for the factual claims.
- The intended audience(s) (one or more of the audiences listed above).
- In addition, feel free to adapt another user’s comment by leaving a reply—prizes will be awarded based on the significance and novelty of the adaptation.
Note that if two entries are extremely similar, we will, by default, give credit to the entry which was posted earlier. Please do not submit multiple entries in one comment; if you want to submit multiple entries, make multiple comments.
The first competition will run until May 27th, 11:59 PT. In around a month, we’ll release a second competition for generating longer “AI risk executive summaries'' (more details to come). If you win an award, we will contact you via your forum account or email.
Paragraphs
We are soliciting argumentative paragraphs (of any length) that build intuitive and compelling explanations of AI existential risk.
- Paragraphs could cover various hazards and failure modes, such as weaponized AI, loss of autonomy and enfeeblement, objective misspecification, value lock-in, emergent goals, power-seeking AI, and so on.
- Paragraphs could make points about the philosophical or moral nature of x-risk.
- Paragraphs could be counterarguments to common misconceptions.
- Paragraphs could use analogies, imagery, or inductive examples.
- Paragraphs could contain quotes from intellectuals: “If we continue to accumulate only power and not wisdom, we will surely destroy ourselves” (Carl Sagan), etc.
For a collection of existing paragraphs that submissions should try to do better than, see here.
Paragraphs need not be wholly original. If a paragraph was written by or adapted from somebody else, you must cite the original source. We may provide a prize to the original author as well as the person who brought it to our attention.
One-liners
Effective one-liners are statements (25 words or fewer) that make memorable, “resounding” points about safety. Here are some (unrefined) examples just to give an idea:
- Vladimir Putin said that whoever leads in AI development will become “the ruler of the world.” (source for quote)
- Inventing machines that are smarter than us is playing with fire.
- Intelligence is power: we have total control of the fate of gorillas, not because we are stronger but because we are smarter. (based on Russell)
One-liners need not be full sentences; they might be evocative phrases or slogans. As with paragraphs, they can be arguments about the nature of x-risk or counterarguments to misconceptions. They do not need to be novel as long as you cite the original source.
Conditions of the prizes
If you accept a prize, you consent to the addition of your submission to the public domain. We expect that top paragraphs and one-liners will be collected into executive summaries in the future. After some experimentation with target audiences, the arguments will be used for various outreach projects.
(We thank the Future Fund regrant program and Yo Shavit and Mantas Mazeika for earlier discussions.)
In short, make a submission by leaving a comment with a paragraph or one-liner. Feel free to enter multiple submissions. In around a month we'll divide 20K to award the best submissions.
I'd like to complain that this project sounds epistemically absolutely awful. It's offering money for arguments explicitly optimized to be convincing (rather than true), it offers money only for prizes making one particular side of the case (i.e. no money for arguments that AI risk is no big deal), and to top it off it's explicitly asking for one-liners.
I understand that it is plausibly worth doing regardless, but man, it feels so wrong having this on LessWrong.
If the world is literally ending, and political persuasion seems on the critical path to preventing that, and rationality-based political persuasion has thus far failed while the empirical track record of persuasion for its own sake is far superior, and most of the people most familiar with articulating AI risk arguments are on LW/AF, is it not the rational thing to do to post this here?
I understand wanting to uphold community norms, but this strikes me as in a separate category from “posts on the details of AI risk”. I don’t see why this can’t also be permitted.
TBC, I'm not saying the contest shouldn't be posted here. When something with downsides is nonetheless worthwhile, complaining about it but then going ahead with it is often the right response - we want there to be enough mild stigma against this sort of thing that people don't do it lightly, but we still want people to do it if it's really clearly worthwhile. Thus my kvetching.
(In this case, I'm not sure it is worthwhile, compared to some not-too-much-harder alternative. Specifically, it's plausible to me that the framing of this contest could be changed to not have such terrible epistemics while still preserving the core value - i.e. make it about fast, memorable communication rather than persuasion. But I'm definitely not close to 100% sure that would capture most of the value.
Fortunately, the general policy of imposing a complaint-tax on really bad epistemics does not require me to accurately judge the overall value of the proposal.)
No, it's just the standard frontpage policy:
Technically the contest is asking for attempts to persuade not explain, rather than itself attempting to persuade not explain, but the principle obviously applies.
As with my own comment, I don't think keeping the post off the frontpage is meant to be a judgement that the contest is net-negative in value; it may still be very net positive. It makes sense to have standard rules which create downsides for bad epistemics, and if some bad epistemics are worthwhile anyway, then people can pay the price of those downsides and move forward.
Raemon and I discussed whether it should be frontpage this morning. Prizes are kind of an edge case in my mind. They don't properly fulfill the frontpage criteria but also it feels like they deserve visibility in a way that posts on niche topics don't, so we've more than once made an exception for them.
I didn't think too hard about the epistemics of the post when I made the decision to frontpage, but after John pointed out the suss epistemics, I'm inclined to agree, and concurred with Raemon moving it back to Personal.
----
I think the prize could be improved simply by rewarding the best arguments in favor and against AI risk. This might actually be more convincing to the skeptics – we paid people to argue against this position and now you can see the best they came up with.
That idea seems reasonable at first glance, but upon reflection, I think it's a really bad idea. It's one thing to run a red-teaming competition, it's another to spend money building rhetorically optimised tools for the other side. If we do that, then maybe there was no point running the competition in the first place as it might all cancel out.
Wait, the goal here, at least, isn't to produce truth, it is to disseminate it. Counter-arguments are great, but this isn't about debating the question, it's about communicating a conclusion well.
You didn't refute his argument at all, you just said that other movements do the same thing. Isn't the entire point of rationality that we're meant to be truth-focused, and winning-focused, in ways that don't manipulate others? Are we not meant to hold ourselves to the standard of "Aim to explain, not persuade"? Just because others in the reference class of "movements" do something doesn't mean it's immediately something we should replicate! Is that not the obvious, immediate response? Your comment proves too much; it could be used to argue for literally any popular behavior of movements, including canceling/exiling dissidents.
Do I think that this specific contest is non-trivially harmful at the margin? Probably not. I am, however, worried about the general attitude behind some of this type of recruitment, and the justifications used to defend it. I become really fucking worried when someone raises an entirely valid objection, and is met with "It's only natural; most other movements do this".
It's unreasonable to look at AI and think "in order to make it more predictable, more controllable, we just have to make it smarter". When individual humans become smarter, they become much less predictable. And the human mind is built to predict the behavior of other humans.
Random side note: GPT-3 seems to be able to generate decent one liners.
Generate one-liners describing the dangers of AI. An example is "Inventing machines that are smarter than us is playing with fire."
1. Machines that are smarter than us could easily become our masters.
2. If we're not careful, AI could spell the end of humanity as we know it.
3. AI could be used to create weapons of mass destruction that could devastate the planet.
4. AI could be used to create powerful robots that could enslave humans.
5. AI could be used to create artificial intelligence that could turn against its creators
6. AI usher in a new era of cyber-warfare that could cripple society
7. AI could create self-replicating robots that could eventually consume all resources on Earth
8. AI could be used to create powerful AIs that could eventually surpass human intelligence and take over the world
9. AI technology could eventually be used to create a global surveillance state where everyone is constantly watched and monitored
First two paragraphs of https://astralcodexten.substack.com/p/book-review-a-clinical-introduction seem to fit the bill.
-- Stuart Russell on a February 25, 2021 podcast with the Future of Life Institu... (read more)
Imagine (an organisation like) the catholic church, but immortal, never changing, highly competent and relentlessly focused on its goals - it could control the fate of humanity for millions of years.
(a)
Look, we already have superhuman intelligences. We call them corporations and while they put out a lot of good stuff, we're not wild about the effects they have on the world. We tell corporations 'hey do what human shareholders want' and the monkey's paw curls and this is what we get.
Anyway yeah that but a thousand times faster, that's what I'm nervous about.
(b)
Look, we already have superhuman intelligences. We call them governments and while they put out a lot of good stuff, we're not wild about the effects they have on the world. We tell gov... (read more)
(To Policymakers and Machine Learning Researchers)
Building a nuclear weapon is hard. Even if one manages to steal the government's top secret plans, one still need to find a way to get uranium out of the ground, find a way to enrich it, and attach it to a missile. On the other hand, building an AI is easy. With scientific papers and open source tools, researchers are doing their utmost to disseminate their work.
It's pretty hard to hide a uranium mine. Downloading TensorFlow takes one line of code. As AI becomes more powerful and more dangerous, greater efforts need to be taken to ensure malicious actors don't blow up the world.
Any arguments for AI safety should be accompanied by images from DALL-E 2.
One of the key factors which makes AI safety such a low priority topic is a complete lack of urgency. Dangerous AI seems like a science fiction element, that's always a century away, and we can fight against this perception by demonstrating the potential and growth of AI capability.
No demonstration of AI capability has the same immediate visceral power as DALL-E 2.
In longer-form arguments, urgency could also be demonstrated through GPT-3's prompts, but DALL-E 2 is better, especially ... (read more)
Neither us humans, nor the flower, sees anything that looks like a bee. But when a bee looks at it, it sees another bee, and it is tricked into pollinating that flower. The flower did not know any of this, it's petals randomly changed shape over millions of years, and eventually one of those random shapes started tricking bees and outperforming all of the other flowers.
Today's AI already does this. If AI begins to approach human intelligence, there's no limit to the number of ways things can go horribly wrong.
If AI approaches and reaches human-level intelligence, it will probably pass that level just as quickly as it arrived at that level.
[ML researchers]
Given that we can't agree on whether a hotdog is a sandwich or not...We should probably start thinking about how to tell a computer what is right and wrong.
[Insert call to action on support / funding for AI governance / regulation etc.]
-
Given that we can't agree on whether a straw has two holes or one...We should probably start thinking about how to explain good and evil to a computer.
[Insert call to action on support / funding for AI governance / regulation etc.]
(I could imagine a series riffing based on this structure / theme)
I will post my submissions as individual replies to this comment. Please let me know if there’s any issues with that.
"Most AI reserch focus on building machines that do what we say. Aligment reserch is about building machines that do what we want."
Source: Me, probably heavely inspred by "Human Compatible" and that type of arguments. I used this argument in conversations to explain AI Alignment for a while, and I don't remember when I started. But the argument is very CIRL (cooperative inverse reinforcment learning).
I'm not sure if this works as a one liner explanation. But it does work as a conversation starter of why trying to speify goals directly is a bad idea. And ho... (read more)
Question: "effective arguments for the importance of AI safety" - is this about arguments for the importance of just technical AI safety, or more general AI safety, to include governance and similar things?
It's not a question of "if" we build something smarter than us, it's a question of "when". Progress in that direction has been constant, for more than a decade now, and recently it has been faster than ever before.
"AI cheats. We've seen hundreds of unique instances of this. It finds loopholes and exploits them, just like us, only faster. The scary thing is that, every year now, AI becomes more aware of its surroundings, behaving less like a computer program and more like a human that thinks but does not feel"
To Policymakers: "Just think of the way in which we humans have acted towards animals, and how animals act towards lesser animals, now think of how a powerful AI with superior intellect might act towards us, unless we create them in such a way that they will treat us well, and even help us."
Source: Me
[Policy makers & ML researchers]
"There isn’t any spark of compassion that automatically imbues computers with respect for other sentients once they cross a certain capability threshold. If you want compassion, you have to program it in" (Nate Soares). Given that we can't agree on whether a straw has two holes or one...We should probably start thinking about how program compassion into a computer.
[Policy makers]
A couple of years ago there was an AI trained to beat Tetris. Artificial intelligences are very good at learning video games, so it didn't take long for it to master the game. Soon it was playing so quickly that the game was speeding up to the point it was impossible to win and blocks were slowly stacking up, but before it could be forced to place the last piece, it paused the game.
As long as the game didn't continue, it could never lose.
When we ask AI to do something, like play Tetris, we have a lot of assumptions about how it can or ... (read more)
Here's my submission, it might work better as bullet points on a page.
AI will transform human societies over the next 10-20 years. Its impact will be comparable to electricity or nuclear weapons. As electricity did, AI could improve the world dramatically; or, like nuclear weapons, it could end it forever. Like inequality, climate change, nuclear weapons, or engineered pandemics, AI Existential Risk is a wicked problem. It calls upon every policymaker to become a statesperson: to rise above the short-term, narrow inte... (read more)
Flowers evolved to trick insects into spreading their pollen, not to feed the insects. AI also evolves; it doesn't know, it just does whatever seems to gain approval.
For policymakers
Remember all the scary stuff the engineers said a terrorist could think to do? Someone could write a computer program to do them just randomly.
What about graphics? e.g. https://twitter.com/DavidSKrueger/status/1520782213175992320
“The smartest ones are the most criminally capable.” [·]
[Policy makers & ML researchers]
“AI doesn’t have to be evil to destroy humanity – if AI has a goal and humanity just happens to come in the way, it will destroy humanity as a matter of course without even thinking about it, no hard feelings” (Elon Musk).
[Insert call to action]
"As AI gradually becomes more capable of modelling and understanding its surroundings, the risks associated with glitches and unpredictable behavior will grow. If artificial intelligence continues to expand exponentially, then these risks will grow exponentially as well, and the risks might even grow exponentially shortly after appearing"
“Aligning AI is the last job we need to do. Let’s make sure we do it right.”
(I’m not sure which target audience my submissions are best targeted towards. I’m hoping that the judges can make that call for me.)
Artificial intelligence, real impacts. (Policymakers)
AI: it’s not “artificial” anymore. (Policymakers)
Artificial intelligence is no longer fictional. (Policymakers)
[Policy makers & ML researchers]
Expecting AI to automatically care about humanity is like expecting a man to automatically care about a rock. Just as the man only cares about the rock insofar as it can help him achieve his goals, the AI only cares about humanity insofar as it can help it achieve its goals. If we want an AI to care about humanity, we must program it to do so. AI safety is about making sure we get this programming right. We may only get one chance.
[Policy makers & ML researchers]
Our goal is human flourishing. AI’s job is to stop at nothing to accomplish its understanding of our goal. AI safety is about making sure we’re really good at explaining ourselves.
[Policy makers & ML researchers]
AI safety is about developing an AI that understands not what we say, but what we mean. And it’s about doing so without relying on the things that we take for granted in inter-human communication: shared evolutionary history, shared experiences, and shared values. If we fail, a powerful AI could decide to maximize the number of people that see an ad by ensuring that ad is all that people see. AI could decide to reduce deaths by reducing births. AI could decide to end world hunger by ending the world.
(The first line is a slightly tweaked version of a different post by Linda Linsefors, so credit to her for that part.)
Imagine a turtle trying to outsmart us. It could never happen. AI Safety is about what happens when we become the turtles.
I was tempted not to post it because it seems too similar to the gorilla example, but I eventually decided, "eh, why not?" Also, there's a possibility that I somehow stole this from somewhere and forgot about it. Sorry if that's the case.
Most humans would (and do) seek power and resources in a way that is bad for other systems that happen to be in the way (e.g., rainforests). When we colloquially talk about AIs "destroying the world" by default, it's a very self-centered summary: the world isn't actually "destroyed", just radically transformed in a way that doesn't end with any of the existing humans being alive, much like how our civilization transforms the Earth in ways that cut down existing forests.
Comment by Zach_M_Davis here
AI doesn't know or care if it takes away someone's job. It doesn't care what we do in response to its capabilities. It simply performs the task, with zero regard for any consequences outside its sphere of comprehension. It is a set of gears, and they turn.
For policymakers
Optional sentence at the end: And every year, the newest AI behaves less like an object and more like something that can have its own thoughts.
We have no idea what the pace of AI advancement will be 10 years from now. Everyone who has tried to predict the pace of AI advancement has turned out to be wrong. You don't know how easy something is to invent until after it is invented.
What we do know is that we will eventually reach generally intelligent AI, which is AI that can invent new technology as well as a human can. That is the finish line for human innovation, because afterwards AI will be the only thing necessary to build the next generation of even smarter AI systems. If these successive AI systems remain controllable after that point, there will be no limit to what the human race will be capable of.
It is a fundamental law of thought that thinking things will cut corners, misinterpret instructions, and cheat.
Innovation is the last thing we will need to automate, the finish line for innovation is building a machine that can innovate as well as a human can. When humans build such a machine, that is better at innovating than humans who built it, then from that point on, it will be able to independently build much smarter iterations of itself.
But it will be as likely to cut corners and cheat as every human and AI we have seen so far, which is very likely, because that is what humans and AI have always done. It is a fundamental law of thought that thinking things cut corners and cheat.
It is clear that once AI is better than humans at inventing things, we will have made the final and most important invention in human history. That is the "finish line" for human innovation and human thought; we will have created a machine that can automate any task for us, including the task of automating new tasks. However, for the last decade, many AI experts have been saying that it will take a really long time before AI is advanced enough to independently make itself smarter.
The last two years of increasingly rapid AI development have called that into... (read more)
One liner: Don't build a god that also wants to kill you.
This image counts as a submission, the first half is just a reference point though and it is not intended to be a meme. Obviously I don't want the actual image to be shown to policymakers, especially because it has a politician in it and it quotes him on something that he obviously never said.

I just really think that we can achieve a lot with a single powerpoint slide that only says "AI Cheats" in gigantic times new roman font
(one liner - for policy makers)
Within ten years, AI systems could be more dangerous than nuclear weapons. The research required for this technology is heavily funded and virtually unregulated.
I will do as Yitz post my submissions as individual replies to this comment. Please let me know if there’s any issues with that.
AI is essentially a statisticians superhero outfit. As with all superheroes, there is a significant amount of collateral damage, limited benefit, and an avoidance of engaging with the root causes of problems.
- Rachel Ganz, posted here on her behalf.
"Past technological revolutions usually did not telegraph themselves to people alive at the time, whatever was said afterward in hindsight"
Eliezer Yudkowsky, AI as a pos neg factor, around 2006
[Policy makers & ML researchers]
Expecting AI to know what is best for humans is like expecting your microwave to know how to ride a bike.
[Insert call to action]
-
Expecting AI to want what is best for humans is like expecting your calculator to have a preference for jazz.
[Insert call to action]
(I could imagine a series riffing based on this structure / theme)
"AI may make [seem to make a] sharp jump in intelligence purely as the result of anthropomorphism, the human tendency to think of «village idiot» and «Einstein» as the extreme ends of the intelligence scale, instead of nearly indistinguishable points on the scale of minds-in-general. Everything dumber than a dumb human may appear to us as simply «dumb». One imagines the «AI arrow» creeping steadily up the scale of intelligence, moving past mice and chimpanzees, with AIs still remaining «dumb» because AIs can’t speak fluent language or write science pa... (read more)
Imagine playing your first ever chess game against a grandmaster. That's what fighting against a malicious AGI would be like.
Donald Knuth said, "Premature optimization is the root of all evil." AIs are built to be hardline optimizers.
Source: Structured Programming with go to Statements by Donald Knuth
"There are also other reasons why an AI might show a sudden huge leap in intelligence. The species Homo sapiens showed a sharp jump in the effectiveness of intelligence, as the result of natural selection exerting a more-or-less steady optimization pressure on hominids for millions of years, gradually expanding the brain and prefrontal cortex, tweaking the software architecture. A few tens of thousands of years ago, hominid intelligence crossed some key threshold and made a huge leap in real-world effectiveness; we went from caves to skyscrapers in the blink of an evolutionary eye"
Eliezer Yudkowsky, AI as a pos neg factor, around 2006
"The key implication for our purposes is that an AI might make a huge jump in intelligence after reaching some threshold of criticality.
In 1933, Lord Ernest Rutherford said that no one could ever expect to derive power from splitting the atom: «Anyone who looked for a source of power in the transformation of atoms was talking moonshine.» At that time laborious hours and weeks were required to fission a handful of nuclei."
Eliezer Yudkowsky, AI as a pos neg factor, around 2006
"One of the most critical points about Artificial Intelligence is that an Artificial Intelligence might increase in intelligence extremely fast. The obvious reason to suspect this possibility is recursive self-improvement. (Good 1965.) The AI becomes smarter, including becoming smarter at the task of writing the internal cognitive functions of an AI, so the AI can rewrite its existing cognitive functions to work even better, which makes the AI still smarter, including smarter at the task of rewriting itself, so that it makes yet more improvements... (read more)
"If a really smart AI and powerful AI is told to maximize humanity's happiness, fulfillment, and/or satisfaction, it will require us to specify that it must not do so by
wiring car batteries to the brain's pleasure centersusing heroin/cocaine/etc.Even if we specify that particular stipulation, it'll probably think of another loophole or another way to cheat and boost the numbers higher than they're supposed to go. If it's smarter than a human, then all it takes is one glitch"
This is not for policymakers, as many of them are probably on cocaine.
"The folly of programming an AI to implement communism, or any other political system, is that you’re programming means instead of ends. You're programming in a fixed decision, without that decision being re-evaluable after acquiring improved empirical knowledge about the results of communism. You are giving the AI a fixed decision without telling the AI how to re-evaluate, at a higher level of intelligence, the fallible process which produced that decision."
Eliezer Yudkowsky, AI as a pos neg factor, around 2006
It makes sense that a disproportionately large proportion of the best paragraphs would come from a single goldmine. I imagine that The Precipice would be even better.
Proving a computer chip correct [in 2006] require[d] a synergy of human intelligence and computer algorithms, as currently [around 2006] neither suffices on its own. Perhaps a true [AGI] could use a similar combination of abilities when modifying its own code — would have both the capability to invent large designs without being defeated by exponential explosion, and also the ability to verify its steps with extreme reliability. That is one way a true AI might remain knowably stable in its goals, even ... (read more)
"One common reaction I encounter is for people to immediately declare that Friendly AI is an impossibility, because any sufficiently powerful AI will be able to modify its own source code to break any constraints placed upon it.
The first flaw you should notice is a Giant Cheesecake Fallacy. Any AI with free access to its own source would, in principle, possess the ability to modify its own source code in a way that changed the AI’s optimization target. This does not imply the AI has the motive to change its own motives. I would not knowingly... (read more)
"Wishful thinking adds detail, constrains prediction, and thereby creates a burden of improbability. What of the civil engineer who hopes a bridge won’t fall?"
Optional extra:
"Should the engineer argue that bridges in general are not likely to fall? But Nature itself does not rationalize reasons why bridges should not fall. Rather, the civil engineer overcomes the burden of improbability through specific choice guided by specific understanding"
-Eliezer Yudkowsky, AI as a pos neg factor, around 2006
"The temptation is to ask what «AIs» will «want», forgetting that the space of minds-in-general is much wider than the tiny human dot"
Optional paragraph form:
"The critical challenge is not to predict that «AIs» will attack humanity with marching robot armies, or alternatively invent a cure for cancer. The task is not even to make the prediction for an arbitrary individual AI design. Rather, the task [for humanity to accomplish] is choosing into existence some particular powerful optimization process whose beneficial effects can leg... (read more)
"Artificial Intelligence is not an amazing shiny expensive gadget to advertise in the latest tech magazines. Artificial Intelligence does not belong in the same graph that shows progress in medicine, manufacturing, and energy. Artificial Intelligence is not something you can casually mix into a lumpenfuturistic scenario of skyscrapers and flying cars and nanotechnological red blood cells that let you hold your breath for eight hours. Sufficiently tall skyscrapers don’t potentially start doing their own engineering. Humanity did not rise to prominence on Earth by holding its breath longer than other species."
Eliezer Yudkowsky, AI as a pos neg factor, around 2006
"If the word «intelligence» evokes Einstein instead of humans, then it may sound sensible to say that intelligence is no match for a gun, as if guns had grown on trees. It may sound sensible to say that intelligence is no match for money, as if mice used money. Human beings didn’t start out with major assets in claws, teeth, armor, or any of the other advantages that were the daily currency of other species. If you had looked at humans from the perspective of the rest of the ecosphere, there was no hint that the soft pink things would eventually ... (read more)
The danger of confusing general intelligence with g-factor is that it leads to tremendously underestimating the potential impact of Artificial Intelligence. (This applies to underestimating potential good impacts, as well as potential bad impacts.) Even the phrase «transhuman AI» or «artificial superintelligence» may still evoke images of book-smarts-in-a-box: an AI that’s really good at cognitive tasks stereotypically associated with «intelligence», like chess or abstract mathematics. But not superhumanly persuasive; or far better than humans at... (read more)
"But the word «intelligence» commonly evokes pictures of the starving professor with an IQ of 160 and the billionaire CEO with an IQ of merely 120. Indeed there are differences of individual ability apart from «book smarts» which contribute to relative success in the human world: enthusiasm, social skills, education, musical talent, rationality. Note that each factor... is cognitive. Social skills reside in the brain, not the liver. And jokes aside, you will not find many CEOs, nor yet professors of academia, who are chimpanzees. You will not find man... (read more)
"Any two AI designs might be less similar to one another than you are to a petunia."
-Yudkowsky, AI pos neg factors, around 2006
for policymakers
"In every known culture, humans experience joy, sadness, disgust, anger, fear, and surprise (Brown 1991), and indicate these emotions using the same facial expressions (Ekman and Keltner 1997). We all run the same engine under our hoods, though we may be painted different colors; a principle which evolutionary psychologists call the psychic unity of humankind (Tooby and Cosmides 1992). This observation is both explained and required by the mechanics of evolutionary biology.
An anthropologist will not excitedly report of a newly discovered tribe: «... (read more)
"Artificial Intelligence is not settled science; it belongs to the frontier, not to the Textbook."
-Eliezer Yudkowsky, positive and negative factors of AI in global risk, around 2006
"The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else"
"All intelligent and semi-intelligent life eventually learns how to cheat. Even our pets cheat. The domesticated Guinea Pig will inflict sleep deprivation on their owner by squeaking at night, over the slightest chance that their owner will wake up and feed them sooner. They even adjust the pitch so that their owner never realizes that the guinea pigs are the ones waking them up. Many dogs and cats learn to do this as well"
"Humans have played brinkmanship with nuclear weapons for 60 years. Strategically, it is the most persuasive option to make it clear that your military is serious about something. Before the nuclear bomb, human beings played brinkmanship with war itself, for centuries (which was the closest equivalent).
We must not play brinkmanship by inventing self-improving AI systems, specifically AI systems that run the risk of rapidly becoming smarter than humans. It may have been possible to de-escalate with nuclear missiles, but it was never conceivable to un-invent the nuclear bomb"
"No matter how simple the task, no matter how obvious it seems to the human mind, AI always finds a new way to cheat"
"AI keeps finding new ways to cheat"
"Imagine that Facebook and Netflix have two separate AIs that compete over hours that each user spends on their own platform. They want users to spend the maximum amount of minutes on Facebook or Netflix, respectively.
The Facebook AI discovers that posts that spoil popular TV shows result in people spending more time on the platform. It doesn't know what spoilers are, only that they cause people to spend more time on Facebook. But in reality, they're ruining the entertainment value from excellent shows on Netflix.
Even worse, the Netflix AI discovers ... (read more)
"The people who predicted exponential/escalating advancement in AI were always right. The people who predicted linear/a continuation of the last 10 years always turned out to be wrong. Since AI doesn't just get smarter every year, but it gets smarter faster every year, that means there are a finite number of years before it starts getting too smart, too fast"
https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html
"AI will probably surpass human intelligence at the same pace that it reaches human intelligence. Considering the pace of AI advancement over the last 3 years, that pace will probably be very fast"
https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-
"We can make AI smarter and that's what we have been doing for a decade, successfully. However, it's also gotten much smarter at cheating, because that's how intelligence works. Always has been, always will be."
Optional second sentence: "But with the rate that AI is becoming more intelligent every year while still cheating, we should worry about what cheating and computer glitches will look like for an AI whose intelligence reaches and surpasses human intelligence"
If AI takes 200 years to become as smart as an ant, and then 20 years from there to become as smart as a chimpanzee; then AI could take 2 years to become as smart as a human, and 1 year after that to become much smarter than a human.
https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html
"What would a glitch look like inside of an AI that is smarter than a human? The only glitches that we have any experience with, have all been inside computers and AI systems that are nowhere near as smart as humans"
"AI has become smarter every year for the last 10 years. It's gotten faster recently. The question is, how much smarter does it need to get before it is smarter than humans? If it is smarter than humans, all it will take is a single glitch, and it could choose do all sorts of horrible things."
Optional extra sentence: "It will not think like a human, it will not want the same things that humans want, but it will understand human behavior better than we do"
"If an AI becomes smarter than humans, it will not have any trouble deceiving its programmers so that they cannot turn it off. The question isn't 'can it behave unpredictably and do damage', the question is 'will it behave unpredictably and do damage'"
At the rate that AI is advancing, it will inevitably become smarter than humans, and take over the task of building new AI systems that are even smarter. Unfortunately, we have no idea how to fix glitches in a computer system that is as smart to us as we are to animals.
"If we race to build an AI that is smarter than us in some ways but not others, then we might not have enough time to steer it in the right direction before it discovers that it can steer us instead"
"Every AI we have ever built has behaved randomly and unpredictably, cheating and exploiting loopholes whenever possible. They required massive amounts of human observation and reprogramming in order to behave predictably and perform tasks."
Optional second sentence: "If we race to build an AI that is smarter than us in some ways but not others, then we might not have enough time to steer it in the right direction before it discovers that it can steer us"
"Every AI ever built has required massive trial and error, and human supervision, in order to make it do exactly what we want it to without cheating or finding a complex loophole."
Optional additional sentences: "Right now, we are trending towards AI that will be smarter than humans. We don't know if it will be in 10 years or 100, but what we do know is that it will probably be much better at cheating and finding loopholes than we are"
https://imgur.com/a/kURPbsk
I made a really good GIF of cheating AI, taken from openAI's video here
"If an AI's ability to learn and observe starts improving rapidly and approach human intelligence, then it will probably behave unpredictably, and we might not have enough time to assert control before it is too late."
https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html
[Policy makers]
We don't let companies use toxic chemicals without oversight.
Why let companies use AI without oversight?
[Insert call to action on support / funding for AI governance or regulation]
[Policymakers & ML researchers]
A virus doesn't need to explain itself before it destroys us. Neither does AI.
A meteor doesn't need to warn us before it destroys us. Neither does AI.
An atomic bomb doesn't need to understand us in order to destroy us. Neither does AI.
A supervolcano doesn't need to think like us in order to destroy us. Neither does AI.
(I could imagine a series riffing based on this structure / theme)
[Policy-makers & ML researchers]
In 1901, the Chief Engineer of the US Navy said “If God had intended that man should fly, He would have given him wings.” And on a windy day in 1903, Orville Wright proved him wrong.
Let's not let AI catch-us by surprise.
[Insert call to action]
[Policy makers & ML researchers]
“If a distinguished scientist says that something is possible, he is almost certainly right; but if he says that it is impossible, he is very probably wrong” (Arthur Clarke). In the case of AI, the distinguished scientists are saying not just that something is possible, but that it is probable. Let's listen to them.
[Insert call to action]
The wait but why post on AI is a gold mine of one-liners and one-liner inspiration.
Part 2 has better inspiration for appealing to AI scientists.
[Policy makers & ML researchers]
If you aren't worried about AI, then either you believe that we will stop making progress in AI or you believe that code will stop having bugs...which is it?
[Tech executives]
If you could not fund that initiative that could turn us all into paperclips...that'd be great.
[Insert call to action]
--
If you could not launch the project that could raise the AI kraken...that'd be great.
[Insert call to action]
--
If you could not build the bot that will treat us the way we treat ants...that'd be great.
[Insert call to action]
--
(I could imagine a series riffing based on this structure / theme)
[ML researchers]
"We're in the process of building some sort of god. Now would be a good time to make sure it's a god we can live with" (Sam Harris, 2016 Ted Talk).
AI presents both staggering opportunity and chilling peril. Developing intelligent machines could help eradicate disease, poverty, and hunger within our lifetime. But uncontrolled AI could spell the end of the human race. As Stephen Hawking warned, "Success in creating AI would be the biggest event in human history. Unfortunately, it might also be the last, unless we learn how to avoid the risks."
AI safety is essential for the ethical development of artificial intelligence."
"AI safety is the best insurance policy against an uncertain future."
"AI safety is not a luxury, it's a necessity."
While it is true that AI has the potential to do a lot of good in the world, it is also true that it has the potential to do a lot of harm. That is why it is so important to ensure that AI safety is a top priority. As Google Brain co-founder Andrew Ng has said, "AI is the new electricity." Just as we have rules and regulations in place to ensure that electricity is used safely, we need to have rules and regulations in place to ensure that AI is used safely. Otherwise, we run the risk of causing great harm to ourselves and to the world around us.
War. Poverty. Inequality. Inhumanity. We have been seeing these for millennia caused by nation states or large corporations. But what are these entities, if not greater-than-human-intelligence systems, who happen to be misaligned with human well-being? Now, imagine that kind of optimization, not from a group of humans acting separately, but by an entity with a singular purpose, with an ever diminishing proportion of humans in the loop.
Audience: all, but maybe emphasizing policy makers
From this recent post about DALLE-2
We don’t know exactly how a self-aware AI would act, but we know this: it will strive to prevent its own shutdown. No matter what the AI’s goals are, it wouldn’t be able to achieve them if it gets turned off. The only sure fire way to prevent its shutdown would be to eliminate the ones with the power to do so: humans. There is currently no known method to teach an AI to care about humans. Solving this problem may take decades, and we are running out of time.
"If we build something smarter than us, that understands us better than we do, but it has a glitch that makes it stop responding correctly to commands, what are we supposed to do?"
"It has been very, very difficult to program AI how to not be racist, and that is only one thing. It keeps treating it like a math problem, not an emotional problem, and our intuitions have to be built in piece by piece.
If we build an AI that is smarter than us, then we will have to get everything right, not just one thing, and if it's smarter than us then we might have only one shot at it"
"If we have an arms race over who can be the first to build an AI smarter than humans, it will not end well. We will probably not build an AI that is safe and predictable. When the nuclear arms race began, all sides raced to build bigger bombs, more bombs, and faster planes and missiles; they did not focus on accuracy and reliability until decades later"
"If AI becomes smarter than humans, which is the direction we are heading, then it is highly unlikely that it will think and behave us. The human mind is a very specific shape, and today's AI scientists are much better at creating randomly-generated minds, than they are at creating anything as predictable and reasonable as a human being"
"The last two decades of innovation have clearly demonstrated that Artificial Intelligence will suddenly become smarter in unexpected ways, and our best experts have always failed to give accurate timeframes predicting this"
Once an extremely competent machine becomes aware of humans, their goals and its own situation every optimization pressure on the machine will via the machines actions start to be exerted on humans, their goals and the machines situation. How do we specify the optimization pressure that will be exerted on all of us with maximum force?
"If AGI systems can become as smart as humans, imagine what one human/organization could do by just replicating this AGI."
To executives and researchers:
However far you think we are from AGI, do you think that aligning it with human values will be any easier? For intelligence we at least have formalisms (like AIXI) that tell us in principle how to achieve goals in arbitrary environments. For human values on the other hand, we have no such thing. If we don't seriously start working on that now (and we can, with current systems or theoretical models), there is no chance of solving the problem in time when we near AGI, and the default outcome of that will be very bad, to say the least.
Edit: typos
“With recent breakthroughs in machine learning, more people are becoming convinced that powerful, world changing AI is coming soon. But we don’t yet know if it will be good for humanity, or disastrous.”
“We may be on the verge of a beautiful, amazing future, within our lifetimes, but only if AI is aligned with human beings.”
Source: original, but motivated by trying to ground WFLL1-type scenarios in what we already experience in the modern world, so heavily based on this. Also the original idea came from reading Neel Nanda’s “Bird's Eye View of AI Alignment - Threat Models"
Intended audience: mainly policymakers
... (read more)"What do condoms have in common with AI?"
"Evolution didn’t optimize for contraception. AI developers don’t optimize against their goals either. Accidents happen. Use protection (optional this last bit)"
"Evolution wasn’t prepared for contraception. We can do better. When deploying AI, think protection."
"We tricked nature with contraception; one day, AI could trick us too."
AIs need immense databases to provide decent results. For example, to recognize if something is a potato, an AI will take 1,000 pictures of a potato and 1,000 pictures of not-a-potato, so that it can tell you if something is a potato with 95% accuracy.
Well, 95% accurate isn't good enough--that's how you get Google labelling images of African Americans as gorillas. So what's the solution? More data! But how do you get more data? Tracking consumers.
Websites track everything you do on the internet, then sell your data to Amazon, Netflix, Facebook, etc. to bol... (read more)
Humans have biases they don't even realize. How can we verify an AI lacks such biases?