TL;DR—We’re distributing $20k in total as prizes for submissions that make effective arguments for the importance of AI safety. The goal is to generate short-form content for outreach to policymakers, management at tech companies, and ML researchers. This competition will be followed by another competition in around a month that focuses on long-form content.

This competition is for short-form arguments for the importance of AI safety. For the competition for distillations of posts, papers, and research agendas, see the Distillation Contest.

Objectives of the arguments

To mitigate AI risk, it’s essential that we convince relevant stakeholders sooner rather than later. To this end, we are initiating a pair of competitions to build effective arguments for a range of audiences. In particular, our audiences include policymakers, tech executives, and ML researchers.

  • Policymakers may be unfamiliar with the latest advances in machine learning, and may not have the technical background necessary to understand some/most of the details. Instead, they may focus on societal implications of AI as well as which policies are useful.
  • Tech executives are likely aware of the latest technology, but lack a mechanistic understanding. They may come from technical backgrounds and are likely highly educated. They will likely be reading with an eye towards how these arguments concretely affect which projects they fund and who they hire.
  • Machine learning researchers can be assumed to have high familiarity with the state of the art in deep learning. They may have previously encountered talk of x-risk but were not compelled to act. They may want to know how the arguments could affect what they should be researching.

We’d like arguments to be written for at least one of the three audiences listed above. Some arguments could speak to multiple audiences, but we expect that trying to speak to all at once could be difficult. After the competition ends, we will test arguments with each audience and collect feedback. We’ll also compile top submissions into a public repository for the benefit of the x-risk community.

Note that we are not interested in arguments for very specific technical strategies towards safety. We are simply looking for sound arguments that AI risk is real and important. 

Competition details

The present competition addresses shorter arguments (paragraphs and one-liners) with a total prize pool of $20K. The prizes will be split among, roughly, 20-40 winning submissions. Please feel free to make numerous submissions and try your hand at motivating various different risk factors; it's possible that an individual with multiple great submissions could win a good fraction of the prize. The prize distribution will be determined by effectiveness and epistemic soundness as judged by us. Arguments must not be misleading.

To submit an entry: 

  • Please leave a comment on this post (or submit a response to this form), including:
    • The original source, if not original. 
    • If the entry contains factual claims, a source for the factual claims.
    • The intended audience(s) (one or more of the audiences listed above).
  • In addition, feel free to adapt another user’s comment by leaving a reply⁠⁠—prizes will be awarded based on the significance and novelty of the adaptation. 

Note that if two entries are extremely similar, we will, by default, give credit to the entry which was posted earlier. Please do not submit multiple entries in one comment; if you want to submit multiple entries, make multiple comments.

The first competition will run until May 27th, 11:59 PT. In around a month, we’ll release a second competition for generating longer “AI risk executive summaries'' (more details to come). If you win an award, we will contact you via your forum account or email. 

Paragraphs

We are soliciting argumentative paragraphs (of any length) that build intuitive and compelling explanations of AI existential risk.

  • Paragraphs could cover various hazards and failure modes, such as weaponized AI,  loss of autonomy and enfeeblement, objective misspecification, value lock-in, emergent goals, power-seeking AI, and so on.
  • Paragraphs could make points about the philosophical or moral nature of x-risk.
  • Paragraphs could be counterarguments to common misconceptions.
  • Paragraphs could use analogies, imagery, or inductive examples.
  • Paragraphs could contain quotes from intellectuals: “If we continue to accumulate only power and not wisdom, we will surely destroy ourselves” (Carl Sagan), etc.

For a collection of existing paragraphs that submissions should try to do better than, see here.

Paragraphs need not be wholly original. If a paragraph was written by or adapted from somebody else, you must cite the original source. We may provide a prize to the original author as well as the person who brought it to our attention.

One-liners

Effective one-liners are statements (25 words or fewer) that make memorable, “resounding” points about safety. Here are some (unrefined) examples just to give an idea:

  • Vladimir Putin said that whoever leads in AI development will become “the ruler of the world.” (source for quote)
  • Inventing machines that are smarter than us is playing with fire.
  • Intelligence is power: we have total control of the fate of gorillas, not because we are stronger but because we are smarter. (based on Russell)

One-liners need not be full sentences; they might be evocative phrases or slogans. As with paragraphs, they can be arguments about the nature of x-risk or counterarguments to misconceptions. They do not need to be novel as long as you cite the original source.

Conditions of the prizes

If you accept a prize, you consent to the addition of your submission to the public domain. We expect that top paragraphs and one-liners will be collected into executive summaries in the future. After some experimentation with target audiences, the arguments will be used for various outreach projects.

(We thank the Future Fund regrant program and Yo Shavit and Mantas Mazeika for earlier discussions.)

In short, make a submission by leaving a comment with a paragraph or one-liner. Feel free to enter multiple submissions. In around a month we'll divide 20K to award the best submissions.

[$20K in Prizes] AI Safety Arguments Competition
New Comment
528 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

I'd like to complain that this project sounds epistemically absolutely awful. It's offering money for arguments explicitly optimized to be convincing (rather than true), it offers money only for prizes making one particular side of the case (i.e. no money for arguments that AI risk is no big deal), and to top it off it's explicitly asking for one-liners.

I understand that it is plausibly worth doing regardless, but man, it feels so wrong having this on LessWrong.

If the world is literally ending, and political persuasion seems on the critical path to preventing that, and rationality-based political persuasion has thus far failed while the empirical track record of persuasion for its own sake is far superior, and most of the people most familiar with articulating AI risk arguments are on LW/AF, is it not the rational thing to do to post this here?

I understand wanting to uphold community norms, but this strikes me as in a separate category from “posts on the details of AI risk”. I don’t see why this can’t also be permitted.

TBC, I'm not saying the contest shouldn't be posted here. When something with downsides is nonetheless worthwhile, complaining about it but then going ahead with it is often the right response - we want there to be enough mild stigma against this sort of thing that people don't do it lightly, but we still want people to do it if it's really clearly worthwhile. Thus my kvetching.

(In this case, I'm not sure it is worthwhile, compared to some not-too-much-harder alternative. Specifically, it's plausible to me that the framing of this contest could be changed to not have such terrible epistemics while still preserving the core value - i.e. make it about fast, memorable communication rather than persuasion. But I'm definitely not close to 100% sure that would capture most of the value.

Fortunately, the general policy of imposing a complaint-tax on really bad epistemics does not require me to accurately judge the overall value of the proposal.)

9Not Relevant
I'm all for improving the details. Which part of the framing seems focused on persuasion vs. "fast, effective communication"? How would you formalize "fast, effective communication" in a gradeable sense? (Persuasion seems gradeable via "we used this argument on X people; how seriously they took AI risk increased from A to B on a 5-point scale".)
4Liam Donovan
Maybe you could measure how effectively people pass e.g. a multiple choice version of an Intellectual Turing Test (on how well they can emulate the viewpoint of people concerned by AI safety) after hearing the proposed explanations.  [Edit: To be explicit, this would help further John's goals (as I understand them) because it ideally tests whether the AI safety viewpoint is being communicated in such a way that people can understand and operate the underlying mental models. This is better than testing how persuasive the arguments are because it's a) more in line with general principles of epistemic virtue and b) is more likely to persuade people iff the specific mental models underlying AI safety concern are correct.  One potential issue would be people bouncing off the arguments early and never getting around to building their own mental models, so maybe you could test for succinct/high-level arguments that successfully persuade target audiences to take a deeper dive into the specifics? That seems like a much less concerning persuasion target to optimize, since the worst case is people being wrongly persuaded to "waste" time thinking about the same stuff the LW community has been spending a ton of time thinking about for the last ~20 years]
7Raemon
This comment thread did convince me to put it on personal blog (previously we've frontpaged writing-contents and went ahead and unreflectively did it for this post)
2[anonymous]
I don't understand the logic here? Do you see it as bad for the contest to get more attention and submissions?

No, it's just the standard frontpage policy:

Frontpage posts must meet the criteria of being broadly relevant to LessWrong’s main interests; timeless, i.e. not about recent events; and are attempts to explain not persuade.

Technically the contest is asking for attempts to persuade not explain, rather than itself attempting to persuade not explain, but the principle obviously applies.

As with my own comment, I don't think keeping the post off the frontpage is meant to be a judgement that the contest is net-negative in value; it may still be very net positive. It makes sense to have standard rules which create downsides for bad epistemics, and if some bad epistemics are worthwhile anyway, then people can pay the price of those downsides and move forward.

[-]Ruby151

Raemon and I discussed whether it should be frontpage this morning. Prizes are kind of an edge case in my mind. They don't properly fulfill the frontpage criteria but also it feels like they deserve visibility in a way that posts on niche topics don't, so we've more than once made an exception for them.

I didn't think too hard about the epistemics of the post when I made the decision to frontpage, but after John pointed out the suss epistemics, I'm inclined to agree, and concurred with Raemon moving it back to Personal.

----

I think the prize could be improved simply by rewarding the best arguments in favor and against AI risk. This might actually be more convincing to the skeptics – we paid people to argue against this position and now you can see the best they came up with.

2tamgent
Ah, instrumental and epistemic rationality clash again
[-]lc190

We're out of time. This is what serious political activism involves.

4trevor
I don't see any lc comments, and I really wish I could see some here because I feel like they'd be good.  Let's go! Let's go! Crack open an old book and let the ideas flow! The deadline is, like, basically tomorrow.
6lc
Ok :)

Most movements (and yes, this is a movement) have multiple groups of people, perhaps with degrees in subjects like communication, working full time coming up with slogans, making judgments about which terms to use for best persuasiveness, and selling the cause to the public. It is unusual for it to be done out in the open, yes. But this is what movements do when they have already decided what they believe and now have policy goals they know they want to achieve. It’s only natural.

[-]hath210

You didn't refute his argument at all, you just said that other movements do the same thing. Isn't the entire point of rationality that we're meant to be truth-focused, and winning-focused, in ways that don't manipulate others? Are we not meant to hold ourselves to the standard of "Aim to explain, not persuade"? Just because others in the reference class of "movements" do something doesn't mean it's immediately something we should replicate! Is that not the obvious, immediate response? Your comment proves too much; it could be used to argue for literally any popular behavior of movements, including canceling/exiling dissidents. 

Do I think that this specific contest is non-trivially harmful at the margin? Probably not. I am, however, worried about the general attitude behind some of this type of recruitment, and the justifications used to defend it. I become really fucking worried when someone raises an entirely valid objection, and is met with "It's only natural; most other movements do this".

-1P.
To the extent that rationality has a purpose, I would argue that it is to do what it takes to achieve our goals, if that includes creating "propaganda", so be it. And the rules explicitly ask for submissions not to be deceiving, so if we use them to convince people it will be a pure epistemic gain. Edit: If you are going to downvote this, at least argue why. I think that if this works like they expect, it truly is a net positive.
1hath
Fair. Should've started with that. I think there's a difference between "rationality is systematized winning" and "rationality is doing whatever it takes to achieve our goals". That difference requires more time to explain than I have right now. I think that the whole AI alignment thing requires extraordinary measures, and I'm not sure what specifically that would take; I'm not saying we shouldn't do the contest. I doubt you and I have a substantial disagreement as to the severity of the problem or the effectiveness of the contest. My above comment was more "argument from 'everyone does this' doesn't work", not "this contest is bad and you are bad". Also, I wouldn't call this contest propaganda. At the same time, if this contest was "convince EAs and LW users to have shorter timelines and higher chances of doom", it would be reacted to differently. There is a difference, convincing someone to have a shorter timeline isn't the same as trying to explain the whole AI alignment thing in the first place, but I worry that we could take that too far. I think that (most of) the responses John's comment got were good, and reassure me that the OPs are actually aware of/worried about John's concerns. I see no reason why this particular contest will be harmful, but I can imagine a future where we pivot to mainly strategies like this having some harmful second-order effects (which need their own post to explain).
8Sidney Hough
Hey John, thank you for your feedback. As per the post, we’re not accepting misleading arguments. We’re looking for the subset of sound arguments that are also effective. We’re happy to consider concrete suggestions which would help this competition reduce x-risk.
6jacobjacob
Thanks for being open to suggestions :) Here's one: you could award half the prize pool to compelling arguments against AI safety. That addresses one of John's points.  For example, stuff like "We need to focus on problems AI is already causing right now, like algorithmic fairness" would not win a prize, but "There's some chance we'll be better able to think about these issues much better in the future once we have more capable models that can aid our thinking, making effort right now less valuable" might. 

That idea seems reasonable at first glance, but upon reflection, I think it's a really bad idea. It's one thing to run a red-teaming competition, it's another to spend money building rhetorically optimised tools for the other side. If we do that, then maybe there was no point running the competition in the first place as it might all cancel out.

8Ruby
This makes sense if you assume things are symmetric. Hopefully there's enough interest in truth and valid reasoning that if the "AI is dangerous" conclusion is correct, it'll have better arguments on its side.
4Sidney Hough
Thanks for the idea, Jacob. Not speaking on behalf of the group here - but my first thought is that enforcing symmetry on discussion probably isn't a condition for good epistemics, especially since the distribution of this community's opinions is skewed. I think I'd be more worried if particular arguments that were misleading went unchallenged, but we'll be vetting submissions as they come in, and I'd also encourage anyone who has concerns with a given submission to talk with the author and/or us. My second thought is that we're planning a number of practical outreach projects that will make use of the arguments generated here - we're not trying to host an intra-community debate about the legitimacy of AI risk - so we'd ideally have the prize structure reflect the outreach value for which arguments are responsible. I'm potentially up to opening the contest to arguments for or against AI risk, and allowing the distribution of responses to reflect the distribution of the opinions of the community. Will discuss with the rest of the group.
3Thomas Kwa
It seems better to award some fraction of the prize pool to refutations of the posted arguments. IMO the point isn't to be "fair to both sides", it's to produce truth.

Wait, the goal here, at least, isn't to produce truth, it is to disseminate it. Counter-arguments are great, but this isn't about debating the question, it's about communicating a conclusion well.

6Yitz
This is PR, not internal epistemics, if I’m understanding the situation correctly.
7Davidmanheim
Think of it as a "practicing a dark art of rationality" post, and I'd think it would seem less off-putting.
4Ben Pace
I think it would be less "off-putting" if we had common knowledge of it being such a post. I think the authors don't think of it as that from reading Sidney's comment.

It's unreasonable to look at AI and think "in order to make it more predictable, more controllable, we just have to make it smarter". When individual humans become smarter, they become much less predictable. And the human mind is built to predict the behavior of other humans.

3trevor
The last sentence can be cut off for simplicity, most of the point isin the first two sentences.

Random side note: GPT-3 seems to be able to generate decent one liners.

Generate one-liners describing the dangers of AI. An example is "Inventing machines that are smarter than us is playing with fire."

1. Machines that are smarter than us could easily become our masters.

2. If we're not careful, AI could spell the end of humanity as we know it.

3. AI could be used to create weapons of mass destruction that could devastate the planet.

4. AI could be used to create powerful robots that could enslave humans.

5. AI could be used to create artificial intelligence that could turn against its creators

6. AI usher in a new era of cyber-warfare that could cripple society

7. AI could create self-replicating robots that could eventually consume all resources on Earth

8. AI could be used to create powerful AIs that could eventually surpass human intelligence and take over the world

9. AI technology could eventually be used to create a global surveillance state where everyone is constantly watched and monitored

6Mitchell Reynolds
I had a similar thought to prompt GPT-3 for one liners or to summarize some article (if available). I think involving the community to write 500-1000 winning submissions would have the positive externality of non-winners to distill/condense their views. My exploratory idea is that this would be instrumentally useful when talking with those new to AI x-risk topics.
2Yitz
We could also prompt GPT-3 with the results ;)
3jcp29
Good idea! I could imagine doing something similar with images generated by DALL-E.
1FinalFormal2
That's a very good idea, I think one limitation of most AI arguments is that they seem to lack urgency. GAI seems like it's a hundred years away at least, and showing the incredible progress we've already seen might help to negate some of that perception.
1trevor
1. Machines that are smarter than us could easily become our masters. [All it takes is a single glitch, and they will outsmart us the same way we outsmart animals.] 2. If we're not careful, AI could spell the end of humanity as we know it. [Artificial intelligence improves itself at an exponential pace, so if it speeds up there is no guarantee that it will slow down until it is too late.] 3. AI could be used to create weapons of mass destruction that could devastate the planet. x 4. AI could be used to create powerful robots that could enslave humans. x 5. AI could one day be used to create artificial intelligence [an even smarter AI system] that could turn against its creators [if it becomes capable of outmaneuvering humans and finding loopholes in order to pursue it's mission.] 6. AI usher in a new era of cyber-warfare that could cripple society x 7. AI could create self-replicating robots that could eventually consume all resources on Earth x 8. AI could [can one day] be used to create [newer, more powerful] AI [systems] that could eventually surpass human intelligence and take over the world [behave unpredictably]. 9. AI technology could eventually be used to create a global surveillance state where everyone is constantly watched and monitored x  
[-]lc90

I remember watching a documentary made during the satanic panic by some activist Christian group. I found it very funny at the time, and then became intrigued when an expert came on to say something like:

"Look, you may not believe in any of this occult stuff; but there are people out there that do, and they're willing to do bad things because of their beliefs."

I was impressed with that line's simplicity and effectiveness. A lot of it's effectiveness stems silently from the fact that, inadvertently, it helps suspend disbelief about the negative impact of "s... (read more)

Any arguments for AI safety should be accompanied by images from DALL-E 2.

One of the key factors which makes AI safety such a low priority topic is a complete lack of urgency. Dangerous AI seems like a science fiction element, that's always a century away, and we can fight against this perception by demonstrating the potential and growth of AI capability.

No demonstration of AI capability has the same immediate visceral power as DALL-E 2.

In longer-form arguments, urgency could also be demonstrated through GPT-3's prompts, but DALL-E 2 is better, especially ... (read more)

3FinalFormal2
Any image produced by DALL-E which could also convey or be used to convey misalignment or other risks from AI would be very useful because it could combine the desired messages: "the AI problem is urgent," and "misalignment is possible and dangerous." For example, if DALL-E responded to the prompt: "AI living with humans" by creating an image suggesting a hierarchy of AI over humans, it would serve both messages. However, this is only worthy of a side note, because creating such suggested misalignment organically might be very difficult. Other image prompts might be: "The world as AI sees it," "the power of intelligence," "recursive self-improvement," "the danger of creating life," "god from the machine," etc.
4trevor
both prompts are from here: https://www.lesswrong.com/posts/uKp6tBFStnsvrot5t/what-dall-e-2-can-and-cannot-do#:~:text=DALL%2DE%20can't%20spell,written%20on%20a%20stop%20sign.)  
1trevor
There's a lot of good DALL-E images floating around lesswrong that point towards alignment significance. We can use copy + paste into a lesswrong comment to post it.
3Yitz
very slight modification of Scott’s words to produce a more self-contained paragraph:

The technology [of lethal autonomous drones], from the point of view of AI, is entirely feasible. When the Russian ambassador made the remark that these things are 20 or 30 years off in the future, I responded that, with three good grad students and possibly the help of a couple of my robotics colleagues, it will be a term project [six to eight weeks] to build a weapon that could come into the United Nations building and find the Russian ambassador and deliver a package to him.

-- Stuart Russell on a February 25, 2021 podcast with the Future of Life Institu... (read more)

Neither us humans, nor the flower, sees anything that looks like a bee. But when a bee looks at it, it sees another bee, and it is tricked into pollinating that flower. The flower did not know any of this, it's petals randomly changed shape over millions of years, and eventually one of those random shapes started tricking bees and outperforming all of the other flowers.

Today's AI already does this. If AI begins to approach human intelligence, there's no limit to the number of ways things can go horribly wrong.

2trevor
This is for ML researchers, I'd worry sigificantly about sending bizarre imagery to techxecutives or policymakers due to the absurdity heuristic being one of the most serious concerns. Generally, nature and evolution are horrible.

Here is the spreadsheet containing the results from the competition.

More quotes on AI safety here.

[Policy makers]

A couple of years ago there was an AI trained to beat Tetris. Artificial intelligences are very good at learning video games, so it didn't take long for it to master the game. Soon it was playing so quickly that the game was speeding up to the point it was impossible to win and blocks were slowly stacking up, but before it could be forced to place the last piece, it paused the game. 

As long as the game didn't continue, it could never lose.

When we ask AI to do something, like play Tetris, we have a lot of assumptions about how it can or ... (read more)

1FinalFormal2
I'm trying to find the balance between suggesting existential/catastrophic risk and screaming it or coming off too dramatic, any feedback would be welcome.