Stronger than human artificial intelligence would be dangerous to humanity. It is vital any such intelligence’s goals are aligned with humanity's goals. Maximizing the chance that this happens is a difficult, important and under-studied problem.
To encourage more and better work on this important problem, we (Zvi Mowshowitz and Vladimir Slepnev) are announcing a $5000 prize for publicly posted work advancing understanding of AI alignment, funded by Paul Christiano.
This prize will be awarded based on entries gathered over the next two months. If the prize is successful, we will award further prizes in the future.
The prize is not backed by or affiliated with any organization.
Your entry must be published online for the first time between November 3 and December 31, 2017, and contain novel ideas about AI alignment. Entries have no minimum or maximum size. Important ideas can be short!
Your entry must be written by you, and submitted before 9pm Pacific Time on December 31, 2017. Submit your entries either as links in the comments to this post, or by email to firstname.lastname@example.org. We may provide feedback on early entries to allow improvement.
We will award $5000 to between one and five winners. The first place winner will get at least $2500. The second place winner will get at least $1000. Other winners will get at least $500.
Entries will be judged subjectively. Final judgment will be by Paul Christiano. Prizes will be awarded on or before January 15, 2018.
What kind of work are we looking for?
AI Alignment focuses on ways to ensure that future smarter than human intelligence will have goals aligned with the goals of humanity. Many approaches to AI Alignment deserve attention. This includes technical and philosophical topics, as well as strategic research about related social, economic or political issues. A non-exhaustive list of technical and other topics can be found here.
We are not interested in research dealing with the dangers of existing machine learning systems commonly called AI that do not have smarter than human intelligence. These concerns are also understudied, but are not the subject of this prize except in the context of future smarter than human intelligence. We are also not interested in general AI research. We care about AI alignment, which may or may not also advance the cause of general AI research.
(Addendum: the results of the prize and the rules for the next round have now been announced.)
Here is my submission.
Thank you for motivating me to write this blog post I have been putting off for a while.
Disclaimer: If you want to only measure the contribution that came November or later, compare to this post, which has one fewer category, no names, fewer examples, nothing about mitigation, and worse presentation.
I think this is an important idea, so I appreciate feedback, especially about presentation.
You don't mention decision theory in your list of topics, but I guess it doesn't hurt to try.
I have thought a bit about what one might call the "implementation problem of decision theory". Let's say you believe that some theory of rational decision making, e.g., evidential or updateless decision theory, is the right one for an AI to use. How would you design an AI to behave in accordance with such a normative theory? Conversely, if you just go ahead and build a system in some existing framework, how would that AI behave in Newcomb-... (read more)
I saw a talk earlier this year that mentioned this 2015 Corrigibility paper as a good starting point for someone new to alignment research. If that's still true, I started writing up some thoughts on a possible generalization of the method in that paper.
Anyway, submitting this draft early to hopefully get some feedback whether I'm on the right track:
The new version does better on sub-agent shutdown and eliminates the "managing the news" problem.
(Let me know if someone already thoug... (read more)
Should I submit? Working on this is my job, so it's maybe better to encourage others to come on board?
What should we be doing to help get more people to enter, whether by spreading the word or another way? We want this to work and result in good things, and it's iteration one so doubtless a lot we're not doing right.
Zvi/Vladimir, what's your role in this - are you the judges?
I don't know if this is a useful "soft" submission, considering I am still reading and learning in the area.
But I think the current metaphors (paperclips, etc.) are not very persuasive for convincing folks in the world at large that value alignment is a BIG, HARD PROBLEM. Here is my attempt to add a possibly-new metaphor to the mix: https://nilscript.wordpress.com/2017/11/26/parenting-alignment-problem/
Posted on my blog, but might as well link it here. Not of the quality that Paul Christiano seeks, but might be of some interest, though many of the same point points have been discussed over and over here and elsewhere before.
Contact: defectivealtruist at g mail
Is "publishing" on google docs ok? Here's a link:
OK, I went on a rant and revived my blog after 4 years of inactivity because entries aren't supposed to be entered as comments but are supposed to be linked to instead.
my idea: https://docs.google.com/document/d/e/2PACX-1vQ3131oaC2JhxafeR77x3nbuOcPRoxLFI0PQvxcYt6N8IqK-FFV6mcK3CMXeEpZlTxjSmSXpvYYbbq7/pub
Are there any limitations on number of submissions per person (where each submission is a distinct idea)? On number of wins per person?
Here's my entry: Friendly AI through Ontology Autogeneration. Am I allowed to keep making improvements to it even after the deadline has passed? (Doing so at my own risk, i.e. if it so happens that you've already read & judged my essay before I make my improvements, and my improvements aren't going to affect my chances of winning, that's my problem.)
Here's my entry. I think it's what you want... Hosted on DocDroid.
Hello :) I’ve created this as a framework for guiding our future with AI http://peridotai.com/call-to-artists/ AND to bootstrap interest in my art and thoughts here at https://quantumsymbol.com
You should think about the incentives of posting early in the 2 month window rather than late. Later entries will be influenced by earlier entries so you have a misalignment between wanting to win the prize and wanting to advance the conversation sooner. Christiano ought to announce that if one entry builds in a valuable way on an earlier entry by someone else, the earlier submitter will also gain subjective judgy-points in a way that he, Paul, affirms is calibrated neither to penalize early entry nor to discourage work that builds on earlier entries.
Is it possible to enter the contest as a group? Meaning, can the article written for the contest have several coauthors?
Are you looking for entries with actionable information, or would you be interested in a paper showing, for example, that AI alignment might not be as big a problem as we thought but not for a reason that will help us solve the AI alignment problem?
Should've saved my decsion alignment loop post a few days. Maybe an expansion of it? Hmm.
Submitting this entry for your consideration: https://www.lesserwrong.com/posts/bkoeQLTBbodpqHePd/ai-goal-alignment-entry-how-to-teach-a-computer-to-love. I'll email it as well. Your commitment to this call for ideas is much appreciated!
My submission is on my project blog: https://airis-ai.com/2017/12/31/friendly-ai-via-agency-sustainment/
Thank you for hosting this excellent competition! It was very inspiring. This is an idea I've been bouncing around in the back of my mind for several months now, and it is your competition that prompted me to refine it, flesh it out, and put it to paper.
My contact email is email@example.com
Hello! My newest proposal:
I would like to propose a certain kind of AI goal structures that would be an alternative to utility maximisation based goal structures. The proposed alternative framework would make AI significantly safer, though it would not guarantee total safety. It can be used at strong AI level and also much below, so it is well scalable. The main idea would be to replace utility maximisation with the concept of homeostasis.
Hi all. I have posted my competition entry on my blog here:
Raising Moral AI
Is it easier to teach a robot to stay safe by not tearing off its own limbs and not drilling holes in its head and not touching lava and not falling from a cliff and so on ad infinitum, or introduce pain as inescapable consequence of such actions and let robot experiment and learn?
Similarly, while trying to create a safe AGI, it is futile to make exhaustive and non-contradictory set of rules (values, policies, laws, committees) due to infinite complexity. A powerful AGI agent might find an exception or conflict in rules and... (read more)
Are teams allowed to make submissions?
I have unpublished text on the topic and will put a draft online in the next couple of weeks, and will apply it to the competition. I will add URL here when it will be ready.