This is a linkpost for https://docs.google.com/document/d/1NIg4OnQyhWGR01fMVTcxpz8jDd68JdDIyQb0ZZyB-go/edit?usp=sharing

If you are interested in working on AI alignment, and might do full or part time work given funding, consider submitting a short application to funding@ai-alignment.com.

Submitting an application is intended to be very cheap. In order to keep the evaluations cheap as well, my process is not going to be particularly fair and will focus on stuff that I can understand easily. I may have a follow-up discussion before making a decision, and I'll try not to favor applications that took more effort.

As long as you won't be offended by a cursory rejection, I encourage you to apply.

If there are features of this funding that make it unattractive, but there are other funding structures that could potentially cause you to work on AI alignment, I'm curious about that as well. Feel free to leave a comment or send an email to funding@ai-alignment.com (I probably won't respond, but it may influence my decisions in the future).

Moderation Guidelines: Easy Going - I just delete obvious spam and trolling.expand_more

Note that this is (by far) the least incentive-skewing from all (publicly advertised) funding channels that I know of.

Apply especially if all of 1), 2) and 3) hold:

1) you want to solve AI alignment

2) you think your cognition is pwned by Moloch

3) but you wish it wasn't

Maybe it'd be useful to make a list of all the publicly advertised funding channels? Other ones I know of:

  • http://existence.org/getting-support/
  • https://futureoflife.org/2017/12/20/2018-international-ai-safety-grants-competition/
  • https://www.lesserwrong.com/posts/4WbNGQMvuFtY3So7s/announcement-ai-alignment-prize-winners-and-next-round
  • https://intelligence.org/mirix/
  • https://www.openphilanthropy.org/focus/global-catastrophic-risks/potential-risks-advanced-artificial-intelligence/open-philanthropy-project-ai-fellows-program

I noticed your comment and created this website that lists funding channels for AI alignment research. I am planning to create a new post on LW to share it after I receive some feedback.

Interesting idea! Some thoughts: You might want to think a bit more about who your target audience is. Given that applying for a job at MIRI/FHI/etc. is always another option, it's not totally clear to me to what extent "x risk funding" a natural category. One possible target audience is e.g. graduate students who are looking for research funding. One possible risk of engaging this audience people have talked about is that they might be more interested in optimizing their own career growth than actually solving x-risk-related problems. I'm not sure how worried to be about that. Another possible target audience is people who don't want to move to the Bay Area/Oxford/etc. Once you know who your target audience is, that makes marketing easier because you can market your site wherever that target audience hangs out. It might be that the most natural target audience is "people with math/CS expertise who are interested in working on the alignment problem", in which case you could expand your scope and also list things like open positions at MIRI, the AI safety reading group, lists of open problems, recent publications, etc.

This is great feedback! Thanks for taking the time to write it up.

One possible target audience is e.g. graduate students who are looking for research funding. One possible risk of engaging this audience people have talked about is that they might be more interested in optimizing their own career growth than actually solving x-risk-related problems. I'm not sure how worried to be about that.

Targeting graduate students is an excellent idea. I think the risk you mention is one worth considering. I've thought of two possible reasons why:

1) This project could end up amounting to nothing more than a waste of my time. Given how much time I currently expect to invest, I'm not very concerned with that at the moment. I haven't decided how much time I'm willing to potentially waste on this but it's a good idea to keep track of the time I'm putting into it.

2) If the project is successful, it could direct more people to these organizations to request funding. If these additional people are likely to care more about their own career growth than actually solving x-risk-related problems, this could make it more difficult to make good decisions about which applicants should receive funding. Making it so that there is a small barrier to finding out about these funding opportunities (e.g. needing to pay attention to what's happening in the LW community) might actually be a good thing. Right now, I'm not convinced that it is a good thing.

I'll continue to think about this.

Once you know who your target audience is, that makes marketing easier because you can market your site wherever that target audience hangs out. It might be that the most natural target audience is "people with math/CS expertise who are interested in working on the alignment problem", in which case you could expand your scope and also list things like open positions at MIRI, the AI safety reading group, lists of open problems, recent publications, etc.

I like the suggestions here and will make use of them.

Actionable stuff:

Target math/cs/philosophy grad students, focus on AI alignment research funding only, expand scope to include useful related information.

I'll start working on some changes to the site soon. Maybe the name should change too? I'm not sure.

I might take this up at a later date. I want to solve AI alignment, but I don't want to solve it now. I'd prefer it if our societies institutions (both governmental and non-governmental) were a bit more prepared.

Differential research that advances safety more than AI capability still advances AI capability.

FWIW, I think I represent the majority of safety researchers in saying that you shouldn't be too concerned with your effect on capabilities; there's many more people pushing capabilities, so most safety research is likely a drop in the capabilities bucket (although there may be important exceptions!)

Personally, I agree that improving social institutions seems more important for reducing AI-Xrisk ATM than technical work. Are you doing that? There are options for that kind of work as well, e.g. at FHI.

If you're able to contribute equally to technical safety work and institution-oriented work, my own advice would generally be to prioritize technical work. I agree with capybarelet, though, that safety researchers should be willing to do work that might synergize with capabilities research, where the tradeoff looks worth it.

On the other hand, I think "don't worry about how your research (or other actions) will impact AGI timelines or development trajectories, because whatever you're doing is probably a drop in the bucket" is a bad meme to propagate. Some of the buckets that matter aren't that large, and the drops may be much larger for some of the researchers who are particularly adept at making safety breakthroughs. (And public advice should plausibly be skewed toward those people, since most of the expected impact of advice may come from its influence on large-drop people.)

Can you give some arguments for these views?

I think the best argument against institution-oriented work is that it might be harder to make a big impact. But more importantly, I think strong global coordination is necessary and sufficient, whereas technical safety is plausibly neither.

I also agree that one should consider tradeoffs, sometimes. But every time someone has raised this concern to me (I think it's been 3x?) I think it's been a clear cut case of "why are you even worrying about that", which leads me to believe that there are a lot of people who are overconcerned about this.

I would have said that strong global coordination before we get to AGI isn't necessary. I'd also have said that strong global coordination without an alignment solution is insufficient, given that it's not realistic to shoot for levels of coordination like "let's just never build AGI". (My model of Nate would also add here that never building AGI would mean losing an incredible amount of cosmopolitan value, enough to count as an existential catastrophe in its own right.)

Maybe we could start with you saying why you think it's necessary and sufficient? That might give me a better understanding of what you have in mind by "institution-oriented work".

I also agree that one should consider tradeoffs, sometimes. But every time someone has raised this concern to me (I think it's been 3x?) I think it's been a clear cut case of "why are you even worrying about that", which leads me to believe that there are a lot of people who are overconcerned about this.

I wouldn't be at all surprised if lots of people are overconcerned about this. Many people are also underconcerned, though. I feel better about public advice that encourages people to test their models of the size of relevant drops and relevant buckets, rather than just trying to correct for a bias some people have in a particular direction (which makes overcorrection easy).

I feel better about public advice that encourages people to test their models of the size of relevant drops and relevant buckets, rather than just trying to correct for a bias some people have in a particular direction (which makes overcorrection easy).

I like this sentence a lot.

So my original response was to the statement:

Differential research that advances safety more than AI capability still advances AI capability.

Which seems to suggest that advancing AI capability is sufficient reason to avoid technical safety that has non-trivial overlap with capabilities. I think that's wrong.

RE the necessary and sufficient argument:

1) Necessary: it's unclear that a technical solution to alignment would be sufficient, since our current social institutions are not designed for superintelligent actors, and we might not develop effective new ones quickly enough

2) Sufficient: I agree that never building AGI is a potential Xrisk (or close enough). I don't think it's entirely unrealistic "to shoot for levels of coordination like 'let's just never build AGI'", although I agree it's a long shot. Supposing we have that level of coordination, we could use "never build AGI" as a backup plan while we work to solve technical safety to our satisfaction, if that is in fact possible.

I think that's wrong.

Yeah, I agree with that; my above suggestion is taking into account that this is a likely case of overconcern.

1) Necessary: it's unclear that a technical solution to alignment would be sufficient, since our current social institutions are not designed for superintelligent actors, and we might not develop effective new ones quickly enough

This sounds weaker to me than what I usually think of as a "necessary and sufficient" condition.

My view is more or less the one Eliezer points to here:

The big big problem is, “Nobody knows how to make the nice AI.” You ask people how to do it, they either don’t give you any answers or they give you answers that I can shoot down in 30 seconds as a result of having worked in this field for longer than five minutes.
It doesn’t matter how good their intentions are. It doesn’t matter if they don’t want to enact a Hollywood movie plot. They don’t know how to do it. Nobody knows how to do it. There’s no point in even talking about the arms race if the arms race is between a set of unfriendly AIs with no friendly AI in the mix.

And the one in the background when he says a competitive AGI project can't deal with large slowdowns:

Because I don't think you can get the latter degree of advantage over other AGI projects elsewhere in the world. Unless you are postulating massive global perfect surveillance schemes that don't wreck humanity's future, carried out by hyper-competent, hyper-trustworthy great powers with a deep commitment to cosmopolitan value — very unlike the observed characteristics of present great powers, and going unopposed by any other major government.

I would say that actually solving the technical problem clearly is necessary for good outcomes, whereas strong pre-AGI global coordination is helpful but not necessary. And the scenario where a leading AI company just builds sufficiently aligned AGI, runs it, and saves the world doesn't strike me as particularly implausible, relative to other 'things turn out alright' outcomes; whereas the scenario where world leaders like Trump, Putin, and Xi Jinping usher in a permanent otherwise-utopian AGI-free world government does strike me as much crazier than the ten or hundred likeliest 'things turn out alright' scenarios.

In general, better coordination reduces the difficulty of the relevant technical challenges, and technical progress reduces the difficulty of the relevant coordination challenges; so both are worth pursuing. I do think that (e.g.) reducing x-risk by 5% with coordination work is likely to be much more difficult than reducing it by 5% with technical work, and I think the necessity and sufficiency arguments are much weaker for 'just try to get everyone to be friends' approaches than for 'just try to figure out how to build this kind of machine' approaches.

My view is more or less the one Eliezer points to here:
The big big problem is, “Nobody knows how to make the nice AI.” You ask people how to do it, they either don’t give you any answers or they give you answers that I can shoot down in 30 seconds as a result of having worked in this field for longer than five minutes.

There are probably no fire alarms for "nice AI designs" either, just like there are no fire alarms for AI in general.

Why should we expect people to share "nice AI designs"?

I had been thinking about metrics for measuring progress towards shared agreed outcomes as a method of co-ordination between potentially competitive powers to avoid arms races.

I passed around the draft to a couple of the usual suspects in the ai metrics/risk mitigation in hopes of getting collaborators. But no joy. I learnt that Jack Clark of OpenAI is looking at that kind of thing as well and is a lot better positioned to act on it, so I have hopes around that.

Moving on from that I'm thinking that we might need a broad base of support from people (depending upon the scenario) so being able to explain how people could still have meaningful lives post AI is important for building that support. So I've been thinking about that.

Moving on from that I'm thinking that we might need a broad base of support from people (depending upon the scenario) so being able to explain how people could still have meaningful lives post AI is important for building that support. So I've been thinking about that.

This sounds like it would be useful for getting people to support the development of AGI, rather than effective global regulation of AGI. What am I missing?

For longer time frames where there might be visible development, the public needs to trust that the political regulators of AI to have their interests at heart. Else they may try and make it a party political issue, which I think would be terrible for sane global regulation.

I've come across pretty strong emotion when talking about AGI even when talking about safety, which I suspect will come bubbling to the fore more as time goes by.

It may also help moral of the thoughtful people trying to make safe AI.

A question: can one post multiple initial applications, each less than a page long? Is there a limit for the total volume?

I don't think I would be interested in being paid to work on this, but a long time ago I wrote about AI alignment in a story. It's about an AI that runs a clinic in a remote village in Africa. http://terasemjournals.net/GNJournal/GN0202/henson1.html

I can go into more detail on the backstory if you want it.

Keith

Hey! I believe we were in a same IRC channel at that time and I also did read your story back then. I still remember some of it. What is the backstory? :)