Our plan for 2019-2020: consulting for AI Safety education

RAISE

Our plan for 2019-2020: consulting for AI Safety education

by RAISE

4 min read3rd Jun 201915 comments

18

Road To AI Safety Excellence

Personal Blog

UPDATE: this plan received sizable criticism. We are reflecting on it, and working on a revision.

Tl;dr: a conversation with a grantmaker made us drop our long-held assumption that outputs needed to be concrete to be recognized. We decided to take a step back and approach the improvement of the AI Safety pipeline on a more abstract level, doing consulting and research to develop expertise in the area. This will be our focus in the next year.

Trial results

We have tested our course in April. We didn’t get a positive result. It looks like this was due to bad test design, with a high variance and low number of participants clouding any pattern that could have emerged. In hindsight, we should clearly have tested knowledge before the intervention as well as after it, though arguably this would have been nearly impossible given the one-month deadline that our funder imposed.

What we did learn is that we are greatly unaware of the extent to which our course is being used. This is mostly due to using software that is not yet mature enough to give this kind of data. If we want to continue building the course, we feel that our first priority ought to be to set up a feedback mechanism that gives us precise insights into how students are journeying through.

However, other developments have pointed our attention away from developing the course, and towards developing the question that the course is an answer to.

If funding wasn’t a problem

During the existence of RAISE, it’s runway has never been longer than about 2 months. This did cripple our ability to make long term decisions, in favor of dishing out some quick results to show value. Seen from a "quick feedback loops" paradigm, this may have been a healthy dynamic. It did also lead to sacrifices that we didn’t actually want to make.

Had we been tasked with our particular niche without any funding constraints, our first move would have been to do extensive study into what the field needs. We feel that EA is missing a management layer. There is a lot that a community-focused management consultant could do, simply by connecting all the dots and coordinating the many projects and initiatives that exist in the LTF space. We have identified 30 (!) small and large organisations that are involved in AI Safety. Not all of them are talking to each other, or even aware of each other.

Our niche being AI Safety education, we would have spent a good 6 months developing expertise and network in this area. We would have studied the scientific frontiers of relevant domains like education and the metasciences. We would have interviewed AIS organisations and asked them what they look for in employees. We would've studied existing alignment researchers and looked for patterns. Talk to grantmakers and consider their models.

Funding might not be a problem

After getting turned down by the LTF fund (which was especially meaningful because they didn’t seem to be constrained by funding), we had a conversation with one of their grantmakers. The premise of the conversation was something like “what version of RAISE would you be willing to fund?” The answer was pretty much what we just described. They thought pipeline improvement was important, but hard, and just going with the first idea that sounds good (an online course) would be a lucky shot if it worked. Instead, someone should be thinking about the bigger picture first.

The mistake we had been making from the beginning was to assume we needed concrete results to be taken seriously.

Our new direction

EA really does seem to be missing a management layer. People are thinking about their careers, starting organisations, doing direct work and research. Not many people are drawing up plans for coordination on a higher level and telling people what to do. Someone ought to be dividing up the big picture into roles for people to fill. You can see the demand for this by how seriously we take 80k. They’re the only ones doing this beyond the organisational level.

Much the same in the cause area we call AI Safety Education. Most AIS organisations are necessarily thinking about hiring and training, but no one is specializing in it. In the coming year, our aim is to fill this niche, building expertise and doing management consulting. We will aim to smarten up the coordination there. Concrete outputs might be:

Advice for grantmakers that want to invest in the AI Safety researcher pipeline
Advice for students that want to get up to speed and test themselves quickly
Suggesting interventions for entrepreneurs that want to fill up gaps in the ecosystem
Publishing thinkpieces that advance the discussion of the community, like this one
Creating and keeping wiki pages about subjects that are relevant to us
Helping AIS research orgs with their recruitment process

We’re hiring

Do you think this is important? Would you like to fast track your involvement with the Xrisk community? Do you have good google-fu, or would you like to conduct depth interviews with admirable people? Most importantly, are you not afraid to hack your own trail?

We think we could use one or two more people to join us in this effort. You’d be living for free in the EA Hotel. We can’t promise any salary in addition to that. Do ask us for more info!

Let's talk

A large part of our work will involve talking to those involved in AI Safety. If you are working in this field, and interested in working on the pipeline, then we would like to talk to you.

If you have important information to share, have been plotting to do something in this area for a while, and want to compare perspectives, then we would like to talk to you.

And even if you would just like to have an open-ended chat about any of this, we would like to talk to you!

You can reach us at raise@aisafety.info

New to LessWrong?

Getting Started

FAQ

Library

Road To AI Safety Excellence

Personal Blog

18

Mentioned in

155RAISE post-mortem

18RAISE AI Safety prerequisites map entirely in one post

Our plan for 2019-2020: consulting for AI Safety education

New Comment

15 comments, sorted by

top scoring

Click to highlight new comments since: Today at 4:47 AM

[-]habryka5y280

As the funder that you are very likely referring to, I do want to highlight that I don't feel like this summarizes my views particularly well. In particular this section:

EA really does seem to be missing a management layer. People are thinking about their careers, starting organisations, doing direct work and research. Not many people are drawing up plans for coordination on a higher level and telling people what to do. Someone ought to be dividing up the big picture into roles for people to fill. You can see the demand for this by how seriously we take 80k. They’re the only ones doing this beyond the organisational level.

Much the same in the cause area we call AI Safety Education. Most AIS organisations are necessarily thinking about hiring and training, but no one is specializing in it. In the coming year, our aim is to fill this niche, building expertise and doing management consulting. We will aim to smarten up the coordination there. Concrete outputs might be:

+ Advice for grantmakers that want to invest in the AI Safety researcher pipeline

+ Advice for students that want to get up to speed and test themselves quickly

+ Suggesting interventions for entrepreneurs that want to fill up gaps in the ecosystem

+ Publishing thinkpieces that advance the discussion of the community, like this one

+ Creating and keeping wiki pages about subjects that are relevant to us

+ Helping AIS research orgs with their recruitment process

I think in general people should be very hesitant to work on social coordination problems because they can't find a way to make progress on the object-level problems. My recommendation was very concretely "try to build an internal model of what really needs to happen for AI-risk to go well" and very much not "try to tell other people what really needs to happen for AI-risk", which is almost the exact opposite.

I actually think going explicitly in this direction is possibly worse than RAISE's previous plans. One of my biggest concerns with RAISE was precisely that it was trying far too early to tell people what exactly to learn and what to do, without understanding the relevant problems themselves first. This seems like it exacerbates that problem by trying to make your job explicitly about telling other people what to do.

A lot of my thoughts in this space are summarized by the discussion around Davis' recent post "Go Do Something", in particular Ray's and Ben Hoffman's comments about working on social coordination technology:

Benquo:

This works for versions of "do something" that mainly interact with objective reality, but there's a pretty awful value-misalignment problem if the way you figure out what works is through feedback from social reality.

So, for instance, learning to go camping or cook or move your body better or paint a mural on your wall might count, but starting a socially legible project may be actively harmful if you don't have a specific need that's meeting that you're explicitly tracking. And unfortunately too much of people's idea of what "go do something" ends up pointing to trying to collect credit for doing things.

Sitting somewhere doing nothing (which is basically what much meditation is) is at least unlikely to be harmful, and while of limited use in some circumstances, often an important intermediate stage in between trying to look like you're doing things, and authentically acting in the world.

Ray:

It's been said before for sure, but worth saying periodically.

Something I'd add, which particularly seems like the failure mode I see in EA-spheres (less in rationalist spheres but they blur together)

Try to do something other than solve coordination problems.

Or, try to do something that provides immediate value to whoever uses it, regardless of whether other people are also using it.

A failure mode I see (and have often fallen to) is looking around and thinking "hmm, I don't know how to do something technical, and/or I don't have the specialist skills necessary to do something specialist. But, I can clearly see problems that stem from people being uncoordinated. I think I roughly know how people work, and I think I can understand this problem, so I will work on that."

But:

+ It actually requires just as much complex specialist knowledge to solve coordination problems as it does to do [whatever other thing you were considering].

+ Every time someone attempts to rally people around a new solution, and fails, they make it harder for the next person who tries to rally people around a new solution. This makes the coordination system overall worse.

This is a fairly different framing than Benquo's (and Eliezer's) advice, although I think it amounts to something similar.

[-]Wei Dai5y40

My recommendation was very concretely “try to build an internal model of what really needs to happen for AI-risk to go well”

I'm not sure anyone knows what really needs to happen for AI-risk to go well, including people who have been thinking about this question for many years. Do you really mean for RAISE to solve this problem, or just to think about this question for more than they already have, or to try to learn the best available model from someone else (if so who)?

[-]habryka5y30

Mostly think more about this question than they already have, which likely includes learning the best available models from others.

The critique here was more one of intention than one of epistemic state. It seems to me like there is a mental motion of being curious about how to make progress on something, even if one is still confused, which I contrast with a mental motion of "trying to look like you are working on the problem".

[-]Wei Dai5y110

Ah ok. Given that, it seems like you need to explain your critique more, or try to figure out the root cause of the wrong intention and address that, otherwise wouldn't they just switch to "trying to look like you're trying to build models of what needs to be done to solve AI risk"?

Another problem is that it seems even harder to distinguish between people who are really trying to build such models, and people who are just trying to look like they're doing that, because there's no short-term feedback from reality to tell you whether someone's model is any good. It seems like suggesting people to do that when you're not sure of their intention is really dangerous, as it could mess up the epistemic situation with AI risk models (even more than it already is). Maybe it would be better to just suggest some concrete short-term projects for them to do instead?

[-]Chris_Leong5y10

Maybe there is a possible project in this direction. I'll assume that this is general advice you'd give to many people who want to work in this space. If it is important for people to build a model of what is required for AI to go well then people may as well work on this together. And sure there's websites like Less Wrong, but people can exchange information much faster by chatting either in person or over Skype. (Of course there are worries that this might lead to overly correlated answers)

[+][anonymous]5y-90

[-]RyanCarey5y180

Hey! Thanks for sharing your experience with RAISE.

I'm sorry to say it, but I'm not convinced by this plan overall. Also, on the meta-level, I think you've got insufficient feedback on the idea before sharing it. Personally, my preferred format for giving inline feedback on a project idea is Google Docs, and so I've copied this post into a GDoc HERE and added a bunch of my thoughts there.

I don't mean to make you guys get discouraged, but I think that a bunch of aspects of this proposal are pretty ill-considered and need a bunch of revision. I'd be happy to provide further input.

[-][anonymous]5y90

Thank you. We're reflecting on this and will reach out to have a conversation soon.

[-]habryka5y40

Note: I think view access to a document is not sufficient to see comments. At least I can't see any comments.

[-]RyanCarey5y70

Should now be fixed

Moderation Log