Announcing AI safety Mentors and Mentees

Marius Hobbhahn

Executive summary

AI safety Mentors and Mentees is a program that matches mentors with mentees to accelerate their AI safety journey. Our aim is to get more people into full-time AI safety roles faster. We expect nearly all mentees to do this next to their current occupation, the program is not designed to be full-time.
The program is designed to be very flexible and lightweight, e.g. as long as the mentor and mentee agree on the terms, they can do pretty much whatever makes the most sense for them. The organizational overhead for each person is less than 10 minutes for the entirety of the program.
We’re trying to fill some of the niches that other programs can’t provide. We don’t expect the program to ever scale up to huge numbers. Whenever possible, we recommend doing cohort-based programs such as AGI safety fundamentals, SERI MATS, AI safety scholars, the AI safety camp, and the like.
We are looking for more mentors and mentees. You can fill out our interest forms here (mentors, mentees). Feel free to forward these forms to people you think might be a good fit.
The requirement for being a mentor is to have a decent overview of what’s happening in AI safety and be able to point people toward relevant articles, organizations and individual researchers. All of our mentors so far are either already working full-time in AI safety or very likely to be in the near future. We think early-stage professionals in AI safety (Ph.D. students, industry researchers/engineers, or independent contributors) are ideal candidates for mentors. Note, being able to consistently spare 30 minutes per week is sufficient to mentor one mentee, so you're time commitment doesn't have to be very big.
The requirement for being a mentee is to intend to work full-time in AI safety in the near future (e.g. ~2 years). There are no strict formal requirements but we strongly prefer candidates who could directly join the workforce or get independent funding to work on AI safety soon after finishing the M&M program. We expect to point many applicants to the AGI safety fundamentals program or other programs if there is no specific reason for 1-on-1 mentorship.
We have funds for people (both mentors and mentees) who really need them--so if you want to participate but don’t have the funds, we will be able to support you. By default, we suggest that the program is unpaid. In some cases, getting paid creates a better structure and more accountability--we are willing to take this into account as a reason to fund you but it should not be your primary reason.
For the last 5 months, we ran a pilot version of the program with 5 mentors and 10-15 mentees (some took breaks or stopped early). We found the following
- On average, the mentors and mentees met 30-60 minutes every 1-4 weeks.
- Most of our current mentees are either Master's or Ph.D. students in CS, ML or physics with a strong technical background who want to fast-track their understanding of AI safety.
- Most of our current mentors are either already working full-time on AI safety or will likely be in the near future (e.g. currently on an independent research grant or finishing their Ph.D.).
- Whenever mentees are unable to consistently commit at least 5 hours per week, the program is probably not worth it for them and their mentor.
- In the pilot, the default was that most people got funding but we think that led to some mentees overstating (not with bad intentions, just planning fallacy) the time they were able to actually commit. Thus, the program will be on a voluntary basis with support funds when necessary (but with a low bar for what “necessary” means).

Brief overview

Motivation

There are many great programs to get people deeper into AI safety, e.g. AGISF fundamentals, SERI MATS, AI safety scholars, or the AI safety camp. There are also multiple options to get individual grants (scroll down here for an overview) and there are multiple guides to doing career exploration in AI safety, e.g. how to pursue a career in technical alignment, leveling up in AI safety research engineering, or the AI safety starter pack. These resources are great and we don’t intend to replace them. However, we think there are some niches that can still be filled.

Flexibility: Most programs only happen once a year, so if you just missed it, you’ll have to wait for the next iteration. Furthermore, often people have very different interests and key uncertainties, so being able to design individualized schedules makes progress much easier. Also, getting some individualized feedback for your projects is usually very helpful for people who are getting up to speed with a topic. Lastly, programs like SERI MATS or MLAB are full-time and not everyone is able to do that. M&M is intended to be next to the mentee's current occupation, e.g. in parallel to a CS Master's degree.
Guidance: It’s hard to find things if you don’t know where to look. One key value that the mentors can provide is to give pointers to existing resources or organizations that have already worked on a given topic or to create connections to other researchers working on AI safety.
Confidence: It can feel intimidating to write an AI safety researcher a cold mail or to post something on the forum. Having a mentor to introduce and/or encourage you reduces the burden.

To evaluate whether our reasoning makes sense we started a pilot phase for the program at the beginning of June 2022 with a grant from the Long-term Future Fund and organizational support from AI safety support (see evaluation below).

Details of the program

The program is very flexible and as long as the mentor and mentee agree on the terms, they can pretty much do whatever makes the most sense for them. However, we provide a rough frame as the default plan.

Gauging interest: The mentor and mentee find each other. So far, this has mostly been through personal connections, e.g. meeting at EAG, knowing someone in the local group, etc. but we think this is ultimately unfair and thus want to open up our process to participants from everywhere (see interest forms for mentors and mentees). Most of our mentees so far are Ph.D. and Master's students in ML, computer science and physics. However, there are no formal requirements and we will judge applications on a case-by-case basis. We want to point out though that there is a certain level under which 1-on-1 mentorship doesn’t make sense. If you’re currently in the middle of your Bachelor’s degree, it's fine to focus on that and apply later.
First call: In the first call mentor and mentee discuss the content and terms of the program, e.g. what to focus on, how many hours per week, how often they meet and whether the mentee needs financial support. In this call, the mentor and the mentee design a schedule together. This schedule is mostly based on the needs and questions of the mentee, e.g. if the mentees want to get deep into interpretability, conceptual research or forecasting, the schedule reflects these interests. Per default, we select those parts of the AGISF course (101 and 201 curricula) that seem most relevant to the scholars for the first couple of weeks and then get more specialized.
The program is on a voluntary basis but we do offer to pay the mentees for the time they spend on the program in some cases. For example, some previous mentees would not have been able to participate in the first place if we didn’t provide funding because they had to work a second job. The exact funding amount is based on the needs of the participant but our default value is $30/h which is roughly 2x of what a teaching assistant would make in Germany. Per default, the program is 3-5 months but it can be shorter or longer if people want that. Mentors can also get paid if they want to! So far, most mentors have done it for free because they have a stable income.
Learning phase: This is the core of the program. In this phase, the mentees mostly work on their own, read blog posts or books, write summaries, etc. From time to time, the mentees ask questions to the mentor and discuss what they have learned. Per default, we recommend a 30- or 60-minute meeting every 2 weeks. Usually, once the mentees know more about the field, they have new interests and the schedule is adapted and extended. The core role of the mentor is to make sure the mentee is on track, answer questions, provide pointers to resources, connect with relevant individuals and give guidance.
Small final project: Toward the end of the program, we encourage people to produce a final project (similar to the cornerstone project in AGISF). These can include a small software project, e.g. applying interpretability tools to a neural network or writing a blog post for the alignment forum or EA forum. Per default, we suggest writing a distillation of their favorite reading during the learning phase. The mentor gives feedback on the project and if the mentee wants that, it will be published. It is also possible to have a non-public final project. However, we think that the accountability of publishing online and engaging with the community is very helpful and encourage our mentees to do that.
Discussing future steps: The main goal of the program is to give the mentee an option to test their fit and prepare for full-time AI safety work. Thus, in a final call, we evaluate how the person enjoyed the program and what their strengths are. In case, they want to continue to dive deeper into AI safety, the mentor and mentee will plan the next steps, e.g. by thinking about which company, Ph.D. program or grant to apply to. We expect to mostly work with candidates that have a plausible path to direct work or intensive up-skilling right after the program. On the other hand, we also expect to get people to a point where they can do so (e.g. our rough goal is that >80% of mentees would get funding for full-time independent upskilling or research after finishing M&M from one of the big EA funders). As a rough guideline, we think that people who finish M&M will at least meet most or all of the application criteria of the SERI MATS program.

One key design component of the M&M program is that the overhead is very small for both mentors and mentees. The goal is that the logistical overhead for the entire program is less than 10 minutes for every mentee. So far, the overhead consists of providing your bank data in case you want to be paid for the time and filling out the final survey in the end. Everything else will be taken care of. The overhead for the mentors is even smaller. They basically just have to fill in a row into a spreadsheet whenever they mentor a new mentee. Everything else is taken care of by AI safety support or Marius. If we expand the program, we might try to get help for some of the organizational overhead.

Longterm role of M&M

Currently, we are uncertain about what role M&M should take in the long run. We expect it to be something like “filling gaps in other existing programs” by having lots of flexibility.

We don’t expect to replace other programs. In fact, most of the mentee’s schedules so far were created by picking and choosing parts of the AGISF schedule (101 and 201 curricula) and other AIS posts. We also expect to send many applicants directly to other programs in case that makes the most sense for them.

We are currently unsure how much we should scale the program. This is mostly because we are unsure about how much value it provides compared to its costs (e.g. mentor time). We will try to monitor this over time and expand or reduce the size of the program accordingly.

An additional benefit of M&M is that the mentors can provide a reference for future programs or jobs since they have a more detailed assessment of the mentee than in most other programs.

The bottleneck is mentorship

Ultimately, the bottleneck is and will probably stay mentorship. Many people would profit from 1-on-1 mentorship but this comes at the cost of the mentor’s time. Investing 30-60 minutes every two weeks might not sound like much but it can quickly grow with more mentees and take up headspace in other ways (e.g. answering messages between the meetings).

There are some egoistic reasons to be a mentor, e.g. building a network and getting more mentoring experience, but we ultimately expect most mentors to do it for altruistic reasons.

We continuously evaluate the program and keep up with our mentors to see if they think their time investment is justified by the value they provide. However, we are cautiously optimistic since the mentorship seems more valuable and requires less time investment than we initially expected.

In case you think you might make a good mentor, please consider reaching out. We are especially looking for members of underrepresented communities, e.g. women and people from developing countries. We think the ideal candidates for mentorship are early-stage professionals working in AI safety or an adjacent field (e.g. Ph.D. students, industry researchers/engineers or independent contributors).

Evaluation of the pilot phase

We received funding for a pilot phase at the beginning of June 2022 from the Long-term Future Fund. There are currently 5 mentors and 10-15 mentees in the program.

How do we know we failed/succeeded?

Ultimately we want to get more full-time AI safety researchers/engineers. Thus, we think the main two metrics are

Did we increase the probability that mentees do a full-time career in AI safety compared to a world without M&M?
Did we decrease the time it takes mentees to get into a full-time career in AI safety compared to a world without M&M?

Both of these are obviously hard to evaluate because we have to estimate counterfactuals. Currently, we estimate these quantities mostly by asking the mentees themselves and using the subjective experiences of the mentors. However, we are interested in finding more accurate approximations of these quantities and are interested in suggestions.

However, we want to point out that rather small changes in these metrics could already justify the program. For example, if we invested 5-15 mentor-hours per mentee, a 10 percentage point increase in p(full-time work) or cutting the time to get there by 3 months would already be a good outcome.

Further evidence for the success or failure of the program include

Whether the mentees enjoyed it and are excited about AI safety.
Whether the mentors think it was worth their time.
How the wider AI safety community reacts to it.

Evidence from the pilot phase

We evaluate the 3 mentees who have finished or nearly finished their run through the program (most mentees have not started in June).

These three mentees:

Have put in between 50 and 100 hours into the program so far.
All received funding.
Liked the structure and accountability of the program and the flexibility wrt topic choice.
Disliked when they could not meet the goals they set for the week.
Estimate that M&M increased their probability of getting deeper into AI safety (e.g. independent research or full-time work) by 15/20/30 percentage points.
Estimate that M&M reduced their time to get deeper into AI (e.g. independent research or full-time work) by 0/3/10 months.
Would all recommend the program to a friend in a similar position.

The five mentors:

All want to continue to mentor (but 2 of the 5 mentors only under slightly modified conditions)
Some were surprised about how helpful a 30 minutes chat every 2 weeks could be to the mentee.
All agreed that every participant has to be able to consistently commit 5 hours or more per week for mentorship to make sense.
3 out of 5 mentors think it was worth their time and 2 out of 5 were not sure but think it would be with slightly modified conditions.

Conclusions

We think the idea to match mentors with mentees is pretty simple, probably useful and has seen lots of success before, e.g. in SERI MATS. We’re not sure if the particular way we do that in M&M is optimal and intend to evaluate this regularly to make sure we’re not wasting otherwise valuable time. For now, we open up the program for applications to mentors and mentees but we intend to stay relatively small until we get more info about the program's impact. Feedback is very welcome. If you want to help with or take over some of the design and operational aspects of the program, please reach out to Marius.

62

Announcing AI safety Mentors and Mentees

62

Ω 20

Executive summary

Brief overview

Motivation

Details of the program

Longterm role of M&M

The bottleneck is mentorship

Evaluation of the pilot phase

How do we know we failed/succeeded?

Evidence from the pilot phase

Conclusions

62

Ω 20

62

Ω 20