Documenting Journey Into AI Safety

jacobhaimes

Summary

Recently, I have noticed that many individuals have become discouraged due to a seeming lack of progress in their attempts to break into a career in AI safety. It seems to me that a key bottleneck in the beginning a career in AI safety pipeline is access to mentors.

One method that may help address this need for mentorship would be for an individual going through this process to document their path, with a focus on peer and mentor feedback.^[1]Although this content would not be tailored to each specific listener, my intuition is that a lot of the feedback provided to one individual will still be valuable to others who have not yet been able to find mentors themselves. A series like this could then be a resource for newcomers, and could significantly increase the number of people who can get involved in the field.

If you have the time, I would greatly appreciate any feedback that you have regarding my thought process and this endeavor. Check out the Feedback Request section for some more information.

Feedback Request

I have prioritized getting this out quickly over this being my final plan, as I believe we can reach a much better result more efficiently than I could on my own. So, if something seems incredibly obvious and I didn't mention it or take it into account, please let me know! Here are some questions to think about as you read through this post for a few different demographics:

If you are new to AI safety, would a series like this be helpful? Is there anything you would like to make sure is covered early on (if possible)?
Field builders, am I making a field-building-faux-pas, or does something like this make sense to you? Is there any way I can improve it?
AI safety veterans, am I missing an obvious opportunity or angle? Is there anything that you think would be highly valuable to talk about in a series like this?

Brief Intro

My name is Jacob Haimes, and I live just outside of Boulder, Colorado, USA. I have an MS in computational modeling. In my free time I love playing all kinds of games, especially TTRPGs.

Examining the Situation

When I began thinking about making a career change into AI safety, I saw two options: up-skill and apply for positions, or apply for a PhD (with the intent to focus on AI safety, and enter the field once finished with the program). After discussing with a number of individuals, I decided on the former for the following reasons:

Consensus seemed to be that a PhD is most effective when the individual knows precisely what niche/problem they plan on studying - I did not.
The field of AI safety is currently relatively small, but is growing, meaning that work which is done now could have an increased impact (as it is likely to influence future work and/or practices).
Timelines for TAI/AGI may be short enough that we need as many individuals as possible working on AI safety as soon as possible.

In these same conversations, I would ask what resources and methods others had found to be the most helpful. The most common piece of advice went something like this: "Courses are helpful to some extent, but I found actually contributing towards research and/or working on projects to be much more effective." When asking about where to go to contribute to these kinds of efforts I was provided some Discord servers, opportunities boards, research groups, grants, organizations, and even some start-ups and companies that would be more for career capital than for the actual value add to AI safety that I would be generating.

Unfortunately, a majority of the job postings aren't looking for newcomers, and instead they include statements such as "must have 5+ years of work in a relevant field" or "Master's required, PhD preferred." To limit the pool even more, a very significant portion of the beginner-accepting positions require relocation to Washington D.C., the San Francisco Bay Area, or London, and all are extremely competitive. It seems that the frequently parroted idea of "we need more people interested in AI safety," isn't wholly accurate; a more authentic version would be "we need more sufficiently experienced and credentialed people interested in AI safety."^[2]

The Inconsistency

As far as I can tell, my story up to this point is a relatively common one, meaning that we have found a routinely occurring inconsistency: as people begin thinking about entering into the AI safety field, they are discouraged from pursuing a PhD, but once they know enough to theoretically be able to contribute, there are very few opportunities for individuals with their level of experience.

The value of a PhD, in addition to having a relatively stable salary for the 3-5 years, is getting experience doing research (in your field of study) under the guidance and oversight of someone who has significantly more knowledge and practice. Theoretically, at least, this kind of relationship is not monopolized by higher education; instead, mentorship is constrained by the availability of mentors. Since the field is rapidly growing (nice work field-builders), there is no way that our current AI safety experts have the capacity to mentor all newcomers, continue pushing forward their own work, and have a healthy work-life balance.

Something That May Help

To resolve this, we need a way to amplify the mentorship that is getting done. I think that I might be able to assist in this endeavor by recording/documenting my transition to a career in AI safety, including conversations that I have with peers/mentors, my progress through courses I am able to get into (with the facilitator's permission), and how my perspectives change as I learn more. This won't be as good as talking directly to one's own mentor, but should provide an additional resource for individuals that are finding it difficult to know what to do next, and how to do it.

Regarding Conflicts of Interest

As I was writing this out, I realized that this post could be perceived as primarily self-serving. This perspective would consider me as an individual who hadn't yet found mentors, and was trying to secure the best ones under the guise of helping others. Perhaps you weren't thinking this, but if I noticed it myself, I think it is probably worth addressing.

I am committed to work in the AI safety space, and I currently have a stable part-time job. Because of this, I am confident that I will, at some point, find fantastic mentors. I do not want to poach mentors from others, so I have come to a two-part solution:

I commit to mentoring others once I have the skills and experience to do so.
The proposed series should not be taken into account when weighing whether or not you would reach out to me and/or offer mentorship.

If you have the availability to be a mentor, please do so.

Series Structure

After receiving the resources that I mentioned earlier, I put myself out there on Discord, messaging individuals and channels about opportunities I could contribute to, and I applied for all of the other opportunities that were accessible to me. Even so, I found no traction, so I began trying to make my own opportunities.

Through networking with others in my AI Governance cohort, I was eventually able to connect with another researcher who was willing to collaborate on mechanistic interpretability research, and I began working with him. I've also been keeping a close eye on the forums and job boards for new opportunities, leading me to Linda Linsefors's post about applying to be a research lead for the upcoming AI Safety Camp. Although at first I was not sure if I would meet the requirements for such a position, I reached out, and verified with her that I could be a good fit. Since then I have gone through multiple iterations of my formal research proposal (effectively the application for AISC), and just in doing that I have learned a lot.

With this in mind, the series would most likely be a podcast (primarily), with the intent to produce additional content (videos, documents) when a situation was particularly conducive to it. The first episode of the podcast would be a summary of my path thus far, including the resources and experiences that have been most valuable. After this (with the possibility of revisiting anything that others are particularly interested in), the podcast would turn into an audio journal about the two projects I am currently working on (the AISC research plan and the mechanistic interpretability research), as well as any that I pick up over time. I have been having multiple meetings a week with connections I have made throughout the courses, and I would (with permission) record the audio of those sessions as well. Afterwards, I will edit these down to focus on the most important parts, which should help hold listener attention. In addition, I would be happy to create posts with my thoughts on why and how specific resources were helpful to me.

Moving forward, I will then continue to apply to programs/courses (e.g., SERI-MATS), with the intent of sharing as much content as is safely possible (I could see there being some edge cases where research requires more safety measures, although I doubt I will be in a position to contribute towards those kinds of projects for some time).

Acknowledgements

Special thanks to Linda Linsefors, Peter Gebauer, and Chris Lonsberry for feedback on this post and our discussions. I also would not have made this post without the help of AI Safety Quest's navigation call, 80,000 Hours 1-1 advice, or BlueDot Impact's AI courses.

I would also like to thank those of you who took the time to read through this post and provide feedback.

^{^}
Full credit to Peter Gebauer for having this idea during one of our discussions.
^{^}
I have even found one relatively recent (late March, 2023) post claiming that OpenAI would love to hire more researchers, if only there were any great researchers willing to work on alignment.

17