Cross-posted from the Effective Altruism Forum
Summary: AI safety university groups are a promising way to grow the talent pool working to reduce existential risk from AI and can be fairly straightforward to set up successfully. Based on the experience of several student groups last semester, running AGI Safety Fundamentals or a similar AI safety reading group can attract dozens of participants and boost their likelihood to work on AI safety, while engaging many that wouldn’t have been reached by cause-neutral effective altruism community building. If you don’t have the time or experience to run an AI safety reading group, it can be valuable to broadly publicize a virtual program facilitated by another group (e.g., EA Cambridge).
Overall, AI safety field-building is a new, exciting area with much low-hanging fruit. At the same time, there are risks from poorly executed field-building which should be thoughtfully handled, such as inducing a poor impression of AI safety or increasing interest in accelerating the development of (potentially unsafe) artificial general intelligence (AGI).
If you’re interested in AI safety field-building, either at a university or in another setting (e.g., technical workplaces), fill out this form to get support from AI Safety Hub! Even if you think you’d do fine without a call, I’d strongly recommend taking a few seconds to fill this out, so that they can connect you with resources and other group organizers.
Transformative AI might pose considerable existential risk within the coming decades or later this century. Trained to maximize an objective, AI systems often find creative and undesirable ways to do so. YouTube's recommendation algorithm, trained to suggest content to maximize user engagement, ended up recommending increasingly extreme content and proliferating misinformation. Language models such as GPT-3, trained to predict the next word in a piece of text, can be prompted to produce offensive content and leak personal information from the training data. As AI capabilities advance, we can expect the stakes of these risks from misaligned systems to dramatically increase. Aligning deep learning systems with human values might be extremely challenging, and approaches that work for making a simpler model safe may fail for more advanced models. At the moment, leading AI companies such as Google Brain and Meta AI are actively working on developing AI with increasingly general capabilities (see A path to autonomous machine intelligence and Pathways), while not investing much in research to ensure that future systems are aligned and safe. (For a more thorough introduction to existential risk from AI, see Introduction to AI Safety - Robert Miles, Why AI alignment could be hard with modern deep learning, AI alignment - Wikipedia, and Power-seeking AI and X-risk mitigation.)
To make progress on the problem of existential risk from AI, it’s extremely valuable to have additional talented people working on it. That includes people in a variety of roles, such as machine learning safety research, conceptual alignment research, policy research, software engineering, operations, machine learning engineering, security, social science, and communications. Universities are a major source of potential talent, especially as students are actively exploring different possible career options. However, the vast majority of students have not heard about AI safety, even among students interested in AI, and fewer still have considered how they could work on AI safety.
Despite the need for many more people to work on AI safety, there hasn’t been much high fidelity, in-depth AI safety community-building at universities. But this year, there’s been a much stronger effort to actually try, and it seems to have been broadly successful. As a few case studies from this past semester:
Still, AI safety field-building remains relatively neglected in the world and in EA community groups. While there are 80+ EA university groups around the world, there are only 16 groups that I know of that are running significant AI safety activities (e.g., a program similar to AGI Safety Fundamentals). [Update: As of November 2022, I believe there are ~30 such groups.] Just a handful of groups have AI governance programming. There’s a lot of low-hanging fruit in the area, and a lot of capacity for more people to help out in this space. (Even relatively large and successful groups could benefit from more community builders; for example, EA at Harvard is currently interested in having an additional community builder working full time for AI safety field-building. Note that you don't need to be a university student to work on university community-building; indeed, OxAI Safety Hub and EA Cambridge are assisted by full-time staff who have already graduated.)
If you’re quite busy and limited on free time, one option is to just publicize a virtual AGI Safety Fundamentals program to computer science (CS) students at your university, the next time EA Cambridge runs the program (sign up here to be notified when it does). Depending on your university, there might be ways to do marketing which don’t take much time. I think it might just take a couple hours, if that.
Perhaps your university allows you to email all the CS students by emailing a listserv or asking an academic advisor. You could post about it to an online group used by students, such as Reddit, Facebook, or Discord. Maybe you could ask professors or clubs to forward an email to students. And of course, you could encourage your friends to sign up. (In publicity, I’d probably refrain from mentioning “artificial general intelligence” or “existential risk” to appear more mainstream; see EA NYU’s fellowship page for one possible way to describe the program.)
You might wonder, why would anyone sign up for a program like AGI Safety Fundamentals, especially when students are so busy? Here are some factors which I think helped EA at Georgia Tech get a good number of participants this past semester:
And since the participants commit to attending the discussion meetings every week, they're fairly likely to attend most meetings.
It can be valuable to have some sort of recurring in-person meetings, such as weekly events with free lunch. This could be a purely social event, or it could involve watching some videos relevant to AI safety (e.g., Robert Miles’ YouTube channel).
Additionally, we might be able to help you save time with group organizing by connecting you with operations assistants.
I think AI safety clubs can contribute to AI safety through pursuing two primary goals:
For the first goal, one option which works well is running a local version of the AGI Safety Fundamentals reading group. As mentioned above, even if you don’t have time to facilitate discussions yourself, it could be quite valuable to publicize EA Cambridge’s virtual program to your university, the next time EA Cambridge runs it.
This section here is help you concretely envision what AI safety field-building could look like, especially based on what groups have done in the past. However, it’s not intended to be a comprehensive implementation guide and omits many helpful resources which I could share. If you’d like to get involved, please get in touch!
Publicizing and running an introductory AI safety reading group appears to be a great way to build up a community of people interested in AI safety, analogous to how the Intro EA Program is an excellent way of starting an EA university club and has worked for ~50 university EA groups. I know of nine university groups which ran the AGI Safety Fundamentals alignment program this past semester: Oxford, Cambridge, Georgia Tech, University of Virginia, Northwestern, Mila, MIT, Harvard, and Stanford. Additionally, Columbia EA ran an AI safety reading group using a locally developed curriculum.
The AGI Safety Fundamentals alignment program is a semester-long reading group on AI alignment. Topics include an introduction to machine learning, existential risk from AGI, inverse reinforcement learning, reward modeling, scalable oversight, agent foundations, and AI safety careers. This program involves weekly 1.5-hour discussions in small groups of 4–6 participants and one discussion facilitator, as well as 2.5 hours of readings and exercises to be done before discussion meetings. Participants don’t need any prior experience in CS or machine learning, though it is helpful. The curriculum was first created in January 2021 by Richard Ngo, an AI governance researcher at OpenAI.
If you’d like to run the program locally, you’d create an application form, publicize the program to CS students at your university, and facilitate weekly 1.5-hour discussions for cohorts of 4–6 participants. You’d want to be familiar with the readings beforehand, but otherwise, facilitating discussions is fairly straightforward since there’s a facilitator guide. (If you’d prefer a curriculum with a lower time commitment for participants, check out Columbia EA’s AI safety curriculum.)
You might be surprised by how many people would be interested in applying to join a reading group on AI safety! If you market it well, you can get a good number of applications. I think a lot of students are just excited to learn more about deep learning, even if they don’t have any prior exposure to AI safety content. But even having a small number of participants could be very valuable.
If you don't have the time or experience to facilitate a few cohorts of the AGI Safety Fundamentals program, feel free to register interest for the global program here and publicize it when applications open. To help participants get to know each other outside of Zoom calls, it might be valuable to coordinate in-person meetups for participants, such as casual lunch socials.
Organizing a reading group is only one possible model for building a community. Other plans I’ve heard of for establishing a community are: running a series of workshops, organizing an AI safety retreat, or running an “eliciting latent knowledge” or distillation contest. Additional activities have worked well for groups include a watch party of Robert Miles’ YouTube channel on AI safety, speaker events, paper reading groups, casual social events, and simple one-on-one conversations – but I think simply publicizing/running AGI Safety Fundamentals would be an excellent way to get a group started started.
Besides getting people interested in working on AI safety, local groups have a valuable part in helping people gain the skills necessary to actually contribute to the field.
One simple option for helping members with upskilling is to connect members with existing resources for upskilling and to provide basic career advice for interested members. You could encourage and support members to:
Still, there’s a gap between knowing what to do and actually doing it, and it might take just an extra bit of support. Stanford AI Alignment is planning on running coworking sessions and having peer accountability setting SMART goals.
After there’s a solid community of students interested in AI safety, you could organize group activities such as the following (based on suggestions from AI safety researchers):
In planning upskilling activities, it’s helpful to first get a sense of the skills and experience necessary for various careers, such as by reading “How to pursue a career in technical AI alignment” (strongly recommended!) and browsing job descriptions from the 80,000 Hours jobs board. I think it could also be useful to be familiar with the overall process for internships or getting involved with research – see the interview process for software engineering internships, the Tech Interview Handbook, the Machine Learning Interviews Book, and a thread on emailing professors about joining their research lab.
You’d be a great fit if you are:
Some reasons to not work on this:
AI safety field-building at universities is a promising and neglected way to engage more people to work on reducing existential risk from AI. You don’t have to be studying at a top CS or policy university, since you can work on AI safety field-building where you are or at another university.
If your university doesn’t already have much of a community interested in AI safety, a great option to get started would be to organize an introductory AI safety seminar program and publicize it to local students. You could use EA Cambridge’s alignment curriculum or governance curriculum, or Columbia EA’s alignment curriculum. If you’re not able to facilitate the program yourself, EA Cambridge or another EA group should be able to support you in recruiting virtual facilitators from the hundreds of past participants. Then, all you’d have to do is create an application form, publicize it to your school (e.g., through listservs), assign participants and facilitators to cohorts, and coordinate the program.
It’s also valuable to encourage participants to take further action such as:
Other valuable activities that groups can run include projects, paper reading groups, guest speaker events, one-on-one conversations, and retreats.
You could organize a group either on a part-time or full-time basis. Group funding for snacks, books, and much more is available from the Centre for Effective Altruism’s Group Support Funding. Besides being impactful, group organizing can also be a great way to build career capital.
I’m Michael Chen, a recent graduate from Georgia Tech. I’m currently a research intern at the Center for Human-Compatible AI, though this post only reflects my personal views, inspired by the activities of other AI safety community groups. Last year, I co-founded EA at Georgia Tech along with Anish Upadhayay. Before then, I felt pretty pessimistic about the viability of starting an EA group there, but it turns out that starting a group was a lot easier and more successful than I expected. I’d be really excited to support the development of other AI safety university groups!
If you’re interested in helping to build up the AI safety community at universities, please fill out this interest form here!
Thanks to Justis Mills, Jamie Bernardi, Luise Wöhlke, Anjay Friedman, and Thomas Woodside for providing suggestions and feedback on this post! All mistakes are my own.
When I say “AI safety university group”, I’m also including EA clubs which have significant activities focused on building the AI safety community. To the average computer science student, “effective altruism” sounds like it has nothing to do with either AI or computer science, so from a marketing perspective, I think it can be helpful to run AI safety activities under an AI safety club, but I don’t think it matters that much overall.
Another risk is if humanity fails to create a future with net positive value for sentient beings, despite successfully managing to avoid extinction or disempowerment from AI – see The future might not be so great and A typology of s-risks.
This post will largely focus on university groups, but some of the content here may apply to workplace groups. While reading groups (also known as “fellowships” or “seminar programs”) have been a successful way for EA or AI safety university groups to attract new members, this is unlikely to be as fruitful for workplace groups, as employees rarely have time to participate in a reading group. Workplace groups may want to focus on activities with a lower time commitment for participants, such as introductory talks or workshop series.
Research impact is heavy-tailed; the most influential machine learning researchers have ~1,000× as many highly influential citations. Other roles like engineering might be less heavy-tailed; still, AI safety organizations are looking for quite excellent engineers. Here, I don't necessarily mean reaching people who are already highly skilled in machine learning, policy research, etc. While that would be great, we can also reach bright students who could become highly skilled later. Consequently, for people whose comparative advantage is in reaching university students, it can make sense to prioritize reaching people who are especially likely to be very talented in the future.
See Communications Specialist | Fund for Alignment Research (FAR) and Call For Distillers - AI Alignment Forum.
That would be (in no particular order) OxAI Safety Hub, EA Cambridge, EA Warwick, EA at MIT / AI Safety at MIT, EA at Harvard / EA at the Harvard Kennedy School, EA NYU, EA at the University of Virginia, EA at Georgia Tech, EA Northwestern, AI Safety at Mila, Columbia EA, EA Georgetown, London School of Economics EA, Imperial EA, EA at UC Berkeley, the Stanford Existential Risks Initiative, and UCLA EA. If I’m missing your group, sorry about that! Feel free to shoot me a message and say hello :)
Here are my recommendations for learning about AI safety: