AI Alignment Research Engineer Accelerator (ARENA): call for applicants

CallumMcDougall

(Edited, to now include a section specifically for FAQs about the virtual program.)

TL;DR

Apply here for the third iteration of ARENA (Jan 8th - Feb 2nd)!

Introduction

We are excited to announce the third iteration of ARENA (Alignment Research Engineer Accelerator), a 4-week ML bootcamp with a focus on AI safety. Our mission is to prepare participants for full-time careers as research engineers in AI safety, e.g. at leading organizations or as independent researchers.

The program will run from January 8th - February 2nd 2024^[1], and will be held at the offices of the London Initiative for Safe AI. These offices are also being used by several safety orgs (BlueDot, Apollo, Leap Labs), as well as the current London MATS cohort, and several independent researchers. We expect this to bring several benefits, e.g. facilitating productive discussions about AI safety & different agendas, and allowing participants to form a better picture of what working on AI safety can look like in practice.

ARENA offers a unique opportunity for those interested in AI safety to learn valuable technical skills, work in their own projects, and make open-source contributions to AI safety-related libraries. The program is comparable to MLAB or WMLB, but extends over a longer period to facilitate deeper dives into the content, and more open-ended project work with supervision.

For more information, see our website.

Also note that we have a Slack group designed to support independent study of the material (join link here).

Outline of Content

The 4-week program will be structured as follows:

Chapter 0 - Fundamentals

Before getting into more advanced topics, we first cover the basics of deep learning, including basic machine learning terminology, what neural networks are, and how to train them. We will also cover some subjects we expect to be useful going forwards, e.g. using GPT-3 and 4 to streamline your learning, good coding practices, and version control.

Note - participants can optionally not attend the program during this week, and instead join us at the start of Chapter 1, if they'd prefer this option and if we're confident that they are already comfortable with the material in this chapter.

Topics include:

PyTorch basics
CNNs, Residual Neural Networks
Optimization (SGD, Adam, etc)
Backpropagation
Hyperparameter search with Weights and Biases
GANs & VAEs

Duration: 5 days

Chapter 1 - Transformers & Interpretability

In this chapter, you will learn all about transformers, and build and train your own. You'll also study LLM interpretability, a field which has been advanced by Anthropic’s Transformer Circuits sequence, and open-source work by Neel Nanda. This chapter will also branch into areas more accurately classed as "model internals" than interpretability, e.g. recent work on steering vectors.

Topics include:

GPT models (building your own GPT-2)
Training and sampling from transformers
TransformerLens
In-context Learning and Induction Heads
Indirect Object Identification
Superposition
Steering Vectors

Duration: 5 days

Chapter 2 - Reinforcement Learning

In this chapter, you will learn about some of the fundamentals of RL, and work with OpenAI’s Gym environment to run their own experiments.

Topics include:

Fundamentals of RL
Vanilla Policy Gradient
Proximal Policy Gradient
RLHF (& finetuning LLMs with RLHF)
Gym & Gymnasium environments

Duration: 5 days

Chapter 3 - Paper Replications

We will conclude this program with paper replications, where participants will get guidance and mentorship while they replicate a paper containing material relevant to this course. This should draw on much of the skills and knowledge participants will have accumulated over the last 3 weeks.

Duration: 5 days

Below is a diagram of the curriculum as a whole, and the dependencies between sections. Note that this may change slightly in the lead-up to the program.

Here is some sample material from the course, which you will be able to full understand once you reach that point in the course. This notebook is on Indirect Object Identification (from the chapter on Transformers & Mechanistic Interpretability), it will represent one of a set of optional 2-day mini projects which participants can choose from towards the end of that 5-day period.

Virtual Program

We're also very excited to announce a virtual program, which will be hosted by BlueDot's platform (the same one which is used to host the AI Safety Fundamentals course). People who are interested in participating virtually should also apply using the same form (they can express their preference for virtual / in-person in this form).

Call for staff

As well as inviting applications from participants, we're also interested in applications from teaching assistants (TAs). You can apply to be a TA for specific chapters of content, if you have particular expertise in them. TAs will be well compensated for their time. Please contact callum@arena.education with any more questions, or comment on this post and we will try and respond in a timely manner.

We're also interested in people who can provide DevOps support, particularly during the first week (e.g. setting people up on virtual machines, and resolving technical problems).

Lastly, if there are some chapters of the course you're highly knowledgeable in, and others you would like to skill up in, we'd be open to a hybrid system of part-TA-ing, part participating. If you're interested in something like this, you should put "staff" rather than "participant" in the application form (linked at the end of this post), and in the form you'll have the opportunity to specify exactly what you're interested in.

FAQ (general / in-person program)

If you have a question not in this list, which you think it would be valuable for people to have an answer for, please comment your question below and we will respond. If you have a question which you don't want to make public, you can message us directly (or ask the question in your application form).

Q: Who is this program suitable for?

A: We welcome applications from people who fit most or all of the following criteria:

Care about AI safety and making future development of AI go well
Relatively strong math skills (e.g. about one year's worth of university-level applied math)
Strong programmers (e.g. have a CS degree / work experience in SWE, or have worked on personal projects involving a lot of coding)
Have experience coding in Python
(if applying for in-person program) Would be able to travel to London for 4 weeks, starting 1st Jan (or for 3 weeks, starting 8th Jan)

We are open to people of all levels of experience, whether they are still in school or have already graduated.

Note - these criteria are mainly intended as guidelines. If you're uncertain whether you meet these criteria, or you don't meet some of them but still think you might be a good fit for the program, please do apply! You can also reach out to us directly, at callum@arena.education.

Q: What will an average day in this program look like?

At the start of the program, most days will involve pair programming, working through structured exercises designed to cover all the essential material in a particular chapter. The purpose is to get you more familiar with the material in a hands-on way. There will also usually be a short selection of required readings, designed to inform the coding exercises.

As we move through the course, some chapters will transition into more open-ended material. For example, in the Transformers & Interpretability chapter, after you complete the core exercises, you'll be able to choose from a large set of different exercises, covering topics as broad as model editing, superposition, circuit discovery, grokking, discovering latent knowledge, and more. In the last week, you'll choose a research paper related to the content we've covered so far & replicate its results (possibly even extend them!). There will still be TA supervision during these sections, but the goal is for you to develop your own research & implementation skills. Although we strongly encourage paper replication during this chapter, we would also be willing to support well-scoped projects if participants are excited about them.

The program will run on weekdays. Each day will be roughly the length of a normal working day (9am-5pm). There will be no compulsory attendance on weekends, but we might organize AI safety discussion groups or social events during this time. The office space will be available 24-7 for anyone who wants to use it outside regular hours.

Q: How many participants will there be?

We're expecting between 10-15 participants in the in-person program. For the virtual program, we're anticipating numbers in the range of 20-40.

Q: Will there be prerequisite materials?

A: Yes, we will be sending you prerequisite reading & exercises covering material such as PyTorch, einops and some linear algebra (this will be in the form of a Colab notebook). We expect that these will take approximately 1-2 days to complete.

Q: When is the application deadline?

A: The deadline for submitting applications is November 27th, 11:59 pm UK time (i.e. 3 weeks from today). However, we will be interviewing and making offers to candidates on a rolling basis.

Q: What will the application process look like?

A: There will be three steps:

Fill out the application form (this is designed to take ~20 minutes).
Perform a coding assessment.
Interview virtually with one of us, so we can find out more about your background and interests in AI safety & this course.

Q: Can I join for some sections, but not others?

A: Participants will be expected to attend the whole workshop. The material is interconnected, not learning all of it would lead to a disjointed experience. We have limited space and therefore are more excited about offering spots to participants who are able to come for the entirety of the program.

The exception to this is the first week, which participants can chose to opt in or out of based on their level of prior experience.

Q: Will you pay stipends to participants?

A: Unfortunately, we won't be able to pay stipends to participants. However, we will be providing housing & travel assistance to in-person participants (see below).

Q: Which costs will you be covering, for the in-person program?

A: We will be providing £500 in travel assistance per participant, as well as £1000 in housing assistance, and up to £500 in visa assistance if this is needed. Meals and office space will be covered by us (and the LISA offices are kept well-stocked with snacks!).

Q: I'm interested in trialling some of the material, or recommending material to be added. Is there a way I can do this?

A: If either of these is the case, please feel free to reach out directly via an EAForum/LessWrong message (or email at callum@arena.education) - we'd love to hear from you!

Q: Do you plan to run more bootcamps in the future?

A: Possibly! If you can't make these dates, then we encourage you to submit an application anyway (the form is designed to be relatively low-effort to fill out). We were pleased with the impact & feedback received from the second iteration of this program, and we'd be excited to continue to run these bootcamps if this program is also well-received. We may also take steps to help university groups teach a version of this curriculum (similar to how WMLB and CamLAB teach versions of the MLAB curriculum).

Q: How has the program been changed from the second iteration?

A: For a full answer to this question, see our impact report for the previous iteration of this program. A short version of the main changes we've chosen to make:

A shorter program overall (4 weeks rather than 6),
Paper replications rather than capstone projects to conclude the program,
A much more extensive "Transformers & Interpretability" chapter, including material outside of the strict definition of mechanisic interpretability (e.g. activation steering and DLK),
Removal of the "Training at Scale" chapter (although we'll be keeping some aspects of this curriculum),
Some other small adjustments to particular chapters (e.g. adding material on GANs & VAEs in the first week, replacing other less essential material such as array stride operations),
A better-structured reading program, which ties closer into the curriculum material.

FAQ (virtual program)

Q: Will the virtual program be full-time?

A: Yes, it will be full time for the 3/4 week duration of the program. Each day will usually consist of a lecture of length 15-60 minutes outlining the core ideas which are important for that day's exercises, followed by working through the coding exercises. We don't expect virtual participants to complete as many exercises as the in-person participants (see next question).

Q: Will the virtual and in-person curricula be different?

A: They'll follow the same basic week-to-week structure. However, we understand that people completing the program virtually won't be able to work at the same pace as people completing it in-person. We will be indicating which parts of the exercises are essential and which can be skipped, and we'll also generally avoid having exercises be too interdependent so participants don't fall behind if they can't complete one particular day.

Q: Can I miss sessions in the virtual program?

A: As mentioned above, the exercises generally won't be heavily interdependent. There are some exceptions (e.g. none of the material in the transformers chapter can be tackled until you've gone through the process of implementing your own transformer). However, we'll do our best to structure things so that participants can complete less of a particular section (or even miss sections) and still keep pace with the rest of the course.

Q: What feedback will be given during the virtual program?

A: At the end of each day of exercises, participants will submit their work (in the form of a link to a shared Colab, or a GitHub repo). They can also request feedback on specific parts of their solution, which we will be able to provide.

Q: What if I'm interested in the virtual program, and I've already done some of the exercises?

A: If these are the compulsory exercises (e.g. any in chapter 0, or the start of chapter 1), you can work on bonus material during this time if you prefer. Alternatively, you can work on some of the other optional sets of exercises (e.g. chapter 1 contains a very large set of different possible exercises, of which most participants will only have time to try a few). Regardless of which sections you've completed, we still encourage attending the virtual talks and events we'll host during this time.

Q: Will the virtual program still have pair-programming?

A: Yes, we will be providing support for pair programming, and heavily encouraging this. If it's not feasible for you then we will allow people to work on their own (or choose a middleground e.g. coworking but not pair programming), however we strongly encourage pair programming because it's a pretty vital part of the overall experience.

Q: Will the assessments be different for virtual and in-person?

A: No, the assessments will be the same for both cases. However, we expect to have a higher bar for the in-person assessments, because places will be more limited.

Q: Will we provide compute support for the virtual program?

A: Unfortunately, we won't be able to provide compute via a provider like we will for the in-person program. We'll encourage the use of Colab (all the exercises have associated Colab notebooks), and we will be able to reimburse people for Colab pro subscriptions for the duration of ARENA. If people would prefer to use their own IDEs & source their own compute, then we'll provide as much assistance as we can (setup instructions and a Slack group for devops support-related questions), although we won't be able to offer as much support as we can for the in-person group, so we recommend the Colab option to most people unless you're already comfortable setting up your own coding workspace & environment.

Q: I want to meet people who are interested in alignment, and am worried the virtual program will not help me do that.

A: We intend to help virtual participants network with each other, through pair programming, virtual discussion groups, and office hours. We also encourage virtual participants to reach out to each other to discuss the material, cowork, or hear what each other are working on.

Q: Can I attend a hybrid of virtual / in-person?

A: We will be able to invite virtual participants as guests to the office for short periods, if they are in the vicinity of the LISA offices. This will be a great opportunity to meet us and chat to the other researchers using the space. However, we can't guarantee virtual participants the use of the office space for extended periods.

Link to Apply

Here is the link to apply (it is the same for participants and staff). You shouldn't spend longer than 20-30 minutes on it.

We look forward to receiving your application!

^{^}
The first week will be optional, as we discuss further down in this post.

56

AI Alignment Research Engineer Accelerator (ARENA): call for applicants

56

TL;DR

Introduction

Outline of Content

Chapter 0 - Fundamentals

Chapter 1 - Transformers & Interpretability

Chapter 2 - Reinforcement Learning

Chapter 3 - Paper Replications

Virtual Program

Call for staff

FAQ (general / in-person program)

Q: Who is this program suitable for?

Q: What will an average day in this program look like?

Q: How many participants will there be?

Q: Will there be prerequisite materials?

Q: When is the application deadline?

Q: What will the application process look like?

Q: Can I join for some sections, but not others?

Q: Will you pay stipends to participants?

Q: Which costs will you be covering, for the in-person program?

Q: I'm interested in trialling some of the material, or recommending material to be added. Is there a way I can do this?

Q: Do you plan to run more bootcamps in the future?

Q: How has the program been changed from the second iteration?

FAQ (virtual program)

Q: Will the virtual program be full-time?

Q: Will the virtual and in-person curricula be different?

Q: Can I miss sessions in the virtual program?

Q: What feedback will be given during the virtual program?

Q: What if I'm interested in the virtual program, and I've already done some of the exercises?

Q: Will the virtual program still have pair-programming?

Q: Will the assessments be different for virtual and in-person?

Q: Will we provide compute support for the virtual program?

Q: I want to meet people who are interested in alignment, and am worried the virtual program will not help me do that.

Q: Can I attend a hybrid of virtual / in-person?

Link to Apply

56

56