Lessons from organizing a technical AI safety bootcamp

Dmitrii Gusev

Summary

This post describes how we organized the Finnish Alignment Engineering Bootcamp, a 6-week technical AI safety bootcamp for 12 people. The bootcamp was created jointly with the Finnish Center for Safe AI (Tutke) and Effective Altruism (EA) Finland. It was composed of five weeks of remote learning based on the ARENA curriculum and a one-week on-site research sprint. We provide extensive details of our work and lessons learned along the way. Hopefully, this post helps others build similar programs and run them (or existing ones) more effectively. We don't focus here on our impact, although you can read about it here.

Thanks to

- Santeri Tani and Karla Still for their help with creating the program,

- James Hindmarch, Joly Scriven, David Quarel, and Nicky Pochinkov from ARENA; Gergő Gáspár from ENAIS; and Clark Urzo from Whitebox Research for their advice,

- Reetta Kohonen, Claude 4 Sonnet and GPT-5 for extensive comments on the draft of this text.

Structure of the post

This post describes the creation and running of the Finnish Alignment Engineering Bootcamp (FAEB) under the Finnish Center for Safe AI (Tutke) and Effective Altruism (EA) Finland. The post is divided into the following subsections, each of which ends with short lessons learned:

1. Team and preparations

2. Candidate recruitment

3. Infrastructure and logistics

4. Remote learning and TAing

5. Speakers and extracurriculars

6. In-person project week

We go over our work and considerations in detail, so it might make sense to skim this post and concentrate on the lessons learned unless you are very intrigued.

1. Team and preparations

Vili (a Math PhD student at Aalto University) was admitted to the fifth ARENA program, while Dmitrii (a CS bachelor's student at Aalto University) had been organizing local AI safety events. When several people in the Finnish AI safety community expressed interest in studying the ARENA curriculum, and EA Finland offered funding for summer projects, Dmitrii proposed creating a formal program. Arguably, there are more people interested in contributing to AI safety research than the current supply of programs can upskill. Vili agreed on the condition that Dmitrii committed to handling all operations, allowing Vili to focus solely on curriculum structure and part-time teaching. Both of us had other commitments during the program, which we do not recommend. Santeri Tani from Tutke promised support.

Due to financial and time constraints, we knew that going through the ARENA materials had to be done remotely, but we also wanted to have an in-person project week. The benefits of getting people in the same space are huge, and while the structure of the closed-ended ARENA curriculum supports remote learning, the value of actually working on open-ended research projects cannot be overstated. We consulted several people about how to conduct such a program, and although it didn't sound easy, it didn't sound too hard. In retrospect, this was correct, but we severely underestimated the workload. Skepticism was expressed about the remote structure, but we didn't really have a choice.

The initial funding obtained from EA Finland was barely sufficient to cover compute for ten people for five weeks and a week of shared accommodation in Finland for the project week. Santeri Tani from Tutke obtained additional funding that allowed for some breathing room and the acceptance of three more participants. Still severely financially constrained, we moved forward without salaries. In the end, we received extra funding from OpenPhil just before the project week, so we could provide lunches and dinners together with some transport support, and the grant also included small salaries for the two of us. Working with OpenPhil was pleasant and smooth, though we are uncertain whether we would have received their support without EA Finland and Tutke vouching for us by offering initial funding. Tutke also offered bonuses for our work.

The final expense breakdown was:

€3,500 for accommodation,
€3,000 for meals and snacks,
€2,000 for compute and API calls,
€800 for other minor expenses, such as the coding test, booking a sauna for the after-party, etc,
The residual budget was used for travel support.^[1]

It must be emphasized that we were in a terrible rush. Due to scheduling conflicts, the decision to move forward with the program was made on May 13 and the start date was set to June 16, leaving only a month to find participants and get everything ready. This was the minimum amount of time expected to make everything work, but in hindsight, we should have postponed the start by at least a week.

We set our objectives as:

1. Upskilling participants to produce or support AI safety work

2. Increasing the amount of effort people put into AI safety

3. Having an attrition rate of at most 33%

4. Getting an 8/10 satisfaction rating with the program

The first objective was hazy, but the reason why we set out on the task was that there is an influx of really smart people who need to understand the basics of technical AI safety so they can get to do research (or any other impactful work) to mitigate existential risks from AI. This was hard to measure, but we applied learning metrics from ARENA to have something quantifiable. These included measuring self-reported knowledge levels in areas such as LLM evals and technical research experience.

The second objective was a supporting proxy for the first. We had a pre-program survey asking how much time people were putting into AI safety now and will do a follow-up survey soon (3 months after the end of the program). The aim was that the program would increase participants’ engagement and effort toward the cause.

The target attrition rate (proportion of people dropping out) was set to 33%. This was a number that people familiar with conducting similar programs said was respectable.

The 8/10 satisfaction rate was our internal decision as we were moving with such speed, few resources, and limited experience.

Because we were doing a remote program, we knew that we would receive applications from people who could not attend longer in-person programs. Many of these people also most likely couldn't work on the bootcamp full-time. Hence, we decided to separate the program into full-time and part-time tracks. Vili was hesitant for a long time about this due to his experience with the materials and how strenuous studying them could be, but in hindsight, this was 100% the right choice. The participant split was approximately 50/50, and we would have missed out on very talented and motivated people had we enforced full-time participation only. The full-time track covered the ARENA materials in full, while for the part-time track, Vili had 1-on-1s to find a subset of the materials to benefit those participants most.^[2]

The first and maybe even the largest mistake we made was that we didn't explicitly plan expected deliverables and timelines, or clarify our team responsibilities. Even though we were running "only" a 6-week program, there were many moving pieces, and instead of Dmitrii doing all of the operations and Vili concentrating on teaching, we spent a lot of time confused about what to do and who should do it. This resulted in lots of ad hoc meetings to go over trivial issues and assign responsibilities on the go.^[3]

Lessons learned

Start preparations early, preferably at least three months in advance, especially if applying for funding.
Ask for advice. When you are motivated and show people you've done your homework, people are likely to help.
Set clear objectives and create a project plan. Have a "good enough" picture of how to arrange everything.
Have clear roles and responsibilities with estimates of the required working hours. This is even more important if you are not planning to organize the program full-time and have obligations on the side.
Estimate the budget and leave some buffer. You don't want to go close to exhausting all your resources, let alone burn your own money.

2. Candidate recruitment

Dmitrii created a web page with an FAQ and an application form for the program. We wanted to make a short and easy-to-fill form, but still to understand whether the applicant satisfied our main criteria of having:

technical skills to complete the program
a (somewhat) clear path to contributing to AI safety
experience of working independently on a significant project (research, entrepreneurship, personal projects)

The application period was two weeks, and Dmitrii managed to market FAEB in several channels, including the AI Safety Events & Training Newsletter, AI alignment Slack, and the channels of Finnish AI-Safety-related groups. This was very important for a new program like FAEB, and backing from Tutke and EA Finland also lent us some credibility. We received ~80 high-quality applications.

We narrowed the pool to the 25 most promising applicants who seemed like they could complete the program and contribute to AI safety soon after. This was roughly twice the people we were expecting to accept into the program, so there was leeway for candidates who might not fit our bill in the end and candidates who might reject an offer.

The rest of the recruitment process consisted of a 30-minute interview and a 60-minute coding test. The interview questions sought to elicit a more nuanced understanding of candidates' stances on AI risks and how they form views on safety approaches. It was also crucial to clarify how committed people were to the program. This was fuzzy but could be somewhat inferred from responses about other commitments during the program and how willing the candidates were to fly to Finland at their own cost as we couldn’t cover travel expenses due to budget constraints. The most promising candidates also answered well the question of how the program would help them counterfactually if they were accepted and what their plan B was. Building trust both ways was important, so we tried to make the interviews as honest and transparent as possible, and ended the interview with more program details and time for the candidates to ask questions.

Because the ARENA materials are very technical and are almost exclusively focused on programming, it was mandatory to assess people's coding skills. We thought of saving on an official coding test, but this didn't make any sense. We then “splurged” $200 on Coderbyte because that was a platform we had heard was OK, and we were already almost late sending the invitations for the next recruitment stage. Probably, there exist cheaper alternatives that would also do the job.

Vili wanted to conduct all of the interviews. This was plenty of work, but the arguments were that it would be easiest for him to then assess who could and would complete the program, given his experience in ARENA. He was responsible for the program structure and contents, so he could also answer program-related questions best.

We discussed who to accept into the program together. Out of a very narrowed-down applicant pool, we made an offer to 13 people, and everyone accepted. Had we had more money, we would have liked to accept more people into the program.

During the recruitment process, Dmitrii communicated with the candidates quickly and clearly. This was very important for us: people put time and effort into their applications, so we wanted to give them the best experience possible with our resources. We also wanted to encourage talented and motivated people whom we couldn't help this time.

Lessons learned

Offer a good description and FAQ for the program.
Market the program in as many places as possible, especially if it's new.
Have explicit criteria with which to assess applicants.
Make sure you receive enough information during the recruitment process to differentiate the best applicants while keeping it as easy and light as possible.
Communicate transparently and promptly with applicants about how the recruitment process is going and why certain decisions were made.

3. Infrastructure and logistics

We were doing most things for the first time, so we encouraged everyone to give open feedback - although we only realised we needed to introduce a formal, weekly Google Forms survey halfway through the program. Receiving frequent input from the whole cohort was very important. While regular 1-on-1s are essential, they might give a biased view of how things are going; complementing them with anonymous feedback can elicit great suggestions on how to improve things.

We used Slack as the course platform to inform participants about what was happening and to host asynchronous discussion. We tried hard to create a set of channels that were mutually exclusive and collectively exhaustive, but participants still found it confusing. The best suggestion we got and implemented was to create one pinned post with all the most relevant information and links.

Gather Town was our choice for the platform where collaboration actually happened. This was suggested by everyone we talked with prior to the program, and we echo their view that it makes sense to pay for Gather. People commented that in the beginning, the platform felt a little weird, but that they quickly became accustomed to it, and it became surprisingly nice and fun.

LettuceMeet is commonly used to find suitable times for group activities, but when we tried it, we encountered some technical problems and switched to When2Meet.

We provided participants with a computational infrastructure using the ARENA infra. Nicky has done an amazing job with it and helped us a lot. The README is even more instructive now, so it should be relatively straightforward to set up. Some people still used their own machines, and in principle, one could do the materials using Google Colab or similar, but we received lots of good feedback about the computational environment. To anyone considering a program based on the ARENA curriculum, we strongly recommend providing the environment.

However, we didn't find a good solution for allocating the VMs to the participants. Concretely, we just had a Google Sheet where people put their name when they were on a machine and had to remember to remove their name when they stopped working. This is a small quibble, but if we'd had more time, it would have been nicer to, e.g., create a script on the proxy server to detect and display when an SSH connection to a VM was established.

Regarding the VM infrastructure, a major mistake was that Vili didn't notice that RunPod, the service provider for the compute, allowed automatic billing. Although we secured funding, it wasn't for immediate use for Vili during the program, and he was paying for the compute from his own pocket. Hence, he added only small calculated amounts to cover compute costs for the upcoming days. During the project week, both compute costs and stress increased, and Vili made an error by not increasing the balance enough on the third day. The funds ran out just before the fourth day started, and several people lost their overnight runs and some data. This was devastating to Vili personally, but fortunately, the damage was limited, and the lost work could be mostly recovered by adding extra VMs to run things in parallel. Make sure to set the automatic billing.

Finally, one considerable operational hindrance was that internally we didn't have a proper tracker for stuff to do. We assumed things would be straightforward and we could manage the majority of our work independently, but things were messy and required coordination. Mixing this with stress, small forgetfulness and perfectionism just meant more stress. We realized this fully around halfway through the program, but still didn't course-correct properly, wasting time and energy checking on each other. Use Trello or a similar project management service and have light processes for recurring things, and maybe have, e.g., weekly sessions with a proper structure. This way, you have actual proactive space to assess how things are going instead of reactively having ad hoc meetings all the time.

Lessons learned

Make a simple set of requirements for your infrastructure and find suitable tools or platforms to satisfy those. Google Forms, Slack, Gather Town, When2Meet and ARENA infra worked really well for us.
Get constant feedback. Tie forms to activities and gently nudge people to continuously provide you with proposals for improvement.
Pay attention to billing; cut your subscriptions when you don't need them, and for regular running costs, make sure the billing is automated.
Have structure or processes for handling recurring tasks and also a way to discuss ad hoc issues. This both depends on and determines how you work as a team, with a large impact on efficiency and stress.

4. Remote learning and TAing

The program schedule was given in a Google Sheet. Vili was available 3-4 hours a day in Gather Town to provide help or just talk in general. It was quite hard to find suitable times as a) we had participants from East Asia to the West Coast of the US^[4], and b) Vili had several other work and life duties. Fortunately, LLMs had already improved to a level where people got significant help from them.

Vili also created a document with study tips. While the ARENA materials are excellent and have clear study objectives, they are also very heavy, and participants mentioned sometimes getting lost or just being confused about specific parts. The document gave a very high-level description of why the materials were studied and mentioned common pitfalls to avoid.

Similar ARENA-based programs have greatly benefited from pair programming, and people recommended that we try it or at least have people working in small groups. We encouraged people to come to Gather to work together and provided matchmaking for pair programming, but collaboration was really hard and happened only a few times. Arguably, the main reasons were

1. Large time zone differences,

2. People covered the materials at different speeds,

3. We didn’t have structured social activities immediately after kickoff so people didn’t get to know each other properly.

Fortunately, the people admitted to the program were very independent and worked through the materials on their own. Especially because there was little interaction, we should have pushed people to share their thoughts about the materials in Slack; this started happening organically during the second week.

Because the ARENA materials are heavy and Vili wanted participants to engage with higher-level AI safety concepts, after each five-day block of study (one ARENA chapter), we reserved a day for recap and group discussions.^[5] The participants could use that day to go over materials they found especially difficult or spend time on unfinished parts. Further, we also gave two shorter readings—parts of papers and blog posts—to highlight more strategic issues of safety research relevant to the chapter. Then the cohort was split into three groups, which had 1.5-hour discussions for sharing thoughts about the ARENA materials and the readings. We had great feedback for the readings, but great-to-mild feedback for the group discussions. The session times were set according to availability, and Vili provided detailed instructions on how to conduct them, but the groups themselves were responsible for running the discussions, as we didn't have time to facilitate them.

For remote programs like these, it is very important to have regular contact with all participants. Especially if there are people from a wide range of time zones, one might only rarely see certain participants online. Everyone has their own life going on, and checking that everyone is OK and staying on track is vital. Vili found several participants very stressed throughout the program. Fortunately, they noticeably relaxed after a short chat where Vili ensured everything was fine. On top of encouragement, sometimes it was necessary to reprioritize and drop a little bit of the agreed workload. Life gets in the way, or things just aren't working out, so making adjustments is obviously much better than people quitting completely because it's too much.

Lessons learned

Regularly check in with all participants about their progress and well-being, and make personalised adjustments if needed.
Promote a social learning environment where people share their experiences (if not in Gather Town, at least on Slack or similar). We are still really unsure how to make this work well in a remote setting.
Create opportunities for people to discuss meta-level issues. In-person programs are unrivalled in this; for remote programs, you need to spend extra effort.
If you do group discussions, consider facilitating them. Otherwise, have very clear instructions and potentially appoint a facilitator from the group.
If you use ARENA materials, check our study tips document.

5. Speakers and extracurriculars

In addition to the learning materials and group discussions, we wanted to provide some quality-of-life extras, namely exposure to people working in the field, a writing workshop, and relaxed social activities.

We managed to find great speakers for each week and received very good feedback on this. Speaker dates and their topics were tied to the learning materials under study at the time. We also had one talk with a policy focus and another with a career focus, both of which were greatly appreciated.

We fumbled with the speakers a little. We had agreed on the sessions and sent calendar invites, but forgot to remind two speakers beforehand that their talk was scheduled for the next day, and both of these people forgot. We managed to reach the other, and they joined a little late, but the other talk had to be rescheduled. There were also problems with recordings. These are very basic things we should have handled better.

The writing workshop was a 2-hour session that extended ARENA's threat modelling exercise. In practice, instead of moving from a model property to a threat model in 40-60 minutes, the workshop had two to three times as much time and asked the participants to write a detailed threat model and then work out the model property to evaluate. The use of LLMs as a writing aid was encouraged. The structure was to:

1. Write a story about how things go horribly wrong because of AI,

2. Give feedback on other people's stories,

3. Improve one's story based on feedback,

4. Identify actions in the story that could have helped things go better,

5. Define 1-3 related model properties to evaluate, i.e. information that could have increased the probability of things going well.

The original plan was to still have another round of feedback and then specify thresholds for the properties that would warrant action. In practice, the five steps already took two hours for most people. The writing workshop received good feedback, although some people noted there could have been more interaction. In practice, we had a Google Doc and everyone had their own tabs, so the feedback round consisted of commenting on other people's writing there, but it could have easily been discussions in breakout rooms.

Finally, we didn't have regular socials from the beginning. This was a mistake. We had introductions at the kickoff and a public board in Gather Town with pictures and funny facts about all participants, but people would have greatly benefited from more structured social activities from the get-go. We thought of fun ideas for socials in the beginning, but dropped them because “fun” after 8 hours of studying on a screen at home probably didn’t mean more time on a screen at home. Then, we were too busy to think about socials and forgot about them. Just having a regular time for people to take a break and talk is probably a good way to start. Roughly halfway through the program, we introduced regular coffee chats due to popular demand, but we didn't hype them or follow through on them properly, so they died down fast. Had we introduced some socials immediately, group discussions, spontaneous collaborations and the general cohort feeling most likely would have improved.

Lessons learned

Start off and emphasize socials from the very beginning. People want to know each other, and socials help the cohort feel more like a community.
Find speakers early on and check in with them close to the actual event.
Even when doing a writing workshop, it's good to get people talking.

6. In-person project week

After five weeks of remote study, the long-awaited in-person project week came. The final day of the remote phase was dedicated to finding and pitching project ideas and forming groups. Of course, most people had started thinking of a project much earlier than that, and Vili wrote extensive guidance for this in the study tips document. Vili emphasized that it was most likely best to group up for the project, but we still had only three two-person groups. Some people mentioned afterwards that it probably would have been nicer to work with others, so maybe we should have stressed that even more.

The accommodation location and type were communicated early, right after we made the booking in the first week of the remote phase. As the phase neared its end, we ran an additional survey on project-week practicalities, such as dietary restrictions and preferences and the need for EU plug and HDMI adapters. We ensured that participants had purchased flights and confirmed their exact arrival dates.

The project week consisted of working on the project Monday to Friday, with Friday afternoon reserved for presentations and the evening for a party. Hence, Dmitrii booked accommodation in Finland for a week from the Saturday before the project week until the Saturday after the program ended. We provided instructions on how to get from the airport to the shared accommodation and how public transport worked in Helsinki. There were socials on Sunday and some weekday evenings, but these were all voluntary, as we also wanted to give people spare time.

For work, we managed to get a modern IT classroom from Aalto University. We had a local guide, a Finnish participant who by chance lived close to the common lodging, helping participants navigate to the correct place, which was extremely helpful. Both Vili and Dmitrii were available 9-5 every day for the project week, Vili for the projects and Dmitrii for everything else.^[6] In addition to the project work, one participant mentioned it would be nice to discuss meta-level topics more. Thus, we took an hour one day to have group discussions with high-level prompts about alignment research, which were highly appreciated. Vili also had 1-on-1s with everyone at the end of the week to help set them up post-program and connect them with helpful people.

In the end, we had 10 project presentations. For each of them, we reserved 10 minutes for the presentation itself and 5 minutes for Q&A and transition to the next presentation, and we had a 15-minute break halfway through. We managed the schedule smoothly by using Q&As as a buffer and reserving roughly two minutes for each transition. If there were fewer questions, we moved to the next presentation immediately to save time.

After everything had concluded, we headed to a rooftop sauna to celebrate!

Lessons learned

Have time for people to think of projects and make sure people understand what they are trying to achieve. See the “Capstone project” tab of the study tips document.
Facilitate and encourage people to group up for projects. This was harder in a remote setting.
Ask all the relevant participant information, such as diet preferences. Trying your best to accommodate people's needs makes a huge difference.
Remember to celebrate all the hard work everyone has put into the program!

Conclusions

At the end, we achieved two quantitative goals:

Overall program satisfaction was 9/10,
Attrition rate was 8%.

The program concluded with great projects, and roughly half from our cohort of 12 is now actively working in AI safety: some in full-time research positions, others part-time, and a few exploring product and management angles. Further, all self-assessment metrics increased. Hence, the objective of “upskilling participants to produce or support AI safety work” is considered achieved as well. The progress on the last goal—increasing the amount of effort people put into AI safety—is yet to be measured with a follow-up survey. A more detailed write-up of the results can be found here.

Running a bootcamp is hard, but it becomes much easier when you can leverage other people's work. If you think you could do it, you probably can. Ask for help, set clear objectives and main tasks, communicate clearly, constantly ask for feedback, put yourself into it, and things will fall into place. We managed to run a 6-week ARENA-based bootcamp successfully with two (officially) part-time employees, a limited budget, and lots of stress. Despite the challenges, we had a good time!

And remember, it's just a short program. Be human and empathetic. Encourage each other, cherish small wins and have fun!

^{^}
This was for 12 participants, although the accommodation was reserved for 13 people.
^{^}
Some parts of ARENA are fundamental, others are less important. Vili mostly agrees with Leon Lang on his assessment of the materials.
^{^}
Had we been very clear on all of this, the program probably would not have been realized as Vili would have bailed out understanding the actual workload.
^{^}
The largest difference between participants' time zones was 16 hours. This made scheduling quite challenging, especially since we wanted everyone to attend the talks. We asked participants in the most extreme time zones whether they preferred unusually early or late sessions in their local time, then built our schedule around their preferences.
^{^}
See the schedule document for more details.
^{^}
The local treats provided by Dmitrii, such as salty liquorice, received praise and in certain cases disgust.

[-]Sergei Smirnov3mo20

"Finally, one considerable operational hindrance was that internally we didn't have a proper tracker for stuff to do. We assumed things would be straightforward..." Could you please elaborate, giving examples of what was not straightforward and what required ad hoc meetings?

[-]Vili Kohonen3mo*10

For example, while inviting speakers was mainly Dmitrii's responsibility, it wasn't that easy and some of them were invited by me as I knew relevant people personally. We went over several smaller tasks together, like which questions to include in the feedback forms and what kind of information we needed to provide about Helsinki for people coming from abroad.

In addition, two dynamics really increased the number of our ad hoc meetings:

We really wanted to do things well. Dmitrii liked to ask for my feedback and I readily gave it to him, and sometimes vice versa.
Gather Town made it super easy to have a chat. When another person is online you can wave them to your desk with two clicks and for the other person accepting is only one click.

I think these things improved the program. But again, we were operating reactively and it was quite stressful. Had we specced the bootcamp properly, used e.g. Trello and had structured meetings where we would have needed to only check everything's going mostly as planned, I think we would have saved loads of time and reduced stress levels. We just didn't have the time nor experience this time.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

16

Lessons from organizing a technical AI safety bootcamp

16

Summary

Structure of the post

1. Team and preparations

2. Candidate recruitment

3. Infrastructure and logistics

4. Remote learning and TAing

5. Speakers and extracurriculars

6. In-person project week

Conclusions

16