Thank you for giving more context to EleutherAI's stance on acceleration and linking to your newest paper.
I support the claim that your open model contributes to AI safety research, and I generally agree with the improvements for the alignment landscape. I can also understand why you are not detailing possible failure modes of realising LLM, as this would basically be stating a bunch of infohazards.
But at least for me, this opens the space for discussing until which point to open up previously closed models for the sake of alignment research. If an aligned researcher can benefit from access, so could a non-aligned researcher, hence the " accidental acceleration."
If we want more talented people in AI safety we should focus on creating new programs. I think this might be for most readers easily defendable, but I quickly want to write my thoughts out.
Even ambitious scaling of MATS, Astra, and similar programs won't produce enough AI safety researchers to solve the field's core problems.
If the programs scaled aggressively, I think there are still lots of potential contributors that would not apply to them. From experience talking with ambitious and driven people who could work in AI safety, the reasons can be logistically, like not wanting to relocate, not wanting part/full-time work, financial security, etc. but they can also be subject specific, like not feeling to be in the target demographic of the program. Especially because of the latter, we should think about new programs that target possible demographics more narrowly. Some interesting ideas: Geographically diverse programs (or initiatives that allow people doing SPAR in non-hub regions to co-work together), open sources or hacking communities (Apart and EleutherAI might already do this, but could be a lot stronger given how powerful that community is), startup/founder communities (there were a few programs like Catalyze Impact, def/acc or Seldon lab which but could a lot stronger given how important that community could be for AI safety), “finance bros,” consultants (Consultants for Impact might do this, but could maybe be stronger given that a few top contributors in AI governance have a consulting background), debaters, quant traders, historians, military history enthusiasts, OSINT community, MUN or model-NATO enthusiasts, writing communities.
These are just on top of my head, I could easily not see way more important communities that are currently not covered by existing efforts. I also wouldn't recommend running programs in any of these areas if you don’t have a strong belief that your programs lead to participants contributing positively to AI safety.
One strong objection is that it is not clear how many potential contributors we are losing because of this. Especially for programs that target candidates that are already familiar with AI GCR, would they not apply to a fellowship if they don’t feel like part of the professional demographic? Another reason could be that reputation might trump domain specificity. Programs that scale and exist longer are able to gain prestige or a reputation. This, as a result, unlocks reaching a demographic of applicants who care about participating in programs with prestige or reputation. This argument also goes the other way: new programs attract a specific demographic of people who have high-trust in the community and are excited to take risks on new projects. From experience participating in Arena's first cohort and organizing pivotals first fellowship, this might loosely correlate with “success in AI safety.” It would be interesting to look across fellowships and do an analysis of the data. If someone is interested in doing this, please reach out.
I still believe it is worth testing if new programs in any of the less explored areas could unlock a new stream of talent in AI safety that we otherwise wouldn’t have. If you are excited about doing this please reach out! Finally, scaling existing programs is also clearly part of the AI safety field growth effort and this post should be viewed as a case against doing so.