Listen to this newsletter on The Alignment Newsletter Podcast.
Alignment Newsletter is a publication with recent content relevant to AI alignment. Find all Alignment Newsletter resources here. In particular, you can look through this spreadsheet of all summaries that have ever been in the newsletter.
Please note that this newsletter represents my personal views and not those of DeepMind.
Sorry for the long hiatus! I was really busy over the past few months and just didn't find time to write this newsletter. (Realistically, I was also a bit tired of writing it and so lacked motivation.) I'm intending to go back to writing it now, though I don't think I can realistically commit to publishing weekly; we'll see how often I end up publishing. For now, have a list of all the things I should have advertised to you whose deadlines haven't already passed.
Survey on AI alignment resources (Anonymous) (summarized by Rohin): This survey is being run by an outside collaborator in partnership with the Centre for Effective Altruism (CEA). They ask that you fill it out to help field builders find out which resources you have found most useful for learning about and/or keeping track of the AI alignment field. Results will help inform which resources to promote in the future, and what type of resources we should make more of.
Announcing the Inverse Scaling Prize ($250k Prize Pool) (Ethan Perez et al) (summarized by Rohin): This prize with a $250k prize pool asks participants to find new examples of tasks where pretrained language models exhibit inverse scaling: that is, models get worse at the task as they are scaled up. Notably, you do not need to know how to program to participate: a submission consists solely of a dataset giving at least 300 examples of the task.
Inverse scaling is particularly relevant to AI alignment, for two main reasons. First, it directly helps understand how the language modeling objective ("predict the next word") is outer misaligned, as we are finding tasks where models that do better according to the language modeling objective do worse on the task of interest. Second, the experience from examining inverse scaling tasks could lead to general observations about how best to detect misalignment.
$500 bounty for alignment contest ideas (Akash) (summarized by Rohin): The authors are offering a $500 bounty for producing a frame of the alignment problem that is accessible to smart high schoolers/college students and people without ML backgrounds. (See the post for details; this summary doesn't capture everything well.)
Job ad: Bowman Group Open Research Positions (Sam Bowman) (summarized by Rohin): Sam Bowman is looking for people to join a research center at NYU that'll focus on empirical alignment work, primarily on large language models. There are a variety of roles to apply for (depending primarily on how much research experience you already have).
Job ad: Postdoc at the Algorithmic Alignment Group (summarized by Rohin): This position at Dylan Hadfield-Menell's lab will lead the design and implementation of a large-scale Cooperative AI contest to take place next year, alongside collaborators at DeepMind and the Cooperative AI Foundation.
Job ad: OpenAI Trust & Safety Operations Contractor (summarized by Rohin): In this remote contractor role, you would evaluate submissions to OpenAI's App Review process to ensure they comply with OpenAI's policies. Apply here by July 13, 5pm Pacific Time.
Job ad: Director of CSER (summarized by Rohin): Application deadline is July 31. Quoting the job ad: "The Director will be expected to provide visionary leadership for the Centre, to maintain and enhance its reputation for cutting-edge research, to develop and oversee fundraising and new project and programme design, to ensure the proper functioning of its operations and administration, and to lead its endeavours to secure longevity for the Centre within the University."
Job ads: Redwood Research (summarized by Rohin): Redwood Research works directly on AI alignment research, and hosts and operates Constellation, a shared office space for longtermist organizations including ARC, MIRI, and Open Philanthropy. They are hiring for a number of operations and technical roles.
Job ads: Roles at the Fund for Alignment Research (summarized by Rohin): The Fund for Alignment Research (FAR) is a new organization that helps AI safety researchers, primarily in academia, pursue high-impact research by hiring contractors. It is currently hiring for Operation Manager, Research Engineer, and Communication Specialist roles.
Job ads: Encultured AI (summarized by Rohin): Encultured AI is a new for-profit company with a public benefit mission: to develop technologies promoting the long-term survival and flourishing of humanity and other sentient life. They are hiring for a Machine Learning Engineer and an Immersive Interface Engineer role.
Job ads: Fathom Radiant (summarized by Rohin): Fathom Radiant is a public benefit corporation that aims to build a new type of computer which they hope to use to support AI alignment efforts. They have several open roles, including (but not limited to) Scientists / Engineers, Builders and Software Engineer, Lab.
I'm always happy to hear feedback; you can send it to me, Rohin Shah, by replying to this email.
An audio podcast version of the Alignment Newsletter is available. This podcast is an audio version of the newsletter, recorded by Robert Miles.