How many people are working (directly) on reducing existential risk from AI?

Benjamin Hilton

Summary

I've updated my estimate of the number of FTE (full-time equivalent) working (directly) on reducing existential risks from AI from 300 FTE to 400 FTE.

Below I've pasted some slightly edited excepts of the relevant sections of the 80,000 Hours profile on preventing an AI-related catastrophe.

New 80,000 Hours estimate of the number of people working on reducing AI risk

Neglectedness estimate

We estimate there are around 400 people around the world working directly on reducing the chances of an AI-related existential catastrophe (with a 90% confidence interval ranging between 200 and 1,000). Of these, about three quarters are working on technical AI safety research, with the rest split between strategy (and other governance) research and advocacy. ^[1]We think there are around 800 people working in complementary roles, but we’re highly uncertain about this estimate.

Footnote on methodology

It’s difficult to estimate this number.

Ideally we want to estimate the number of FTE (“full-time equivalent“) working on the problem of reducing existential risks from AI.

But there are lots of ambiguities around what counts as working on the issue. So I tried to use the following guidelines in my estimates:

I didn’t include people who might think of themselves on a career path that is building towards a role preventing an AI-related catastrophe, but who are currently skilling up rather than working directly on the problem.
I included researchers, engineers, and other staff that seem to work directly on technical AI safety research or AI strategy and governance. But there’s an uncertain boundary between these people and others who I chose not to include. For example, I didn’t include machine learning engineers whose role is building AI systems that might be used for safety research but aren’t primarily designed for that purpose.
I only included time spent on work that seems related to reducing the potentially existential risks from AI, like those discussed in this article. Lots of wider AI safety and AI ethics work focuses on reducing other risks from AI seems relevant to reducing existential risks – this ‘indirect’ work makes this estimate difficult. I decided not to include indirect work on reducing the risks of an AI-related catastrophe (see our problem framework for more).
Relatedly, I didn’t include people working on other problems that might indirectly affect the chances of an AI-related catastrophe, such as epistemics and improving institutional decision-making, reducing the chances of great power conflict, or building effective altruism.

With those decisions made, I estimated this in three different ways.

First, for each organisation in the AI Watch database, I estimated the number of FTE working directly on reducing existential risks from AI. I did this by looking at the number of staff listed at each organisation, both in total and in 2022, as well as the number of researchers listed at each organisation. Overall I estimated that there were 76 to 536 FTE working on technical AI safety (90% confidence), with a mean of 196 FTE. I estimated that there were 51 to 359 FTE working on AI governance and strategy (90% confidence), with a mean of 151 FTE. There’s a lot of subjective judgement in these estimates because of the ambiguities above. The estimates could be too low if AI Watch is missing data on some organisations, or too high if the data counts people more than once or includes people who no longer work in the area.

Second, I adapted the methodology used by Gavin Leech’s estimate of the number of people working on reducing existential risks from AI. I split the organisations in Leech’s estimate into technical safety and governance/strategy. I adapted Gavin’s figures for the proportion of computer science academic work relevant to the topic to fit my definitions above, and made a related estimate for work outside computer science but within academia that is relevant. Overall I estimated that there were 125 to 1,848 FTE working on technical AI safety (90% confidence), with a mean of 580 FTE. I estimated that there were 48 to 268 FTE working on AI governance and strategy (90% confidence), with a mean of 100 FTE.

Third, I looked at the estimates of similar numbers by Stephen McAleese. I made minor changes to McAleese’s categorisation of organisations, to ensure the numbers were consistent with the previous two estimates. Overall I estimated that there were 110 to 552 FTE working on technical AI safety (90% confidence), with a mean of 267 FTE. I estimated that there were 36 to 193 FTE working on AI governance and strategy (90% confidence), with a mean of 81 FTE.

I took a geometric mean of the three estimates to form a final estimate, and combined confidence intervals by assuming that distributions were approximately lognormal.

Finally, I estimated the number of FTE in complementary roles using the AI Watch database. For relevant organisations, I identified those where there was enough data listed about the number of researchers at those organisations. I calculated the ratio between the number of researchers in 2022 and the number of staff in 2022, as recorded in the database. I calculated the mean of those ratios, and a confidence interval using the standard deviation. I used this ratio to calculate the overall number of support staff by assuming that estimates of the number of staff are lognormally distributed and that the estimate of this ratio is normally distributed. Overall I estimated that there were 2 to 2,357 FTE in complementary roles (90% confidence), with a mean of 770 FTE.

There are likely many errors in this methodology, but I expect these errors are small compared to the uncertainty in the underlying data I’m using. Ultimately, I’m still highly uncertain about the overall FTE working on preventing an AI-related catastrophe, but I’m confident enough that the number is relatively small to say that the problem as a whole is highly neglected.

I’m very uncertain about this estimate. It involved a number of highly subjective judgement calls. You can see the (very rough) spreadsheet I worked off here. If you have any feedback, I’d really appreciate it if you could tell me what you think using this form.

Some extra thoughts from me

This number is extremely difficult to estimate.

Like any Fermi estimate, I'd expect there to be a number of mistakes in this estimate. I think there will be two main types:

Bad judgment calls when estimating the number of people working at each organisation, e.g. based on "what counts as an FTE working directly on this issue", "how wrong is the AI watch database on this organisation", etc.
Errors in calculation / estimating uncertainty, etc.

Again, like in any Fermi estimate, I'd hope that these errors will roughly cancel out overall.

I didn't spend much time on this (maybe about 2 days of work). This is because I'd guess that more work won't improve the estimate by decision-relevant amounts. Some reasons for this:

A rougher version of this estimate that I'd used previously came to an answer of 300 FTE. That estimate took around 3-4 hours of work. While 300 FTE to 400 FTE is a large proportional change, it still represents a highly neglected field and doesn't seem substantially decision-relevant.
Errors in collecting data on this seem large in a way that couldn't be easily mitigated by doing more work.
There would still be substantial subjective judgement in an estimate that took more time. My uncertainty in this estimate includes uncertainty in whether these are the right judgement calls (on the criteria of "is it truthful, across a distribution of plausible definitions, to say that this is the number of FTE working directly on reducing existential risk from AI"), and it seems very difficult to reduce that uncertainty.

^{^}
Note that before 19 December 2022, this page gave a lower estimate of 300 FTE working on reducing existential risks from AI, of which around two thirds were working on technical AI safety research, with the rest split between strategy (and other governance) research and advocacy
This change represents a (hopefully!) improved estimate, rather than a notable change in the number of researchers.

[-]David Scott Krueger3y115

In the current climate, I think playing up the neglectedness and "directly working on x-risks" is somewhat likely be counterproductive, especially if not done carefully, some reasons:

1) It fosters an "us-vs-them" mindset.
2) It fails to acknowledge that these researchers don't know what the most effective ways are to reduce x-risk, and there is not much consensus (and that which does exist is likely partially due to insular community epistemics).
3) It discounts the many researchers doing work that is technically indistinguishable the work by researchers "directly working on x-risks".
4) Concern about x-risk (or more generally, the impact of powerful AI) from AI researchers is increasing organically, and we want to welcome this concern, rather than (accidentally/implicitly/etc.) telling people they don't count.

I think we should be working to develop clearer ideas about which kinds of work is differentially useful for x-safety, seeking to build a broader (outside this community) consensus about that, and try to incentivize more explicit focus on x-safety.