Disclaimer: the models presented are extremely naive and simple, and assume that existential risk from AI is higher than 20%. Play around with the models using this (mostly GPT-4 generated) jupyter notebook.
1 microdoom = 1/1,000,000 probability of existential risk
The model has the following assumptions:
The goal is to estimate the expected decrease in existential risk that would result from adding one more person to the current AI safety workforce. By inputting the current size of the workforce, the ideal size, and the potential absolute risk reduction, the model gives the expected decrease.
If we run this with:
we get that one additional career averts 49 microdooms. Because of diminishing returns, the impact from an additional career is very sensitive to how big the workforce currently is.
We assume that the impact of professionals in the field follows a Pareto distribution, where 10% of the people account for 90% of the impact.
We get that, if you’re a typical current AIS professional (between 10th and 90th percentile), you reduce somewhere between 10 and 270 microdooms. Because of how skewed the distribution is, the mean is at 286 microdooms, which is higher than the 90th percentile.
If we just assume that going from 350 current people to 10,000 people would decrease x-risk by 10% linearly, we get that one additional career averts 10 microdooms.
Every model points at the conclusion that one additional AI safety professional decreases existential risks from AI by one microdoom at the very least.
Because there are 8 billion people alive today, averting one microdoom roughly corresponds to saving 8 thousand current human lives (especially under short timelines, where the meaning of “current” doesn’t change much). If one is willing to pay $5k to save one current human life (roughly how much it costs GiveWell top charities to save one), this amounts to $40M.
One microdoom is also 1 millionth of the entire future. If we expect our descendants to only spread to the milky way galaxy and no other galaxies, then this amounts to roughly 300,000 star systems.
AI safety as a field is probably marginally (with regards to number of people or amount of funding) much more effective at saving current human lives than the global health charities GiveWell recommends. I think GiveWell shouldn’t be modeled as wanting to recommend organizations that save as many current lives as possible. I think a more accurate way to model them is “GiveWell recommends organizations that are [within the Overton Window]/[have very sound data to back impact estimates] that save as many current lives as possible.” If GiveWell wanted to recommend organizations that save as many human lives as possible, their portfolio would probably be entirely made up of AI safety orgs.
Because of organizational inertia, and my expectation that GiveWell will stay a global health charity recommendation service, I think it’s very worth thinking about creating a donation recommendation organization that evaluates (or at the very least compiles and recommends) AI safety organizations instead. Something like "The AI Safety Fund", without any other baggage, just plain AI safety. There might be huge increases in interest and funding in the AI safety space, and currently it’s not very obvious where a concerned individual with extra money should donate it.
Some other possibilities that may be worth considering and can further reduce impact, at least for an individual looking to work on AI safety themself:
Also, the estimate of the current number of researchers probably underestimates the number of people (or person-hours) who will work on AI safety. You should probably expect further growth to the number of people working on AI safety, because the topic is getting mainstream coverage and support, Hinton and Bengio have become advocates, and it's being pushed more in EA (funding, community building, career advice).
However, the FTX collapse is reason to believe there will be less funding going forward.