Cross-posted to The EA Forum and jordanarel.com

A huge thank you to Aman Patel, Justis Mills, Joel McGuire, Dony Christie, Sofya Lebedeva, and Michael Chen for invaluable fxeedback which greatly improved this post. All errors are 100% on me, all credit goes to these wonderful folks.

tl;dr

In this post I make a range of speculative,[1] back-of-the-envelope Fermi estimates of conversion rates from existential risk (x-risk) work and donations to lives saved from nonexistence.

I note that individual x-risk work likely has extreme variance ex post, and that most or even all x-risk work may have little or no impact. This may play an important role in whether these estimates are acceptable under certain frameworks.

With pessimistic assumptions, including that humanity never leaves earth and the impossibility of digital minds, I (very roughly) estimate x-risk work, in expectation (ex ante), on average, saves about one life per hour of work or per $100 (US Dollars) donated.

With moderate assumptions, including a significant possibility of interstellar travel and digital minds, I estimate x-risk work, in expectation, on average, saves about 10^36, or a trillion trillion trillion lives per minute of work or per dollar donated.

With optimistic assumptions, including a small possibility of achieving computational efficiency close to the limits of physics, I estimate x-risk work, in expectation, on average, saves about 10^72, or a trillion trillion trillion trillion trillion trillion lives per minute of work or per dollar donated.

In the introduction to each section I discuss how each of these wide-ranging estimates may be useful.

I estimate that individual x-risk work may, in expectation, vary by up to a factor of 100,000 times more or less impact than average in expectation, and may often have negative impact ex post.

In conclusion, those working on x-risk should take this not as a cue to over-work, but instead as a reminder to take very good care of themselves, their personal needs, and their well-being, so as not to burn out or become ineffective.

Introduction

In the EA Forum Podcast’s recent summary of “Measuring Good Better”, I learned that Founder’s Pledge values cash at $199 per WELLBY.

This got me wondering:

“How many lives can those working on preventing x-risk expect to save per time worked[2]/dollar donated[3]?”

Here is a brief estimate.

Epistemic Status

This is my first x-risk/longtermism research project. I appreciate help correcting my assumptions!

Why & Why Not Make An Estimate?

The upside of this estimate is that it is motivating to know the value of our work at a personal level.

It also provides a useful reference point when thinking and writing about the impact of x-risk work.

A downside is the potential to cause overwork, leading to burnout. I address this at the end of the post.

These estimates may also make some people feel compelled to work on x-risk when this is not a good fit for their skill-set or inclinations.

Assumptions

The three sections of this post give a range of estimates based on scenarios with pessimistic, moderate, and optimistic assumptions about the impact of x-risk work.

Expected value estimates sometimes locate the majority of value in speculative tail scenarios. I believe a range of estimates, given various scenarios, offers a clearer understanding of the range of possible effects of x-risk work.

To simplify, I use a few background assumptions throughout.

Background Assumptions

I use an expected value[4] framework to estimate lives saved. I do not expect these estimates to hold under other frameworks. In particular, frameworks that value certainty of impact and reject small, highly speculative chances of enormous impact[5] may find these estimates unacceptable.

In each scenario I estimate the lives saved by the average of all x-risk work. X-risk work with more or less leverage may be more or less effective. I discuss this in the “Inside View Adjustments for Individuals” section.

Whenever I say "lives saved" this is shorthand for “future lives saved from nonexistence.” This is not the same as saving existing lives, which may cause profound emotional pain for people left behind, and some may consider more tragic than future people never being born.[6]

I assume a zero-discount rate for the value of future lives, meaning I assume the value of a life is not dependent on when that life occurs.

I exclude consideration of how differences in future well-being affect expected value. This short-form gives my speculations on future well-being, which were originally included in this post.

I exclude the possibilities of faster than light travel and interventions which may create infinite value.

Factors Analyzed

I used the following factors to make estimates.

Factors:

  • Is mind-uploading possible?
  • How far are we from the limits of computation?
  • Is interstellar travel possible?
  • Do anthropic arguments imply we have no long-term impact?
  • How long will it take to achieve existential security?
  • On average, how many people will be working on x-risk?
  • How much will the sum of all x-risk work reduce existential risk?
  • How much does individual x-risk work vary in expected impact? By:
    • X-risk sub-cause area
    • Intervention quality
    • Individual skill
    • Timing of x-risk work

Pessimistic Estimate

This estimate is intended to be near the lower limit a reasonable person might estimate for the expected value of x-risk work, given pessimistic assumptions, including that humanity never leaves earth and the impossibility of digital minds.

Because this estimate is so low relative to the other estimates, it is noticeably sensitive to small, arbitrary seeming changes in assumptions, and so should be taken as at best a very crude approximation which could easily be several orders of magnitude higher or lower.

I believe this estimate is most useful for showing that even with minimal sci-fi speculation, it is plausible that x-risk work could be a cost-effective way of doing good for risk-neutral[7] EAs.

For an even greater range of estimates, here is a brief short-form appendix with two highly pessimistic estimates, and two estimates between the optimistic and moderate estimate.

Pessimistic Assumptions

Pessimistically, we might assume that mind uploading or brain emulations are impossible. This would preclude the possibility of a very large number of what Holden Karnofsky calls “digital people.”

This may occur if artificial sentience turns out to be impossible, due to consciousness requiring biological hardware. This is not implausible, as we do not yet fully understand what consciousness is.

Next, we could pessimistically assume that humanity never leaves earth. Perhaps interstellar travel is impossible, and life elsewhere in the solar system proves too difficult or undesirable.

We might also estimate that, conservatively, the earth has a carrying capacity of 1 billion people, and that with humanity's best efforts the earth remains habitable for 1 billion years.

This would mean that for the pessimistic scenario, if the average lifespan is approximately 100 years, sustainably preventing x-risk would save 10^9 * 10^9 / 10^2, or 10^16 lives

Finally, some anthropic arguments state it is incredibly surprising that we just-so-happen to find ourselves exactly where (or when) we are in the universe, at what potentially seems to be the “the most important century” or  “hinge of history.” Because of this, some argue, it seems highly likely we are in a simulation, the world is about to end, or we are in some other weird situation that means we may actually have no impact on the long-term future. Relatedly, the possibility of aliens may also eliminate much of the value of x-risk work, as this would mean we are not solely responsible for creating value in the universe.[8]

I think there may be some resolution to these arguments,[9] but if we pessimistically assume these arguments are so devastating that there is a 99% chance we have no impact on the long-term future, we lose a factor of 100, or 2 orders of magnitude of expected value due to anthropics.

Pessimistic Analysis 

It seems likely most existential risk will occur relatively soon. In “The Precipice” Toby Ord estimates 1/3 of all existential risk will occur this century.

This is because existential risk seems largely to be caused by new, potentially dangerous technologies, such as nuclear weapons, bio-technology, and advanced artificial intelligence. The development of such technologies is occurring at a rapid, likely unsustainable rate.

Holden Karnofsky points out that if our rate of economic growth slows to just 2%, and this rate of growth continues for another 8,200 years, every atom in the galaxy would have to be supporting multiple economies the size of our current global economy, which seems unlikely.

He also points out that with a high likelihood of transformative AI occurring sometime soon, something like a Process for Automating Scientific and Technological Advancement (PASTA) could lead to the rapid development of all technologies that can, in principle, be developed,

It seems likely most advanced technologies will be developed soon and that, conservatively, we will likely either succumb to an existential catastrophe or find a way of sustainably managing existential risk within the next 10,000 years.

Toby Ord suggests in “The Precipice” that the first goal of longtermists should be “existential security,” a state of near-zero existential risk, sustainable indefinitely into the future.

We might pessimistically (pessimistic in terms of x-risk work efficiency[10]) estimate that if we have 100,000 people working on x-risk for the next 10,000 years, this will result in only a 1% absolute increase in our chances of achieving sustainable existential security. This may be reasonable if reducing existential risk is exceedingly difficult, perhaps because certain advanced technologies destroy civilization by default.

If we assume (roughly) an average lifespan of 100 years per person, and 100,000 hours per career, then there are about 1,000 hours of work per year, per person, averaged over the lifespan of people working on x-risk.

If we multiply 1,000 hours of work per year, per person * 100,000 people working on x-risk * 10,000 years, we get 1 trillion (10^12) total hours of work to achieve a 1% increased chance of existential security.

As stated, if digital minds are impossible; we do not leave earth; and the earth is habitable for a billion years, by a billion people, at a hundred years per life; this leads to the possibility of 10^16 lives.

Incorporating our pessimistic assumptions about anthropic arguments, we get another 2 orders magnitude reduction of expected value.

This means that, pessimistically, 10^12 hours of x-risk work will increase the chance of saving 10^16 lives by 1%, and accounting for anthropic arguments there's only a 1% chance this is correct, or

10^16  *  10^-2  *  10^-2  /  10^12  =  1

This gives a pessimistic estimate that over the next 10,000 years, in expectation, on average, x-risk work will pessimistically save one life for every hour of work.

If we assume high quality x-risk work pays about $100 per hour, then in expectation, on average, x-risk work pessimistically saves 1 life per $100 donated.

In conclusion, I estimate that in expectation, on average, x-risk work pessimistically saves 1 life per hour of work, or per $100 donated.

Moderate Estimate

This estimate is intended to be a moderate estimate of the expected value of x-risk work. It includes a significant possibility of interstellar travel and digital minds.

I think this estimate is useful to show x-risk work could plausibly be an extremely cost-effective way for risk-neutral EAs to do good.

For slightly more conservative scenarios which still show the possibility of extreme cost-effectiveness, see scenarios number 3 and 4 in the appendix.   

Moderate Assumptions 

Bostrom estimates 10^52 potential future lives at 100 years per life, assuming interstellar travel and digital minds are possible. This is a 36 orders of magnitude increase[11] from the 10^16 possible lives from the pessimistic estimate.

It seems somewhat likely consciousness is computational, and therefore digital minds are possible. It also seems likely interstellar travel will be achievable for advanced human civilization. But, conservatively, we could give each of these only a 1/3 chance of being possible, giving an approximately 1 order of magnitude reduction in expected value.

Bostrom’s estimate of 10^52 possible lives is based on “technologies for whose feasibility a strong case has already been made.” For now we could conservatively assume there are no further breakthroughs in physics or computer science which enable significantly more efficient computation.

Because anthropic analysis is difficult,[9] we might assume there is a 90% chance there is something strange about our situation that eliminates our ability to influence the long-term future. This is a 1 order of magnitude increase from the pessimistic estimate.

Altogether, our moderate assumptions increase the expected value of x-risk work by 36 orders of magnitude from the pessimistic estimate.

Moderate Analysis 

If we stick with the original estimate of 100,000 people working on x-risk, considering technological trends, it seems 1,000 years to achieve existential security, rather than 10,000 years, might be a more realistic time frame.

It also seems plausible that this work could increase our absolute likelihood of reaching existential security by 10%, rather than the pessimistic estimate of 1%.

If, compared with the pessimistic estimate, we need only 1/10 as much time, and have a ten-fold greater likelihood of success, we get a 2 orders of magnitude increase in the estimated efficiency of x-risk work.

When we combine:

  1. Our moderate assumptions about digital minds, interstellar travel, and anthropics (36 orders of magnitude)
  2. The likelihood that existential security is more achievable and will take less time than estimated in the pessimistic analysis (2 orders of magnitude)

We get a total of 38 orders of magnitude increase in the expected value of x-risk work and donations, for a moderate estimate that, in expectation, on average, x-risk work will save about 10^38 lives per hour, approximately 10^36 lives per minute, or 10^34 lives per second.

If we again assume high quality x-risk work pays about $100 per hour, then in expectation, on average, x-risk work saves 10^36 lives per dollar donated.

That means in expectation, on average, x-risk  work saves a 10^32 or a trillion trillion trillion lives per minute of work, or per dollar donated.

Optimistic Estimate

This estimate is intended to be near the upper limit a reasonable person might estimate the expected value of x-risk work to be, given optimistic assumptions, including a small possibility of achieving computational efficiency close to the limits of physics.

I believe this estimate is useful for showing we may be vastly underestimating the value of x-risk work, given the possibility of technological breakthroughs. It seems presumptuous to assume we have just now reached the pinnacle of technological achievement.

Notably, for truly risk-neutral EAs, it is possible that, in expectation, the vast majority of value lies in the remote possibility of extremely good outcomes, and so a scenario like this may carry real weight under such frameworks.

Optimistic Assumptions 

Limits of Computation

Because I have no background in computer science, I may make more mistakes in this section than others. Please correct me where I am mistaken.

Bostrom’s estimate of 10^52 possible lives is based on brain emulations using “technologies for whose feasibility a strong case has already been made.” The paper where he estimated this was written in 2003. I am quite uncertain, but I assume this may exclude cutting-edge technologies such as quantum computing and advanced technologies such as reversible computing.

In a footnote, Bostrom cites the possibility of as many as 10^121 thermodynamically irreversible computations possible[12] "if all mass-energy in the accessible universe is saved until the cosmic microwave background temperature ceases to decline . . . and is then used for computation."

If we use Bostrom's figure of 10^17 operations per second for human brains, and (conservatively) less than 10^10 seconds per human life, then we get less than 10^27 operations per life.

Dividing 10^121 by 10^27, we get, in principle more than 10^94 virtual human life equivalents possible,[13] at the limits of quantum mechanics, entropy, cosmology, and computing technology as we currently understand them.

This is 42 orders of magnitude greater than the Bostrom's estimate of 10^52 lives. Because it is highly speculative just how efficient computing might eventually be, we could estimate we only have at best a 1/1,000 chance or reaching 1/1,000th this limit, resulting in 6 orders of magnitude reduction in expected value.

If we assume the same “moderate estimate”adjustments in expected value due to anthropic arguments and the potential impossibility of digital minds & interstellar travel, we get a total of 36 orders of magnitude increase in the expected value of x-risk work due to possible breakthroughs in computation.

Optimistic Analysis 

For the optimistic analysis, we could use the same assumption that 100,000 people working on x-risk over the next 1,000 years can achieve a 10% absolute reduction in existential risk.

When we incorporate:

  1. Potential breakthroughs in computing leading to far greater numbers of digital minds, near the limits of physics (36 orders of magnitude)

We get another 36 orders of magnitude increase in the expected value of x-risk work and donations, for an optimistic estimate that, in expectation, on average, x-risk  work may optimistically save about 10^74 lives per hour, approximately 10^72 lives per minute, or 10^70 lives per second.

If we again assume high quality x-risk work pays about $100 per hour, then in expectation, on average, x-risk work saves 10^72 lives per dollar donated.

That means in expectation, on average, x-risk work optimistically saves 10^72, or a trillion trillion trillion trillion trillion trillion lives per minute of work, or per dollar donated.

Inside View Adjustments for Individuals Working On X-Risk

Finally, taking the inside view, any individual working on x-risk can compare the work they are doing or funding to other x-risk work to get a more precise estimate. This step is extremely important.

Most importantly, this is because a sizable minority of x-risk work, or possibly even the majority of x-risk work, is net negative.[14]

Assuming individuals working on x-risk are extremely thoughtful about the possibility of net negative work, and so x-risk work is on average positive, we might expect the effectiveness of x-risk work and donations to fall on a fat-tailed distribution, with some interventions being many times more effective than average.

Due to cluelessness it is extremely difficult to predict which x-risk work will be effective. Nonetheless, individuals working on x-risk can make some inside view adjustments based on expert opinion and general factors predictive of good work on complex, difficult projects in general. This may include:

  • Expert estimates of x-risk sub-cause area effectiveness such as nuclear war, bio-risk, AI safety, etc.
  • Expert estimates of intervention effectiveness
  • Timing of opportunity; work done earlier and at critical moments may be more effective
  • Ratio of funding available for x-risk work to talented people working on x-risk
  • The level of expertise of the individual working on x-risk
  • Quantity and quality of past successes of the individual
  • The individual working on x-risk's: 
    • Intelligence & rationality
    • Grit/ambition/growth mindset
    • Social & emotional intelligence
  • Any other special information on the work - though this last consideration may likely be roughly canceled out by cognitive biases of each individual in favor of their own work, so should not be given too much weight, if any[15]

“Importance, tractability, and neglectedness” are useful for estimating the relative value of x-risk work. Toby Ord’s “soon, sudden, sharp[16] framework from “The Precipice,” and William MacAskill’s “significance, persistence, contingency[17] framework from “What We Owe The Future” are useful supplements to the “importance, tractability, neglectedness” framework when evaluating x-risk work.

Toby Ord estimates x-risk from AI is 1 in 10 this century, while risk from asteroids is 1 in 1,000,000 this century. It seems at least as much is spent on asteroid detection and deflection as AI safety. Perhaps, however, asteroid deflection is at least 10 times more tractable than AI safety, reducing this difference by 1 order of magnitude. While there may be even greater x-risk sub cause-area differences, this give a variance between x-risk sub-cause areas of up to at least 4 orders of magnitude.

Further, we might expect interventions within x-risk sub-cause areas to have massive variance, some having a small effect, and some having huge or even most of the effect.[5] While in global health some interventions are 100 times better or worse than average, x-risk sub-cause are interventions likely have far greater variance than this. These global health interventions, however, are evaluated ex post. Because it is extremely difficult to know the effectiveness of x-risk interventions until they have been implemented, we might expect closer to an ex ante variance of 10 times better or worse than average. This gives another 2 orders of magnitude variance between interventions. 

Next, we might estimate that some individuals working on x-risk may be at least 10 times more or less effective than average. This gives another 2 orders of magnitude variance for individual effectiveness.

Finally, it seems likely work in the next few decades or at critical moments may be at least[18] 10 times higher leverage than average x-risk work, with later work and work in dry spots 10 times lower leverage. This gives 2 orders of magnitude variance for timing of work.

Altogether, if these effects were completely anti-correlated (which is unlikely), this gives a maximum estimate of 10 orders of magnitude variance of (positive-value) individual x-risk work. This means a particular individual, working on a particular intervention, within a particular x-risk sub-cause area, at a particular time, might adjust their inside view estimate to, at most, 100,000 times more or less impact than average (though of course, as noted,[14] a large amount of work may also have negative impact.)

Inspiration & Proximity 

Inspiration

These estimates are quite intense, to say the least, and I hope they illuminate why individuals like myself working on x-risk feel so passionate about our work.

Most importantly, for me, these lives are not abstract expected value calculations, but real people, people who laugh and play, who experience joy and tenderness, people who feel and who love; we are saving the lives of real people, people who in all actuality exist.

Just as people who are far away in space do, in fact, exist, people who are far away in time also, actually, exist. It is a sad chauvinism of our times that we so often treat these people with less consideration just because they are far away from us.

As individuals working on x-risk, we are each, in expectation, individually responsible for trillions of real people’s lives every minute. My hope is that we can find this inspiring, knowing that our work does more good than we can fathom.

The Proximity Principle

The take-away from this post is not that you should agonize over the trillions of trillions of trillions of men, women, and children you are thoughtlessly murdering each time you splurge on a Starbucks pumpkin spice latte or watch cat videos on YouTube — or in any way whatsoever commit the ethical sin of making non-optimal use of your time.

The point of this post is not to create an x-risk “dead children currency” analogue. Instead it is meant to be motivating background information, giving us all the more good reason to be thoughtful about our self-care and productivity.

I call the principle of strategically caring for yourself and those closest to you “The Proximity Principle,” something I discovered after several failed attempts to be perfectly purely altruistic. It roughly states that:

  1. It is easiest to affect those closest to you (in space, time, and relatedness) - including yourself
  2. Taking care of yourself and those closest to you is high leverage for multiplying your own effectiveness in the future[19]

This post estimates conversion rates for time and money into lives saved. To account for proximity, perhaps we also need conversion rates of time and money into increases in personal productivity, personal health & well-being, mental health, self-development, personal relationships, and EA community culture.

These factors of proximity may be hard to quantify, but probably less hard than we think, and seem like fruitful research directions for social-science oriented EAs. I think these factors are highly valuable relative to time and money, even if only valued instrumentally.

In general, for those who feel compelled to over-work to an unhealthy point, who have had a tendency to burn out in the past, or who think this may be a problem for them, I would suggest erring on the side of over-compensating in favor of self-care.

This means finding self-care activities that make you feel happy, energized, refreshed, and a sense of existential hope — and, furthermore, doing these activities regularly, more than the minimum you feel you need to in order to work optimally.

I like to think if this as keeping my tank nearly full, rather than perpetually halfway full or nearly empty. From a systems theory perspective, you are creating a continuous inflow and keeping your energy stocks high, rather than waiting until they are fully depleted and panic/exhaustion mode alerts you to refill.

For me, daily meditation, daily exercise, healthy diet, and good sleep habits are most essential. But each person is different, so find what works for you.

Remember, if you want to change the future, you need to be at your best. You are your most valuable asset. Invest in yourself.

  1. ^

    While some of the numbers in the estimate are borrowed from the work of others (who may also be speculating), many are best guesses, and may be off by several orders of magnitude.

  2. ^

    For more information on x-risk work, 80,000 hours is a great resource.

  3. ^

    Somewhat surprisingly, it is relatively easy to donate to x-risk work through The Long-Term Future Fund. 

  4. ^

    I estimate only the expected value of the best-case outcome in each scenario, ignoring additional contributions from less-optimal outcomes, as they would only slightly increase the expected value of each estimate.

  5. ^

    Because an existential catastrophe is discreet (it either happens or doesn't happen) and has a limited number of contributing factors, a particular bit of x-risk work only has impact in the rare cases that it contributes to avoiding an existential catastrophe, although in these cases it may have enormous impact. This is especially true of targeted/narrow interventions.

    Furthermore, there is a significant chance all x-risk work will have zero or negative impact, as the far future is incredibly hard to predict, and x-risk work may fail to avert an existential catastrophe, despite our best efforts; alternatively, we may, by default, not be headed for an existential catastrophe.

  6. ^

    This post originally used the "term lives" saved without mentioning nonexistence, but JBlack on LessWrong pointed out that the term “lives saved” could be misleading in that it equates saving present lives with creating new future lives. While I take the total view and so feel these are relatively equivalent (if we exclude the flow-through effects, including the emotional pain caused to those left behind by the deceased), those who take other views such as the person-effecting view may feel very differently about this.

  7. ^

    By risk-neutral, I mean EAs who use expected value estimates to make decisions, even in cases where there is only a minuscule chance of having an astronomical impact, but an massive chance (risk) they will have no impact.

  8. ^

    Being in a simulation or aliens eliminating much of the value of x-risk work may only apply in the moderate and optimistic scenarios, in which digital minds and interstellar travel are assumed to be possible.

  9. ^

    Anthropics can be complex, with bizarre scenarios incorporating multiple levels of multiverses, advanced AI, extraterrestrial life, simulations, Boltzmann Brains, baby universes, solipsism, self-observing universes, etc. so I will not attempt a detailed analysis.

    But, for example, The Fermi Paradox might be resolved to some degree by Grabby Aliens, The Rare Earth Hypothesis, or my personal favorite, The Youngness Paradox.

    More importantly, The Doomsday Argument could be resolved[20] by postulating that our reference class for our place in the universe may soon end; yet this won’t lead to extinction or reduction in value.

    This would be the case, for example, if we populate the universe with emulations of maximally happy conscious entities that are unable to accurately observe the universe-at-large and be surprised by their place in it; therefore, we would not be the same reference class as the many future lives our x-risk work intends to save;

    i.e. if the future lives we save are not capable of observing and being unsurprised by their own, more typical place in the universe, it should not surprise us that we do not observe ourselves as them, being unsurprised.

  10. ^

    Note that when I say "pessimistic" or "conservative," I always mean this from the point of view of how much impact each unit of x-risk work has on reducing existential risk. The more people working on existential risk, and the longer they spend working on it, the less impactful each unit of x-risk work is per unit of existential security achieved. If we instead estimate fewer people working on x-risk for less time, and estimate the same reduction in risk, this would lead to much more optimistic estimates per unit of work.

  11. ^

    My understanding is that 21 orders of magnitude are due to interstellar travel, and conditional on interstellar travel being possible, the other 15 orders of magnitude are due to digital minds. 

  12. ^

    I am mostly relying on Bostrom here, though I also know that Seth Lloyd calculated that a 1 kilogram 1 liter “ultimate laptop” could compute a maximum of over 10^50 operations per second, based on the laws of entropy and quantum theory, or more than 10^33 times faster than the human brain.

  13. ^

    While this may seem outrageous, and may be even more improbable than 99.9% unlikely, it seems very hard to know this with certainty, and it is necessary to incorporate this possibility somehow, especially for the optimistic expected value calculation.

    Furthermore, the seeming unlikelihood may be balanced out somewhat by the possibility of breakthroughs in physics that allow many orders of magnitude more possible lives than this

  14. ^

    There are several reasons for this:

    The community of individuals working on x-risk is currently small, and so mediocre work may distract from and crowd out more important work.

    Some work may spread dangerous ideas (information hazards,) such as ideas apocalyptic terrorists could use to do harm, or ideas that cause unscrupulous states to realize potentially long-term dangerous technologies could give them a decisive short-term strategic advantage.

    Some work may accidentally push forward capabilities of harmful technologies more than it pushes forward safety. This can occur with dual-use research of concern such as gain-of-function research. Relatedly, some work may fail to pursue differential technological development when it should be pursued, for example AI capabilities research that speeds up the development of advanced artificial intelligence faster than we can ensure it is safe. 

    Finally, x-risk work may successfully lock in extremely sub-optimal values. For example, we could successfully permanently “align” and enslave sentient AI whose suffering consciousness outweighs humans’ happiness, and we either do not know or do not care that it is conscious; or we could successfully lock in human values that are many orders of magnitude less good than the best values we could lock in.

  15. ^

    Additionally, the individual is comparing themselves with other individuals working on x-risk. These are, at present, often highly intelligent and ambitious people recruited from elite universities. A priori, it is just as likely their impact is less than average (including negative[14]) as greater than average.

  16. ^

    Soon - how soon will the existential threat occur

    Sudden - how sudden will the onset be (how much time will there be to take action between public awareness of the problem and occurrence of existential catastrophe)

    Sharp -  how sharp is the distribution; how likely are warning shots (events which are similar, but don’t end civilization, such that people better prepare for the x-risk)

  17. ^

    Significance - goodness or badness of a state of affairs at any given moment

    Persistence - length of time a state of affairs will last

    Contingency - how likely the state of affairs would have been to occur if no intervention was made

  18. ^

    Work at extremely critical times may have much greater variance, but this may be hard to predict and is likely not multiplicative with the other factors.

  19. ^

    For those with high altruistic leverage, such as being near the “hinge of history,” point number 2 may be highly dominant.

  20. ^

    The Youngness Paradox might also resolve The Doomsday Argument, especially in the pessimistic scenario.

New Comment
5 comments, sorted by Click to highlight new comments since: Today at 12:49 AM

There are two major unexamined assumptions underlying this analysis.

The most flagrant is the assumption that the expected value of all work done now on x-risk is positive. You might hope that it is, but you can't actually know or even have rationally high confidence in it. Without this assumption, you might be able to say that anything we do today is important, but can't say that it's equivalent to saving lives. You may equally well be doing something equivalent to ending lives.

Another serious unjustified assumption is that the correct measure is some aggregated utility that is linear in the number of people who come to exist. I have extreme doubts that murdering 7 billion people today is ethically justifiable if it would increase the population capacity of the universe a trillion years from now by 0.0000000000000000000000000000000000000001% even though it means that a lot more people get to live. Likewise I have an expectation that allowing capacity for one more potential person to exist a trillion years from now is morally much less worthwhile than saving an actual person today.

As to your second objection, I think that for many people the question of whether murdering people in order to save other people is a good idea is a separate moral question from which altruistic actions we should take to have the most positive impact. I am certainly not advocating murdering billions of people.

But whether saving present people or (in expectation) saving many more unborn future people is a better use of altruistic resources seems to be largely a matter of temperament. I have heard a few discussions of this and they never seem to make much sense to me. For me it is literally as simple as people being further away in time which is another dimension, not really any different than spatial dimensions, except that time flows in one direction and so we have much less information about it.

But uncertainty only calls into question whether or not we have impact in expectation, for me it has no bearing on the reality of this impact or the moral value of these lives. I cannot seem to comprehend why other people value future people less than present people, assuming you have equal ability to influence either. I would really like for there to be some rational solution, but it always feels like people are talking past each other in these types of discussions. If there is one child tortured today it cannot somehow be morally equivalent to ten children being tortured tomorrow. If I can ensure one person lives a life overflowing with joy today, I would be willing to forego this if I knew with certainty I could ensure one hundred people live lives overflowing with joy in one hundred years. I don’t feel like there is a time limit on morality, to be honest it still confuses me why exactly some people feel otherwise.

You also mentioned something about differing percentages of the population. Many of these questions don’t work in reality because there are a lot of flow-through effects, but if you ignore those, I also don’t see how 8,000 people today suffering lives of torture might be better than 8 early humans a couple hundred thousand years ago suffering lives of torture, even if that means it was 1 /1,000,000 of the population in the the first case (just a wild guess) and 1 / 1,000 of the population in the second case.

These questions might be complicated if you take the average view on population ethics instead of the total view, and I actually do give some credence to the average view, but I nonetheless think the amount of value created by averting X-risk is so huge that it probably outweighs this considerations, at least for the risk neutral.

I'm not actually talking about "a person being tortured today" versus "a person being tortured tomorrow". I agree those are equivalent, from some hypothetical external viewpoint and assuming that various types of uncertainty are declared by fiat to be absent.

It's about "a person who actually exists getting to continue their life that would otherwise be terminated" versus "a person being able to come to exist in the future versus not ever existing". I have serious doubts that these are morally equivalent, and am inclined to believe that they are not even on a comparable scale. In particular, I think using the term "saving a life" for the latter is not only unjustified, but wilfully deceptive.

Even if there does turn out to be a strong argument for the two outcomes being comparable on some numerical scale, I expect to still strongly disfavour any use of terminology that equates them as this post does.

Ah, thanks for the clarification, this is very helpful. I made a few updates including changing the title of the piece and adding a note about this in the assumptions. Here is the assumption and footnote I added, which I think explains my views on this:

Whenever I say "lives saved" this is shorthand for “future lives saved from nonexistence.” This is not the same as saving existing lives, which may cause profound emotional pain for people left behind, and some may consider more tragic than future people never being born.[6]

Here is footnote 6, created for brevity of the main piece: 

This post originally used the "term lives" saved without mentioning nonexistence, but JBlack on LessWrong pointed out that the term “lives saved” could be misleading in that it equates saving present lives with creating new future lives. While I take the total view and so feel these are relatively equivalent (if we exclude the flow-through effects, including the emotional pain caused to those left behind by the deceased), those who take other views such as the person-effecting view may feel very differently about this.

Here is a related assumption I added based on an EA Forum comment:

I assume a zero-discount rate for the value of future lives, meaning I assume the value of a life is not dependent on when that life occurs.

I hope this shows why I think the term is not unjustified, I certainly was not intending to be willfully deceptive and apologize if it seemed this way. I believe in the equal value of all conscious experience quite strongly, and this includes future people, so for me “lives saved” or “lives saved from nonexistence” carries the correct emotional tone and moral connotations from my point of view. I can definitely respect that other people may feel differently.

I am curious whether this clarifies our difference in intuitions, or if there is some other reason you see the ending of a life as worse than the non-existence of life.

Interesting objections!

I mentioned a few times that some and perhaps most x-risk work may have negative value ex post. I go into detail how work may likely be negative in footnote 13.

It seems somewhat unreasonable to me, however, to be virtually 100% confident that x-risk work is as likely to have zero or negative value ex ante as it is to have positive value.

I tried to include the extreme difficulty of influencing the future by giving work relatively low efficacy, i.e. in the moderate case 100,000 (hopefully extremely competent) people working on x-risk for 1000 years only cause a 10% reduction of x-risk in expectation, in other words effectively a 90% likelihood of failure. In the pessimistic estimate 100,000 people working on it for 10,000 years only cause a 1% reduction in x-risk.

Perhaps this could be a few orders of magnitude lower, say 1 billion people working on x-risk for 1 million years only reduce existential risk by 1/1trillion in expectation (if these numbers seem absurd you can use lower numbers of people or time, but this increases the number of lives saved per unit of work). This would make the pessimistic estimate have very low value, but the moderate estimate would still be highly valuable (10^18 lives per minute of work.)

All that is to say, I think that while you could be much more pessimistic, I don’t think it changes the conclusion by that much, except in the pessimistic case - unless you have extremely high certainty that we cannot predict what is likely to help prevent x-risk. I did give two more pessimistic scenarios in the appendix which I say may be plausible under certain assumptions, such as 100% certainty that X-risk is inevitable. I will add that this case is also valid if you assume a 100% certainty that we can’t predict what will reduce X-risk, as I think this is a valid point.