• Safety from Xrisk is a common good: We all benefit by making it less likely that we will all die.
  • In general, people are somewhat selfish, and value their own personal safety over that of another (uniform) randomly chosen person.
  • Thus individuals are not automatically properly incentivized to safeguard the common good of safety from Xrisk.

I hope you all knew that already ;)

19 comments, sorted by Click to highlight new comments since: Today at 3:19 PM
New Comment

From the original article by Nick Bostrom: "Reductions existential risks are global public goods [13] and may therefore be undersupplied by the market [14]. Existential risks are a menace for everybody and may require acting on the international plane. Respect for national sovereignty is not a legitimate excuse for failing to take countermeasures against a major existential risk." See: https://nickbostrom.com/existential/risks.html

I see what you're saying, but I don't think it's a great fit for the concept, and I don't think it's helpful to reinforce what similarities there are.

Traditional commons problems (inefficient and destructive use of resources due to lack of mechanism to encourage efficient uses) are typically solved by assigning an owner or curator, who decides what limits to impose (often in the form of an entry price or usage rent, which means it gets used by those who think it's more valuable than the price, and provides revenue to replenish or maintain the property). Tragedy of the commons is generally applied to non-exludable (can't keep people out), rivalrous (meaning it's not unlimited in usage) topics. And the solution is exclusion.

X-risk mitigation is more like the "should we build a lighthouse" problem - it's non-excludable but also non-rivalrous, since someone using it doesn't keep anyone else from using it. In fact, many beneficiaries don't even know they're taking advantage of it. Here, the difficulty is knowing what level of investment is efficient. You can't get a price signal, so there's no way know what level of cost for any given project would make people prefer to take the risk rather than spend the cost. There's a lower bound on what people will voluntarily contribute (which is how many lighthouses actually got built - through sponsorships and donations by townspeople and ship owners). In modern times, this is often a topic where government officials make claims that they know the right things, and use it as a reason to justify taxes. Sometimes, they may even be right.

In case others haven't seen it, here's a great little matrix summarising the classification of goods on "rivalry" and "excludability" axes.

I think it is rivalrous.

Xrisk mitigation isn't the resource; risky behavior is the resource. If you engage in more risky behavior, then I can't engage in as much risky behavior without pushing us over into a socially unacceptable level of total risky behavior.

I think existence is the resource. Risky behavior and risk-mitigating behavior both impact the probabilities of such resource in the future. The fundamental resource (enjoying existence when disaster has not happened yet) is non-rivalrous and non-excludable. Funding for risk-mitigating projects is both rivalrous and excludable - that's just normal goods. Allowing risk-increasing behaviors is ... difficult to model.

True, but from a marketing perspective it's better to emphasize the fact reducing x-risk is in each individual's self-interest even if no one else is doing it. Also, instead of talking about AI arms races, we should talk about why AI done right means a post-scarcity era whose benefits can be shared by all. There's no real benefit to being the person who triggers the post-scarcity era.

I dunno... I think describing them as a tragedy of the commons can help people understand why the problems are challenging and deserving of attention.

Tragedies of the commons usually involve some personal incentive to defect, which doesn't seem true in the framework you have. Of course, you could get such an incentive if you include race dynamics where safety takes "extra time", and then it would seem like a tragedy of the commons (though race to the bottom seems more appropriate)

If there is a cost to reducing Xrisk (which I think is a reasonable assumption), then there will be an incentive to defect, i.e. to underinvest in reducing Xrisk. There's still *some* incentive to prevent Xrisk, but to some people everyone dying is not much worse than just them dying.

Cool, I think we agree.

I'm not sure. I was trying to disagree with your top level comment :P

Ah, you're right, we don't really agree, I misunderstood.

I think we basically agree on actual object-level thing and I'm mostly disagreeing on the use of "tragedy of the commons" as a description of it. I don't think this is important though so I'd prefer to drop it.

Tbc, I agree with this:

If there is a cost to reducing Xrisk (which I think is a reasonable assumption), then there will be an incentive [...] to underinvest in reducing Xrisk. There's still *some* incentive to prevent Xrisk, but to some people everyone dying is not much worse than just them dying.

An important aspect is that people disagree about which (if any) X-risks are real.

That makes it quite different from the usual scenario, where people agree that situation sucks but each of them has individual incentives to contribute to making it worse. Such situation allows solutions like collectively agreeing to impose a penalty on people who make things worse (thus changing their individual incentive gradient). But if people disagree, imposing the penalty is politically not possible.

The boundary between disagreement about whether something is real and different preferences about the costs of mitigation are, alas, somewhat porous. E.g., when there was debate about the dangers of smoking, you were much less likely to think there was a lot of danger if you were either a smoker or an employee of a tobacco company than if you were neither of those things.

I don't know how strong this effect is in the domain of existential risk; it might depend on what things one counts as existential risks.

I think there's an important distinction between x-risks and most other things we consider to be tragedies of commons: the reward for "cooperating" against "defectors" in an x-risk scenario (putting in disproportionate effort/resources to solve the problem) is still massively positive, conditional on the effort succeeding (and in many calculations, prior to that conditional). In most central examples of tragedies of the commons, the payoff for being a "good actor" surrounded by bad actors is net-negative, even assuming the stewardship is successful.

The common thread is that there might be a free-rider problem in both cases, of course.

This is making the somewhat dubious assumption that X risks are not so neglected that even a "selfish" individual would work to reduce them. Of course, in the not too unreasonable scenario where the cosmic commons is divided up evenly, and you use your portion to make a vast number of duplicates of yourself, the utility, if your utility is linear in copies of yourself, would be vast. Or you might hope to live for a ridiculously long time in a post singularity world.

The effect that a single person can have on X risks is small, but if they were selfish with no time discounting, it would be a better option than hedonism now. Although a third alternative of sitting in a padded room being very very safe could be even better.

I acknowledge that there's a distinction, but I fail to see how it's important. When you (shut up and) multiply out the probabilities, the expected personal reward for putting in disproportionate resources is negative, and the personal-welfare-optimizing level of effort is lower than the social-welfare-optimizing level.

Why is it important that that negative expected reward is made up of a positive term plus a large negative term (X-risk defense), instead of a negative term plus a different negative term (overgrazing)?

Not all proposed solutions to x-risk fit this pattern: If government spends taxes to build survival shelters that will shelter only a chose few who will then go on to perpetuate humanity in case of a cataclysm, most tax payers receive no personal benefit.

Similarly, if government-funded programs solve AI value loading problems and the ultimate values don't reflect my personal self-regarding preferences, I don't benefit from the forced funding and may in fact be harmed by it. This is also true for any scientific research whose effect can be harmful to me personally even if it reduces x-risk overall.

New to LessWrong?