Don't Condition on no Catastrophes

by Scott Garrabrant 2y21st Feb 20188 comments

82


I often hear people say things like "By what date do you assign 50% chance to reaching AGI, conditioned on no other form of civilizational collapse happening first?" The purpose of this post is to make this question make you cringe.

I think that most people mentally replace the conditional with something like "if other forms of civilizational collapse were magically not a thing, and did not have to enter into your model." Further, I think this is the more useful question to discuss, as it makes it easier to double crux, or download other people's models. However, not everyone does this, and it is not the question being asked.

To illustrate the difference, consider Alice, who, if they ignored other civilizational collapse, would think that AGI arrival date is uniform over the next 100 years. However, they also think that if not for AGI, extinction level nuclear war will happen in the next 100 years, uniformly at random over the next 100 years. Alice is not concerned about any other catastrophes.

Alice has these two independent distributions on when each event will happen if counterfactually, the other were to magically not happen. However, the world is such that as soon as one of events happens, it causes the other event to not happen, because the world is made very different.

When asking about Alice's median AGI date, ignoring civilizational collapse, we would like to encourage her to say 50 years. However her median AGI date, conditional on no nuclear war happening first is actually 33 years. This is because conditioning on no nuclear war happening first biases towards AGI dates that are early enough to stop a counterfactual future nuclear war.

The form of the question I would like to ask Alice is as follows:

Take your distribution over ways the way the future can go, and sample a random future, . If that future ends with nuclear war at time , sample another world with the property that neither AGI nor any other catastrophe happens before time . If that world ends with a non AGI catastrophe, redefine to be the time of the catastrophe in that world, and repeat the process, until you get a world that ends with AGI, with no other catastrophe happening first. Use this as your new distribution over futures, and tell me the median AGI date.

Note that conditioning on no other catastrophe happening first is the same procedure, except when you sample a new future, you do not require that it has the property that neither AGI nor any other catastrophe happens before time .

I don't have a good name for this alternative to conditioning, and would like suggestions in comments. You may notice a similarity between it and causal counterfactuals. You also notice a similarity between it and the thing you do to get Solomonoff Induction out of the Universal Semimeasure.

82