Re: The Crux List

Really enjoyed reading Zvi's Crux List.

One really important takeaway of lists like this is "we don't know what we don't know". Namely, P(DOOM) depends intimately on a whole bunch of details about the universe that are mostly unknowable in advance.

In that spirit, here are some more cruxes that P(DOOM) depends on.

Does AI favor the attacker (in a military sense)?

Suppose I invented a weapon that was exactly like modern weapons, but with two differences:

It is 100x faster than existing weapons
It is 100x more precise than existing weapons.

Obviously such a weapon would favor the defender, right?

Which is harder to improve AI capabilities or AI ethics?

Suppose it is absolutely trivial to build an AI that is sentient, but extremely difficult to build an AI that is super-intelligent.

In that case, Alignment is super-easy because we can easily create a bunch of AIs with high moral worth without increasing the overall danger to humanity, right?

Suppose, on the other hand, that it is extremely easy to build AI that is super-intelligent but practically impossible to create AI with goals, sentience, etc.

In that case Alignment is super-easy because it will be easy to pull off a pivotal act, right?

Is AI inherently centralizing or decentralizing?

Suppose that AI is inherently centralizing (training runs require large quantities of compute, which gives the few largest players the biggest advantage).

This is good for alignment, because it means we don't have to worry about a lone madman doing something that endangers all of humanity, right?

Suppose on the other hand that AI is inherently decentralizing (because literally anyone can run an AI on their GPU).

This is good for Alignment because centralized power is generally bad, right?

How hard is it to emulate human minds?

If emulating human minds (say by training an LLM to imitate human speech) is trivial, this is good for Alignment because it means that it is easy to teach AI human values, right?

Suppose, on the other hand, that emulating human minds is literally impossible because the no-cloning theorem. This is good for Alignment because it eliminates a large category of S-Risk, right?

Is Many Worlds True?

If the Many Worlds interpretation is true, this is good for Alignment because it means that even a small P(~DOOM) implies that we obtain a large absolute amount of future utility, right?

Suppose on the other hand that objective waveform collapse is true, then this is good for Alignment because it implies humans have Free Will, right?

What is the point of all this?

Each of these questions has the property that:

Knowing the answer in advance is practically impossible
Knowing the answer should change your P(DOOM) a lot

One thing this should definitely do is rule out any claims that P(DOOM) is extremely high or extremely low.

Additionally, I think that it nullifies the precautionary principle a bit. Because we don't even have a framework to estimate whether a particular action (say researching AI consciousness) increases or decreases P(DOOM), it is impossible to decide in advance what action is "precautionary".

Finally, and most controversially here on Less Wrong, it means that we should give extra deference to those with hands-on experience. Because Alignment depends finely on the details of AI technology, those who have the most intimate knowledge of this technology through practical experience are likely to be nearer to the truth.

LESSWRONG
LW