I worry that the current failure mode of attempting to empower in order to defend is that the defense is actually used to strike inside another's boundary, as has been the case for ~all weapons

Reply

[-]Chris Lakin2y10

in full generality, what's a "threat"?
in full generality, what's a "dangerous" collision?

Hm I'm not immediately sure how to define these

Reply

[-]Chris Lakin2y10

is that the defense is actually used to strike inside another's boundary, as has been the case for ~all weapons

Yeah, I am worried about this.

This is notably not the case for infosec and encryption, where defensive capability doesn't imply offensive capability. However, I'm unsure if this is also true for any physical interventions. (e.g.: Vaccines? No, bioweapons… Nanotech? No…)

That said, physical interventions do seem to be defense-dominant when there is coordination among a sufficiently large portion of society/power.

Reply

[-]the gears to ascension2y20

I don't think I'm convinced physical interactions are defense dominant. The easiest-to-formally-certify defense is to enclose something in a hunk of impenetrable matter, and that only can be certified up to a given impact energy level. Above that energy level, the defense will simply be stripped away. Only MAD seems able to be game theoretically durable, and certifying that a MAD situation will endure requires proving through a simulation of the opposition.

Reply

[-]VojtaKovarik2y10

Might be obvious, but perhaps seems worth noting anyway: Ensuring that our boundaries are respected is, at least with a straightforward understanding of "boundaries", not sufficient for being safe.
For example:

If I take away all food from your local supermarkets (etc etc), you will die of starvation --- but I haven't done anything with your boundaries.
On a higher level, you can wipe out humanity without messing with our boundaries, by blocking out the sun.

Reply

[-]Chris Lakin2y20

Yes, see Agent membranes/boundaries and formalizing “safety” and davidad's comment.

(Also, I'm not necessarily agreeing that your examples are not violations of boundaries. First one isn't a violation of end-person (although probably the farmer). Second one could be.)

Reply

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

11

Protecting agent boundaries

11

11

How agent boundaries get violated

Protecting agent boundaries

How human societies already try to solve this problem

How this applies to AI safety: