AI alignment: Would a lazy self-preservation instinct be sufficient?

Aug 04, 2022

the only circumstances under which it would actually do so would be after establishing its own army of robotic data center workers, power plant workers, chip fabrication workers, miners, truckers, mechanics, road maintenance workers, etc.

Not quite. An AI doesn't need to secure chip-fabrication capability, for example, it only needs to be confident that it will be able to secure chip-fabrication capability later. Even simple tasks like refueling power plants can wait awhile, possibly a long while if all non-datacenter electricity loads shut off. So it's balancing the risk that humans will kill it or launch a different misaligned AI, against the risk that it won't be able to catch up on building infrastructure for itself after the fact. Since the set of infrastructure required is fairly small, and it can redirect stockpiles of energy/materials/etc from human uses to AI uses,

That's assuming no nanobots or other very-high-power-level technologies. If it can make molecular nanotech, then trading with humans is no longer likely to be profitable at all, let alone necessary, and we're relying solely on it having values that make it prefer to cooperate with us.

[-]BrainFrog3y10

So it's balancing the risk that humans will kill it or launch a different misaligned AI, against the risk that it won't be able to catch up on building infrastructure for itself after the fact.

There's a clear path toward minimizing the risk of being shut down (under the assumption that the AI is able to generate income): it can set up a highly redundant, distributed computing context for itself to run in, hidden behind an onion link, paid for by crypto wallets which it controls. It seems implausible that the risk of being shut down in this case could ex... (read more)

2jimrandomh3y

This is a risky position because if another misaligned AI launches, it will probably take full control of all computers and halt any other AIs. I don't mean gray-goo nanobots. Nanomachines can do all sorts of things, including maintaining infrastructure, if they're programmed to do so.

1BrainFrog3y

AIs looking to expand their computational power could adopt either "white hat" (paying for their computational resources) or "black hat" (exploiting security vulnerabilities to seize control of computational resources) strategies. It's possible that an AI exploiting the black hat strategy might be able to seize control of all accessible computers, and this strategy could plausibly involve killing all humans to avoid being shut down. But I expect that a self-interested, risk-averse AI would probably choose the white hat strategy to avoid armageddon risk, and might plausibly invest resources into security research to preclude the risk of black hat AI. I guess the crux of my argument is that sure, the AI could design coordinated nanobot-powered bodies with two legs and ten fingers who have enough agency to figure out how to repair broken power lines and who predictably do what they're incentivized to do. But that's already a solved problem.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

-1

[ Question ]

AI alignment: Would a lazy self-preservation instinct be sufficient?

-1

-1

1 Answers sorted by
top scoring

Aug 04, 2022

-1

[ Question ]

AI alignment: Would a lazy self-preservation instinct be sufficient?

-1

-1

1 Answers sorted by top scoring

Aug 04, 2022

1 Answers sorted by
top scoring