AI governance needs a theory of victory

At a minimum, it [moratorium] would require establishing strong international institutions with effective control over the access to and use of compute.

Note that for this to be an actually long-term victory, you can't just control large scale compute resources. This requires controlling even personal computers. Algorithmic progress will continue to advance what can be done with a personal home computer or smartphone. This wouldn't be an immediate problem, but you'd need to have a plan which involved controlling securely, confiscating or destroying every computer and smartphone and preventing any uncontrolled computers from being made.

That's not the way I usually hear this discussed when I hear people endorse a moratorium. They usually talk about just non-personal large scale computing resources like data centers.

The true moratorium is the full Butlerian Jihad. No more computer chips anywhere, unless controlled so thoroughly you are willing to bet the existence of humanity on that control.

Reply

1

[-]William the Kiwi1y52

This would depend on whether algorithmic progress can continue indefinitely. If it can, then yes the full Butlerian Jihad would be required. If it can't, either due to physical limitations or enforcement, then only computers over a certain scale would be required to be controlled/destroyed.

Reply

[-]Nathan Helm-Burger11mo20

Can it continue indefinitely? No, infinity is big.

Can it continue far enough that a single laptop computer can host a powerful model? Pretty sure most technical experts are going to agree that that seems feasible in theory.

After that, it's a question of offense-defense balance. Currently, one powerful uncontrolled model can launch an attack that can wipe out the vast majority of humanity. Currently, having lots of similarly powerful models working for the good guys doesn't stop this. Defensive acceleration seeks to change this balance. Will offense continue to dominate in the future? If so, we face precarious times ahead.

Reply

[-]Nathan Helm-Burger1y42

I currently believe with fairly high confidence that AI Leviathan is the only plausibly workable approach which maintains the Bright Future (humanity expanding beyond Earth). I think a Butlerian Jihad moratorium must either eventually be revoked in order to settle other worlds, or fail over the long term due to lack of control maintenance.

I do think that a temporary moratorium to allow for time for AI alignment research is a reasonable plan, but should be explicit about being temporary and about needing substantial resources to be invested in AI alignment research during the pause. Additionally, a temporary moratorium would need a much more lax enforcement / control scheme. You could probably get by with controlling just data centers for maybe 10 or 20 years. No need to confiscate every personal computer in the world.

I don't believe defensive acceleration is plausibly viable, due to specific concerns around the nature of known technologies. It's possible this view could be changed upon discovery of new defensive technology. I don't anticipate this, and think that many actions which lead towards defensive acceleration are actively counter-productive for pursuing an AI Leviathan. Thus, I would like to convince people to abandon the pursuit of defensive acceleration until such time as the technological strategic landscape shifts substantially in favor of defense.

I have lots of reasoning and research behind my views on this, and would enjoy having a thoughtful discussion with someone who sees this differently. I've enjoyed the discussions-via-LessWrong-mechanism that I've had so far.

Reply

[-]William the Kiwi1y10

Is humanity expanding beyond Earth a requirement or a goal in your world view?

Reply

[-]William the Kiwi1y21

A novel theory of victory is human extinction.
I do not personally agree with it, but it is supported by people like Hans Moravec and Richard Sutton, who believe AI to be our "mind children" and that humans should "bow out when we can no longer contribute".

Reply

[-]Mateusz Bagiński11mo20

This ToV is a ToV for actors whose values are inverted relative to [shared values of humanity] on a particular dimension.

Ergo, if worldwide theocracy or totalitarianism is not a theory of victory, then human extinction is not either.

LESSWRONG
LW

LESSWRONG
LW

45

AI governance needs a theory of victory

45

45

Overview

Introduction

What is the goal of AI governance?

Existential security as a positive goal

Motivation

Criteria for a theory of victory

Endgames and strategies

Scenario robustness

Evaluating interventions

Existential risk not from AI

Nuclear risk as a case study

International coordination to prevent nuclear development

A unilateral monopoly on nuclear weapons

Comparing nuclear risk and AI risk

Theories of victory for AI governance

AI development moratorium

Description

Discussion

AI leviathan

Description

Discussion

Defensive acceleration

Description

Discussion

Conclusion