4

22nd Feb 2020

2 min read

A

3 3

4

AI Risk

Frontpage

4

New Answer

New Comment

3 Answers sorted by
top scoring

Matthew Barnett

Feb 22, 2020

50

Similarly, if this were the only problem, then people would just put more effort into determining whether an AGI is aligned before turning it on, or not build them.

The traditional arguments for why AGI could go wrong imply that AGI could go wrong even if you put an immense amount of effort into trying to patch errors. In machine learning, when we validate our models, we will ideally do so in an environment that we think matches the real world, but it's common for the real world to turn out to be subtly different. In the extreme case, you could perform comprehensive testing and verification and still fail to properly assess the real world impact.

If the cost of properly ensuring safety is arbitrarily high, there is a point at which people will begin deploying unsafe systems. This is inevitable, unless you could somehow either ban computer hardware or stop AI research insights from proliferating.

adamShimi

Feb 22, 2020

30

I talked about this issue with Buck in the comments (my comment, Buck's answer)

What I pointed was that the spaceship examples had very specific features:

Both personal and economic incentives are against the issue.
The problem are obvious when one is confronted with the situation
At the point where the problem becomes obvious, you can still solve it.

My intuition is that the main disanalogies with the AGI case are the first one (at least the economic incentives that might push people to try dangerous things when the returns are potentially great) and the last one, depending on your position on takeoffs.

Anon User

Feb 22, 2020

10

One big difference is that "having enough food" admits a value function ("quantity of food") that is both well understood and for the most part smooth and continuous over the design space, given today's design methodology (if we try to design a ship with a particular amount of food and make a tiny mistake it's unlikely that the quantity of food will change that much). In contrast, the "how well is it aligned" metric is very poorly understood (at least compared with "amount of food on a spaceship") and a lot more discontinuous (using today's techniques of designing AIs, a tiny error in alignment is almost certain to cause catastrophic failure). Basically - we do not know what exactly if means to get it right, and even if we knew, we do not know what the acceptable error tolerances are, and even if we knew, we do not know how to meet them. None of that applies to the amount of food on a spaceship.

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

4

[ Question ]

What do you make of AGI:unaligned::spaceships:not enough food?

4

4

3 Answers sorted by
top scoring

Feb 22, 2020

Feb 22, 2020

Feb 22, 2020

4

[ Question ]

What do you make of AGI:unaligned::spaceships:not enough food?

4

4

3 Answers sorted by top scoring

Feb 22, 2020

Feb 22, 2020

Feb 22, 2020

3 Answers sorted by
top scoring