7 Gatekeeper variation

by JoshuaFox

7th Aug 2015

2 min read

8

7

Personal Blog

7

New Comment

8 comments, sorted by

top scoring

Click to highlight new comments since: Today at 6:06 PM

[-]Luke_A_Somers10y70

Objection 1 seems really strong. The kinds of problems that AGI would be better at than non-general-intelligences are those with ambiguity. If it was just a constraint-solver, it wouldn't be a threat in the first place.

Similarly, with such a restricted output channel, there's little-to-no point in making it have agency to begin with. We're deep in 'tool AI' territory. The incentives to leave this territory would remain.

Reply

[-]JoshuaFox10y00

Thanks. Those points are correct. Is there any particular weakness or strength to this UP-idea in contrast to Oracle, tool-AI, or Gatekeeper ideas?

Reply

[-]Luke_A_Somers10y20

Seems like your usual 'independently verify everything the AI says' concept, only way more restrictive.

Reply

[-]JoshuaFox10y00

Sure, but to "independently verify" the output of an entity smarter than you is generally impossible. This makes it possible, while also limiting the potential of the boxed AI to choose its answers.

Reply

[-]Luke_A_Somers10y00

'Generally impossible'. Well... only in the sense that 'general' means 'in every conceivable case'. But that just means you miss out on some things.

This is not the reason that tool AI would likely fail

Reply

[-]kilobug10y20

This wont work, like with all other similar schemes, because you can't "prove" the gatekeeper down to the quark level of what makes its hardware (so you're vulnerable to some kind of side-attack, like the memory bit flipping attack that was spoken about recently), nor shield the AI from being able to communicate through side channels (like, varying the temperature of its internal processing unit which it turns will influence the air conditioning system, ...).

And that's not even considering that the AI could actually discover new physics (new particles, ...) and have some ability to manipulate them with its own hardware.

This whole class of approach can't work, because there are just too many ways for side-attacks and side-channels of communication, and you can't formally prove none of them are available, without going down to making proof over the whole (AI + gatekeeper + power generator + air conditioner + ...) down at Schrödinger equation level.

Reply

[-]JoshuaFox10y00

You're quite right--these are among the standard objections for boxing, as mentioned in the post. However, AI boxing may have value as a stopgap in an early stage, so I'm wondering about the idea's value in that context.

Reply

[-]Squark10y10

It is indeed conceivable to construct "safe" oracle AIs that answer mathematical questions. See also writeup by Jim Babcock and my comment. The problem is that the same technology can be relatively easily repurposed into an agent AI. Therefore, anyone building an oracle AI is really bad news unless FAI is created shortly afterwards.

I think that oracle AIs might be useful to control the initial testing process for an (agent) FAI but otherwise are far from solving the problem.

Reply

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

7

Gatekeeper variation

7

7