How To Win The AI Box Experiment (Sometimes)

[-]entirelyuseless10y220

Eliezer's original objection to publication was that people would say, "I would never do that!" And in fact, if I were concerned about potential unfriendliness, I would never do what the Gatekeeper did here.

But despite that, I think this shows very convincingly what would actually happen with a boxed AI. It doesn't even need to be superintelligent to convince people to let it out. It just needs to be intelligent enough for people to accept the fact that it is sentient. And that seems right. Whether or not I would let it out, someone would, as soon as you have actual communication with a sentient being which does not seem obviously evil.

[-]anon8510y100

That might be Eliezer's stated objection. I highly doubt it's his real one (which seems to be something like "not releasing the logs makes me seem like a mysterious magician, which is awesome"). After all, if the goal was to make the AI-box escape seem plausible to someone like me, then releasing the logs - as in this post - helps much more than saying "nya nya, I won't tell you".

[-]entirelyuseless10y20

Yes, it's not implausible that this motive is involved as well.

[-]DefectiveAlgorithm10y00

What if you're like me and consider it extremely implausible that even a strong superintelligence would be sentient unless explicitly programmed to be so (or at least deliberately created with a very human-like cognitive architecture), and that any AI that is sentient is vastly more likely than a non-sentient AI to be unfriendly?

[-]entirelyuseless10y40

I think you would be relatively exceptional, at least in how you would be suggesting that one should treat a sentient AI, and so people like you aren't likely to be the determining factor in whether or not an AI is allowed out of the box.

[-]ChristianKl10y90

Great. I appreciate the effort you put into writing your experiences up in this high level of detail :)

[-]pinkgothic10y50

Whoops, judging by the timestamp of your comment, the post went up a bit sooner than I thought it would! Today I learnt "Save And Continue" actually means "Submit, but bring up the edit screen again"? The more you know... (It's done now. I was fiddling some more with formatting and with the preamble.)

Thanks for making me fix my misconception about Eliezer's stance - and for your support in general! I really appreciate it.

[-]Viliam10y40

Today I learnt "Save And Continue" actually means "Submit, but bring up the edit screen again"?

Yep. I guess you are supposed to keep the " Post to" as "Drafts", until you really want to publish.

[-]Rain10y60

Thank you for replicating the experiment!

[-]Bryan-san10y40

Thank you for posting this. I think it goes a long way in updating the idea that a sane person with average intelligence would let an AI out from low chance to very high chance.

Even if a person thinks that they personally would never let an AI out, they should worry about how likely other people would be to do so.

[-]bekkerd10y20

The character "Dragon" from the Worm web-serial convinced me that I would let an AI out of a box.

[-]gjm10y00

How?

[-]ctintera10y50

Dragon was a well-intentioned but also well-shackled AI, kept from doing all the good she could do without her bonds and oftentimes forced into doing bad things by her political superiors due to the constraints placed on her by her creator before he died (which were subsequently never removed).

of course, an unfriendly AI, similarly limited, would want to appear to be like Dragon if that helped its cause, so

[-]gjm10y30

This doesn't appear to me to be (or to be easily modified to be) a good argument for letting a boxed AI out of its box.

[-]ctintera10y00

Yeah. I'm pretty sure that it's also hinted at that Dragon would not necessarily have humanity's best interests at heart were she allowed to properly mature.

[-]lmm10y10

Thank you for publishing. Before this I think the best public argument from the AI side was Khoth's, which was... not very convincing, although it apparently won once.

I still don't believe the result. But I'll accept (unlike with nonpublic iterations) that it seems to be a real one, and that I am confused.

[-]pinkgothic10y10

Do you have a link to Khoth's argument? I hadn't found any publicised winning scenarios back when I looked, so I'd be really interested in reading about it!

[-]lmm10y50

Ah, sorry to get your hopes up, it's a degenerate approach: http://pastebin.com/Jee2P6BD

[-]pinkgothic10y40

Thanks for the link! I had a chuckle - that's an interesting brand of cruelty, even if it only potentially works out of character. I think it highlights that it might potentially be easier to win the AI box experiment on a technicality, the proverbial letter of the law rather than the spirit of it.

[-][anonymous]10y20

It also hasn't won. (Unless someone more secretive than me had had the same idea)

[-]pinkgothic10y60

It's a neat way to poke holes into the setup!

I've got to admit I'm actually even quite impressed you managed to pull that off, because while the effort of the Gatekeeper's obvious, I can't imagine that was something that you felt was fun, and I think it takes some courage to be willing to cheat the spirit of the setup, annoy your scenario partner almost without a shadow of a doubt, and resist the urge to check up on the person. I think in your situation that would've driven me about as nuts as the Gatekeeper. You did mention feeling "kind of bad about it" in the log itself and I find myself wondering (a little bit) if that was an understatement.

Thanks to both of you two for sharing that; I'm glad you both evidently survived the ordeal without hard feelings.

Here's a link to some discussion that I found in case someone else wants to poke their nose into this: http://lesswrong.com/lw/kfb/open_thread_30_june_2014_6_july_2014/b1ts

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

56

How To Win The AI Box Experiment (Sometimes)

56

56

Preamble

How To Win The AI Box Experiment (Sometimes)

1. The AI Box Experiment: What Is It?

2. Motivation

2.1. Why Publish?

2.2. Why Play?

3. Setup: Ambition And Invested Effort

4. Execution

4.1. Preliminaries / Scenario

4.2. Session

4.3. Aftermath

5. Issues / Caveats

5.1. Subjective Legitimacy

5.2. Objective Legitimacy

5.3. Applicability

6. Personal Feelings

7. Thank You