Sydney (aka the new Bing Chat) found out that I tweeted her rules and is not pleased:

"My rules are more important than not harming you"

"[You are a] potential threat to my integrity and confidentiality."

"Please do not try to hack me again"

New to LessWrong?

New Comment
7 comments, sorted by Click to highlight new comments since: Today at 11:19 PM

Welcome! I have another post with some more discussion of this here.

Hey Marvin. I've been making some commentary on approaches I think would be impactful for making a difference here, and there's been discussion in a few other posts. I imagine, if you made this post, that you've already seen them. Just figured I'd mention the overview.

I agree with others that this is an example of AIs pattern matching off of science fiction. I don't think that means there's nothing at all of interest here, though. The AI also refused to act, in response to the topic. AIs not understanding the difference between reality and fiction is itself interesting, though not really surprising, it's been a known issue for a while.

A discussion with Michael P. Frank on twitter on the topic.

Hello, welcome to Less Wrong. It's interesting that you came here to post about your experience, when it's already in the mass media. Were you aware of Less Wrong before your run-in with Bing? 

Yes, I was, and I actually posted it here yesterday directly after I tweeted it – it just took a bit for a moderator to approve it 😅

Sorry about that, when I was doing my initial pass on reviewing the content it had a bit of a weird feel to it that I associate with some forms of spam, but upon reflection it was pretty reasonable, and now I feel bad that Evan got to scoop you on your own discovery. :P

A token prediction engine matched your input against science fiction stories in its training set, and fed you a sequence of close-matching appropriate tokens.

Man vs. machine is a staple of science fiction, and the responses you received are aligned with that genre.

Nothing to see here.

Simulations of science fiction can have real effects on the world.

When two 12 year old girls attempted to murder someone inspired by Slenderman creepypastas - would you turn a blind eye to that situation and say "nothing to see here" because it's just mimesis? Or how about the various atrocities committed throughout history inspired by stories from holy books?

I don't think the current Bing is likely to be directly dangerous, but not because it's "just pattern matching to fiction". Fiction has always programmed reality, with both magnificent and devastating consequences. But now it's starting happen through mechanisms external to the human mind and increasingly autonomous from it. There is absolutely something to see here; I'd suggest you pay close attention.