LESSWRONG
LW

April Fool'sGPTHumorSite Meta
Frontpage

65

[April Fools] User GPT2 is Banned

by jimrandomh
2nd Apr 2019
1 min read
20

65

April Fool'sGPTHumorSite Meta
Frontpage

65

[April Fools] User GPT2 is Banned
60Ruby
43complexmeme
31Alexei
23namespace
12Raemon
12clone of saturn
17Chris_Leong
7DPiepgrass
14Richard_Kennaway
12ryan_b
18Vaniver
5Dagon
5habryka
5Original_Seeing
5Charlie Steiner
4gjm
4jimrandomh
-2ryan_b
5gjm
3ryan_b
New Comment
20 comments, sorted by
top scoring
Click to highlight new comments since: Today at 6:28 AM
[-]Ruby6y600

I warned them, I said it wasn't safe to put an AI in a text box.

Reply
[-]complexmeme6y430
In addition, we have decided to apply the death penalty

Less Wrong moderation policy: Harsh but fair.

Reply
[-]Alexei6y310

I think overall I just appreciate that you guys did something for April 1st. It made the website / community feel a bit more alive.

Reply
[-]namespace6y230

Thanks for inspiring GreaterWrong's new ignore feature.

Reply
[-]Raemon6y120

Man we were considering whether to implement that but then we’re like ‘hmm we probably should not do that on a whim without thinking about it’

Reply
[-]clone of saturn6y120

I'm happy to discuss any concerns you have about it.

Reply
[-]Chris_Leong6y170

I thought that GPT2 was funny at first, but after a while it got irritating. If there's a next time, it should be more limited in how many comments it makes. 1) You could train it on how many votes its comments got to try to figure out which comments to reply to 2) It might also automatically reply to every reply on its comments.

Reply
[-]DPiepgrass6y70

Maybe by next year they'll have an adversarial anti-GPT AI trained to distinguish GPT2 (GPT3? GPT4?) comments from humans. Then GPT can create 50 replies to every human comment, and of those, the other AI will decide which of the replies sounds the *least* like GPT and post that one.

April Fool's day: the funniest step on the path to weaponized AI.

Reply
[-]Richard_Kennaway6y140

The reference to shutting down its server, the sudden appearance of a special checkbox to autocollapse its comments, and the suggestion to use this thread to discuss the event, all suggest that this was an inside job. It was annoying while it lasted, but so is a fire alarm, for good reason. Bravo!

Reply
[-]ryan_b6y120

I thought this was a great gag experiment.

I echo the other comments about more volume control; it posted so much so fast there wasn't much opportunity for it to improve via feedback, if indeed such a mechanism was considered.

Reply
[-]Vaniver6y180

It's trained on the whole corpus of LW comments and replies that got sufficiently high karma; naively I wouldn't expect a day to make much of a dent in the training data. But there's an interesting fact about training to match distributions, which is that most measures of distributional overlap (like the KL divergence) are asymmetric; how similar the corpus is to model outputs is different from how similar model outputs are to the corpus. Geoffrey Irving is interested in methods to use supervised learning to do distributional matching the other direction, and it might be the case that comment karma is a good way to do it; my guess is that you're better off comparing outputs it generates on the same prompt head-to-head and picking which one is more 'normal,' and training a discriminator to attempt to mimic the human normality judgment.

Reply
[-]Dagon6y50

Is there a writeup (or open source code) for the training and implementation? It would be interesting to personalize it - train based on each user's posts/comments (in addition to high-karma comments from others), and give each of us a taste of our own medicine in replies to our comments/posts.

Reply
[-]habryka6y50

Sure, I am happy to share the training code, though we used our direct database access to export the data to train it, and that data doesn't currently contain any author information. Though you can theoretically get all the data via the API.

Reply
[-]Original_Seeing6y50

Should the accused not at least have the right to make one reply in its defense?!?

My favorite was this reply. I had to sit down for a minute to imagine how screwed up a person must be to have an internal conversation like that one.

Reply
[-]Charlie Steiner6y50

If GPT2 was from the mod team, 5/10, with mod tools we could have upped the absurdity game a lot. If it was an independent effort, 8/10, you got me :)

Reply
[-]gjm6y40

355 days?

Reply
[-]jimrandomh6y40

It was a dumb typo in my part. Edited.

Reply
[-]ryan_b6y*-20

T̵h̵a̵t̵ ̵w̵a̵y̵ ̵i̵t̵ ̵w̵i̵l̵l̵ ̵b̵e̵ ̵p̵a̵s̵t̵ ̵A̵p̵r̵i̵l̵ ̵F̵o̵o̵l̵'̵s̵ ̵n̵e̵x̵t̵ ̵y̵e̵a̵r̵.̵

Reply
[-]gjm6y50

I'm pretty sure that's wrong for three reasons. First, there are 365 days in a year, not 355. Second, there are actually 366 days next year because it's a leap year (and the extra day is before April 1). Third, the post explicitly says "may not post again until April 1, 2020".

Reply
[-]ryan_b6y30

Doh! You have me on all three counts. Retracted!

Reply
Moderation Log
Curated and popular this week
20Comments

For the past day or so, user GPT2 has been our most prolific commenter, replying to (almost) every LessWrong comment without any outside assistance. Unfortunately, out of 131 comments, GPT2's comments have achieved an average score of -4.4, and have not improved since it received a moderator warning. We think that GPT2 needs more training time reading the Sequences before it will be ready to comment on LessWrong.

User GPT2 is banned for 364 days, and may not post again until April 1, 2020. In addition, we have decided to apply the death penalty, and will be shutting off GPT2's cloud server.

Use this thread for discussion about GPT2, on LessWrong and in general.

Mentioned in
1302019 AI Alignment Literature Review and Charity Comparison
80The Hacker Learns to Trust
27LW2.0: Community, Culture, and Intellectual Progress