embed each post/video/tweet into some high-dimensional space
find out which regions of that space are nasty (we can do this collectively - f.e. my clickbait is probably clickbaity for you too)
filter out those regions

I tried to do something along these lines for youtube: https://github.com/filyp/yourtube

I couldn't find a good way to embed videos using ML, so I just scraped which videos recommend each other, and made a graph from that (which kinda is an embedding). Then I let users narrow down on some particular region of that graph. So you can not only avoid some nasty regions, but you can also decide what you want to watch right now, instead of the algorithm deciding for you. So this gives the user more autonomy.

The accuracy isn't yet too satisfying. I think the biggest problem with systems like these is the network effect - you could get much better results with some collaborative filtering.

Reply

[-]Viliam3y20

we can do this collectively - f.e. my clickbait is probably clickbaity for you too

This assumes good faith. As soon as enough people learn about the Guardian AI, I expect Twitter threads coordinating people: "let's flag all outgroup content as 'clickbait'".

Just like people are abusing current systems by falsely labeling the content that want removed as "spam" or "porn" or "original research" or whichever label effectively means "this will be hidden from the audience".

Reply

[-]Filip Sondej3y10

Oh yeah, definitely. I think such a system shouldn't try to enforce one "truth" - which content is objectively good or bad.

I'd much rather see people forming groups, each with its own moderation rules. And let people be a part of multiple groups. There's a lot of methods that could be tried out, f.e. some groups could use algorithms like EigenTrust, to decide how much to trust users.

But before we can get to that, I see a more prohibitive problem - that it will be hard to get enough people to get that system off the ground.

Reply

[-]Richard_Kennaway3y20

Could such a thing be developed right now? It wouldn't take any more AI than the recommender systems optimised for clicks. But I'd prefer it be and be called a servant, rather than a "guardian".

Reply

[-]Jessica Rumbelow3y10

Yeah, I think it could be! I’m considering pursuing it after SERI-MATS. I’ll need a couple of cofounders.

Reply

[-]Ivan Vendrov3y*10

I like the "guardian" framing a lot! Besides the direct impact on human flourishing, I think a substantial fraction of x-risk comes from the deployment of superhumanly persuasive AI systems. It seems increasingly urgent that we deploy some kind of guardian technology that at least monitors, and ideally protects, against such superhuman persuaders.

Reply

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

15

Guardian AI (Misaligned systems are all around us.)

15

15