LW wiki spam filtering

The same one who defined a predicate checking against the age of the user's account, maybe the karma score is similarly available? Also, there's been some fiddling with karma representation recently, so it may be an easy task for the person who implemented that.

[-]gwern13y50

The same one who defined a predicate checking against the age of the user's account, maybe the karma score is similarly available?

You're not a programmer, are you?

MediaWiki and Reddit are entirely different programs, so no, using one of the canned rules in a commonly used extension designed for non-programmers is very different from figuring out how to interface a mess written in PHP to a custom Python codebase to extract such information and expose it to the rules in the latter.

Writing the rule based on the extension docs took me maybe 5 minutes. I would be a little shocked if Trike could make the karma available in the live site, with testing that it works correctly, in anything less than 2 orders of magnitude more time (8 hours).

[-]Kawoomba13y10

I see.

It seemed like a reasonable guess, I haven't checked out the specific architecture and am not familiar with the MediaWiki API or the extension being designed for non-programmers.

If web programming was clean and nifty, the wiki and the forum system would both reference the same user database on some MySQL backend. If it's an "organically grown" mess, nevermind. But the API should still be well defined. Now, it was only an idle suggestion in case someone missed some low hanging fruit, I didn't expect it to be some assured panacea. I've only ever done one php/mySQL website around 9 years ago, I don't much care for web programming.

Could also just remove the whole "recent wiki edits" potion of the sidebar. It doesn't provide much and takes up screen space on all the important pages.

[-]gwern13y50

Could also just remove the whole "recent wiki edits" potion of the sidebar. It doesn't provide much and takes up screen space on all the important pages.

It'd be much improved if it simply filtered out the registration of new user accounts.

[-]falenas10813y00

Don't we already have code to extract karma points values for the minimum for creating a new discussion post?

[-]gwern13y100

Discussion posts are made within the Reddit system which is Lesswrong.com; which is entirely separate from the MediaWiki instance which is wiki.lesswrong.com.

[-]NancyLebovitz13y-20

Why not require positive karma to create or edit wiki pages? There's no relevant downside to it.

There may be people who love the wiki, but not LW. Requiring positive karma might make editing the wiki more trouble than it's worth for them.

[-]Kawoomba13y10

Just saying "hi" in the Welcome thread or posting basically any random sentence whatsoever in the Quotes thread should take care of that, one single karma would suffice.

[-]maia13y70

To be fair, it is possible to actually be downvoted in the Quotes thread.

[+]Kawoomba13y-80

[-]A1987dM13y10

Beware trivial inconveniences!

[-]Desrtopa13y70

Considering that the great majority of wiki edits are spam and spam deletion, some more inconvenience may be warranted.

[-]wwa13y30

I have so far defined one rule: page creation is forbidden for users younger than X hours

Never, ever publicly post your constants. If it was a site-specific spammer, he can now create accounts X hours before posting, aka good old cookie-aging.

Overkill security professional solution (if you don't mind Ajax and some coding though) : have the site or at least crucial part of it self-decrypt with one-time-pad. Doubles the size (if whole-site) but robots extremely rarely run scripts so both chunks parse as garbage. And even if they did understand JavaScript you could make the problem "AI-hard" in principle (yes... I do realize there's no such formal class).

Also, for fun: http://xkcd.com/810/

[-]gwern13y00

Never, ever publicly post your constants. If it was a site-specific spammer, he can now create accounts X hours before posting, aka good old cookie-aging.

I'm not worried. As you can see from the sidebar, spammers have been prolifically creating accounts for countless months and almost all accounts wind up never being used. My inference is that most of them are being stymied by other anti-spam features. None of the spam seems to be done by hand, and certainly they aren't looking on an obscure post on a different domain for a value that they would stumble upon accidentally ('hm, my spam account didn't work yesterday, let's take a look to see if the error went away - oh, it did, there must be a 24-hr timeout').

[-]wwa13y10

I totally agree there's a very low probability of leak there. There's still a (meta) reason to do stuff like this though.

If you have a high value target (like building an AI), you need insane paranoid security. Means you need to train the mindset of patching everything you possibly can not because you think it's unsafe but, you know, just because. E.g. you could tell the difference between good and bad sysadmin just by looking at his own PC. A good sysadmin will have every single drive truecrypted, no matter the inconvenience. This is an opportunity to train the mind into the job.

[-]Larks13y30

Thankyou for doing this.

[-]jamesf13y30

It's pretty silly that the site about instrumental rationality isn't the greatest site on the internet yet. I don't have any concrete suggestions to fix it besides "If that bothers you enough, start looking at source code relevant to these problems".

[-]James_Miller13y-20

In a saner world micropayments would take care of this as everyone would expect to have to pay a penny to evidence that they are not a spambot. But as we don't have a good micropayments infrastructure this solution is likely unavailable to you.

[-]gwern13y150

Which can of course still be misapplied; the Bitcoin wiki at some point began requiring a payment of some bitcoin, I logged in to make some more edits, discovered it applied to me too, and mentally told them to go screw themselves: I wasn't about to pay for the privilege of working for them, when I had already proven myself with previous good edits.

[-]jamesf13y70

Requiring people to fork over any amount of money at all usually results in many fewer participants. This doesn't seem better overall as long as other effective solutions to spam exist.

[-]Shmi13y20

Probably because it's currently very hard to pay 0.1c hassle-free.

[-]David_Gerard13y-20

Artificial shortages are artificial. Why should people bother giving you money to work for you?

[-]James_Miller13y-20

To prove, at an extremely low cost to you and zero cost to society, that you are not going to spam them.

[-]David_Gerard13y10

It appears this idea does not work in practice.

[-]James_Miller13y20

Hence my starting with "In a saner world"

[-]Richard_Kennaway13y70

I see. The solution does not work, but it is nevertheless correct, because it should work. It is the people who are wrong.

In the real world, a solution to a problem is something that actually solves the problem. An engineer does not get to blame metal for obstinately rusting despite his wonderful anti-rusting invention. Instead, he recognises that his invention does not work. Neither does a would-be social engineer get to blame the people for not behaving as he thinks they should.

[-]James_Miller13y10

"It is the people who are wrong."

Hence the need for a website called "LessWrong"

[-]gwern13y00

As the person whose experience Gerard is citing, I would point out that my beef is largely with how the micropayment is being done: known-good users (like myself) were not being grandfathered in, as would be sane.

Having dealt with so much spam on the LW wiki, I wouldn't be averse to a micropayments thing if it were even slightly reasonable to expect random LWers to possess bitcoins and there were some non-payment way of editing with moderation (eg. having the first n edits by users who choose to not pay go into some moderation queue before going live).

LESSWRONG
LW

LESSWRONG
LW

40

LW wiki spam filtering

40

40