{ |one, step, back| } 1 of 1 article WikiSyndicate: full/short

Spam, Spam, Spam, Spam ...   10 May 06
[ print link all ]

Today’s topic is Spam … and what we are doing about it.

Spam by the Numbers

Anyone who visits the RubyGarden wiki regularly has probably run into wiki spam. You know what I mean, defaced pages with hundreds of links to questionable web sites. All done with the goal of increasing Google page rank.

Just to give you an idea of the magnitude of this problem, make a guess on how many time during the past 7 days someone tried to deface the RubyGarden wiki with spam.

Got a number? Its probably too low.

According to the logs, we had 18,139 attacks against our wiki. In just seven days! Over the past few weeks we have been averaging between 17,000 and 20,000 attacks in a 7 day period.

That’s a lot of spam.

Fortunately most of the attacks went directly into the wiki tarpit where only other spammers saw the results. Only about 250 attacks made it to the real wiki where they needed to be cleaned by hand.

(For those who aren’t familiar with a wiki tarpit, it is a shawdow wiki behind the real wiki where spammers are directed. The spammers spend all their time updating a virtual wiki that no one, except other spammers, will ever see. The goal is to have the spammers waste their time instead of ours.)

Now the tarpit isn’t perfect. Sometimes legitimate users get sent to the tarpit instead of the real wiki. If you ever went to RubyGarden and saw spam on almost every page, you were probably in the tarpit.

But, cleaning up 250 spams instead of 18000? That is a pretty good success story.

But We Need Something Better

As good as the tarpit approach is, we still need something better. The UseMod wiki software we are using makes it painful to clean up spam. The average page needs about four clicks to despam, with a lot of hard to automate decision making in the process. See this demo for a look at what I do to clean up a UseMod wiki page. Go ahead, click now. I’ll wait for you.

That’s a lot of work. Despamming several hundred posts can take hours.

Ruse

Ruse is new wiki with built-in anti-spam features. It supports UseMod style markup, so all of the RubyGarden pages can be easily migrated into it. It has an integrated tarpit that makes despamming a page a single button press. In fact, Ruse can move all of an author’s pending posts into the tarpit with a single click. Ruse can mark edits as spam based on either content (e.g. linking to a known spam site) or IP address (coming from a known spammer).

Best of all, Ruse makes it easy to distribute the job of detecting and marking spam across the regular contributers to the wiki.

Watch this demo to see Ruse in action.

The Beta

Chad and I have setup a mirror of the RubyGarden wiki at http://rubygarden.org:3000/Ruby. This is a trial run of the software before we commit to using it. Feel free to check it out, kick the tires and beat on it a bit. Shoot, if you have a secret desire to be a spam writter, go for it, just to see what happens.

You can post anonymously, or sign up for a guest account. After a certain number of spam-free postings, guest accounts are upgraded to full member accounts.

Oh, the account “spammer” with password “spammer” is already setup if you to see how the wiki reacts to spammers.

But remember, this is a beta trial and content of the wiki will be reset before it goes “live” for real.

Documentation is a bit skimpy right now, but we are working on that too.

Enjoy.

 

Formatted: 07-Oct-08 22:22
Feedback: jim@weirichhouse.org