How about trying reCAPTCHA?

I like to check this site several times a day, and recently there have been so many spam posts I can't tell from the recent post list if there's any new interesting content. Which makes this, my favorite website, the one that is most affected by spam. Really annoying!

reCAPTCHA is an addin that lets us help digitize books that were published before the computer age. Each time we type in the captcha we're helping resolve (as I understand it) OCR issues in the preparation of these older works so they can be issued in digital form (as in free e-books).

I'd be happy to type in the captchas for a worthy cause. Actually, if there were a website where I could spend a few minutes every day, I'd be glad to go there and help with the project.

I tried to post about this yesterday with links, but maybe it was flagged as spam. Something needs to be done, and reCAPTCHA seems like a good solution. How about trying it?

You can go to to find out more.

Syndicate content

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

we try

That will be up to Doug and the other crew... I do know that we try very hard to catch the spam as fast as possible. I recently started a new job and my time here has been drastically reduced. however, i think i've cleaned up the bulk of the mess and hopefully the spam filter will "learn" and catch more... :)

my artwork | my blog

Actually, I like this idea a lot!

Normally, I hate capchas, but when I first read about this a few months ago, I thought it was a great idea, and worth doing. Sadly, I have yet to find a site that uses this. I think that it would be great if DIYPlanner used this. I mean, this is a site about the value of the printed word, and people here tend to be readers as well. Why not use this, not merely to put the brakes on spammers, but to help in digitizing some great books of the past? I'm really surprised that more haven't commented on this. :-)


I'm sure I'll like it too,

I'm sure I'll like it too, but can someone explain in very lay terms exactly what it is? TIA!

I'm not sure if this classes

I'm not sure if this classes as laymans terms, but here's an explanation of reCaptcha: In short, it's a captcha system that also uses the people logging in to digitize scanned books, a word at a time.

Captcha and reCaptcha

You are probably familiar with captcha--many web sites and boards (including Blogger) have barely legible text in images--usually distorted, or in some way obscured. The user is prompted to enter the text from this image. This is a way to keep robots from spamming.

What reCaptcha does, is, instead of using random, or other text, it uses text scanned in from old books--single words. There are two words, a known word, and an unknown, or improperly scanned one, that requires human intervention. If you've ever done OCR (optical character recognition) on scanned texts, you are probably familiar with the problem of scanning text, and trying to do OCR. There are always words that don't scan properly. In a typical OCR session. This is what is happening on this project that is trying to scan in worthwhile books, so they came up with the idea of getting help via the Captcha device.

So how does it work? When you do the reCaptcha, you type in those two words. The one that is known is used for the "captcha" and the unknown word is added to the database. The reCaptcha system then compiles the data from the unknown word (which is sent to multiple people) and picks the most common word returned among the results. This word is then added to the "known" words, and the work continues with new words.