reCAPTCHA is completely ineffective

818 views
Skip to first unread message

Elliot Marx

unread,
Jul 18, 2011, 5:08:25 AM7/18/11
to reCAPTCHA
I set up a forum using Invision Power Board and reCAPTCHA. I used this
for years and it kept puzzling me why my forum kept getting spammed.
No kidding, I was getting 40-50 people signing up daily all with names
like WeLikeViagra blah blah blah.

As a last resort I switched to something called "Normal CAPTCHA
(requires GD)". Although there are less letters in the CAPTCHA fill-
in, something about the random angling and coloring of the letters
makes this far more effective. After I switched, the spam went away
completely

Think about it. If you take 2 words, put them side by side, and just
angle them a bit or put a line through them, any character recognition
system can read the words. If you're going to develop a technology
please don't go off bragging about how effective it is until you've
actually proven it to work. I personally have suffered and my
marketing has suffered for years because of this.

PJH

unread,
Jul 18, 2011, 5:15:41 AM7/18/11
to reca...@googlegroups.com
On Mon, Jul 18, 2011 at 10:08 AM, Elliot Marx <marx....@gmail.com> wrote:
> Invision Power Board

Sounds more like a problem with Invision's implementation of
reCAPTCHA, not the captcha itself.

--
PJH

Jerry Hollombe

unread,
Jul 18, 2011, 8:02:41 PM7/18/11
to reca...@googlegroups.com


As an experiment, I put in some tracking code on my reCAPTCHA protected
web page. Results:

In the last 24 hours, 25 attempts were made to spam the page.
reCAPTCHA blocked 23 of them.
The other two were blocked by my internal spam filter.

However, every one of the attempts reCAPTCHA blocked would have been
blocked by my spam filter anyway, so final score is:

reCAPTCHA 92%
Spam filter 100%

Also during that time, there were no legitimate posts to the page, so
the reCAPTCHA project got 23 bad responses and only two good ones.

Given the above, is there any reason I should continue to hassle my few
legitimate users by making them solve reCAPTCHA puzzles?

--
Jerry Hollombe
Webmaster, http://www.thegarret.info/
Producer, http://www.cafepress.com/thegarretshop.14394351

PJH

unread,
Jul 19, 2011, 4:51:28 AM7/19/11
to reca...@googlegroups.com
On Tue, Jul 19, 2011 at 1:02 AM, Jerry Hollombe
<jerry.h...@gmail.com> wrote:
> Given the above, is there any reason I should continue to hassle my few
> legitimate users by making them solve reCAPTCHA puzzles?

Depends.

Have you had complaints?

Does your spam filter have its own false positives/negatives?

How easy is it to re-enable if you decide to disable it now, but find
you think you need it later on if/when you get busier?

Remember, reCAPTCHA isn't meant to be a be-all and end-all to
preventing spammers; it should be used in conjunction with other
things (like your spam filter.)

--
PJH

Jerry Hollombe

unread,
Jul 19, 2011, 10:57:29 AM7/19/11
to reca...@googlegroups.com

On 7/19/2011 1:51 AM, PJH wrote:
> On Tue, Jul 19, 2011 at 1:02 AM, Jerry Hollombe
> <jerry.h...@gmail.com> wrote:
>> Given the above, is there any reason I should continue to hassle my few
>> legitimate users by making them solve reCAPTCHA puzzles?
>
> Depends.
>
> Have you had complaints?

No, but I get very little traffic in general and don't usually track
reCAPTCHA results, so there's no way to know how many legitimate users
have simply given up or refused to try.

> Does your spam filter have its own false positives/negatives?

None in the last 48 hours. One false positive in the past six months
and that's been corrected.

> How easy is it to re-enable if you decide to disable it now, but find
> you think you need it later on if/when you get busier?

Trivial. (I'm not an amateur and I'm thoroughly familiar with the code.)

> Remember, reCAPTCHA isn't meant to be a be-all and end-all to
> preventing spammers; it should be used in conjunction with other
> things (like your spam filter.)

Right now it's providing no benefit to me, is a nuisance for my users
and is doing more harm than good for the reCAPTCHA project. I'm not
saying this applies to everyone, but for an extremely low traffic site
like mine, I'm beginning to think it's counterproductive.

PJH

unread,
Jul 19, 2011, 10:58:30 AM7/19/11
to reca...@googlegroups.com
On Tue, Jul 19, 2011 at 3:57 PM, Jerry Hollombe
<jerry.h...@gmail.com> wrote:
> Trivial.  (I'm not an amateur and I'm thoroughly familiar with the code.)

I'd say disable it for the moment then.

--
PJH

Jerry Hollombe

unread,
Jul 19, 2011, 11:47:14 AM7/19/11
to reca...@googlegroups.com

reCAPTCHA finally caught one spam attempt that my spam filter would have
missed. Not even spam really, just some idiot posting a meaningless
message for no obvious reason -- more like graffiti. That's one out of
39 in the past day and a half, but it did save me some time and
annoyance, so I think I'll keep it for now.

Thanks for your responses.

Elliot Marx

unread,
Jul 21, 2011, 4:07:57 AM7/21/11
to reCAPTCHA
On Jul 18, 5:15 pm, PJH <pauljherr...@gmail.com> wrote:
> On Mon, Jul 18, 2011 at 10:08 AM, Elliot Marx <marx.ell...@gmail.com> wrote:
> > Invision Power Board
>
> Sounds more like a problem with Invision's implementation of
> reCAPTCHA, not the captcha itself.
>
> --
> PJH

This sounds to me like a standard "If there's a problem, has to be
something else" response.

The implementation looks fine, the reCAPTCHA images appeared and
wouldn't allow login unless the correct input was typed. The same was
true for the normal CAPTCHA.

Again I've tried two different settings for CAPTCHA:
1) "Normal CAPTCHA (requires GD)": When I did this for 1 week, not 1
spam got through. I actually had a legitimate new user log on at that
time. I'll have more now that I can advertise our site and don't have
to worry about spam.
2) "ReCAPTCHA": I had 70 spam attacks daily that got through. I had to
manually weed them out.

Even if as Jerry says reCAPTCHA catches 92% that means in my case I'd
still have at least 5 spams I'd have to weed out manually. Forums are
much worse spam targets which probably explains the difference.

In the meantime, I'm not touching reCAPTCHA and I'm extremely
disappointed. You guys need to acknowledge the shortcomings and use
that information to improve it.

PJH

unread,
Jul 21, 2011, 4:12:56 AM7/21/11
to reca...@googlegroups.com
On Thu, Jul 21, 2011 at 9:07 AM, Elliot Marx <marx....@gmail.com> wrote:
> On Jul 18, 5:15 pm, PJH <pauljherr...@gmail.com> wrote:
>> On Mon, Jul 18, 2011 at 10:08 AM, Elliot Marx <marx.ell...@gmail.com> wrote:
>> > Invision Power Board
>>
>> Sounds more like a problem with Invision's implementation of
>> reCAPTCHA, not the captcha itself.
>>
>> --
>> PJH
>
> This sounds to me like a standard "If there's a problem, has to be
> something else" response.

Because *every* other site that uses reCAPTCHA but not IPB has exactly
the same problem...

Oh, but wait - they don't.

> You guys need to acknowledge the shortcomings and use
> that information to improve it.

Nobody on this group has the ability to fix IPB's implementation of
something that works perfectly well elsewhere. Unless we happen to
have one of IPB's developers on here of course.

Perhaps you could consider complaining to IPB instead?

--
PJH

Message has been deleted

Elliot Marx

unread,
Jul 21, 2011, 8:29:54 AM7/21/11
to reCAPTCHA
Yep I already did complain to IPB already. You can check what they
have to say about reCAPTCHA:

http://community.invisionpower.com/resources/documentation/index.html...

Here's the section on reCAPTCHA:

"reCAPTCHA Options

The reCAPTCHA service is provided at no cost to you from reCAPTCHA.net
and its sponsors. The settings in this area allow you to customize the
service as provided by reCAPTCHA.

The API settings are will work as shipped and we would like to thank
reCAPTCHA for providing a global API key for use by all IPS customers.
This means that reCAPTCHA will work right of the box with no
configuration on your part."

So I find it hard to believe that they would promote reCAPTCHA so much
and not have it working properly on IPB...

Also, this section reference in the notes for the article that
appeared in science magazine explains the reCAPTCHA vs. standard
CAPTCHA even better:

"8. Because computer programs can easily attempt to pass the CAPTCHA
multiple times, if a computer has a success rate of even 5%, the
CAPTCHA is considered broken. A typical convention is that a program
should not be able to pass the CAPTCHA with a success rate of more
than 1 in 10,000. (Downloading 10,000 CAPTCHA images requires
substantial usage of bandwidth, exposing the IP address as potentially
abusive.) Our system uses more than 100,000 words, which yields a
probability of random guessing that is much smaller than 1/10,000. By
contrast, conventional CAPTCHAs that use seven random characters yield
an even smaller probability of success for random guessing: 1/36^7."

The spammers are getting really smart. I'd imagine some combination of
OCR combined with databasing the images is giving them an edge and
explains why many are coming through. With the proliferation of
databases and forums it's easy for them to garner this information
from many forums which use the same engine.

The standard CAPTCHA is doing a lot to help:
1) True randomized images rather than a database.
2) Constantly varying fonts, backgrounds, and character orientations.
3) Unmanageably large database for spammers to memorize and learn
CAPTCHAs

Another thing that might explain this is that on IPB I tried to get a
new reCAPTCHA Private Key and reCAPTCHA Public key from your website
but then reCAPTCHA didn't work any more. I had to use the keys
generated by IPB. Perhaps the spammers got smart and figured that
everyone using IPB was getting the same keys?

Unfortunately I would argue that at this point the computers are
definitely having over 5% success rate getting through reCAPTCHA.
Jerry Hollombe had the same issue, and I don't think that we're the
only ones having this issue.


orgsit...@aol.com

unread,
Jul 21, 2011, 5:13:35 PM7/21/11
to reca...@googlegroups.com
I'm sorry. but I have to comment....
 
Is this IBP's great command of the English Language in their "disclaimer"?
 
The API settings are will (huh?)  work as shipped and we would like to thank reCAPTCHA for providing a global API key for use by all IPS customers.

This means that reCAPTCHA will work right of the box with no configuration on your part."
That aside... it has been reported that reCaptcha so annoyed spammers that the spammers were hiring people for next to nothing to just send spam mail & forum posts while figuring out the words being requested.   So, I suppose it is fairly easy to hire a few hundred or thousand people, pay them a pittance and have them just crank out the spams while solving the words.
 
Second, please remember what the real purpose of reCaptcha was.  Not to really stop spammers, but to help decipher scanned words from ancient / damaged texts, by humans, hoping to determine through large numbers matching answers and the eventual discovery of the word almost lost to the times.
 
Third, does your forum board have any secondary filters?  Like blocking for URLs or excessively misspelled words?  What about looking at the login ID / IP to the national anti-spammers database and blocking based upon that?  Can IBP start providing a DB tracking service that usrs can start logging spammer's too and as the count gets to a certain point, there post does not go through?
 
We have to think outside the box.  My site uses reCaptcha, and yes, was having lots of problems with spammers.  Yet, I had good users who would send me their spam from the service so I could start building an exclusion list. 
 
We are fighting a fight like the movie & software biz does against illegal copies in China.  Close down one shop / avenue, and they will just set up another path in.
 
Let's try and be HELPFUL to reCaptcha and offer positive solutions to the problem.
Ler's also remember... the spammers get / read these posts, so every complaint they LOVE.  Along with every potential solution to reduce it for we have just shared the key to the house with them.
 
Sometimes there are proper times to be vague.

Elliot Marx

unread,
Jul 21, 2011, 11:22:56 PM7/21/11
to reCAPTCHA
This is the response I received from Invision Power Board:

"Hello,

Several users have already notified Google that they have seen a huge
surge in spam, and are linking it to a crack of the reCaptcha system.
IP.Board is not the only system affected - WordPress, vBulliten, and
any other system using reCaptcha is affected. You can read more and
follow the discussion on the reCaptcha website:
http://groups.google.com/group/recaptcha/browse_thread/thread/9de60ecda3dc7687

Unfortunately, because reCaptcha is a 3rd party service, we are not
able to investigate the cause or offer a solution to the situation.
It's simply beyond our control.

-Collin S.
Invision Power Services, Inc."

orgsitesi: Of course you shouldn't discuss algorithms for generating
CAPTCHAs, I'm just mentioning what is obvious to anyone who has every
looked at a standard CAPTCHA and frankly any spammer can notice within
a few seconds of looking at a standard CAPTCHA. No harm done there.

I also don't buy your second argument. CAPTCHA stands for "Completely
Automated Public Turing Test to Tell Computers and Humans Apart." It's
using this acronym so it should be doing the job but it isn't.

The people who register on my website - mostly DJs - use all sorts of
crazy user name spellings which are often difficult to distinguish
from spam. Even many are from countries with similar URLs to the
spammers. I don't have the time or energy to build lists and
algorithms to exclude people. The only 100% sure way is to approve
everyone then block the spam forum posters but this is tremendously
time consuming.

So if we want to be helpful yes just like anyone who writes anti-virus
software if reCAPTCHA doesn't want to end up in the bin of failed
inventions then whoever works for reCAPTCHA needs to get on the ball
and get one step ahead. The code's been cracked, they know the flaws,
and they need to get busy finding smarter ways to "Tell Computers and
Humans Apart."

Jerry Hollombe

unread,
Jul 21, 2011, 11:41:27 PM7/21/11
to reca...@googlegroups.com

On 7/21/2011 8:22 PM, Elliot Marx wrote:

> So if we want to be helpful yes just like anyone who writes
> anti-virus software if reCAPTCHA doesn't want to end up in the bin of
> failed inventions then whoever works for reCAPTCHA needs to get on
> the ball and get one step ahead. The code's been cracked, they know
> the flaws, and they need to get busy finding smarter ways to "Tell
> Computers and Humans Apart."

The problem is it's not computers that are the source of the spam.
There are real live people in the world, mostly in Bangladesh it seems,
who think getting paid a few pennies an hour to sit at a terminal and
solve CAPTCHAs all day is a good job. Some have even advertised their
services in this forum -- presumably to any professional spammers who
may be lurking.

reCAPTCHA does effectively distinguish between humans and computers. It
blocks dozens of spam attempts at my site every day. It can't and isn't
intended to block spam entered by humans. For that I've built my own
filters and they block about 99% of what little reCAPTCHA can't. I
continue to tune my filters as best I can. (A webmaster's job is never
done. /-: )

Orgsit...@aol.com

unread,
Jul 21, 2011, 11:46:09 PM7/21/11
to reca...@googlegroups.com
Have you considered implementing the following?
 
 
Sincerely,

-Scott
Message has been deleted

Elliot Marx

unread,
Jul 22, 2011, 12:12:27 AM7/22/11
to reCAPTCHA
Jerry: Kudos for your patience. I can't hire a full time webmaster to
cover for reCAPTCHA's flaws. reCAPTCHA has fallen a step behind the
spammers and needs to catch up.

Jerry Hollombe

unread,
Jul 22, 2011, 1:21:55 AM7/22/11
to reca...@googlegroups.com

It can't and isn't intended to. It's just a CAPTCHA and, despite claims
to the contrary, I don't think there's much, if any, evidence that
computers are solving the challenges at a useful rate. If they are,
we're out of luck because, if the challenges are made any more
difficult, humans won't be able to solve them either. In fact,
reCAPTCHA is blocking at least 98% of the attempts on my site, which is
evidence the machines aren't beating it.

We _know_ there are human beings solving the challenges for the
spammers. They've hawked their services in this forum. _By
definition_, CAPTCHAs can't solve that problem.

Building spam filters isn't a full time job. It's pretty
straightforward. In PHP, you create a function called isSpam() and pass
it the variables the spammers fill in. It does simple searches for
typical spam strings and returns true if it finds one, false if it
doesn't. On the rare occasion that a spammer gets through (maybe once a
week or so), I add an appropriate string to the filter, if there is one,
and that's the end of that.

Others here have mentioned on-line anti-spam services that do similar
things with IP addresses and such. I'm considering adding them to my
filter, but, so far, it isn't worth the bother.

PJH

unread,
Jul 22, 2011, 3:15:01 AM7/22/11
to reca...@googlegroups.com

Can we have a link to IPB's module, or whatever they call it, that
implements reCAPTCHA please?

--
PJH

tom wible

unread,
Jul 22, 2011, 8:54:12 PM7/22/11
to reca...@googlegroups.com
in the past 2 days i've seen ~16 spam comments on
http://bostoncamerata.org/blog, most of which were caught by wordpress...this is
remarkable only in that i usually see 1-2 spams/month.
Reply all
Reply to author
Forward
0 new messages