Spammers on web2pyslices.com

122 views
Skip to first unread message

Bruno Rocha

unread,
Jun 15, 2013, 12:40:50 AM6/15/13
to web...@googlegroups.com, web2py-developers
Hi,

recently we are having too many spams posted on web2pyslices.com

I am deleting one by one, but started to be difficult to track this.

We need to implement a captcha system or any other kind of spam blocking.

is there any volunter? to do this for user registration form and also for article post form?

I am in a rush between work and medical treatments, I tried but I really have no time now to develop this.

If anybody can take this, please email me ans I give you access to the development version of the code on pythonanywhere.

Thanks.

[]'s

---

Bruno Rocha

Paolo valleri

unread,
Jun 15, 2013, 4:11:49 AM6/15/13
to web...@googlegroups.com, web2py-developers
Personally I don't like captcha image, before delving into the implementation of whatever like that it is worth to try the honeypot mechanism namely a 'hidden field'. A field that if filled out allow you to distinguish between user and robots requests. The field it is hidden by css properties real users aren't able to fill it.
More here: http://en.wikipedia.org/wiki/Honeypot_%28computing%29  Actually, I have never tested if that really works!
we could think about implementing it as an option for web2py, it would be very welcome.
Finally, it seems that web2pyslices.com registration has got a captcha, have you already implemented it?

Paolo

Bruno Rocha

unread,
Jun 15, 2013, 5:33:12 AM6/15/13
to web2py-developers, web...@googlegroups.com
Implemented only for user registration, but spammers are registering, so I guess they have captcha breaking system or there is a hole in the website security?
 (needs investigation)

The captcha, honeypot, confirm by email or something would be really nice if implemented in "create a slice" form.

I hope to get some help on this.

Thanks.


--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

Anthony

unread,
Jun 15, 2013, 9:11:03 AM6/15/13
to web2py-d...@googlegroups.com, web...@googlegroups.com
Is it possible these are not being posted by bots? If so, we might need another tactic, such as requiring that a new user's first post be approved by a moderator.

Anthony

Alan Etkin

unread,
Jun 15, 2013, 9:37:26 AM6/15/13
to web...@googlegroups.com, web2py-d...@googlegroups.com
Is it possible these are not being posted by bots?

It would take a very smart bot to pass captcha (no?). Maybe it is possible to change type of captcha used (i.e. random visual tests like those of arithmetics with objects, etc.)?. I'm clueless about authentication beyond the built-in web2py features but I can help running tests against the web2pyslices app if needed.

If so, we might need another tactic, such as requiring that a new user's first post be approved by a moderator.

+1

Niphlod

unread,
Jun 15, 2013, 9:51:48 AM6/15/13
to web...@googlegroups.com, web2py-d...@googlegroups.com
I have an unrelated (on web2py's side) website that uses captchas from google and bots are successfully registering to it (of course, they need to be approved first but it's a PITA to remove them anyway).
There are captcha services that decode the images for you (and your bot).

I'm working on a threaded comments plugin on my spare time and for spam prevention I just add some hidden fields that needs javascript to be filled. Given that bots running javascript code are a little percentage, this should mitigate the issue (at least, a similar technique on the aforementioned site is keeping spambots away).
Small problem, though, users with javascript disabled are left alone. If that is fine, I can share the draft code (was waiting to complete the plugin before posting to github and here, but if needed that's not a big deal)

paolo....@gmail.com

unread,
Jun 15, 2013, 12:17:37 PM6/15/13
to web...@googlegroups.com

I don't see an approach able to tackle the issue at all,we should implement several techniques together.
Anyway, what shall we do when a bot is detected? Have we got a sort of blacklist? If so,instead of starting with an empty list,  we could think to start from a public available blacklist of bot.

--
 
---
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/M2HlsCpqHbM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.

Niphlod

unread,
Jun 15, 2013, 12:36:05 PM6/15/13
to web...@googlegroups.com


On Saturday, June 15, 2013 6:17:37 PM UTC+2, Paolo valleri wrote:

I don't see an approach able to tackle the issue at all,we should implement several techniques together.
Anyway, what shall we do when a bot is detected? Have we got a sort of blacklist? If so,instead of starting with an empty list,  we could think to start from a public available blacklist of bot.

ehm. when you detect a bot you don't allow it to post something ^_^.
do you want to ban its ip ? sure, can be done, but it's quite a stretch.
BTW, I created a stopforumspam plugin that validates to a list of ips and/or emails to sort out if you want to do that kind of check, but I think it's superflous if honeypots works out.

 

villas

unread,
Jun 15, 2013, 3:10:51 PM6/15/13
to web...@googlegroups.com
I started to get bot spam.  So I introduced the non-displayed honeypot field that the bots would complete.  This worked great at first,  but the bots seemed to learn the trick and started leaving it empty.  So the spam returned. 

After a little research,  I decided that I liked those questions that humans can easily answer,  but bots cannot. 

Only problem with the questions is that there can be several 'right' answers eg. zero, 0, none, nil, nothing -- might all be acceptable answers.  Also the questions needed to be selected at random.  So my solution needed to be not only simple but flexible too. 

Anyhow I wrote that code and I have not been troubled by spam since.  If anyone is interested in this idea I will extract the code from my app and post it.

Regards,  D

 


Niphlod

unread,
Jun 15, 2013, 3:29:42 PM6/15/13
to web...@googlegroups.com
honeypot is the first stop, javascript evaluation is the second.... another step that is almost impossible to break is to require registration for everything (on google and facebook), but that can scare off users (although probably not users of web2pyslices.com)

villas

unread,
Jun 15, 2013, 4:17:57 PM6/15/13
to web...@googlegroups.com
Why is honeypot field your first stop when I just said that it was too unreliable?  (Especially as my site was not even worth spamming). 

JS verification can be bypassed completely by bots,  so what do you have in mind?

My question/answer solution is better than a simple honeypot field,  so didn't you like that idea?

Alan Etkin

unread,
Jun 15, 2013, 4:39:02 PM6/15/13
to web...@googlegroups.com
What about this. Would it be useful? Looks like the source is not mantained nowadays.

http://www.voidspace.org.uk/python/akismet_python.html

Niphlod

unread,
Jun 15, 2013, 4:52:44 PM6/15/13
to web...@googlegroups.com
rephrasing. honeypot field (i.e. hidden from users, needs to be empty, usually filled by bots) is the "simplier implementation".
If users programming bots wants to tackle your site, they find the honeypot pretty easily and then code accordingly, so back to square 1.

Second one is js execution. This is effectively mitigating a lot (>90%) of attempts on the aforementioned site.
I don't know why you state that js verification can be passed by bots... usually they don't want to waste cpu resources loading a full js environment (spidermonkey, phantomjs) just to crack your site (that's the other <10%)


Next on the moderation: everything gets checked in as draft and then moderators check those messages and "publish them". Anthony's solution is the best "bang for the buck" one, but a management "console" has to be programmed anyway, and every bot that gets in needs to be removed anyway (as right now)

Here we stop about "transparent to the users" solutions to the problem.

Next step, we want to "annoy" users.....then requiring filling a captcha, replying to a question, require to register all make sense, but it kinda shifts the problem. captcha are passed right now, a small database of answers gets pounded easily if the bot's programmer is looking at your site (or, with wolphram alpha, he wants to enjoy natural language programming).

Another step, pass everything through predefined filters, as stopforumspam. Surely won't get you killed, but can be heavy on a social app (thinking about slices that runs on pythonanywhere, don't know how much "juice" has got).
Yet another step, pass everything to akismet (@alan beat me while I was writing the post). Doesn't work always and can ban for false positives. Sync use slows the comment form a lot, async one needs some machinery to work (again, checking in everything as draft, then submitting to akismet and pruning whatever returns "it's a spam message").

Niphlod

unread,
Jun 15, 2013, 4:54:28 PM6/15/13
to web...@googlegroups.com
almost forgot. Second step is actually to add a time verification to see if user opened the comment form at least 5 seconds before posting.

Alan Etkin

unread,
Jun 15, 2013, 6:53:39 PM6/15/13
to
almost forgot. Second step is actually to add a time verification to see if user opened the comment form at least 5 seconds before posting.

web2py could use this default validators, as long as they can actually be implemented:

db.auth_user.email.requires = IS_NOT_SPAMMER(...)
db.<table>.<string field>.requires = IS_NOT_SPAM(...)

using one or more of the resources posted in this thread

EDIT: Niphlod's stopforumspam plugin already has an IS_NOT_SPAMMER like validator

villas

unread,
Jun 15, 2013, 6:56:36 PM6/15/13
to web...@googlegroups.com
> rephrasing. honeypot field (i.e. hidden from users, needs to be empty, usually filled by bots) is the "simplier implementation".
If users programming bots wants to tackle your site, they find the honeypot pretty easily and then code accordingly, so back to square
1.

My experience is this: if the "users programming bots" attack your site then you are dealing with a human and CAPTCHAs are irrelevant.  We are dealing with bots here.  Honeypot empty field seemed to be OK once,  but not now.


Second one is js execution. This is effectively mitigating a lot (>90%) of attempts on the aforementioned site.
I don't know why you state that js verification can be passed by bots... usually they don't want to waste cpu resources loading a full js environment (spidermonkey, phantomjs) just to crack your site (that's the other <10%)


I believe most bots initially bypass JS by ignoring it -- they detect the fields and how they are submitted and then just try to whizz off some data.  The spammers are just building on to that basic approach to defeat the most obvious defences.  e.g. The empty honeypot field. 

However,  there are still very effective ways of obfuscating the form with JS.  The added JS checkbox is currently very effective.  I would recommend that as a minimally obtrusive test.  However,  I also think the spambots will eventually start looking out for that one too.
http://uxmovement.com/forms/captchas-vs-spambots-why-the-checkbox-captcha-wins/

For a slightly more obtrusive test the Qs and As are good.  The obvious ones are static prompts,  e.g. What is the sum of 3 and 2?  I had good success with that.  for a while,  but some spambots seem to look for that now.  I currently have total success with my random Qs and As and I can totally change these if and when the spambots catch up.  

I like the 5 secs idea -- I never thought of that!

Summary:
1. I would try the JS checkbox, 5 secs delay,  and/or look at my Qs and As implementation.
2. Failing that you have to fall back to:  increasingly complex image CAPTCHAs, Login, Moderation and Askimet solutions etc.

If the spambot programmers target your site (and let's face it, they would only do so if you are a worthy target),  you have to work with the second group anyway :(

Joe Barnhart

unread,
Jun 16, 2013, 4:09:33 AM6/16/13
to web...@googlegroups.com, web2py-developers
At least one site i use regularly implemented a 24-hour posting delay.  Sign up today and your posting ability starts tomorrow.  It was a little annoying to newbies but it really zeroed the spam!

-- Joe

Michele Comitini

unread,
Jun 16, 2013, 6:07:41 AM6/16/13
to web...@googlegroups.com, web2py-developers
As an alternative method there is a very robust solution: client auth using a x509 client certificate.  As a user installing the certificate is simpler than answering questions or reading weird captchas and he can forget about it, the browser does all the auth by itself using the SSL/TLS protocol,  but it all depends on usage scenarios. You need a PKI that generates a pkcs12 certificate+private key archive and let the user install it on its browser.  For my needs I have written a simple PKI here for web2py:


The code is really simple. The advantage is that certificate generation can be automated during registration process of any web2py app.  There are other and better PKI implementations around, much more complex to manage, but it depends on how much security and features you need.  To avoid browser complaints about insecure certificates, just use your server private key that you use in your PKI, to request a cheap or free server certicate (startssl.com is a good one), install it on your web server
along with the private key and you are done. Web2py supports x509 auth out of the box with rocket, but you can use most ssl enabled servers: apache, nginx, cherokee and many others.

mic


2013/6/16 Joe Barnhart <joe.ba...@gmail.com>
--
 
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.

Niphlod

unread,
Jun 16, 2013, 8:11:45 AM6/16/13
to web...@googlegroups.com, web2py-developers
@mcm: apart from explaining a user how to set his browser to provide client auth with ssl, I don't think that pyhonaywhere lets you use client-side ssl auth.
@joe: talking about "annoy", a 24 hour stop would surely make me angry. The problem here is stop bots, with this you have to manually unregister them anyway
@villas: js execution lets you execute some server-side code that needs to be executed by the client too. Let the bots figure out that they need to reverse a string and send only the half of it deciphering your js function....just loading a js environment they loose roughly a 70% of processing power.

I think that honeypot + timestamp + js execution are transparent to the end user and keep the vast majority of bots out. Every captcha solution needs to trim out a large percentage of unwanted behaviours. a 100% proof solution only comes with high-grade security, but let's face it, web2pyslices.com doesn't need to be a banking site.

villas

unread,
Jun 16, 2013, 4:46:36 PM6/16/13
to web...@googlegroups.com, web2py-developers
>> I think that honeypot + timestamp + js execution are transparent to the end user and keep the vast majority of bots out.

Yes that sounds very good.
Re: Honeypot.  As already mentioned, a display:none input box on its own does not seem to defeat spammers these days.  However,  there could be other innovative ways of styling it so that real users ignored it.

Paolo Betti

unread,
Jun 17, 2013, 6:44:32 AM6/17/13
to web2py-d...@googlegroups.com, web...@googlegroups.com
Hi,

is a different solution but have you ever tried CloudFlare (cloudflare.com) service?

It is a kind of proxy-cache online.

I use it with my site that has very very low traffic :-) but open comments and spammers have disappeared.

{the site is made with Plone but I have to upgrade to web2py as soon as possible ;-)}

PB
Reply all
Reply to author
Forward
0 new messages