Preventing spam attacks on Ajax views

Julien Phalip

unread,

Nov 28, 2008, 5:27:51 PM11/28/08

to Django users

Hi,

I'm building a rating app, so people can rate any kind of object (e.g.
a video, a news entry, etc.). The rating is done anonymously (there's
no user account on that site) and via an Ajax query. The view
currently only takes one parameter, the rating value (a float), so I
don't think I can use something like Akismet.

To prevent multiple ratings by the same person, a flag is set in the
session. Obviously it means that the person can rate again if she uses
a different browser or if the session expires, but that's not a big
issue.

Now, what worries me is potential spam attacks. How can I identify if
the request is from a genuine person or a bot? I started implementing
a system which records IP addresses and prevents anybody to rate twice
from the same IP within a given short time. But if genuine persons are
behind a proxy, IP uniqueness cannot be guaranteed and they may be all
mistaken for a bot.

Are there some algorithms in Django to cope with this kind of
situations? Maybe passing some kind of key protection in the URL?

Thanks a lot,

Julien

Malcolm Tredinnick

unread,

Nov 28, 2008, 8:57:33 PM11/28/08

to django...@googlegroups.com

On Fri, 2008-11-28 at 14:27 -0800, Julien Phalip wrote:
> Hi,
>
> I'm building a rating app, so people can rate any kind of object (e.g.
> a video, a news entry, etc.). The rating is done anonymously (there's
> no user account on that site) and via an Ajax query. The view
> currently only takes one parameter, the rating value (a float), so I
> don't think I can use something like Akismet.
>
> To prevent multiple ratings by the same person, a flag is set in the
> session. Obviously it means that the person can rate again if she uses
> a different browser or if the session expires, but that's not a big
> issue.
>
> Now, what worries me is potential spam attacks. How can I identify if
> the request is from a genuine person or a bot? I started implementing
> a system which records IP addresses and prevents anybody to rate twice
> from the same IP within a given short time. But if genuine persons are
> behind a proxy, IP uniqueness cannot be guaranteed and they may be all
> mistaken for a bot.

All you can do are social engineering hacks. There's no way for the web
protocol to know and Django doesn't ship with anything like that (since
it's an arms race -- what works one day in one scenario doesn't work in
other places). You have no way of knowing if submissions from the same
IP address are from a bot or a human, so you just have to decide whether
to allow those or now. Since major bot attacks rarely come from single
IP addresses in any case, it's only a first line of defence (really
major attacks are done in distributed form, so they come from multiple
locations -- but often isn't an issue for normal sites, since you're
miles below the radar for such attackers).

Rate limit throttling is one fairly easy way to keep things in line.
Maximum number per minute on average (keeping a running moving average
value is easy) and maximum number per day, for example.

>
> Are there some algorithms in Django to cope with this kind of
> situations?

No, for the reason mentioned above.

> Maybe passing some kind of key protection in the URL?

Anything automatic isn't going to stop a bot, any more than a human.
That's why the effective prevention measures are something unpredictable
that a human has to do -- from a CAPTCHA style entry to just entering
the word "orange" in a box to solving a mathematics problem.

Regards,
Malcolm

Andrei Eftimie

unread,

Nov 29, 2008, 6:32:15 AM11/29/08

to Django users

Probably best thing would be to have accounts...

Julien Phalip

unread,

Nov 29, 2008, 4:46:56 PM11/29/08

to Django users

Thanks for your replies Andrei and Malcolm. Unfortunately that site
does not (and will not) have user accounts, and the client does not
want captcha-like solutions either ("too cumbersome for users"). So I
guess there is no perfect protection here, but in this case we'll
resort to monitoring IPs and time between requests to do a first layer
of protection. That should be enough for the size and popularity of
that site.

Regards,

Julien

Reply all

Reply to author

Forward