How to deal with CSRF middleware from a crawler

406 views
Skip to first unread message

Torsten Bronger

unread,
Nov 24, 2014, 3:01:39 PM11/24/14
to django...@googlegroups.com
Hallöchen!

We use crawlers, which in our case a Python scripts that read data
from disk and send a lot of HTTP POST requests to the Django
deployment. The POST requests hit the same URLs/Views that are also
used by the browser to edit something with a web <form>.

If I activate the CSRF middleware, does this mean that our crawlers
have to make GET requests before every POST in order to get the CSRF
token? This would slow down them significantly ... Can one exclude
certain usernames from the CSRF checks? Or do you see another way
of keeping the number of HTTP requests small in the crawlers?

Tschö,
Torsten.

--
Torsten Bronger Jabber ID: torsten...@jabber.rwth-aachen.de
or http://bronger-jmp.appspot.com

Carl Meyer

unread,
Nov 24, 2014, 3:50:18 PM11/24/14
to django...@googlegroups.com
Hi Torsten,

On 11/24/2014 01:00 PM, Torsten Bronger wrote:
> We use crawlers, which in our case a Python scripts that read data
> from disk and send a lot of HTTP POST requests to the Django
> deployment. The POST requests hit the same URLs/Views that are also
> used by the browser to edit something with a web <form>.
>
> If I activate the CSRF middleware, does this mean that our crawlers
> have to make GET requests before every POST in order to get the CSRF
> token? This would slow down them significantly ... Can one exclude
> certain usernames from the CSRF checks? Or do you see another way
> of keeping the number of HTTP requests small in the crawlers?

Unless you've modified the CSRF implementation locally, all it does is
check that the CSRF token provided in a cookie matches the one provided
in the POST data. This is effective because browser same-origin policy
prevents malicious JS from reading the cookie value that the user's
browser will send, or from controlling the sent cookie value.

But this means that the CSRF protection is simple to bypass in a case
like yours: you can just set the CSRF cookie and the POST var to the
same value in all your crawler's requests. It doesn't matter what that
value is.

Carl

signature.asc

Torsten Bronger

unread,
Nov 25, 2014, 4:28:47 PM11/25/14
to django...@googlegroups.com
Hallöchen!

Carl Meyer writes:

> [...]
>
> Unless you've modified the CSRF implementation locally, all it
> does is check that the CSRF token provided in a cookie matches the
> one provided in the POST data. [...]
>
> But this means that the CSRF protection is simple to bypass in a
> case like yours: you can just set the CSRF cookie and the POST var
> to the same value in all your crawler's requests. It doesn't
> matter what that value is.

Thank you. I was ignorant about the details of this anti-CSRF
mechanism. It's working now, even for the login view itself.
Reply all
Reply to author
Forward
0 new messages