How can I block curl requests

2,536 views
Skip to first unread message

Kate

unread,
Aug 1, 2012, 4:43:58 PM8/1/12
to google-a...@googlegroups.com
I am getting tens of thousands of curl requests - many thousands time browser requests and I want to block them. I'm using python. They coming from many different IPs most in Europe.  If I can't stop them I will have to close my site or go to a new provider.

Thank in advance,

Kate






Rerngvit Yanggratoke

unread,
Aug 1, 2012, 4:51:04 PM8/1/12
to google-a...@googlegroups.com
I think you can blacklist those IPs. Have a look in https://developers.google.com/appengine/docs/java/config/dos#About_dos_xml or https://developers.google.com/appengine/docs/python/config/dos.








--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/L7wDtd5rrUcJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Best Regards,
Rerngvit Yanggratoke 

Kate

unread,
Aug 1, 2012, 5:14:45 PM8/1/12
to google-a...@googlegroups.com

Yes I can but there's a limit to the number - 100 I think, and there are tens of thousands of them. I guess I'll just have to move from GAE as there doesn't seem any way of doing this and I can't be paying for these requests.


On Wednesday, August 1, 2012 4:51:04 PM UTC-4, rerngvit yanggratoke wrote:
On Wed, Aug 1, 2012 at 10:43 PM, Kate <mss....@gmail.com> wrote:
I am getting tens of thousands of curl requests - many thousands time browser requests and I want to block them. I'm using python. They coming from many different IPs most in Europe.  If I can't stop them I will have to close my site or go to a new provider.

Thank in advance,

Kate






--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/L7wDtd5rrUcJ.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Kyle Finley

unread,
Aug 1, 2012, 5:55:08 PM8/1/12
to google-a...@googlegroups.com
Hi Kate,

Maybe you could look into cloudflare.com it offers protection agains DDOS attacks, which is what these request appear to be.

- Kyle

Drake

unread,
Aug 1, 2012, 6:45:01 PM8/1/12
to google-a...@googlegroups.com

You can check the user agent of the request. Is the Use agent the same?

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of Kate
Sent: Wednesday, August 01, 2012 2:15 PM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] How can I block curl requests

 


Yes I can but there's a limit to the number - 100 I think, and there are tens of thousands of them. I guess I'll just have to move from GAE as there doesn't seem any way of doing this and I can't be paying for these requests.

 


On Wednesday, August 1, 2012 4:51:04 PM UTC-4, rerngvit yanggratoke wrote:

I think you can blacklist those IPs. Have a look in https://developers.google.com/appengine/docs/java/config/dos#About_dos_xml or https://developers.google.com/appengine/docs/python/config/dos.

 

On Wed, Aug 1, 2012 at 10:43 PM, Kate <mss....@gmail.com> wrote:

I am getting tens of thousands of curl requests - many thousands time browser requests and I want to block them. I'm using python. They coming from many different IPs most in Europe.  If I can't stop them I will have to close my site or go to a new provider.

 

Thank in advance,

 

Kate

 

 

 

 

 

 

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/L7wDtd5rrUcJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.


For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.


 

--
Best Regards,
Rerngvit Yanggratoke 

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/X1_iYMXzpKEJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Jeff Schnitzer

unread,
Aug 1, 2012, 6:59:00 PM8/1/12
to google-a...@googlegroups.com
Oh, the irony...

Jeff

Kyle Finley

unread,
Aug 1, 2012, 7:06:38 PM8/1/12
to google-a...@googlegroups.com
Yes, I should mention that today might not be a good day to test CloudFlare, though.



--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Kyle Finley

unread,
Aug 1, 2012, 7:11:56 PM8/1/12
to google-a...@googlegroups.com
Even if she did the check in a middleware, she would still have to handle/reject the requests by spinning up instances, though. Right? 

Drake

unread,
Aug 1, 2012, 7:16:43 PM8/1/12
to google-a...@googlegroups.com

Yes you’d still have to check user agent, but that could be very early in the processing, and returning a denied message will typically cause the offender to go away.

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of Kyle Finley
Sent: Wednesday, August 01, 2012 4:12 PM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] How can I block curl requests

 

Even if she did the check in a middleware, she would still have to handle/reject the requests by spinning up instances, though. Right? 

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/UV1fA4QJYRcJ.

Kate

unread,
Aug 1, 2012, 9:58:06 PM8/1/12
to google-a...@googlegroups.com
Yes Brandon, the user agent is the same, but how do I issue the denied message?

Thanks,

Kate

hyperflame

unread,
Aug 1, 2012, 10:01:56 PM8/1/12
to Google App Engine
Return a HTTP 429 error code ( stands for too many requests ). See
https://developers.google.com/appengine/docs/python/tools/webapp/redirects

Kyle Finley

unread,
Aug 2, 2012, 12:40:57 AM8/2/12
to google-a...@googlegroups.com
Kate,

You could check the user agent in a middleware something like this:

# in your appengine_config.py file - root directory
from webob import Response

class AntiCurlMiddleware(object):
    def __init__(self, app):
        self.app = app

    def __call__(self, environ, start_response):
        if environ['HTTP_USER_AGENT'] is 'curl':
            resp = Response('Too many requests!')
            resp.status_code = 423
            return resp(environ, start_response)
        return self.app(environ, start_response)

def webapp_add_wsgi_middleware(app):
    return AntiCurlMiddlewarey(app)

I haven't tested this, so If anyone sees any errors please let me know.

- Kyle

Kate

unread,
Aug 2, 2012, 2:55:17 PM8/2/12
to google-a...@googlegroups.com
Well I tried this by testing user agent but it passes the test and the page loads correctly, which it shouldn't in this example.
Am I meant to pass 'app' as such? I am not sure of this parameter.

Here is my code 
from webob import Response

class AntiCurlMiddleware(object):
    def __init__(self, app):
        self.app = app

    def __call__(self, environ, start_response):
        ua = os.environ.get('HTTP_USER_AGENT', "unknown")
sindex = string.find(ua,'Win',0)
if sindex > 0:
     resp = Response('Too many requests!')
     resp.status_code = 423
     return resp(environ, start_response)
        return self.app(environ, start_response)


def webapp_add_wsgi_middleware(app):
    return AntiCurlMiddlewarey(app)



hyperflame

unread,
Aug 2, 2012, 3:43:44 PM8/2/12
to Google App Engine
You're trying to find curl requests by looking for a "win" string?
IIRC, curl uses "curl" as its default user-agent, you have to look for
that. Also, it's a bad idea to look for "Win", as legitimate requests
(users using WINdows) will be blocked.

Also, it should be noted (and I believe a number of people have
already mentioned) that curl allows a person to change their user-
agent; just looking for a "curl" user-agent may not stop the problem.

Kate

unread,
Aug 2, 2012, 3:47:57 PM8/2/12
to google-a...@googlegroups.com
I am looking for 'win' in user agent as that IS the USER Agent I am using to test. It should evaluate to true  and give a 423 error.

I do not want to test it using curl. I am trying to see if I can return a 423 error. It is a TEST!

I am running it locally to see if it will block.

It doesn't.

'Win' is in my user agent - as I am testing.

Kyle Finley

unread,
Aug 2, 2012, 4:42:44 PM8/2/12
to google-a...@googlegroups.com
Kate, sorry there were some errors

Here's a working example:

It should be "status" not "status_code", AntiCurlMiddleware not AntiCurlMiddlewarey and I used startswith instead of is

You can test it here:


will return:

* About to connect() to anticurl.scotch-media.appspot.com port 80 (#0)
*   Trying 173.194.77.141... connected
* Connected to anticurl.scotch-media.appspot.com (173.194.77.141) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
> Accept: */*
< HTTP/1.1 423 Locked
< Content-Type: text/html; charset=UTF-8
< Vary: Accept-Encoding
< Date: Thu, 02 Aug 2012 20:40:13 GMT
< Server: Google Frontend
< Cache-Control: private
< Transfer-Encoding: chunked
* Connection #0 to host anticurl.scotch-media.appspot.com left intact
* Closing connection #0
Too many requests!

I hope that helps

- Kyle

Kate

unread,
Aug 2, 2012, 5:24:53 PM8/2/12
to google-a...@googlegroups.com

Thanks Kyle,

I don't have a system with access to curl and will download it for Windows later. Meanwhile I've put your code on my site.

Can you test it with curl for me? I tried testing for user agent starting with  Mozilla , from Firefox but it didn't work. 


Many thanks!!!!!!

Kate

Wilson MacGyver

unread,
Aug 2, 2012, 5:34:34 PM8/2/12
to google-a...@googlegroups.com
it's not working, I did curl 'http://www.australiansabroad.com/' and
it let me in
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/OQh8YTtF26cJ.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.



--
Omnem crede diem tibi diluxisse supremum.

Kyle Finley

unread,
Aug 2, 2012, 6:36:24 PM8/2/12
to google-a...@googlegroups.com
Same here. It let me in.

Did you copy this file to the root directory?


Maybe the instance needs to be restarted? Does anyone else have any ideas?

- Kyle

Kate

unread,
Aug 2, 2012, 6:48:25 PM8/2/12
to google-a...@googlegroups.com
Yes it is in the root directory. I am stumped! I didn't think it was working as I tried testing for different browsers and it didn't catch them.

Kate

unread,
Aug 2, 2012, 6:49:24 PM8/2/12
to google-a...@googlegroups.com
How do I restart the instance?

Kyle Finley

unread,
Aug 2, 2012, 6:55:37 PM8/2/12
to google-a...@googlegroups.com
How do I restart the instance?
at appengine.google.com in the instance section you should see a list of instance. They each have a "Shutdown" button

Yes it is in the root directory. I am stumped! I didn't think it was working as I tried testing for different browsers and it didn't catch them.

I don't know, that's strange. I'ts working here:

And if you have included the file in your project, it should at the very least stop me from using curl to access your site.

Kyle Finley

unread,
Aug 2, 2012, 7:05:57 PM8/2/12
to google-a...@googlegroups.com
If the middleware doesn't block them, I believe the issue with CloudFlare has been resolved. maybe you could test their service. They have a free tier
https://www.cloudflare.com/plans and CloudFlare should block the request before they  reached your app - saving you from starting instances to handle the curl request.

Kate

unread,
Aug 2, 2012, 8:39:10 PM8/2/12
to google-a...@googlegroups.com
I restarted the instance. I also got a copy of curl.exe for windows and it lets me through! :-(

My code is exactly this (below) and is in the file appengine_config.py in the root directory.

I also tested your example from my machine and got blocked.

I can't think of why this could be! I even altered appengine_config.py to have a syntax error and it picked it up, so I know it is loading.


from webob import Response

class AntiCurlMiddleware(object):
    def __init__(self, app):
        self.app = app

    def __call__(self, environ, start_response):
        if environ['HTTP_USER_AGENT'].startswith('curl'):
            resp = Response('Too many requests!')
            resp.status = 423
            return resp(environ, start_response)
        return self.app(environ, start_response)

def webapp_add_wsgi_middleware(app):
    return AntiCurlMiddleware(app)

Kyle Finley

unread,
Aug 2, 2012, 10:34:58 PM8/2/12
to google-a...@googlegroups.com
That is strange. The only other thing I can think of is that the requests are not dynamic request, so the middleware is not being called. Is the URL root being handled in the app.yaml? something like:

- url: /
  static_files: index.html
  upload: index.html

Another, thought, what framework are you using webapp, webapp2 flask? It shouldn't matter, but I'm kind of out of ideas.

Is the code on Github or somewhere I could take look? If you want you can send me the link off list.

- Kyle


Other then then, I don't know. Is the code public
--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/1jEROOXORQMJ.

Joshua Smith

unread,
Aug 3, 2012, 9:15:13 AM8/3/12
to google-a...@googlegroups.com
What might be helpful would be:

1. Add some logging. Up top:

import logging

then in the __call__ method:

    def __call__(self, environ, start_response):
logging.info('__call__ sees UA: "%s"', environ['HTTP_USER_AGENT'])
        if environ['HTTP_USER_AGENT'].startswith('curl'):

2. Deploy that

3. Check your logs. If you are being hit with lots of inbound curl calls, you should get some clues right away. But you can also try hitting it with curl yourself.



--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/1jEROOXORQMJ.

Kyle Finley

unread,
Aug 3, 2012, 10:34:11 AM8/3/12
to google-a...@googlegroups.com
Hi Joshua, 

Thank you, that's a good thought. 

Kate sent me some files offline, and I believe we've figured out the problem. For the middleware to work you must be using WSGI not CGI. Someone please correct me if I'm wrong, but I believe she would have to upgrade here App to python27 to use it. The alternative is to do the check in the webapp request handler:

def check_for_curl(self):
    if self.request.environ['HTTP_USER_AGENT'].startswith('curl'):
        return self.error(401)

class MainHandler(webapp.RequestHandler):
    def get(self):
        check_for_curl(self)
        # handle request

The problem is that webapp doesn't recognize error code 429 so we have to use something else. Unless there's a simple way to make it write 429?

- Kyle

Joshua Smith

unread,
Aug 3, 2012, 10:51:23 AM8/3/12
to google-a...@googlegroups.com
There are couple problems with your snippet.

First, she's getting HEAD not GET requests, so you need to use different handler.

Also, you aren't returning, so if you were in a GET request, it would proceed to handle the request regardless.

Something more like this (untested):

class MainHandler(webapp.RequestHandler):
  def head(self):
    self.error(401)

  def get(self):
    if (self.request.headers['User-Agent'].startswith('curl'))
      self.error(401)
      return
    # rest of the get handler

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/TQuZYYR0wrAJ.

Kyle Finley

unread,
Aug 3, 2012, 11:02:26 AM8/3/12
to google-a...@googlegroups.com
Yes, thank you.  Do you have any thoughts on how to return error code 429?

Drake

unread,
Aug 3, 2012, 11:36:05 AM8/3/12
to google-a...@googlegroups.com

I think you change 401 in this code to 429

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of Kyle Finley
Sent: Friday, August 03, 2012 8:02 AM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] How can I block curl requests

 

Yes, thank you.  Do you have any thoughts on how to return error code 429?

Kyle Finley

unread,
Aug 3, 2012, 11:42:42 AM8/3/12
to google-a...@googlegroups.com

I think you change 401 in this code to 429

I wish it was that easy. Webapp2 uses dictionary to return the status code / message and 429 didn't make the list.

<pre>Traceback (most recent call last):
  File &quot;/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/_webapp25.py&quot;, line 701, in __call__
    handler.get(*groups)
  File &quot;/Users/finley/dev/scotch/operation_curl_block/main.py&quot;, line 7, in get
    return self.error(429)
  File &quot;/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/_webapp25.py&quot;, line 435, in error
    self.response.set_status(code)
  File &quot;/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/_webapp25.py&quot;, line 279, in set_status
    message = Response.http_status_message(code)
  File &quot;/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/_webapp25.py&quot;, line 341, in http_status_message
    raise Error('Invalid HTTP status code: %d' % code)
Error: Invalid HTTP status code: 429

Drake

unread,
Aug 3, 2012, 11:45:33 AM8/3/12
to google-a...@googlegroups.com

Ah, I hadn’t checked. I usually return a permission denied Error, or a Busy Error, 503 I think (sorry not at my desk)

 

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of Kyle Finley
Sent: Friday, August 03, 2012 8:43 AM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] How can I block curl requests

 

 

--

Joshua Smith

unread,
Aug 3, 2012, 11:45:45 AM8/3/12
to google-a...@googlegroups.com
I would have thought self.error(429). That doesn't work? Is there a doc that says what codes are are allowed to return?

Kyle Finley

unread,
Aug 3, 2012, 11:52:09 AM8/3/12
to google-a...@googlegroups.com
@Brandon
Yes, 503 would probably be better then 401.

@Joshua
No 429 doesn't work. I don't know if the allowed return values are documented, but here's the source:

Kate

unread,
Aug 6, 2012, 2:26:17 PM8/6/12
to google-a...@googlegroups.com

@Kyle

I changed to 503  and didn't update my python.

Is this good or bad????

C:\inetpub>curl -v http://www.coolabah.com
* About to connect() to www.coolabah.com port 80 (#0)
*   Trying 205.178.189.131...
* connected
* Connected to www.coolabah.com (205.178.189.131) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.26.0
> Host: www.coolabah.com
> Accept: */*
>
< HTTP/1.1 302 Moved Temporarily
< Content-Length: 0
< Location: /?bee68f00
<
* Connection #0 to host www.coolabah.com left intact
* Closing connection #0


\inetpub>
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

Kate

unread,
Aug 6, 2012, 2:37:42 PM8/6/12
to google-a...@googlegroups.com
OOPs. my error.

If  I curl to http://aussieclouds.appspot.com or to http://www.australiansabroad.com it gets through still. I will update pyphon. coolabah.com points to australiansabroad.com and I thought they resolved to the same address. Apparently not.

Kate

unread,
Aug 6, 2012, 2:50:55 PM8/6/12
to google-a...@googlegroups.com
Just read this. Thanks.

Looks like it works now.

Thanks to all!!!!!

Kate
To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

Kate

unread,
Aug 6, 2012, 2:58:30 PM8/6/12
to google-a...@googlegroups.com
But I still don't like them hitting my site every 500 ms! e.g.
    1. 2012-08-06 13:56:02.725 / 401 32ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:56:02.279 / 401 70ms 0kb curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.6.2 zlib/1.2.3 libidn/1.9 libssh2/1.2.4
    1. 2012-08-06 13:55:57.921 / 401 29ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:55.403 / 401 9ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:54.323 / 401 52ms 0kb curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.6.2 zlib/1.2.3 libidn/1.9 libssh2/1.2.4
    1. 2012-08-06 13:55:54.283 / 401 33ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:52.814 / 401 82ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:52.437 / 401 50ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:49.063 / 401 33ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:45.986 / 401 10ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:43.610 / 401 18ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:42.189 / 401 29ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:42.114 / 401 76ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:39.592 / 401 51ms 0kb curl/7.21.0 (x86_64-redhat-linux-gnu) libcurl/7.21.0 NSS/3.12.10.0 zlib/1.2.5 libidn/1.18 libssh2/1.2.4
    1. 2012-08-06 13:55:38.948 / 401 85ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
    1. 2012-08-06 13:55:38.945 / 401 130ms 0kb curl/7.19.7 (i386-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.6.2 zlib/1.2.3 libidn/1.9 libssh2/1.2.4
    1. 2012-08-06 13:55:38.944 / 401 123ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18

Kyle Finley

unread,
Aug 6, 2012, 3:43:22 PM8/6/12
to google-a...@googlegroups.com
Hopefully they will stop once they realizes they're being blocked - if it's a legitimate service. If they are not, and they are reading this thread, they will probably just change their user agent, though. The only way to stop them from reaching your app entirely would be to use a service like CloudFlare. I have no personal experience with CloudFlare, however, so I can not state definitively that it will solve your problem.  And adding an additional layer can result in it's own issues, as demonstrated by the August 1st CloudFlare block. 

- Kyle


To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/ClwBVQQTxesJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Joshua Smith

unread,
Aug 6, 2012, 3:53:46 PM8/6/12
to google-a...@googlegroups.com
You could have some fun with them. Instead of returning an error, you could redirect them someplace:

Replace self.error(401) with:

self.redirect("http://localhost/")

for example. Curl does follow redirects by default, but it doesn't have to, so this may or may not have an effect.

If they are following redirects, you could set up a backend with a handler that just sleeps for 30 seconds. Then redirect them all to that one backend, where they can all sit in line waiting.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/ClwBVQQTxesJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Kate

unread,
Aug 6, 2012, 5:24:06 PM8/6/12
to google-a...@googlegroups.com

I like it!!!!!!

Ernesto Oltra

unread,
Aug 6, 2012, 6:53:21 PM8/6/12
to google-a...@googlegroups.com
In fact that server already exists, blackhole.webpagetest.org ensures it will never answer to anything.

Kate

unread,
Aug 8, 2012, 1:11:28 PM8/8/12
to google-a...@googlegroups.com
It isn't stopping them. I am just not getting errors. What is troubling is that these curl requests are counting as hits and there are so many of tens of thousands of them it is hard for me to analyze site traffic as genuine requests are buried in the curl stats.

Kyle Finley

unread,
Aug 8, 2012, 1:24:36 PM8/8/12
to google-a...@googlegroups.com
In the admin logs - under options - you can filter by regular expression. Does that help?

On Aug 8, 2012, at 12:11 PM, Kate wrote:

> It isn't stopping them. I am just not getting errors. What is troubling is that these curl requests are counting as hits and there are so many of tens of thousands of them it is hard for me to analyze site traffic as genuine requests are buried in the curl stats.
>
> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/p4fPoQaAACIJ.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Kate

unread,
Aug 8, 2012, 1:49:27 PM8/8/12
to google-a...@googlegroups.com
Thanks Kyle, it does help. But I still hate those nasty people! I opened a production issue with google a week ago so maybe they will help someday ...

Kate

unread,
Aug 8, 2012, 11:17:32 PM8/8/12
to google-a...@googlegroups.com
My site is now down as I'm  over quota. I can't tun billing on as it is too expensive to pay for these dos attacks.

Thanks everyone for being helpful but I think I'm beaten on this. It seems a pity that a non profit site could be brought down by this but that's the case.
Google doesn't seem to care as there has been no response on my production issue. I suppose it isn't in their interest as I either pay for the attacks or lose my site. V discouraging.

All attempts at blocking the attacks has only increased their volume.

Kate

Kyle Finley

unread,
Aug 9, 2012, 1:22:16 AM8/9/12
to google-a...@googlegroups.com
Kate,

Sorry to hear that. So CloudFlare.com wasn't able to block it?

- Kyle

sergey

unread,
Aug 9, 2012, 3:43:03 AM8/9/12
to google-a...@googlegroups.com
Can you show what you have in log for curl requests?

Kate

unread,
Aug 9, 2012, 7:59:57 AM8/9/12
to google-a...@googlegroups.com
Hi Sergey,

Here is a typical example
2012-08-09 06:51:16.597 / 302 30ms 0kb curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18
202.125.215.12 - - [09/Aug/2012:04:51:16 -0700] "HEAD / HTTP/1.1" 302 153 - "curl/7.18.2 (i386-redhat-linux-gnu) libcurl/7.18.2 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18" "aussieclouds.appspot.com" ms=31 cpu_ms=0 api_cpu_ms=0 cpm_usd=0.000049 instance=00c61b117c2f994812ed63184c9c5544dea738

But the ip address varies. My code forces 302 response. Before I added the code they were throwing errors head method not found. But even though I am doing the 303 I am still getting front end time exceeded and these requests are taking up about 95% of my quota. So to keep the site alive I would have to pay for them, I have lost most of my European and Australian visitors because the site is down every night during those places daylight hours. Obviously I can't continue like this and so will have to move to a provider capable of blocking these requests,

Kate

unread,
Aug 9, 2012, 7:59:58 AM8/9/12
to google-a...@googlegroups.com

Kate

unread,
Aug 9, 2012, 8:01:49 AM8/9/12
to google-a...@googlegroups.com
No cloudfare requires ands info which I do not have.

I made a cloudflare account but it can't resolve my domain name, www.australiansabroad.com as I don't have a dns entry at network solutions.com where I register my sites. I have a special entry that google resolves. And if I put in my appspot site name cloudflare says it cannot accept that.


Barry Hunter

unread,
Aug 9, 2012, 8:16:57 AM8/9/12
to google-a...@googlegroups.com
Another thing, have you tried contacting PlanetLab themselves? (the ips you've posted so far have come from them) 

Researchers using the PlanetLab network are bound by an Acceptable Use Policy which forbids malicious or disruptive behavior. Additionally, all PlanetLab nodes are secured and actively managed by the PlanetLab Operations team.

If you are unable to determine the source of the traffic, please contact PlanetLab Support (sup...@planet-lab.org). Feel free to direct any additional concerns or questions about PlanetLab to this address.

alex

unread,
Aug 9, 2012, 9:55:29 AM8/9/12
to google-a...@googlegroups.com
Kate,

If barryhunter is right and all the IPs are coming from the same ISP anyway, you can simply block the whole subnetwork ranges of that ISP (at least temporary) using dos.yaml:

It'll be pain in the ass updating the file every time you encounter new subnets but at least you could probably save some quota 'till you move somewhere else or figure something out.

-- alex

Kate

unread,
Aug 9, 2012, 12:13:14 PM8/9/12
to google-a...@googlegroups.com
They are not coming from the same IP. They are mostly in Europe but there are no subnets.

There are hundreds of them and google only lets you block 100.

eg
132.65.240.100 
133.15.59.2
193.136.19.13
139.165.12.211
193.166.167.5
141.219.252.133
200.17.202.195
195.130.124.1
193.1.201.27
138.48.3.202
136.159.220.40
138.251.214.78


all these and more within a minute.

They are all different.

Kate

Joshua Smith

unread,
Aug 9, 2012, 1:03:49 PM8/9/12
to google-a...@googlegroups.com
If you put those into an IP lookup utility, you'll find that they are actually all "planetlab" related.

I believe that what is happening is you are being DOS'd by a botnet created by these guys: http://www.planet-lab.org

To report a suspected violation of this policy, contact PlanetLab Support (sup...@planet-lab.org).

If that doesn't stop it, sue them. Call the FBI. Contact everyone on the steering committe: http://www.planet-lab.org/consortium

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/g_yrLQPI49cJ.

Michael Hermus

unread,
Aug 9, 2012, 1:25:23 PM8/9/12
to google-a...@googlegroups.com
Planet Lab seems to be a platform that supports a large number of third party projects. My guess would be that it isn't actually the people at Planet Lab themselves, but rather some user that created a bot which is running amok (either deliberately or by mistake).

Regardless, Joshua is right: you should contact them ASAP and let them know what is happening.

alex

unread,
Aug 9, 2012, 2:37:48 PM8/9/12
to google-a...@googlegroups.com
well, while you're notifying planetlab and whatnot you could create and upload a dos.yaml for the time being with a content similar to this:

blacklist:
- subnet: 132.65.0.0/16
- subnet: 133.0.0.0/8
  description: somewhere in china
- subnet: 136.159.0.0/16
- subnet: 138.250.0.0/15
- subnet: 138.48.0.0/16
- subnet: 139.165.0.0/16
- subnet: 193.1.0.0/16
  description: planetlab
- subnet: 193.136.16.0/24
- subnet: 200.17.192.0/19


- put that file in your app root dir and do something like this from a terminal:

"appcfg.py update_dos ."

Kate

unread,
Aug 9, 2012, 4:10:00 PM8/9/12
to google-a...@googlegroups.com
Thanks,

I emailed support. Tried to look up their steering committee but their database is dowwn. Will let you all know what I hear back!



On Thursday, August 9, 2012 1:03:49 PM UTC-4, Joshua Smith wrote:

Kate

unread,
Aug 9, 2012, 4:21:03 PM8/9/12
to google-a...@googlegroups.com

I did that but don't think it will catch them all.

I had a list of 100 individual ones, see below. Can you recognize subnets in them?
blacklist:
- subnet: 82.179.176.44
- subnet: 83.230.127.124
- subnet: 88.2.234.60
- subnet: 128.10.19.53
- subnet: 128.84.154.45
- subnet: 128.42.142.44
- subnet: 128.36.233.154
- subnet: 128.111.52.59
- subnet: 128.114.63.64
- subnet: 128.138.207.45
- subnet: 128.227.150.12
- subnet: 128.208.4.198
- subnet: 128.151.65.101
- subnet: 128.223.8.112
- subnet: 128.232.103.201
- subnet: 129.10.120.194
- subnet: 129.15.78.30
- subnet: 129.74.74.20
- subnet: 129.82.12.188
- subnet: 129.93.229.139
- subnet: 129.97.74.14
- subnet: 129.108.202.11
- subnet: 129.130.252.140
- subnet: 129.237.161.194
- subnet: 130.37.193.143
- subnet: 130.83.166.243
- subnet: 130.104.72.213
- subnet: 130.216.1.22
- subnet: 130.253.21.123
- subnet: 130.195.4.68
- subnet: 130.237.50.125
- subnet: 131.179.150.72
- subnet: 131.188.44.102
- subnet: 132.65.240.100
- subnet: 132.72.23.10
- subnet: 132.170.3.32
- subnet: 132.181.10.56
- subnet: 133.1.74.163
- subnet: 133.15.59.2
- subnet: 133.68.253.242
- subnet: 134.151.255.181
- subnet: 138.4.0.120
- subnet: 138.251.214.78
- subnet: 136.159.220.40
- subnet: 138.48.3.202
- subnet: 139.78.141.245
- subnet: 139.165.12.211
- subnet: 140.109.17.181
- subnet: 140.112.107.82
- subnet: 140.123.230.248
- subnet: 141.11.0.162
- subnet: 141.20.103.211
- subnet: 141.219.252.133
- subnet: 142.103.2.2
- subnet: 143.89.49.73
- subnet: 143.225.229.238
- subnet: 155.245.47.225
- subnet: 155.246.12.163
- subnet: 156.56.250.226
- subnet: 156.62.231.244
- subnet: 157.92.44.101
- subnet: 160.80.221.39
- subnet: 161.106.240.19
- subnet: 169.226.40.2
- subnet:  169.229.50.15
- subnet: 165.91.55.8
- subnet: 165.230.49.115
- subnet: 192.1.249.138
- subnet: 192.12.33.102
- subnet: 192.16.125.11
- subnet: 192.38.109.144
- subnet: 192.41.135.219
- subnet: 192.42.43.23
- subnet: 192.107.171.145
- subnet: 193.1.201.27
- subnet: 193.138.2.13
- subnet: 193.196.39.9
- subnet: 193.167.187.185
- subnet: 193.136.19.13
- subnet: 193.166.167.5
- subnet: 193.205.215.74
- subnet: 193.226.19.31
- subnet: 194.29.178.13
- subnet: 194.167.254.19
- subnet: 194.254.215.12
- subnet: 195.130.124.1
- subnet: 198.82.160.221
- subnet: 200.0.206.137
- subnet: 200.0.206.168
- subnet: 200.17.202.195
- subnet: 200.129.132.19
- subnet: 202.23.159.52
- subnet: 202.125.215.12
- subnet: 202.237.248.222
- subnet: 202.249.37.67
- subnet: 203.110.240.190
- subnet: 203.178.133.2
- subnet: 212.51.218.235
- subnet: 213.73.40.106
- subnet: 213.131.1.101

Thanks
Kate

alex

unread,
Aug 9, 2012, 5:32:03 PM8/9/12
to google-a...@googlegroups.com
A lot of those IPs are assigned to universities, like almost literally all of them. More than 50% are US universities. This really looks like a big distributed bot network to me.

Anyway, here you go (some IPs are from the same net block so there are less than 100 entries):


- subnet: 82.179.176.0/20
- subnet: 83.230.96.0/19
- subnet: 88.2.0.0/16
- subnet: 128.10.0.0/16
- subnet: 128.84.0.0/16
- subnet: 128.42.0.0/16
- subnet: 128.36.0.0/16
- subnet: 128.111.0.0/16
- subnet: 128.114.0.0/16
- subnet: 128.138.0.0/16
- subnet: 128.227.0.0/16
- subnet: 128.208.0.0/16
- subnet: 128.151.0.0/16
- subnet: 128.223.0.0/16
- subnet: 128.232.0.0/16
- subnet: 129.10.0.0/16
- subnet: 129.15.0.0/16
- subnet: 129.74.0.0/16
- subnet: 129.82.0.0/16
- subnet: 129.93.0.0/16
- subnet: 129.97.0.0/16
- subnet: 129.108.0.0/16
- subnet: 129.130.0.0/16
- subnet: 129.237.0.0/16
- subnet: 130.37.0.0/16
- subnet: 130.83.0.0/16
- subnet: 130.104.0.0/16
- subnet: 130.216.0.0/16
- subnet: 130.253.0.0/16
- subnet: 130.195.4.0/24
- subnet: 130.237.0.0/18
- subnet: 131.179.0.0/16
- subnet: 131.188.0.0/16
- subnet: 132.64.0.0/13
- subnet: 132.72.0.0/14
- subnet: 134.151.0.0/16
- subnet: 138.4.0.0/16
- subnet: 138.250.0.0/15
- subnet: 138.48.0.0/16
- subnet: 139.165.0.0/16
- subnet: 140.109.0.0/16
- subnet: 143.225.0.0/16
- subnet: 155.245.0.0/16
- subnet: 160.80.0.0/16
- subnet: 161.106.0.0/16
- subnet: 192.16.124.0/22
- subnet: 192.38.0.0/17
- subnet: 192.41.132.0/22
- subnet: 192.42.42.0/23
- subnet: 193.1.0.0/16
- subnet: 193.138.2.0/24
- subnet: 193.196.0.0/15
- subnet: 193.166.0.0/15
- subnet: 193.136.0.0/15
- subnet: 193.166.0.0/15
- subnet: 193.204.0.0/15
- subnet: 193.226.0.0/19
- subnet: 194.29.176.0/22
- subnet: 194.167.0.0/16
- subnet: 194.254.0.0/16
- subnet: 212.51.208.0/20
- subnet: 213.73.32.0/19
- subnet: 213.131.0.0/19
- subnet: 136.159.0.0/16
- subnet: 132.170.0.0/16
- subnet: 132.181.0.0/16
- subnet: 133.0.0.0/8
- subnet: 203.0.0.0/8
- subnet: 198.82.0.0/16
- subnet: 200.0.0.0/8
- subnet: 140.112.0.0/12
- subnet: 140.123.0.0/16
- subnet: 142.103.0.0/16
- subnet: 143.89.0.0/16
- subnet: 139.78.0.0/16
- subnet: 155.246.0.0/16
- subnet: 156.56.0.0/16
- subnet: 156.62.0.0/16
- subnet: 157.92.0.0/16
- subnet: 169.226.0.0/16
- subnet: 165.91.0.0/16
- subnet: 165.230.0.0/16
- subnet: 192.1.0.0/16
- subnet: 192.12.33.0/24
- subnet: 169.229.0.0/16
- subnet: 141.219.0.0/16
# these belong to a too big block
- subnet: 141.11.0.162
- subnet: 141.20.103.211



alex

unread,
Aug 9, 2012, 5:37:41 PM8/9/12
to google-a...@googlegroups.com
Actually, scratch last three lines (starting from # these belong to...) and replace with

- subnet: 141.0.0.0/8

Kate

unread,
Aug 9, 2012, 5:47:55 PM8/9/12
to google-a...@googlegroups.com
Terrific Alex!!!! Thank you!!!!

Kate

unread,
Aug 9, 2012, 10:10:24 PM8/9/12
to google-a...@googlegroups.com
Done,

Actually I have some more. Can you "subnet" any of these? all planet-lab ones

- subnet: 134.117.226.181
- subnet: 147.229.10.250
- subnet: 206.23.240.29
- subnet: 160.193.163.106
- subnet: 138.246.99.249
- subnet: 129.242.19.197
- subnet: 206.12.16.154
- subnet: 160.193.163.106
- subnet: 152.3.138.6
- subnet: 195.37.16.121

I am up to 100 now because of these individual ones.

alex

unread,
Aug 10, 2012, 4:41:06 AM8/10/12
to google-a...@googlegroups.com
- subnet: 134.117.0.0/16
  description: Carleton University
- subnet: 147.228.0.0/14
  description: Brno University of Technology
- subnet: 206.23.0.0/16
  description: Tennessee Board of Regents
- subnet: 160.193.0.0/16
  description: Osaka City University
- subnet: 138.246.0.0/16
  description: Ludwig-Maximilians-Universitaet Muenchen
- subnet: 129.242.0.0/16
  description: University of Tromso
- subnet: 206.12.0.0/16
  description: BCnet Vancouver
- subnet: 152.3.0.0/16
  description: Duke University
- subnet: 195.37.0.0/16
  description: Extranet der Universitaet Passau

Kate

unread,
Aug 10, 2012, 1:37:33 PM8/10/12
to google-a...@googlegroups.com
Thanks!@ I have used up my 100 entries and haven't got all of them!

Haven't heard back from planet-lab.org!

Will keep you informed!

Kate

unread,
Aug 10, 2012, 7:13:26 PM8/10/12
to google-a...@googlegroups.com
Heard back from planet-lab:

See below. However I don't buy the "We ensure that we only send active probes to prefixes
that receive traffic from PlanetLab, and we probe every prefix at most
once every 5 minutes if the prefix is reachable and at most three
times in a 5 minute period if we do not receive responses to our
probes."

Hi Kathleen,

Apologies for not resolving this sooner.

We believe we found a likely source behind the traffic you've been
referring to. As you may know, PlanetLab is a distributed systems
research test bed with 1000+ machines world wide. These machines may
share access to both research, local and public Internet. These
services are actively managed by researchers granted access to
PlanetLab accounts.

Since your site is hosted by Google, the IP addresses that you use are
not unique to you, but are shared among many Google hosted services.
Many experiments on PlanetLab nodes sent significant volume of
legitimate traffic to these IP addresses and finding the subset of
this traffic that corresponds to your service is a bit more involved.

We have however identified a likely experiment that is responsible, it
is ucr_web slice, run by researchers at University of California,
Riverside, who are cc'ed on this email. The researchers provide a
description of their work as:

"""
This slice is being used to perform measurements to detect outages on
paths on which traffic is served from PlanetLab. We passively observe
traffic outgoing from PlanetLab to see which prefixes are receiving
TCP traffic from PlanetLab, and then use a combination of passive
monitoring and active probing to detect outages on paths to these
prefixes.<br>We ensure that we only send active probes to prefixes
that receive traffic from PlanetLab, and we probe every prefix at most
once every 5 minutes if the prefix is reachable and at most three
times in a 5 minute period if we do not receive responses to our
probes.

Drake

unread,
Aug 10, 2012, 7:20:16 PM8/10/12
to google-a...@googlegroups.com

If they didn’t follow your robots.txt and you know who they are, send a legal order, sue for damages and wipe them off the map.

 

.

Reply all
Reply to author
Forward
0 new messages