Limits or throttles asking for full hashes?

205 views
Skip to first unread message

Jim Idle

unread,
Mar 6, 2013, 3:36:13 AM3/6/13
to google-safe-...@googlegroups.com
I have written a full service for checking URLS and domains and so on and GSB is one aspect of checking URLs. However, now I am coming to check it, I am running my service against a list of many thousands of known bad URLs. This generates quite a lot of hits in to the hashes, which of course means that unless I have them cached in the last 30 minutes, then I have to ask the Google service for the full hashes for a particular prefix and because a lot of these URLs are in the database, it means I am generating lots of full hash requests.

I am also trying to load test my service, and so I am running the checks as fast as I can deliver them (and I have not even got to multiple processes yet). However, once i hit about 500 full hash requests (which is pretty quickly in this case of course), then the call always returns 503 - Service unavailable, at least for a while. 

So, I know that we are supposed to back off on requesting updates for the hashes and so on if a 503 occurs, and that is all implemented. But if the request for full hashes is going to fail, and we are supposed to back off from asking for more for minutes or hours, then it means it is impossible to create a real life system around GSB. It can work for a few users, where of course most of the requests will not be with malicious URLs, but I am looking eventually at lots of servers. And they will all appear to come from one IP address behind a NAT server. 

So, assuming that I getting the 503 error because it thinks I am making too many requests, is there any way for me to request that the full hash requests are not throttled? I know that some people probably abuse it and go through all the prefixes asking for all the full hashes immediately, but I am definitely not doing that.

It seems that there is an ability to request more usage for the query API, but I don't see anything about the Protocol v2 system. Perhaps this limit was just not thought through and I am the first person to be load testing an array of servers with a list of URLs that generate many hits?

Thanks for any help,

Jim

I will email this to the usage request email for the lookup API, in case it is the same person that would deal with this :)

Jim Idle

unread,
Mar 8, 2013, 1:26:18 AM3/8/13
to google-safe-...@googlegroups.com
Perhaps this isn't the place to ask questions of the Google staff. If not, does anyone have a clue where I can get an answer to this question?

Thanks for any pointers,

Jim

Stefan Keller

unread,
Mar 8, 2013, 1:39:24 AM3/8/13
to google-safe-...@googlegroups.com
Hi,

I am not a Google guy, but https://developers.google.com/safe-browsing/lookup_guide says

"We will limit the number of URLs queried in a single POST request to be 500, which we believe is sufficient for most API users. We will also limit the number of requests that can be made with a single API key in a 24-hour period. If you expect to make more than 10,000 requests per day, you must contact us to have your API key provisioned for additional users. At the present time there is no cost for this; we want to make sure that we have contact information for any large users that may potentially affect the service and its availability. For further questions about large deployments, contact us by sending email to antiphish-ma...@google.com."

Would that help?

Best regards

Stefan

--
You received this message because you are subscribed to the Google Groups "Google Safe Browsing API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-safe-browsi...@googlegroups.com.
To post to this group, send email to google-safe-...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-safe-browsing-api?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jim Idle

unread,
Mar 8, 2013, 11:21:23 PM3/8/13
to google-safe-...@googlegroups.com

Thanks Stefan,

 

But that concerns the lookup API and not the v2 API. I did send an email to the address given there already though. Thanks for taking the time to answer this,

 

Jim

Garrett Casto

unread,
Mar 11, 2013, 2:59:04 PM3/11/13
to google-safe-...@googlegroups.com
The same alias applies to both APIs. See https://developers.google.com/safe-browsing/developers_guide_v2#Overview. I'll respond to your request in a bit. Apologies for the delay.

Jim Idle

unread,
Mar 11, 2013, 8:52:39 PM3/11/13
to google-safe-...@googlegroups.com
Ok Garrett,

Thanks for the reply. 

Jim

Garrett Casto

unread,
Apr 3, 2013, 1:36:38 PM4/3/13
to google-safe-...@googlegroups.com
I agree with the proxying part, but please don't use multiple API keys for the same service. We use these to track misbehaving clients and it's much harder to do if your traffic is split over multiple keys. The e-mail alias is setup explicitly to ask for us to relax our throttling restriction on an API key if you are a service that sends more traffic than usual.


On Fri, Mar 22, 2013 at 5:23 PM, Artur Daci <solar.no...@gmail.com> wrote:
Until you find a permanent solution I would suggest proxifying the queries done from your server and loading the balance
between lets say 10 api keys. Quick and easy 

--

Jim Idle

unread,
Apr 3, 2013, 6:26:52 PM4/3/13
to google-safe-...@googlegroups.com
Did you receive my reply Garret? We already proxy the requests and use one API key and a fixed source IP address. 

Garrett Casto

unread,
Apr 3, 2013, 6:39:07 PM4/3/13
to google-safe-...@googlegroups.com
I responded to your e-mail about a week ago and haven't seen a response. We should take the rest of this discussion off list, I just wanted mention not to use multiple keys to a larger audience.
Reply all
Reply to author
Forward
0 new messages