URL redirectors being flagged as malware even while using Safe Browsing API

1,181 views
Skip to first unread message

Owen Allen

unread,
Feb 7, 2017, 7:31:00 PM2/7/17
to Google Safe Browsing API
We have been having issues with Chrome's malware blocking screen for about 6 months periodically. We manage roughly 110 websites for our clients and each one has a URL that we use when routing to external websites as a means of tracking the clicks. The problem is that if one of those links goes to a malware site, then the entire route ends up blocked as malware and now all outgoing URLs are blocked. In order to correct this problem, back in October we instituted a system where ALL outgoing URLs would be checked server-side against the Google Safe Browsing Go package. So in the browser the a tag would be written to /plugins/crm/count/?encodedString (to prevent against redirect attacks), then on the server it decodes the string, checks safe browsing and if it's safe redirects the user to the destination site.

It was working pretty well for the past month or two as we had no sites affected by the system. Every now and then a redirected site would get flagged and we would remove it from our database, but none of them caused cascading failure to block the whole /plugins/crm/count/ route. Now, we've had 2 sites with cascading blocking in the past week. The odd thing about both cases is that our server-side analytics tell us that we redirected exactly two, yes, two real users to the malware site in a period of time while it may be infected. This means we were flagged for merely having a URL which 301s to another site even if that 301 went through the safe browsing system. Unfortunately Google has been very quiet on how this whole system works, which leaves site operators like myself very frustrated. We want to comply with the rules, we will institute the necessary systems, but can you at least provide some guidance as to what we can do to prevent getting blocked?

So here are my questions:

1. It appears to me based on the evidence I have (because Google refuses to provide documentation on this subject!) that Google blocks sites that link to a site with malware, even if that site isn't actively sending traffic through those links and even if those links use the Google Safe Browsing to prevent redirects if they did become infected in the future. If this is true, then honestly, what is the point of the Safe Browsing API in the first place? The pattern we utilize is the exact same pattern used by Yelp and TripAdvisor for linking to external sites. Yet, they don't see to get their entire pathway blocked. Why?

2. The timing of how the whole system works is clouded in mystery. One of our sites showed up in the Google Search Console as being infected on 2/3/2017. If it shows up in Google Search Console at that time, does it mean that around that same time it was flagged as malware. According to our analytics only two users went through to the URL in the 5 days before 2/3/2017. Does that mean that sending through 2 clicks is enough to flag an entire pathway. It possible that it was flagged as malware months ago, and had been malware for quite some time, and only within the past week did Google escalate it up to the search console. If so, that would indeed be frustrating. If the search console is the method of notifying admins of the issue, then it would make sense to notify admins as soon as the problem is known so we can correct it before our urls are blocked as malware. Is there any documentation for when something is flagged about how long until it shows up in Search Console, how long before it gets blocked and linking sites.

3. Why is that sometimes our system only has issues with individual URLs being blocked /plugins/crm/count/?specificUrl, while others get /plugins/crm/count/ blocked in it's entirety. The times when the entire pathway are blocked do not seem to have either A) a high volume of traffic or B) a high number of individual infected URLs. In all of the cases of group blocking we've had only 1 corrupt URL out of thousands. In the ones where we have had a single URL blocked it was the same pattern. What causes one failure to cascade upstream to a whole pathway while others don't?

4. Is there a best practice we can utilize to solve this problem in a different way which will inoculate us from this issue. What we are doing is nothing different than https://www.yelp.com/biz/sak%C3%A9-brisbane-2?osq=Restaurants when you look at their biz_redir pathway if you mouse over the restaurants website URL. 

Example Client:

https://www.ontariossouthwest.com/listing/windsor-essex-barrels-bottles-%26-brews-trail/1982/ - Clicking Visit Website goes through the /plugins/crm/count/ pathway (currently blocked, but waiting for review).
https://www.visitbatonrouge.com/listing/hampton-inn-%26-suites-baton-rouge-downtown/1206/ - Clicking Visit Website goes through the /plugins/crm/count/ pathway.

Any help is appreciated.
Message has been deleted

Niti Singh

unread,
Mar 27, 2017, 4:02:27 PM3/27/17
to Google Safe Browsing API
Hi Owen, 
I am so happy I found this post from you. 

I implemented click tracking (very similar to what you did). We have links all over our releases and we track them using this tracking site that I implemented. Google Safe Browsing was blocking us every now and then and so I decided to implement the Google Safe Browsing API and check every URL before we redirect to make sure we do not redirect to any bad site. It worked fine for last more than 10 months or so. On friday, we got blocked again. I checked everything on my end and confirmed everything is working fine and as expected. 

Now, I really want to understand how is this safe browsing logic working. Is Google identifying bad sites using the link from good sites and then blocking the good site before it even adds the bad site to their DB? If so, Google Safe Browsing API will be useless. We want to use the API to identify the bad sites not to help get to the bad sites and get blocked doing so. 

This is highly frustrating. This is a matter of the reputation to us. 

Owen Allen

unread,
Apr 12, 2017, 1:33:35 PM4/12/17
to Google Safe Browsing API
We have had this happen again to us. According to our logs we had the following circumstance:

1. At 4:27am UTC we redirected 1 user to a website that reported as safe according to our Go package safe browsing. That was the only user we redirected to the specific URL in the past 5 days.

2. At 5:00am UTC, our redirect pathway was blocked as malware. Blocking all outbound URLs, not just the infected one.

3. We are running the Go package from Google and it's database updated at 3:58am, 4:28am and 4:58am.

Based on this evidence the most plausible theory I have is that Google added the domain as malware sometime between 3:58am and 4:28am. Therefore the one redirect we sent a user through would have occurred 1 minute before our version of the database in the Go package updated. Therefore Google sees us as linking to malware. This situation is incredibly frustrating because we are doing exactly what the documentation tells us to do. We are checking the URL, we are using the Go package. Still, we end up having whole pathways, not just specific URLs blocked, but an entire pathway that routes to many URLs. This is a common protocol followed by numerous sites such as Bit.ly, TripAdvisor, Yelp and any URL shortener. I attempted to create a Bitly URL to the infected URL and Bitly is using the Safebrowsing system to prevent the redirect. On the otherhand ow.ly is not. It redirected me the malware site, yet somehow, ow.ly is not affected, they don't have their entire redirect system shut off as linking to malware. There must be some strategy for mitigating this issue.

What I would request if Google is going to block domains redirecting to malware. Please at least wait long enough so that your approved pathway (the Go package) has enough time to update. So if a URL gets flagged as Malware by Google at 10:01am, don't block sites redirecting to that site that is flagged until at least 10:31am. That way the official way of solving this problem (the Go package) has enough time to update since it updates every 30 minutes. Either that or provide some level of documentation for how we can structure our URLs to prevent this, but so far I cannot figure out what we are doing differently than the major redirectors.

Again, any help is appreciated.
Reply all
Reply to author
Forward
0 new messages