chrome.webRequest not receiving events for https://clients6.google.com/generate_204

1,119 views
Skip to first unread message

John Hinsdale

unread,
Dec 6, 2014, 10:04:03 AM12/6/14
to chromium-...@chromium.org
Hi,

I'm developing a Chromium extension that uses the webRequest API https://developer.chrome.com/extensions/webRequest and I've noticed none of the request event handler(s), not even the  initial "onBeforeRequest" get called for certain URLs in a way I can't make sense of.  Specifically, while handlers are called for, e.g., https://www.google.com/generate_204 they are not for https://clients6.google.com/generate_204 ... is there some reason for this?  Or is it a bug?

As background, the purpose of my extension is to scrub private information from some headers (e..g. version info appended to User-Agent specific to the local build) on outbound requests to sites outside of my extension user's network, while retaining that information for logging purposes for requests inside my extension user's network.  Doing this reliably won't be possible if not all external URLs can be subject to handling by the extension.

Is there something magic about this URL?  Are there other ones not passed to webRequest?

The doc linked above mentions an incomplete list of some URLs that are "hidden" from the API:

"... certain requests with URLs ... are hidden, e.g., chrome-extension://other_extension_id where other_extension_id is not the ID of the extension to handle the request, https://www.google.com/chrome, and others (this list is not complete)."

Is there a way to get the complete list?  Other than trawling the Chromium code?  Not being able to rely on complete coverage of the outbound requests by an extension employing webRequest will substantially reduce its effectiveness.  I've noticed other situations where URLs are not passed, e.g. the URL passed on the command line to the browser is accessed w/out contacting the extension, but I suspect that is because the extension is not completely initialized.  In the example above, I accessed the first URL and w/ console logging verified it got passed to the webRequest extension, while a subsequent request to the "clients6" URL was among the "hidden."

Here is a snippet from my chrome://version page:
Chromium = 37.0.2062.120 (Developer Build 281580) Ubuntu 12.04
OS = Linux 
JavaScript = V8 3.27.34.17
User Agent Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/37.0.2062.120 Chrome/37.0.2062.120 Safari/537.36

Ideas?

Thanks,
John K. Hinsdale

Rob Wu

unread,
Dec 6, 2014, 10:32:33 AM12/6/14
to John Hinsdale, Chromium-extensions

--
You received this message because you are subscribed to the Google Groups "Chromium-extensions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-extens...@chromium.org.
To post to this group, send email to chromium-...@chromium.org.
Visit this group at http://groups.google.com/a/chromium.org/group/chromium-extensions/.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-extensions/f5a05538-e2da-4f38-a8bd-96dca47ce39b%40chromium.org.
For more options, visit https://groups.google.com/a/chromium.org/d/optout.

John Hinsdale

unread,
Dec 6, 2014, 8:46:30 PM12/6/14
to chromium-...@chromium.org, hins...@gmail.com

Rob - thanks this was very helpful.  So, according to the comments in "web_request_permissions.cc" it looks like all URLs hosted at domains clients[1-9].google.com are "back-door" URLs that bypass the webRequest API and all extension based on it, for the purpose of ensuring that "internal services" -- which I interpret to mean mechanisms that ensure the correct operation of the Chromium ecosystem -- are affected by webRequest extensions themselves.  The comments mentioned extension update, extension blacklisting, safe-browse, etc.  All that makes sense.

However, I've got all these options turned off (the six checkboxes under "Privacy" in settings from "Use a web service ..." to "Enabled phishing protection ..." etc.), and still I see all kinds of requests going out to clients[1-9].google.com which are thwarting my attempt to use webRequest to scrub the outbound headers per below.

On more analysis, it appears that specifically that if my extension user is signed into Gmail (a Web service unrelated to the operation of the Chromium software ecosystem, but happens to be operated by Google), then Gmail makes numerous requests to clients[1-9].google.com, so that those domains would appear to be used both for Chromium software purposes as well as Google's commercial services.  Some examples:

responds w/ an HTML document containing embeded Javascript:
    function apsl() {   (new Image).src = "//clients2.google.com/availability/...." + (new Date).getTime() + "......." }
This then fetches a small 35-byte GIF hosted at clients2.google.com and which bypasses webRequesting filtering.  This looks like a tracking GIF.

responds w/ an HTML document containing embeded Javascript:
      <script> ... AF_initDataCallback({... data:[[ ...,"https://client-channel.google.com/",
               "https://clients4.google.com/invalidation/lcs/client"]]})
      </script>
Chromium then makes requests to clients4.google.com that bypass webRequest

responds w/ a JavaScript document containing:
    { ... "root-1p":"https://clients6.google.com", sessionCache:{enabled:!0}, ...
Chromium then makes requests to clients4.google.com that bypass webRequest

So basically, the Web sites serving up the Google apps and services appear to be making regular use of these back-door domains that are entirely bypassed by any extension using webRequest which tries to limit data sent out (extra User-Agent stuff in my case, but it will render ineffective any Cookie blocker or other privacy tool).

Are Chromium developers aware of this?  My first reaction is to surmise that if Google is acting both as the custodian of Chromium software integrity as well as a participant in the competitive market for Web-delivered services, it should try to avoid even the appearance of potentially using these back-door domains for any purpose related to the Web services, the reason being that could be seen as a way to get a leg up on the competition as far as ability to capture tracking data from users who try not to disclose it.

It seems that this could be fixed by having the Chromium code hide only those URLs known to implement the various software integrity functions mentioned in the comments in "web_request_permissions.cc" while leaving all other URLs hosted at the clients[1-9].google.com back-door domains still within the scope of the webRequest API.

Ideas?  Leaking the unwanted private header info to *.google.com defeats my extension's purpose and the only ways to proceed I can think of are to patch Chromium or shut off access to those domains which I would imagine would break the Google services.

Thanks for your help!

John Hinsdale

Rob Wu

unread,
Dec 7, 2014, 5:37:43 AM12/7/14
to John Hinsdale, Chromium-extensions
Even if that list of URLs with security implications is disabled, you won't be able to intercept all requests because of so-called system requests: https://crbug.com/108648#c2 (this is one of the bugs that resulted in the creation of the IsSensitiveURL method).
If you want to reliably change the User Agent string, add --user-agent="...." to the command line flags of Chrome/Chromium.

Kind regards,
 Rob
 https://robwu.nl

--
You received this message because you are subscribed to the Google Groups "Chromium-extensions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-extens...@chromium.org.
To post to this group, send email to chromium-...@chromium.org.
Visit this group at http://groups.google.com/a/chromium.org/group/chromium-extensions/.

John Hinsdale

unread,
Dec 7, 2014, 9:17:05 AM12/7/14
to chromium-...@chromium.org, hins...@gmail.com

>>> Even if that list of URLs with security implications is disabled, you won't be able to intercept all requests because of so-called system requests :https://crbug.com/108648#c2 <<<

Thanks for the link, the discussion history in that bug helped a lot to explain why things are the way they are.  I'm fine with not being able to use webRequest for these system requests needed to administer the extensions infrastructure.  It won't work to have webRequest end-runned by some random commercial Web service provider site though, so I will need to differentiate the two.

>>> If you want to reliably change the User Agent string, add --user-agent="...." to the command line flags of Chrome/Chromium. <<<

That won't work, as I need to do it conditionally, e.g. via webRequest or similar post-startup log.  The purpose of my extension is to scrub private information from some headers (e..g. version info appended to User-Agent specific to the local build) on outbound requests to sites outside of my extension user's network, while retaining that information for logging purposes for requests inside my extension user's network.

I've got some more ideas for how to make sure Chromium, in its present form, sends out only those requests to the back-door domains that are legitimate use for their intended purpose, but it's ugly and not sure it will work.  I'll follow up here as to how that pans out. It doesn't seem like these special domains should be used for other purposes and, to me anyway, there seems at least the appearance of potential misuse.

Thanks again for all the help,
John K. Hinsdale

John Hinsdale

unread,
Dec 28, 2014, 9:55:20 AM12/28/14
to chromium-...@chromium.org, hins...@gmail.com
>>> I've got some more ideas for how to make sure Chromium, in its present form, sends out only those requests to the back-door domains that are legitimate use for their intended purpose, but it's ugly and not sure it will work.  I'll follow up here as to how that pans out. <<<

Finally following up as promised ... I had considered having my extension "scrub" the body of HTTP responses to eliminate any markup or Javascript that would trigger access to these domains, but that was too hard, and I didn't want to get into the business of modifying anyone's HTTP response bodies that weren't mine.

So what I did instead was a combination of two things:
  • preventatively blocking access to sites (or more accurately, portions of sites) that issued responses with such markup and JS.  This effectively meant blocking users' access to Gmail, YouTube and Google News.  It was considered a greater need to scrub the private header info than to access these sites.
  • Where there were telltale signs in the headers of these domains being accessed -- specifically "Content-Security-Policy:" -- adjusting this header to "tighten up" access by removing these domains as allowed sources of Javascript, and doing all this in the context of requests/responses being done outside the domains' intended purpose.
I've noticed some Chromium-based browser distributions (e.g. Opera) do not bypass webRequest for clients[1-9].google.com, and I'm wondering if that means they are subject to potential interference by webRequest-based extensions affecting the Chromium infrastructural URLs, or if rather the developers have selectively bypassed only those URLs that should be bypassed.  That would be a tricky problem to solve, but then the Opera developers are fairly talented bunch so wouldn't be surprising.  I've published a version of my extension in question here: http://alma.com/chromium/hin/

Thanks,
John Hinsdale
 
Reply all
Reply to author
Forward
0 new messages