Thanks for the detailed reply, Markus. A few responses inline.On Wed, Dec 17, 2014 at 3:26 AM, Markus Heintz <markus...@google.com> wrote:cc Mike West: FYIbcc UI folksOn Wed, Dec 17, 2014 at 12:24 PM, Markus Heintz <markus...@google.com> wrote:cc Vaclav : FYI since you are working on content settings componentizationcc Adrienne (security): Most likely you know this already but: See the "More Context" section below as FYI since you are very actively working on permissions these days.cc Alex and Rebecca (UI): See the "More Context" section below as FYI. Content settings may come form various sources. So in some cases we may need to provide read only UI in case content settings are managed. I think we have not talked about this@Alex and Charlie:You have different questions here:1) How to work around issues with GURL and WebSecurityOrigins?It would be great if any code that handles URLs and/or origins would handle file URL consistently. It would be great if we could define for Chrome what the origin of an file URL is (maybe even depending on some flags).2) Do content settings for file://abc/ URLs need to depend in the URL path?I don't see a good argument from the privacy perspective why file://foo.html should have different content settings than file://bar.html. However I have not implemented this TODO yet because Microsoft allows the following file URLs: "file://server/share/" (http://msdn.microsoft.com/en-us/library/windows/desktop/ff819129(v=vs.85).aspx) . I'm a bit unsure if it is a good idea to apply the same content settings to all files across all remote shares. In a corp environment users may trust the files on their own machine but maybe not trust the files of their co-workers on a shared location in the network. In a consumer environment users may have a (shared?) Windows share on a Router, NAS or whatever device. I think it is fair to say that same people may not even be aware of this if this is enabled by default on a router. So far I though it is safer to be more fine grained. But my personal opinion is that files on a "local" file system (which my include remote mounts and shares) are from a single party for which we could apply the same content settings. So getting rid of the path for file URLs it what I was hoping for. Ideally simply by fixing the issue mentioned in section 1 above.Hmm, that does raise an interesting question. Obviously, it would be simpler from our perspective to treat all file URLs as the same.One alternative is to include the file's "host" in the origin for a file URL (as referenced here, which hopefully covers the case you describe). There's immense subtlety here (see url_canon_fileurl.cc, etc), so I think consulting an expert like brettw@ will be useful if we try to change anything.
3) Is it possible to not replicate the full URL across all processes? or How to send content settings to the renderer?I'm not sure I have the full picture here yet. I'll read up more about the site isolation work. Feel free to share further docs with me if you think they are a good read.Sure. This is a good starting point:The premise is that we shouldn't trust the renderer to make security decisions, since it's easy to exploit and can then lie. I would love to see content settings become something we can protect, but that would mean that we would have to make the policy decisions in the browser process (with an IPC roundtrip).Barring that (e.g., for things that need synchronous replies), we would like to minimize the amount we're relying on the renderer. That's why we're hoping to avoid giving it the full URL of any RemoteFrames: so that it's not used for security decisions.In general Content Settings are defined per origin-pattern pair e.g. Camera , https://hangouts.google.com:*, https://hangouts.google.com:*, ALLOW. The pattern pair defines a first party and third party (for embedded content) context. Most content settings do not make use of the second origin pattern and ignore it by using a "*" as second pattern. But settings like location and camera/mic do.However this means that instead of the full URL you could only use the web security origin for looking up content settings.@Charlie: I'm not sure what you mean by relying on the URLs in the browser process. Isn't it possible to have content from different 3rd parties on the renderer side. Wouldn't this make it necessary to be able lockup different settings? What would you use on the renderer side instead of URLs? I guess I have not the full picture yet sorry.Yes, I was asking whether an IPC roundtrip to the browser would be possible to make the security decision in these cases, since we're already asking the browser to fetch the resource for us. I wouldn't be surprised if that's not feasible in all content settings cases, but I wanted to throw it out as an optimal solution for security.More Context:In Chrome we have:- Content Settings defined by the user,- Content Settings defined by policy,- Content Settings defined by extensions via the content settings extensions API.- For hosted Chrome Apps we grant some permissions for the URLs in the app manifest.- We have install time permissions for Apps and extensions.- We have optional permissions for Apps and extensions.- And we have pseudo content settings like CONTENT_SETTINGS_TYPE_AUTO_SELECT_CERTIFICATE that are used to define site specific settings.In the past I was hoping to build a single service that provides all site specific settings and permissions (including content settings). Such settings and permissions could be looked up by URL.The reason why I'm writing this is that for Apps and Extensions we user the Extensions patterns which support URL paths (unlike content settings patterns). This could lead to interesting situations with Hosted Apps and cookie settings. In theory a user could blocks cookies for a URL that is part of a hosted App in the content settings UI and Chrome will happily ignore this.A single site permissions and settings service would make it much easier:1 To build a proper UI that shows why a permission is granted for a specific URL2 To maintain any permissions related code.3 For clients, as they only need to query a single source of truth4 To clear any site related settings and permissions for a particular siteSo whatever you implement it would be great if building such a single service would still be possible.Heh, I'd love to see such a service in the browser process, but again, I realize that might be impractical.Charlie----On Tue, Dec 16, 2014 at 9:06 PM, Alex Moshchuk <ale...@chromium.org> wrote:Thanks, Markus, this helps. You mention that this code allows to have content settings for individual files. But your TODO in ContentSettingsPattern::Matches seems to say that this will going away:// TODO(markusheintz): Content settings should be defined for all files on // a machine. Unless there is a good use case for supporting paths for file // patterns, stop supporting path for file patterns.Is this TODO implying something different then?Also, are there any tests that exercise the individual file matching for the top frame?I believe that currently WebSecurityOrigin::toString() will return "null" for file URLs, since enforceFilePathSeparation flag is turned on in Chrome by default. With that flag off (as is the case in some tests) it will return "file://". GURL.GetOrigin() on a file URL seems to return "file:///" but that isn't used in this code path.If this check is intended solely for file URLs, maybe the more reliable way would be to check the origin's scheme before calling toString(), to distinguish from other cases where WebSecurityOrigin::toString() returns "null"?AlexOn Tue, Dec 16, 2014 at 9:59 AM, Charlie Reis <cr...@chromium.org> wrote:Thanks, Markus. Perhaps we can investigate what origin is returned for file URLs and whether that can be changed.As Alex mentioned, we'd like to avoid replicating the full URL across all processes, since it may include information that we don't want leaked to malicious renderers. In cases like location settings, I wonder if it's possible to rely on the URLs known in the browser process instead of in the renderer process.CharlieOn Tue, Dec 16, 2014 at 4:58 AM, Markus Heintz <markus...@google.com> wrote:Content settings are based on web origins. For file URLs the Web Origin may depend on the implementation of the agent (see http://tools.ietf.org/html/rfc6454)GURL returns "null" as origin for a file URL (at least it used to). Therefore we are using the scheme + path as origin for file URLs . This allows to have content settings for individual files. The sole purpose of the code you are mentioning in your email is to handle file URLs.For content settings it is necessary to know the top frame URL. E.g. location settings use two urls embedder (top frame URL ) and embedded URL (frame) URL.In order not to break anything you need to make sure that GURL returns a sane origin for file URLs. Otherwise Chrome will crash.--On Tue, Dec 16, 2014 at 10:35 AM, Jochen Eisinger <joc...@chromium.org> wrote:+Dominic BattreAs is, the content settings system allows for specifying arbitrary URL patterns. Maybe Dominic or Markus can chime in about what the future for content settings holdsOn Tue Dec 16 2014 at 10:02:15 AM Marja Hölttä <ma...@chromium.org> wrote:I touched this code long time ago, I have only vague memories about it...If you're refactoring so that you can ask an origin from WebRemoteFrame, can't you also add a func there to get originOrUrl() from it (which would have the same logic as this func)?On Tue, Dec 16, 2014 at 12:45 AM, Alex Moshchuk <ale...@chromium.org> wrote:Hi Markus, Jochen, Marja, and Bernhard,I'm working on site isolation and came across this code that is retrieving the top frame's URL in ContentSettingsObserver. I had a couple of questions about it, and you seem to have been involved with it looking at its history.GURL GetOriginOrURL(const WebFrame* frame) {WebString top_origin = frame->top()->document().securityOrigin().toString();// The the |top_origin| is unique ("null") e.g., for file:// URLs. Use the// document URL as the primary URL in those cases.if (top_origin == "null")return frame->top()->document().url();return GURL(top_origin);}With --site-per-process, top() may be a WebRemoteFrame rendered in a different process, in which case "top()->document().url()" will crash since document() does not exist for remote frames.I'm wondering whether knowing the top frame's URL is really necessary here. It seems like the URL may be used for path-based matching for file:// URLs. Are there any other use cases besides this? Also, there's a TODO(markusheintz) in ContentSettingsPattern::Matches to remove support for path-based matching even for file:// URLs. Is this TODO going to get done any time soon?I'm wondering about this because I've recently landed https://codereview.chromium.org/692973005/ which makes origins available on WebRemoteFrames, and I was refactoring ContentSettingsObserver to use them in https://codereview.chromium.org/789273006/. We were hoping to avoid replicating full URLs, which may leak sensitive information to untrusted renderers. Would we be breaking anything if we always used frame->top()->securityOrigin() here, when in --site-per-process mode, and perhaps passing "file://" for all file:// URLs to make schema-wide matching work?Thanks,AlexMarkusMarkusMarkus