Hi API owners,
I'd like to run an experiment to collect some latency metrics for my project. At a high level, my feature
is initialized asynchronously at Chrome startup, and (when enabled) must be completely initialized before the Network Service accesses cookies. There's prior art for doing this kind of thing, namely reading the persistent cookie store DB into memory at startup.
Currently, the feature is disabled, so the First-Party Sets backend answers every query synchronously (and the answers are no-ops). When the feature is enabled, some queries may be answered asychronously, if they arrive before the backend is fully initialized. (The answers will still be no-ops for now, since the feature has not launched.) I'd like to run an experiment in which Chrome enables the feature at 50% on Canary/Dev, 50% on Beta, and 1% on Stable
, to verify that the latency characteristics of this are acceptable.
Some more details:
- What is affected? Cookie accesses (from both HTTP and scripts) during Chrome startup.
- Web requests which are not HTTP(S)/WS(S) are not affected.
- HTTP(S)/WS(S) requests which are uncredentialed are not affected.
- Cookie accesses (via HTTP or script) that occur after startup are not affected.
- What happens during First-Party Sets initialization?
- Backend waits to receive a file from component updater, and reads/parses the contents.
- If the component is not yet installed, or we fail to read the file, we use an invalid base::File instead.
- Backend reads/parses a command-line flag (or empty string if the flag isn't present).
- Backend merges the component and flag values.
- What metrics will I collect?
- Fine-grained metrics to monitor each part of initialization; see https://crrev.com/c/3501915.
- Fine-grained metrics to monitor the impacts of query delays. See https://crrev.com/c/3465494.
- Any special concerns?
- Deadlock. The backend must initialize no matter what, in order to eventually serve queries for cookie accesses.
- Priority inversion. I've updated all disk reads on the critical path to use USER_BLOCKING priority if this feature is enabled, and BEST_EFFORT otherwise.
- What's the proposed rollout plan?
- A week or two at 50% on canary/dev; then
- A week or two at 50% on beta; then
- 2 weeks at 1% on stable.
- Platforms: Windows, Linux, Mac, ChromeOS, Lacros, Android browser
Any concerns with this approach, or questions that I haven't addressed? Thanks!