%-experiments in the Blink process.

55 views
Skip to first unread message

Mike West

unread,
Dec 14, 2020, 3:56:52 AM12/14/20
to blink-api-owners-discuss
Hey folks,

In last week's API owners meeting, we had a short discussion around experiments that aren't appropriate for the Origin Trial framework, but instead would benefit from being rolled out to a percentage of users. I think we agreed on something like the following:
  1. Percentage experiments that only reach developer-focused releases (those limited to Chrome's Canary and Dev channel, for instance) do not require API owners' involvement. These releases are explicitly unstable, have a limited reach, and can't support strong security or privacy guarantees due to the rapid churn of changes. The value of %-experiments in these environments pretty clearly outweighs the minimal risk of baking in a behavior based on exposure.

  2. Broadly-accessible releases (Chrome's Beta and Stable channels, for instance) reach many more developers and users, and we'd like API owners to evaluate and approve percentage experiments via an explicit Intent to Experiment. This both ensures that we're evaluating the effect of an experiment on the ecosystem, and gives developers an opportunity to understand what changes might be coming their way.

  3. Percentage experiments that reach a certain level of the population (say, ~5% of stable) are difficult to distinguish from shipping with regard to their impact on developers' expectations. While it's possible that such experiments might really be experimental (Trust Tokens is a recent example), the risks are similar enough to shipping that we'd see this kind of experiment as a rare exception.
Assuming that matches others' recollection of the conversation, I'd propose adding the above requirements to https://www.chromium.org/blink/launching-features, alongside the discussion of origin trials in step 5 the of "For New Feature Incubations" section. Perhaps along the lines of:

"""
If you want to gather data on the usability of your feature that an Origin Trial can help collect, or data about your feature's impact that a percentage-based experiment could support, then  proceed to the "Public Experiment" stage in ChromeStatus and fill out the required fields detailing what you hope to learn.  This will generate an Intent to Experiment mail that you should send to blink-dev.  After receiving at least one LGTM from API Owners, you can proceed with your origin trial or percentage experiement. Collect data and respond to any issues.  From here, you may wish to return to Dev Trials, proceed to Prepare to Ship, or park the feature.

Please note that these experiments are not exempt from requiring cross-functional approvals from the Chrome launch review process.
"""

This would also require corresponding changes to the Intent to Experiment template, and to the ChromeStatus workflow as well. It also might be reasonable to create a separate page describing the experimental options in more detail, as well as the approval requirements for exceptions.

WDYT?

-mike

Jochen Eisinger

unread,
Dec 14, 2020, 4:17:33 AM12/14/20
to Mike West, blink-api-owners-discuss
On Mon, Dec 14, 2020 at 9:56 AM Mike West <mk...@chromium.org> wrote:
Hey folks,

In last week's API owners meeting, we had a short discussion around experiments that aren't appropriate for the Origin Trial framework, but instead would benefit from being rolled out to a percentage of users. I think we agreed on something like the following:
  1. Percentage experiments that only reach developer-focused releases (those limited to Chrome's Canary and Dev channel, for instance) do not require API owners' involvement. These releases are explicitly unstable, have a limited reach, and can't support strong security or privacy guarantees due to the rapid churn of changes. The value of %-experiments in these environments pretty clearly outweighs the minimal risk of baking in a behavior based on exposure.

  2. Broadly-accessible releases (Chrome's Beta and Stable channels, for instance) reach many more developers and users, and we'd like API owners to evaluate and approve percentage experiments via an explicit Intent to Experiment. This both ensures that we're evaluating the effect of an experiment on the ecosystem, and gives developers an opportunity to understand what changes might be coming their way.

  3. Percentage experiments that reach a certain level of the population (say, ~5% of stable) are difficult to distinguish from shipping with regard to their impact on developers' expectations. While it's possible that such experiments might really be experimental (Trust Tokens is a recent example), the risks are similar enough to shipping that we'd see this kind of experiment as a rare exception.
The Chrome launch process differentiates between 1% and >1% stable. Should we use the same here instead of picking 5%?
 
Assuming that matches others' recollection of the conversation, I'd propose adding the above requirements to https://www.chromium.org/blink/launching-features, alongside the discussion of origin trials in step 5 the of "For New Feature Incubations" section. Perhaps along the lines of:

"""
If you want to gather data on the usability of your feature that an Origin Trial can help collect, or data about your feature's impact that a percentage-based experiment could support, then  proceed to the "Public Experiment" stage in ChromeStatus and fill out the required fields detailing what you hope to learn.  This will generate an Intent to Experiment mail that you should send to blink-dev.  After receiving at least one LGTM from API Owners, you can proceed with your origin trial or percentage experiement. Collect data and respond to any issues.  From here, you may wish to return to Dev Trials, proceed to Prepare to Ship, or park the feature.

Please note that these experiments are not exempt from requiring cross-functional approvals from the Chrome launch review process.
"""

This would also require corresponding changes to the Intent to Experiment template, and to the ChromeStatus workflow as well. It also might be reasonable to create a separate page describing the experimental options in more detail, as well as the approval requirements for exceptions.

WDYT?

-mike

--
You received this message because you are subscribed to the Google Groups "blink-api-owners-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-api-owners-d...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-api-owners-discuss/CAKXHy%3DeiMG3TLA1Kuhr51MDwHFHQgiAmV7vZAy1i9df8v%3DC0tQ%40mail.gmail.com.

Mike West

unread,
Dec 14, 2020, 4:34:05 AM12/14/20
to Jochen Eisinger, blink-api-owners-discuss
On Mon, Dec 14, 2020 at 10:17 AM Jochen Eisinger <joc...@chromium.org> wrote:
On Mon, Dec 14, 2020 at 9:56 AM Mike West <mk...@chromium.org> wrote:
Hey folks,

In last week's API owners meeting, we had a short discussion around experiments that aren't appropriate for the Origin Trial framework, but instead would benefit from being rolled out to a percentage of users. I think we agreed on something like the following:
  1. Percentage experiments that only reach developer-focused releases (those limited to Chrome's Canary and Dev channel, for instance) do not require API owners' involvement. These releases are explicitly unstable, have a limited reach, and can't support strong security or privacy guarantees due to the rapid churn of changes. The value of %-experiments in these environments pretty clearly outweighs the minimal risk of baking in a behavior based on exposure.

  2. Broadly-accessible releases (Chrome's Beta and Stable channels, for instance) reach many more developers and users, and we'd like API owners to evaluate and approve percentage experiments via an explicit Intent to Experiment. This both ensures that we're evaluating the effect of an experiment on the ecosystem, and gives developers an opportunity to understand what changes might be coming their way.

  3. Percentage experiments that reach a certain level of the population (say, ~5% of stable) are difficult to distinguish from shipping with regard to their impact on developers' expectations. While it's possible that such experiments might really be experimental (Trust Tokens is a recent example), the risks are similar enough to shipping that we'd see this kind of experiment as a rare exception.
The Chrome launch process differentiates between 1% and >1% stable. Should we use the same here instead of picking 5%?

I mentioned 5% here because it was the bar we set for Trust Tokens. Aligning with 1% vs >1% sounds perfectly reasonable to me.

-mike

Assuming that matches others' recollection of the conversation, I'd propose adding the above requirements to https://www.chromium.org/blink/launching-features, alongside the discussion of origin trials in step 5 the of "For New Feature Incubations" section. Perhaps along the lines of:

"""
If you want to gather data on the usability of your feature that an Origin Trial can help collect, or data about your feature's impact that a percentage-based experiment could support, then  proceed to the "Public Experiment" stage in ChromeStatus and fill out the required fields detailing what you hope to learn.  This will generate an Intent to Experiment mail that you should send to blink-dev.  After receiving at least one LGTM from API Owners, you can proceed with your origin trial or percentage experiement. Collect data and respond to any issues.  From here, you may wish to return to Dev Trials, proceed to Prepare to Ship, or park the feature.

Please note that these experiments are not exempt from requiring cross-functional approvals from the Chrome launch review process.
"""

This would also require corresponding changes to the Intent to Experiment template, and to the ChromeStatus workflow as well. It also might be reasonable to create a separate page describing the experimental options in more detail, as well as the approval requirements for exceptions.

WDYT?

-mike

--
You received this message because you are subscribed to the Google Groups "blink-api-owners-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-api-owners-d...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-api-owners-discuss/CAKXHy%3DeiMG3TLA1Kuhr51MDwHFHQgiAmV7vZAy1i9df8v%3DC0tQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "blink-api-owners-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-api-owners-d...@chromium.org.

Chris Harrelson

unread,
Dec 14, 2020, 1:15:03 PM12/14/20
to Mike West, Jochen Eisinger, blink-api-owners-discuss
On Mon, Dec 14, 2020 at 1:34 AM Mike West <mk...@chromium.org> wrote:
On Mon, Dec 14, 2020 at 10:17 AM Jochen Eisinger <joc...@chromium.org> wrote:
On Mon, Dec 14, 2020 at 9:56 AM Mike West <mk...@chromium.org> wrote:
Hey folks,

In last week's API owners meeting, we had a short discussion around experiments that aren't appropriate for the Origin Trial framework, but instead would benefit from being rolled out to a percentage of users. I think we agreed on something like the following:
  1. Percentage experiments that only reach developer-focused releases (those limited to Chrome's Canary and Dev channel, for instance) do not require API owners' involvement. These releases are explicitly unstable, have a limited reach, and can't support strong security or privacy guarantees due to the rapid churn of changes. The value of %-experiments in these environments pretty clearly outweighs the minimal risk of baking in a behavior based on exposure.
I'm ok with this as long as the percentage is, say, 50% or less. Anything more might confuse the issue. 

  1. Broadly-accessible releases (Chrome's Beta and Stable channels, for instance) reach many more developers and users, and we'd like API owners to evaluate and approve percentage experiments via an explicit Intent to Experiment. This both ensures that we're evaluating the effect of an experiment on the ecosystem, and gives developers an opportunity to understand what changes might be coming their way.

I think we should also explicitly say that we don't think experimenting in this way is typical or recommended for features.

  1. Percentage experiments that reach a certain level of the population (say, ~5% of stable) are difficult to distinguish from shipping with regard to their impact on developers' expectations. While it's possible that such experiments might really be experimental (Trust Tokens is a recent example), the risks are similar enough to shipping that we'd see this kind of experiment as a rare exception.
The Chrome launch process differentiates between 1% and >1% stable. Should we use the same here instead of picking 5%?

I mentioned 5% here because it was the bar we set for Trust Tokens. Aligning with 1% vs >1% sounds perfectly reasonable to me.

-mike

Assuming that matches others' recollection of the conversation, I'd propose adding the above requirements to https://www.chromium.org/blink/launching-features, alongside the discussion of origin trials in step 5 the of "For New Feature Incubations" section. Perhaps along the lines of:

"""
If you want to gather data on the usability of your feature that an Origin Trial can help collect, or data about your feature's impact that a percentage-based experiment could support, then  proceed to the "Public Experiment" stage in ChromeStatus and fill out the required fields detailing what you hope to learn.  This will generate an Intent to Experiment mail that you should send to blink-dev.  After receiving at least one LGTM from API Owners, you can proceed with your origin trial or percentage experiement. Collect data and respond to any issues.  From here, you may wish to return to Dev Trials, proceed to Prepare to Ship, or park the feature.

Please note that these experiments are not exempt from requiring cross-functional approvals from the Chrome launch review process.
"""

This would also require corresponding changes to the Intent to Experiment template, and to the ChromeStatus workflow as well. It also might be reasonable to create a separate page describing the experimental options in more detail, as well as the approval requirements for exceptions.

WDYT?

-mike

--
You received this message because you are subscribed to the Google Groups "blink-api-owners-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-api-owners-d...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-api-owners-discuss/CAKXHy%3DeiMG3TLA1Kuhr51MDwHFHQgiAmV7vZAy1i9df8v%3DC0tQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "blink-api-owners-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-api-owners-d...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-api-owners-discuss/CALjhuifW8n-wuY9XhXcCh886Nmy4L-R%3DkQK13Nv%2BqNBLCCFXRQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "blink-api-owners-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-api-owners-d...@chromium.org.

Yoav Weiss

unread,
Dec 15, 2020, 2:54:16 AM12/15/20
to Chris Harrelson, Mike West, Jochen Eisinger, blink-api-owners-discuss
On Mon, Dec 14, 2020 at 7:15 PM Chris Harrelson <chri...@chromium.org> wrote:


On Mon, Dec 14, 2020 at 1:34 AM Mike West <mk...@chromium.org> wrote:
On Mon, Dec 14, 2020 at 10:17 AM Jochen Eisinger <joc...@chromium.org> wrote:
On Mon, Dec 14, 2020 at 9:56 AM Mike West <mk...@chromium.org> wrote:
Hey folks,

In last week's API owners meeting, we had a short discussion around experiments that aren't appropriate for the Origin Trial framework, but instead would benefit from being rolled out to a percentage of users. I think we agreed on something like the following:
  1. Percentage experiments that only reach developer-focused releases (those limited to Chrome's Canary and Dev channel, for instance) do not require API owners' involvement. These releases are explicitly unstable, have a limited reach, and can't support strong security or privacy guarantees due to the rapid churn of changes. The value of %-experiments in these environments pretty clearly outweighs the minimal risk of baking in a behavior based on exposure.
I'm ok with this as long as the percentage is, say, 50% or less. Anything more might confuse the issue. 

  1. Broadly-accessible releases (Chrome's Beta and Stable channels, for instance) reach many more developers and users, and we'd like API owners to evaluate and approve percentage experiments via an explicit Intent to Experiment. This both ensures that we're evaluating the effect of an experiment on the ecosystem, and gives developers an opportunity to understand what changes might be coming their way.

I think we should also explicitly say that we don't think experimenting in this way is typical or recommended for features.

I'm not sure we have a clear definition of "features" vs. web-exposed behavior that does warrant experimentation for e.g. performance measurements.
Also, in some cases shipping features results in breakage (e.g. UA-CH, same-site cookies) and some experimentation can help us evaluate the risk there. (although, I also agree that these cases are typically covered by an I2S and slow-rollout, rather than an I2E)

It might be helpful to talk about a concrete example: Let's say we want to test the performance benefits of preload, a web-exposed feature.
The right way to run such an experiment would be to turn off the feature on ~1% of stable and measure the experiment's impact on various metrics.
Would that require an Intent to Experiment?

I'd argue that:
a) We want this experiment to run through the intent process
b) We want to enable such experiments, even if they are done with web exposed features

Does that make sense?
 

Mike West

unread,
Dec 15, 2020, 2:56:29 AM12/15/20
to Chris Harrelson, Jochen Eisinger, blink-api-owners-discuss
On Mon, Dec 14, 2020 at 7:15 PM Chris Harrelson <chri...@chromium.org> wrote:
On Mon, Dec 14, 2020 at 1:34 AM Mike West <mk...@chromium.org> wrote:
On Mon, Dec 14, 2020 at 10:17 AM Jochen Eisinger <joc...@chromium.org> wrote:
On Mon, Dec 14, 2020 at 9:56 AM Mike West <mk...@chromium.org> wrote:
Hey folks,

In last week's API owners meeting, we had a short discussion around experiments that aren't appropriate for the Origin Trial framework, but instead would benefit from being rolled out to a percentage of users. I think we agreed on something like the following:
  1. Percentage experiments that only reach developer-focused releases (those limited to Chrome's Canary and Dev channel, for instance) do not require API owners' involvement. These releases are explicitly unstable, have a limited reach, and can't support strong security or privacy guarantees due to the rapid churn of changes. The value of %-experiments in these environments pretty clearly outweighs the minimal risk of baking in a behavior based on exposure.
I'm ok with this as long as the percentage is, say, 50% or less. Anything more might confuse the issue. 

50% is in line with the (internal) Finch best practices doc, as it allows reasonable measurement of an A/B configuration. I could get behind requiring an Intent to Experiment for wider deployment even on Canary/Dev, though I wonder what we'd say to more complicated experiments that divided the world up into more than two groups to test variants of a proposal. As long as no one group is above 50%, I think I'd be fine with allowing developers to experiment without our involvement.


  1. Broadly-accessible releases (Chrome's Beta and Stable channels, for instance) reach many more developers and users, and we'd like API owners to evaluate and approve percentage experiments via an explicit Intent to Experiment. This both ensures that we're evaluating the effect of an experiment on the ecosystem, and gives developers an opportunity to understand what changes might be coming their way.

I think we should also explicitly say that we don't think experimenting in this way is typical or recommended for features.

I think it depends greatly on the feature. For example, the recent cache partitioning change doesn't expose a new web API, but has performance and potentially correctness implications. Experimenting with it as an origin trial doesn't seem viable, and rolling it out to 100% is risky. Experiments in the wild seem like the right way to evaluate that kind of change, and one that we should encourage.

I agree that we should discourage shipping new APIs on a percentage basis. Origin Trials seem much more appropriate for that use case.

-mike

Chris Harrelson

unread,
Dec 16, 2020, 1:04:56 PM12/16/20
to Mike West, Jochen Eisinger, blink-api-owners-discuss
Responding to both Yoav and Mike's points together:

I agree that the examples you gave are ones where a percent experiment makes sense. When I said "I think we should also explicitly say that we don't think experimenting in this way is typical or recommended for features", my intent was to indicate that an intent owner should only use these kinds of experiments when necessary, because the result is that the web-exposed change will not be reliable for developer during that period, which may cause confusion or broken sites.

I think we agree on these points?

Yoav Weiss

unread,
Dec 16, 2020, 5:38:19 PM12/16/20
to Chris Harrelson, Mike West, Jochen Eisinger, blink-api-owners-discuss
I think we agree, but it might be worthwhile to clarify where experiments are a legitimate tool and where they are not.
IMO, they are a legitimate tool for compatibility and performance related experiments. and can be used when:
a) We want to test the performance implications of (potentially web-exposed) behavior changes
b) We want to get a better understanding of the compatibility implications of new features or removals

(b) is something we often refer to as a "Finch rollout", as is a practice we encouraged, following an approved intent.

Did you have something else than (b) in mind for experimenting with features? Or were you just saying that it's not something people should do without an intent? (Or, am I missing your intent entirely?)

Chris Harrelson

unread,
Dec 16, 2020, 6:04:45 PM12/16/20
to Yoav Weiss, Mike West, Jochen Eisinger, blink-api-owners-discuss
On Wed, Dec 16, 2020 at 2:38 PM Yoav Weiss <yoav...@google.com> wrote:
I think we agree, but it might be worthwhile to clarify where experiments are a legitimate tool and where they are not.
IMO, they are a legitimate tool for compatibility and performance related experiments. and can be used when:
a) We want to test the performance implications of (potentially web-exposed) behavior changes
b) We want to get a better understanding of the compatibility implications of new features or removals

(b) is something we often refer to as a "Finch rollout", as is a practice we encouraged, following an approved intent.

Did you have something else than (b) in mind for experimenting with features? Or were you just saying that it's not something people should do without an intent? (Or, am I missing your intent entirely?)

I am just saying that we should say it's an unusual situation to experiment via Finch *before* an approved I2S.

Yoav Weiss

unread,
Dec 17, 2020, 3:00:51 AM12/17/20
to Chris Harrelson, Mike West, Jochen Eisinger, blink-api-owners-discuss
On Thu, Dec 17, 2020 at 12:04 AM Chris Harrelson <chri...@chromium.org> wrote:


On Wed, Dec 16, 2020 at 2:38 PM Yoav Weiss <yoav...@google.com> wrote:
I think we agree, but it might be worthwhile to clarify where experiments are a legitimate tool and where they are not.
IMO, they are a legitimate tool for compatibility and performance related experiments. and can be used when:
a) We want to test the performance implications of (potentially web-exposed) behavior changes
b) We want to get a better understanding of the compatibility implications of new features or removals

(b) is something we often refer to as a "Finch rollout", as is a practice we encouraged, following an approved intent.

Did you have something else than (b) in mind for experimenting with features? Or were you just saying that it's not something people should do without an intent? (Or, am I missing your intent entirely?)

I am just saying that we should say it's an unusual situation to experiment via Finch *before* an approved I2S.

OK, we're 100% in agreement then :)

Jochen Eisinger

unread,
Dec 17, 2020, 4:08:57 AM12/17/20
to Yoav Weiss, Chris Harrelson, Mike West, blink-api-owners-discuss
 I think that if we go down this route, we should introduce a new kind of i2e, or modify the i2e to state that if you run some kind of 3p OT w/o OT usage limits (because you use Finch for that), you'll need 3 lgtms or similar.

I don't think we can require an intent to ship, this would send the wrong signal to the developer community. I'd imagine us approving an i2s for the next Potassium API while it's still under development, for example, would damage the trust in this process and the potassium project.

Chris Harrelson

unread,
Dec 17, 2020, 12:08:57 PM12/17/20
to Jochen Eisinger, Yoav Weiss, Mike West, blink-api-owners-discuss
On Thu, Dec 17, 2020 at 1:08 AM Jochen Eisinger <joc...@chromium.org> wrote:
 I think that if we go down this route, we should introduce a new kind of i2e, or modify the i2e to state that if you run some kind of 3p OT w/o OT usage limits (because you use Finch for that), you'll need 3 lgtms or similar.

I don't think we can require an intent to ship, this would send the wrong signal to the developer community. I'd imagine us approving an i2s for the next Potassium API while it's still under development, for example, would damage the trust in this process and the potassium project.

Agree that we should not use this requirement as a reason to early-approve an I2S.

Daniel Bratell

unread,
Dec 17, 2020, 1:33:12 PM12/17/20
to Chris Harrelson, Jochen Eisinger, Yoav Weiss, Mike West, blink-api-owners-discuss

Since nobody has objected, I think we all agree that dev and canary (<=50%) can be experimented on without explicit api owner approval? Just checking.

/Daniel

Chris Harrelson

unread,
Dec 17, 2020, 1:47:17 PM12/17/20
to Daniel Bratell, Jochen Eisinger, Yoav Weiss, Mike West, blink-api-owners-discuss
On Thu, Dec 17, 2020 at 10:33 AM Daniel Bratell <brat...@gmail.com> wrote:

Since nobody has objected, I think we all agree that dev and canary (<=50%) can be experimented on without explicit api owner approval? Just checking.

No - I think it should require one LGTM and a blink-dev email.
Reply all
Reply to author
Forward
0 new messages