A new testing policy for code landing in mozilla-central

Brian Grinstead

unread,

Sep 15, 2020, 11:03:45 AM9/15/20

to dev-platform

(This is a crosspost from firefox-dev)

Hi all,

We’re rolling out a change to the review process to put more focus on automated testing. This will affect you if you review code that lands in mozilla-central.

## TLDR

Reviewers will now need to add a testing Project Tag in Phabricator when Accepting a revision. This can be done with the “Add Action” → “Change Project Tags” UI in Phabricator. There's a screenshot of this UI at https://groups.google.com/g/firefox-dev/c/IlHNKOKknwU/m/zl6cOlT2AgAJ.

We’ve been piloting the policy for a few weeks with positive feedback from reviewers. Still, there may be some rough edges as we roll this out more widely. I’d like to hear about those, so please contact me directly or in the #testing-policy channel in Slack / Matrix if you have feedback or questions about how the policy should apply to a particular review.

We’re working on ways to make this more convenient in the UI and to prevent landing code without a tag in Lando. In the meantime if you'd like a reminder to add the tag while reviewing code, Nicolas Chevobbe has built a WebExtension that automatically opens up the Project Tags UI when appropriate at https://github.com/nchevobbe/phab-test-policy.

## Why?

Automated tests in Firefox and Gecko reduce risk and allow us to quickly and confidently improve our products.

While there’s a general understanding that changes should have tests, this hasn’t been tracked or enforced consistently. We think defining an explicit policy for including automated tests with code changes will help to achieve those goals.

Also, we’ll be able to better understand exactly why and when tests aren’t being added. This can help to highlight components that need new testing capabilities or technical refactoring, and features that require increased manual testing.

There are of course trade-offs to enforcing a testing policy. Tests take time to write, maintain, and run which can slow down day-to-day development. And given the complexity of a modern web browser, some components are impractical to test today (for example, components that interact with external hardware and software).

The policy below hopes to mitigate those by using a lightweight enforcement mechanism and the ability to exempt changes at the discretion of the code reviewer.

## Policy (This text is also located at https://firefox-source-docs.mozilla.org/testing/testing-policy/)

Everything that lands in mozilla-central includes automated tests by default. Every commit has tests that cover every major piece of functionality and expected input conditions.

One of the following Project Tags must be applied in Phabricator before landing, at the discretion of the reviewer:

* `testing-approved` if it has sufficient automated test coverage.
* One of `testing-exception-*` if not. After speaking with many teams across the project we’ve identified the most common exceptions, which are detailed below.

### Exceptions

* testing-exception-unchanged: Commits that don’t change behavior for end users. For example:
* Refactors, mechanical changes, and deleting dead code as long as they aren’t meaningfully changing or removing any existing tests. Authors should consider checking for and adding missing test coverage in a separate commit before a refactor.
* Code that doesn’t ship to users (for example: documentation, build scripts and manifest files, mach commands). Effort should be made to test these when regressions are likely to cause bustage or confusion for developers, but it’s left to the discretion of the reviewer.
* testing-exception-ui: Commits that change UI styling, images, or localized strings. While we have end-to-end automated tests that ensure the frontend isn’t totally broken, and screenshot-based tracking of changes over time, we currently rely only on manual testing and bug reports to surface style regressions.
* testing-exception-elsewhere: Commits where tests exist but are somewhere else. This requires a comment from the reviewer explaining where the tests are. For example:
* In another commit in the Stack.
* In a followup bug.
* In an external repository for third party code.
* When following the [Security Bug Approval Process](https://firefox-source-docs.mozilla.org/bug-mgmt/processes/security-approval.html) tests are usually landed later, but should be written and reviewed at the same time as the commit.
* testing-exception-other: Commits where none of the defined exceptions above apply but it should still be landed. This should be scrutinized by the reviewer before using it - consider whether an exception is actually required or if a test could be reasonably added before using it. This requires a comment from the reviewer explaining why it’s appropriate to land without tests. Some examples that have been identified include:
* Interacting with external hardware or software and our code is missing abstractions to mock the interaction out.
* Inability to reproduce a reported problem, so landing something to test a fix in Nightly.

Thanks,
Brian

Andrew Halberstadt

unread,

Sep 15, 2020, 11:21:49 AM9/15/20

to Brian Grinstead, dev-platform

This change is fantastic, thanks for driving it!

Reviewers should not need to feel bad about asking authors to go back and
add tests, making this official policy removes that burden of guilt (just
like how reviewbot removes the burden of pointing out linting nits).
Authors can now grumble at the process rather than the reviewer. A better
situation all around.

Cheers!

On Tue, Sep 15, 2020 at 11:03 AM Brian Grinstead <bgrin...@mozilla.com>
wrote:

> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>

Jonathan Kew

unread,

Sep 15, 2020, 11:49:06 AM9/15/20

to dev-pl...@lists.mozilla.org

On 15/09/2020 16:03, Brian Grinstead wrote:

> We’re rolling out a change to the review process to put more focus on

automated testing. [...]

As others have said, it's great that we're setting a clearer policy here
- thanks!

One thing did strike me as a little surprising, and could perhaps be
clarified:

> * testing-exception-other: Commits where none of the defined
exceptions above apply but it should still be landed. This should be
scrutinized by the reviewer before using it - consider whether an
exception is actually required or if a test could be reasonably added
before using it. This requires a comment from the reviewer explaining
why it’s appropriate to land without tests.

The point that caught my eye here was that it explicitly requires the
explanatory comment to be *from the reviewer*. I would have thought that
in many cases it would make sense for the *patch author* to provide such
an explanatory comment (which the reviewer would of course be expected
to scrutinize before granting r+). Is that scenario going to be
considered unacceptable?

Thanks,

JK

Andrew McCreight

unread,

Sep 15, 2020, 12:28:58 PM9/15/20

to dev-platform

On Tue, Sep 15, 2020 at 8:03 AM Brian Grinstead <bgrin...@mozilla.com>

wrote:

> (This is a crosspost from firefox-dev)
>
> Hi all,
>

> We’re rolling out a change to the review process to put more focus on

> automated testing. This will affect you if you review code that lands in
> mozilla-central.
>
> ## TLDR
>
> Reviewers will now need to add a testing Project Tag in Phabricator when
> Accepting a revision. This can be done with the “Add Action” → “Change
> Project Tags” UI in Phabricator. There's a screenshot of this UI at
> https://groups.google.com/g/firefox-dev/c/IlHNKOKknwU/m/zl6cOlT2AgAJ.
>

I of course think having more tests is a good thing, but I don't like how
this specific process places all of the burden of understanding and
documenting some taxonomy of exceptions on the reviewer. Reviewing is
already a fairly thankless and time consuming task. It seems like the way
this is set up is due to the specifics of our how reviewing process is
implemented, so maybe there's no way around it.

Also, contrary to what ahal said, I don't know that tests being an official
requirement relieves the burden of guilt of asking for tests, as everybody
already knows that tests are good and that you should always write tests.
The situation is different from the linters, as the linters will fix almost
all issues automatically, so asking somebody to fix linter issues isn't
much of a burden at all. I guess it is impossible to make testing better
without somebody directly pressuring somebody, though.

My perspective might be a little distorted as I think a lot of the patches
I write would fall under the exceptions, either because they are
refactoring, or I am fixing a crash or security issue based on just a stack
trace.

Separately, one category of fixes I often deal with is fixing leaks or
crashes in areas of the code I am unfamiliar with. I can often figure out
the localized condition that causes the problem and correct that, but I
don't really know anything about, say, service workers or networking to
write a test. Do I need to hold off on landing until I can find somebody
who is willing and able to write a test for me, or until I'm willing to
invest the effort to become knowledgeable enough in an area of code I'm
unlikely to ever look at again? Neither of those feel great to me.

Another testing difficulty I've hit a few times this year (bug 1659013 and
bug 1607703) are crashes that happen when starting Firefox with unusual
command line or environment options. I think we only have one testing
framework that can test those kinds of conditions, and it seemed like it
was in the process of being deprecated. I had trouble finding somebody who
understood it, and I didn't want to spend more than a few hours trying to
figure it out for myself, so I landed it without testing. To be clear, I
could reproduce at least one of those crashes locally, but I wasn't sure
how to write a test to create those conditions. Under this new policy, how
do we handle fixing issues where there is not always a good testing
framework already? Writing a test is hopefully not too much of a burden,
but obviously having to develop and maintain a new testing framework could
be a good deal of work.

Anyways those are some disjointed thoughts based on my experience
developing Firefox that probably don't have good answers.

Andrew

Brian Grinstead

unread,

Sep 15, 2020, 2:10:58 PM9/15/20

to Jonathan Kew, dev-platform

I think that’s OK and something we can clarify in the text if needed. It’s helpful for authors to explain the reasoning if they think an exception should be applied (essentially, making a request for a particular exception). We'd need to decide if either (a) the reviewer grants the request by adding the exception with a link pointing to the authors comment, (b) the reviewer grants the request by adding the exception without a comment, or (c) the author adds the exception at the same time they make the request, and the reviewer implictly grants the request by not removing it. As the policy is written, (a) is already fine.

If based on usage feedback this is inconvenient and not providing value I’m open to adjusting it accordingly. Though did want to note that I’d be more interested in optimizing the workflow for `testing-exception-elsewhere` which also requires a comment (as it is expected / commonly used, like in Stacks) than `testing-exception-other` which should be used less often and require more scrutiny.

Brian

> Thanks,
>
> JK

Andrew Halberstadt

unread,

Sep 15, 2020, 2:45:19 PM9/15/20

to Andrew McCreight, dev-platform

On Tue, Sep 15, 2020 at 12:28 PM Andrew McCreight <amccr...@mozilla.com>
wrote:

> I don't know that tests being an official requirement relieves the burden
of guilt of asking for tests, as everybody already knows that tests are
good and that you should always write tests

I think this is largely true at Mozilla, but we all have our lapses from
time to time. Thankfully since we all know this, a tiny nudge is all that
we need.

> Separately, one category of fixes I often deal with is fixing leaks or
crashes in areas of the code I am unfamiliar with. I can often figure out
the localized condition that causes the problem and correct that, but I
don't really know anything about, say, service workers or networking to
write a test. Do I need to hold off on landing until I can find somebody
who is willing and able to write a test for me, or until I'm willing to
invest the effort to become knowledgeable enough in an area of code I'm
unlikely to ever look at again? Neither of those feel great to me.

This sounds like an argument in favour of putting the burden of exceptions
on the reviewer. Authors can submit without a test (ideally calling out
it's because they don't know how) and then the reviewer can either:
A) Explain how to add tests for said patch, or
B) Knowing that writing a test would be difficult, grant them an exception

If a reviewer finds themselves explaining / granting exceptions often, it
could be a signal that the test infrastructure in that area needs
improvement.

Btw I think your points are all valid, I guess we'll have to see how it
turns out in practice. Hopefully we can adjust the process as needed based
on what we learn from it.
- Andrew

On Tue, Sep 15, 2020 at 12:28 PM Andrew McCreight <amccr...@mozilla.com>
wrote:

> On Tue, Sep 15, 2020 at 8:03 AM Brian Grinstead <bgrin...@mozilla.com>
> wrote:
>
> > (This is a crosspost from firefox-dev)
> >
> > Hi all,
> >

> > We’re rolling out a change to the review process to put more focus on

Brian Grinstead

unread,

Sep 15, 2020, 3:05:44 PM9/15/20

to Andrew Halberstadt, Andrew McCreight, dev-platform

> On Sep 15, 2020, at 11:44 AM, Andrew Halberstadt <ah...@mozilla.com> wrote:
>
> On Tue, Sep 15, 2020 at 12:28 PM Andrew McCreight <amccr...@mozilla.com>
> wrote:
>> I don't know that tests being an official requirement relieves the burden
> of guilt of asking for tests, as everybody already knows that tests are
> good and that you should always write tests
>
> I think this is largely true at Mozilla, but we all have our lapses from
> time to time. Thankfully since we all know this, a tiny nudge is all that
> we need.
>
>> Separately, one category of fixes I often deal with is fixing leaks or
> crashes in areas of the code I am unfamiliar with. I can often figure out
> the localized condition that causes the problem and correct that, but I
> don't really know anything about, say, service workers or networking to
> write a test. Do I need to hold off on landing until I can find somebody
> who is willing and able to write a test for me, or until I'm willing to
> invest the effort to become knowledgeable enough in an area of code I'm
> unlikely to ever look at again? Neither of those feel great to me.
>
> This sounds like an argument in favour of putting the burden of exceptions
> on the reviewer. Authors can submit without a test (ideally calling out
> it's because they don't know how) and then the reviewer can either:
> A) Explain how to add tests for said patch, or
> B) Knowing that writing a test would be difficult, grant them an exception

Or (C) the reviewer could write the test themselves and land it in a separate commit alongside the patch, marking the original patch as `testing-exception-elsewhere`.

We definitely don't want to discourage writing drive-by / “good samaritan” patches where you are fixing a bug in an area outside of your normal work. This came up a few times in discussions and is one of the reasons I prefer reviewer-specified instead of author-specified exceptions.

> If a reviewer finds themselves explaining / granting exceptions often, it
> could be a signal that the test infrastructure in that area needs
> improvement.
> Btw I think your points are all valid, I guess we'll have to see how it
> turns out in practice. Hopefully we can adjust the process as needed based
> on what we learn from it.

For sure. I’m happy to discuss any specific pain points you run into with the policy.

Eric Rescorla

unread,

Sep 16, 2020, 8:10:34 AM9/16/20

to Andrew McCreight, dev-platform

On Tue, Sep 15, 2020 at 9:28 AM Andrew McCreight <amccr...@mozilla.com>

wrote:

> On Tue, Sep 15, 2020 at 8:03 AM Brian Grinstead <bgrin...@mozilla.com>
> wrote:
>
> > (This is a crosspost from firefox-dev)
> >
> > Hi all,
> >
> > We’re rolling out a change to the review process to put more focus on
> > automated testing. This will affect you if you review code that lands in
> > mozilla-central.
> >
> > ## TLDR
> >
> > Reviewers will now need to add a testing Project Tag in Phabricator when
> > Accepting a revision. This can be done with the “Add Action” → “Change
> > Project Tags” UI in Phabricator. There's a screenshot of this UI at
> > https://groups.google.com/g/firefox-dev/c/IlHNKOKknwU/m/zl6cOlT2AgAJ.
> >
>
> I of course think having more tests is a good thing, but I don't like how
> this specific process places all of the burden of understanding and
> documenting some taxonomy of exceptions on the reviewer. Reviewing is
> already a fairly thankless and time consuming task. It seems like the way
> this is set up is due to the specifics of our how reviewing process is
> implemented, so maybe there's no way around it.
>

I certainly have some sympathy for this, but I'm not sure that it needs to
be
addressed via tools. What I would generally expect in cases like this is
that the reviewer says "why isn't there a test for X" and the author says
"for reason Y" and either the reviewer does or does not accept that.
That's certainly been my experience on both sides of this interaction
in previous instances where there were testing policies but not this
machinery.

Also, contrary to what ahal said, I don't know that tests being an official
> requirement relieves the burden of guilt of asking for tests, as everybody
> already knows that tests are good and that you should always write tests.
>

I also don't know about how this will impact people's internal states; I see
this as having two major benefits:

1. It tells people what we expect of them
2. It gives us the ability to measure what's actually happening and adjust
accordingly.

To expand on (2) a bit, if we look back and find that there was a lot of
code
landed without tests but where exceptions weren't filed, then we know we
need to work on one set of things. On the other hand, if we see that there
are a lot of exceptions being filed in cases we don't think there should
be (for whatever reason) then we need to work on a different set of things
(e.g., improving test frameworks in that area).

> My perspective might be a little distorted as I think a lot of the patches
> I write would fall under the exceptions, either because they are
> refactoring, or I am fixing a crash or security issue based on just a stack
> trace.
>
> Separately, one category of fixes I often deal with is fixing leaks or
> crashes in areas of the code I am unfamiliar with. I can often figure out
> the localized condition that causes the problem and correct that, but I
> don't really know anything about, say, service workers or networking to
> write a test. Do I need to hold off on landing until I can find somebody
> who is willing and able to write a test for me, or until I'm willing to
> invest the effort to become knowledgeable enough in an area of code I'm
> unlikely to ever look at again? Neither of those feel great to me.
>

Agreed that it's not an ideal situation. I think it's important to step
back and
ask how we got into that situation, though. I agree that it's not that
productive
for you to write the test, but hopefully *someone* at Mozilla understands
the
code and is able to write a test and if not we have some other problems,
right?
So what I would hope here is that you were able to identify the problem,
maybe write a fix and then hand it off to someone who could carry it over
the
line.

> Another testing difficulty I've hit a few times this year (bug 1659013 and
> bug 1607703) are crashes that happen when starting Firefox with unusual
> command line or environment options. I think we only have one testing
> framework that can test those kinds of conditions, and it seemed like it
> was in the process of being deprecated. I had trouble finding somebody who
> understood it, and I didn't want to spend more than a few hours trying to
> figure it out for myself, so I landed it without testing. To be clear, I
> could reproduce at least one of those crashes locally, but I wasn't sure
> how to write a test to create those conditions. Under this new policy, how
> do we handle fixing issues where there is not always a good testing
> framework already? Writing a test is hopefully not too much of a burden,
> but obviously having to develop and maintain a new testing framework could
> be a good deal of work.
>

As a general matter, I think we need to be fairly cautious about landing
code
where we aren't able to test. In some cases that means taking a step back
and doing a lot of work on the testing framework before you can write some
comparatively trivial code, which obviously is annoying at the time but also
is an investment in the future. In other cases, we'll have to make a
decision
that the value of the change outweighs the risk of landing untested code.
This also depends on the nature of the change: if you're fixing a crash in a
section of code, the bar might be lower than if you're adding a new feature.

-Ekr

Andrew McCreight

unread,

Sep 16, 2020, 9:34:44 AM9/16/20

to dev-platform

On Wed, Sep 16, 2020 at 5:10 AM Eric Rescorla <e...@rtfm.com> wrote:

> I certainly have some sympathy for this, but I'm not sure that it needs to
> be
> addressed via tools. What I would generally expect in cases like this is
> that the reviewer says "why isn't there a test for X" and the author says
> "for reason Y" and either the reviewer does or does not accept that.
> That's certainly been my experience on both sides of this interaction
> in previous instances where there were testing policies but not this
> machinery.
>

Sure, but it just adds latency to the whole process. Hopefully requiring an
extra round trip will be rare.

> Also, contrary to what ahal said, I don't know that tests being an official
>> requirement relieves the burden of guilt of asking for tests, as everybody
>> already knows that tests are good and that you should always write tests.
>>
>
> I also don't know about how this will impact people's internal states; I
> see
> this as having two major benefits:
>
> 1. It tells people what we expect of them
> 2. It gives us the ability to measure what's actually happening and adjust
> accordingly.
>
> To expand on (2) a bit, if we look back and find that there was a lot of
> code
> landed without tests but where exceptions weren't filed, then we know we
> need to work on one set of things. On the other hand, if we see that there
> are a lot of exceptions being filed in cases we don't think there should
> be (for whatever reason) then we need to work on a different set of things
> (e.g., improving test frameworks in that area).
>

Yeah, I think gathering data about what the actual level of testing is a
great output of this process. I'm curious to see what it will turn up.
Presumably it could also be bucketed by Bugzilla components and so forth.

I mean, everybody has their own priorities. Part of the reason I'm fixing
leaks in random code is because I understand the leak fixing tools very
well, but part of it is also that I care more about leaks than the average
person. If I've fixed a leak in an obscure error case in some web API,
maybe the experts in that API are willing to review a patch to fix the
leak, but if they have to weigh trying to figure out how to write a test
for the leak versus meeting some important deadline or their H2 goal, maybe
they aren't going to be able to get to it quickly. I can't even say that's
the wrong decision for them to make.

>
>
>> Another testing difficulty I've hit a few times this year (bug 1659013 and
>> bug 1607703) are crashes that happen when starting Firefox with unusual
>> command line or environment options. I think we only have one testing
>> framework that can test those kinds of conditions, and it seemed like it
>> was in the process of being deprecated. I had trouble finding somebody who
>> understood it, and I didn't want to spend more than a few hours trying to
>> figure it out for myself, so I landed it without testing. To be clear, I
>> could reproduce at least one of those crashes locally, but I wasn't sure
>> how to write a test to create those conditions. Under this new policy, how
>> do we handle fixing issues where there is not always a good testing
>> framework already? Writing a test is hopefully not too much of a burden,
>> but obviously having to develop and maintain a new testing framework could
>> be a good deal of work.
>>
>
> As a general matter, I think we need to be fairly cautious about landing
> code
> where we aren't able to test. In some cases that means taking a step back
> and doing a lot of work on the testing framework before you can write some
> comparatively trivial code, which obviously is annoying at the time but
> also
> is an investment in the future. In other cases, we'll have to make a
> decision
> that the value of the change outweighs the risk of landing untested code.
>

That's true. In these particular cases, I might not have looked at the
crashes if I'd realized before I started investigating that they were
happening with weird options, but once I had, the fix was straightforward
so I decided I might as well land a patch.

> This also depends on the nature of the change: if you're fixing a crash in
> a
> section of code, the bar might be lower than if you're adding a new
> feature.
>

Yeah, that's a good distinction.

Andrew

> -Ekr
>
>

surkov.a...@gmail.com

unread,

Sep 16, 2020, 10:13:00 AM9/16/20

to

Better test coverage is the best. Agreed, having a policy to force developers to add tests must be an efficient way to improve test coverage. I worried though it may reduce amount of contributions from the community since it complicates the development process: sometimes you have to spend more time to create a test than to fix a bug. Also it could make developers (intentionally or unintentionally) to pick up tasks that don't need tests. The worse thing is perhaps you cannot effectively measure these side affects of the change.

Thanks,
Alexander.

Brian Grinstead

unread,

Sep 16, 2020, 10:55:41 AM9/16/20

to surkov.a...@gmail.com, dev-pl...@lists.mozilla.org

> On Sep 16, 2020, at 7:12 AM, surkov.a...@gmail.com <surkov.a...@gmail.com> wrote:
>
> Better test coverage is the best. Agreed, having a policy to force developers to add tests must be an efficient way to improve test coverage. I worried though it may reduce amount of contributions from the community since it complicates the development process: sometimes you have to spend more time to create a test than to fix a bug.

I think this is similar to the issue raised by mccr8 and discussed upthread. People who aren't familiar with a particular area of the codebase should still feel free to write a patch without a test to fix a bug. It's then up to the reviewer to either provide instructions to write a test, write a test themselves, or grant an exception.

> Also it could make developers (intentionally or unintentionally) to pick up tasks that don't need tests. The worse thing is perhaps you cannot effectively measure these side affects of the change.

The exceptions are meant to at least partially address these points (see also ekr's point (2) above).

Brian

Kartikaya Gupta

unread,

Sep 24, 2020, 10:02:26 AM9/24/20

to Brian Grinstead, dev-platform

First - thank you for making these changes! I do find the testing
prompt useful as a reviewer, and it has already resulted in a few
extra tests being added where they probably wouldn't have been
otherwise.

After living with this for a week or so, I have two (relatively minor)
papercuts:
- I really want the webextension available in Fenix. I do a surprising
number of reviews from my phone and having to manually add the tags is
annoying
- Sometimes I want to mark a patch "testing-exception-elsewhere" or
"testing-exception-other" but also leave a general review comment. It
seems awkward to me to have to mix my general review comments with the
testing-specific comment into the same input field, and I'm always
worried it will confuse the patch author. If this process is
eventually formalized into something more structured than phabricator
project tags it would be nice to have separate fields for the
testing-related comment and the general review comment.

Cheers,
kats

Brian Grinstead

unread,

Sep 24, 2020, 5:48:14 PM9/24/20

to Kartikaya Gupta, dev-platform

> On Sep 24, 2020, at 7:02 AM, Kartikaya Gupta <kgu...@mozilla.com> wrote:
>
> First - thank you for making these changes! I do find the testing
> prompt useful as a reviewer, and it has already resulted in a few
> extra tests being added where they probably wouldn't have been
> otherwise.
>
> After living with this for a week or so, I have two (relatively minor)
> papercuts:

Thanks for the feedback:

> - I really want the webextension available in Fenix. I do a surprising
> number of reviews from my phone and having to manually add the tags is
> annoying

A shorter term goal would be to ship a Phabricator extension that implements the web extension behavior so you don’t need to install anything. Longer term it'd be great if the UI more integrated into the Accept process rather than being driven through the Project Tags UI (for example, a radio list to select from before you can click Submit).

Of course getting a Phabricator extension rolled out for everyone has a much higher bar for testing and quality compared with the web extension, and needs to be balanced with everything else on that team's roadmap. But it's good motivation to hear you are finding the web extension useful.

> - Sometimes I want to mark a patch "testing-exception-elsewhere" or
> "testing-exception-other" but also leave a general review comment. It
> seems awkward to me to have to mix my general review comments with the
> testing-specific comment into the same input field, and I'm always
> worried it will confuse the patch author. If this process is
> eventually formalized into something more structured than phabricator
> project tags it would be nice to have separate fields for the
> testing-related comment and the general review comment.

Yeah, keeping a structured field with the testing policy comment would also be great for querying or scanning revisions. Maybe something in the revision details - though I’d want the ability to comment more integrated than having to go to Edit Revision as a separate step _after_ adding the Project Tag. I've pinged Zeid to ask if there's any existing functionality along these lines.

Jeff Muizelaar

unread,

Nov 12, 2020, 11:22:16 AM11/12/20

to Brian Grinstead, dev-platform

Brian,

Now that we've been doing this for a while, do we have any aggregated
data on the testing tags? i.e. how often are we adding tests vs not.

-Jeff

Brian Grinstead

unread,

Nov 12, 2020, 5:45:18 PM11/12/20

to Jeff Muizelaar, dev-platform

Hi Jeff,
Thanks to help from Marco Castelluccio we now have an artifact in taskcluster that has metadata about all commits that land in m-c, which I've used to gather this data. We're still working on tooling to better understand patterns with different types of bugs & patches, but I'm happy to share what we already have.

Here's a spreadsheet showing weekly data since September: https://docs.google.com/spreadsheets/d/1xpm7V-zHCB3nyENdoVkG2ViT5iyg940SIpobY1fzR2g/edit.

A few notes:

* "none" is when a patch is reviewed/landed without a tag being applied. I've been hoping this would naturally get down to 0 as people get used to the policy and from the notifications we added in phabrictor. It dropped quickly to ~10% but is hovering right now. I've spoken with the Lando team about changing our warning checkbox into a hard blocker to enforce this.
* "unknown" counts revisions that don't have an accessible phabricator revision. This is mostly automated commits like wpt-sync (for example https://hg.mozilla.org/mozilla-central/rev/19b1604e8c8e) but also includes revisions with restricted permissions that aren't able to be loaded during generation time. For the sake of reporting I think these can be pretty much ignored.
* From most common to least common we see: “unchanged”, “approved”, “other”, “elsewhere”, “ui”. This has remained pretty stable across weeks. I’m not surprised that “unchanged” is common since it’s a bit overloaded - covering both (a) code that don’t ship to users and (b) code that’s being modified but keeping the same functionality.

Finally, I’m happy to hear feedback about how people think this has been going from a reviewer/author standpoint. Most of the feedback has been around ambiguity with “unchanged”. For example when “unchanged” versus “approved” should be applied when touching only tests: https://bugzilla.mozilla.org/show_bug.cgi?id=1672439. Or if we should somehow split up “unchanged” to differentiate between code that already has tests and code that doesn’t. #testing-policy is a good channel for that discussion if you are interested in the topic.

Thanks,
Brian