Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Groups keyboard shortcuts have been updated
See shortcuts

Capturing additional metadata in files

Skip to first unread message

Gregory Szorc

Dec 9, 2014, 1:47:17 PM12/9/14
to dev-platform
In Portland, there were a number of discussions around ideas and
features that could be easier implemented if only we had better metadata
and annotations for source files. For example:

* Suggested reviewers for a patch
* Determine the Bugzilla component for a failing test
* Determine the Bugzilla component for a changed file so a bug can be
filed automatically
* Building a subscription service for watching code and reviews
* Defining what static analysis should run on a given source file
* Mapping changed files to impacted automation jobs (useful for
minimizing automation that runs)

There is pretty much universal consensus that as much metadata as
possible should live in the tree, next to the things being annotated.
This is in contrast to how current systems like Bugzilla's suggested
reviewers feature operate, which is to establish a separate service/data
store, essentially fragmenting the source of truth and introducing
one-off change processes.

I discussed options with Mike Hommey and we believe that files
are the appropriate default location for this metadata. We considered
alternatives such as Python sandboxes under a different
filename and standalone JSON or YAML files. We like because it
is a fully customizable Python environment that already exists and
therefore doesn't require much effort to stand up and doesn't fragment
source of truth.

This should not be a surprise: capturing non-build metadata in
files was always an eventual goal. There is already precedence for this
in defining the Sphinx documentation [1]. We just haven't had a good
reason or time to add more things. Until now.

In the weeks and months ahead, expect to start seeing work to integrate
extra metadata into files. This may require refactoring some files. We'll need to support a world where files can
be evaluated before configure is executed (so any tool with a copy of
the source and the Python package for reading files can
extract metadata in milliseconds).

This work should enable all kinds of awesome tooling and developer
productivity wins.

If anyone has any other crazy ideas for what metadata to capture in files to help improve processes, I'm definitely interested in
hearing them!


Nicholas Nethercote

Dec 9, 2014, 3:54:35 PM12/9/14
to Gregory Szorc, dev-platform
On Tue, Dec 9, 2014 at 10:46 AM, Gregory Szorc <> wrote:
> I discussed options with Mike Hommey and we believe that files are
> the appropriate default location for this metadata. We considered
> alternatives such as Python sandboxes under a different
> filename and standalone JSON or YAML files. We like because it is
> a fully customizable Python environment that already exists and therefore
> doesn't require much effort to stand up and doesn't fragment source of
> truth.

Sounds reasonable to me!


Benoit Girard

Dec 10, 2014, 12:30:23 PM12/10/14
to Gregory Szorc, dev-platform
On Tue, Dec 9, 2014 at 1:46 PM, Gregory Szorc <> wrote:

> * Building a subscription service for watching code and reviews

They all sound great. Except I'm not sure what you mean by this one. Are
you suggesting that we have something like a list of email in to
register for updates to a component? I'm not sure I'd like to see people
committing to the tree to register/CC themselves. But maybe you had
something else in mind?

Ehsan Akhgari

Dec 10, 2014, 12:48:21 PM12/10/14
to Benoit Girard, Gregory Szorc, dev-platform
Yeah, I hope that is not what's being suggested here either!

Philipp Kewisch

Dec 10, 2014, 1:05:25 PM12/10/14
On 12/9/14 7:46 PM, Gregory Szorc wrote:
> * Building a subscription service for watching code and reviews

I think this would be quite interesting, but I'm also not quite sure
what metadata you would put into the files.

In the past I've built a small script that uses mozilla pulse to figure
out when a commit changes the UUID of an IDL file or a such file is
added or deleted. I had a lot of ideas on how this could be built into a
real service, that allows the core or addon developer to better watch if
changes to the tree could affect his or her own code. Maybe this would
be interesting to a larger group of people, resulting in a such service
to watch code.

I also thought it would be pretty cool for Thunderbird (or other apps,
for that matter) to use the static analysis data from DXR to determine
the chance of toolkit code changes requiring changes in Thunderbird. I
never looked into this though.

Sorry if I am off topic, I just thought I'd throw my ideas out there in
case a code watching service becomes real.


Gregory Szorc

Dec 10, 2014, 1:12:28 PM12/10/14
to Ehsan Akhgari, Benoit Girard, dev-platform
On 12/10/14 9:48 AM, Ehsan Akhgari wrote:
> On 2014-12-10 12:30 PM, Benoit Girard wrote:
>> On Tue, Dec 9, 2014 at 1:46 PM, Gregory Szorc <> wrote:
>>> * Building a subscription service for watching code and reviews
>> They all sound great. Except I'm not sure what you mean by this one. Are
>> you suggesting that we have something like a list of email in
>> to
>> register for updates to a component? I'm not sure I'd like to see people
>> committing to the tree to register/CC themselves.
> Yeah, I hope that is not what's being suggested here either!

Subscription would not be in tree. Instead, metadata about grouping and
labels for files/directories/modules would be in the tree to make
subscriptions easier to manage. And even then I'm not convinced that is
much better than just letting people manage their own filters. I wanted
an extra bullet point, OK :)

David Burns

Jan 22, 2015, 6:53:41 PM1/22/15
to Gregory Szorc, dev-platform
The last bullet for me is the killer feature. I recently hit an issue where
I made some fairly big change to an API and updated all the consumers that
I was aware and even ran a try push for the "happy" set. Unfortunately this
burnt the tree.

I see this situation as a bigger waste of resources (sheriffs time,
infrastructure time) than people not compiling their code and pushing to a

Obviously there is an issue that annotating the tree will only give you the
"happy" set but that is much better than what we have now and would
hopefully remove the need for people to workout what they need as try
syntax, it would be done for them.


On 9 December 2014 at 18:46, Gregory Szorc <> wrote:

> In Portland, there were a number of discussions around ideas and features
> that could be easier implemented if only we had better metadata and
> annotations for source files. For example:
> * Suggested reviewers for a patch
> * Determine the Bugzilla component for a failing test
> * Determine the Bugzilla component for a changed file so a bug can be
> filed automatically
> * Building a subscription service for watching code and reviews
> * Defining what static analysis should run on a given source file
> * Mapping changed files to impacted automation jobs (useful for minimizing
> automation that runs)
> There is pretty much universal consensus that as much metadata as possible
> should live in the tree, next to the things being annotated. This is in
> contrast to how current systems like Bugzilla's suggested reviewers feature
> operate, which is to establish a separate service/data store, essentially
> fragmenting the source of truth and introducing one-off change processes.
> I discussed options with Mike Hommey and we believe that files
> are the appropriate default location for this metadata. We considered
> alternatives such as Python sandboxes under a different
> filename and standalone JSON or YAML files. We like because it is
> a fully customizable Python environment that already exists and therefore
> doesn't require much effort to stand up and doesn't fragment source of
> truth.
> This should not be a surprise: capturing non-build metadata in
> files was always an eventual goal. There is already precedence for this in
> defining the Sphinx documentation [1]. We just haven't had a good reason or
> time to add more things. Until now.
> In the weeks and months ahead, expect to start seeing work to integrate
> extra metadata into files. This may require refactoring some
> files. We'll need to support a world where files can be
> evaluated before configure is executed (so any tool with a copy of the
> source and the Python package for reading files can extract
> metadata in milliseconds).
> This work should enable all kinds of awesome tooling and developer
> productivity wins.
> If anyone has any other crazy ideas for what metadata to capture in
> files to help improve processes, I'm definitely interested in
> hearing them!
> [1]
> Documentation/index.html#adding-documentation
> _______________________________________________
> dev-platform mailing list

Gregory Szorc

Feb 17, 2015, 6:46:36 PM2/17/15
to Gregory Szorc, dev-platform
I'm starting to work on this implementation.

Feel free to comment on the diffs. is where it starts to get
0 new messages