Switching from PYthon Literal-based config files to JSON5

162 views
Skip to first unread message

Dirk Pranke

unread,
Jan 12, 2017, 9:26:30 PM1/12/17
to infr...@chromium.org, Sasha Bermeister, Kevin Liu
Hi all,

Recently some people on the Blink team have wanted to move from .in config files to a more common, readable, and manageable config format. We discussed switching to JSON, PYthon Literal (.pyl) files, and YAML, but settled on JSON5.

We use .pyl files in a few places in infra as well. I suggest we switch from PYL to JSON 5 as well.

Advantages:
- consistency w/ dev standards
- IMO JSON5 is strictly more readable than PYL files
- well-defined syntax
- is at least something that people might've heard of 

Disadvantages:
- parser not built in to Python (there is a json5 library on PYPI -- that I wrote -- that we'd have to install or check in)
- it's easy enough to define a subset of PYL as well (which I have also done)

I don't think this is an urgent thing.

I also don't want to confuse this with using protobufs. For situations where we need well-defined schemas, and where we can have generated parsing stubs, I tend to agree that we should use protobufs instead. But, I don't think we're at a point where we can just
require protobufs for everything.

What do others think? Is there a good argument for keeping .pyl files? Separately, there are a few places where we use .json files for configuration, which is terrible. If we can switch from PYL to JSON5, switching from .json to .json5 seems also easy enough, but that's a separate, follow-on discussion.

-- Dirk



Sergiy Byelozyorov

unread,
Jan 13, 2017, 5:02:45 AM1/13/17
to Dirk Pranke, infr...@chromium.org, Sasha Bermeister, Kevin Liu
This was discussed before: https://groups.google.com/a/google.com/d/msg/chrome-infra/OJLcUVOSmUw/SHcSRLh3wiAJ (sorry, internal only). IIRC, last time we've decided to use proto3 text format, e.g. https://chromium.googlesource.com/chromium/src/+/master/infra/config/cq.cfg.

--
You received this message because you are subscribed to the Google Groups "infra-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to infra-dev+...@chromium.org.
To post to this group, send email to infr...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/infra-dev/CAEoffTCWtzu87MQoRJ5sYcPVX6P9iZArVLytu38pT1BDKJJnOA%40mail.gmail.com.
--
Sergiy Byelozyorov | Software Engineer | ser...@google.com

Google Germany GmbH
Erika-Mann-Strasse 33
80636 München

AG Hamburg, HRB 86891 | Sitz der Gesellschaft: Hamburg | Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle

Dirk Pranke

unread,
Jan 13, 2017, 12:01:21 PM1/13/17
to Sergiy Byelozyorov, infr...@chromium.org, Sasha Bermeister, Kevin Liu
Yup. However, I ruled that out for this discussion :). Discussing whether we should switch the existing uses of .pyl to protos is
fair, but that's much more work for very little gain, so I don't want to consider it at this time.

-- Dirk

On Fri, Jan 13, 2017 at 2:02 AM, Sergiy Byelozyorov <ser...@google.com> wrote:
This was discussed before: https://groups.google.com/a/google.com/d/msg/chrome-infra/OJLcUVOSmUw/SHcSRLh3wiAJ (sorry, internal only). IIRC, last time we've decided to use proto3 text format, e.g. https://chromium.googlesource.com/chromium/src/+/master/infra/config/cq.cfg.

On Fri, Jan 13, 2017 at 3:26 AM Dirk Pranke <dpr...@chromium.org> wrote:
Hi all,

Recently some people on the Blink team have wanted to move from .in config files to a more common, readable, and manageable config format. We discussed switching to JSON, PYthon Literal (.pyl) files, and YAML, but settled on JSON5.

We use .pyl files in a few places in infra as well. I suggest we switch from PYL to JSON 5 as well.

Advantages:
- consistency w/ dev standards
- IMO JSON5 is strictly more readable than PYL files
- well-defined syntax
- is at least something that people might've heard of 

Disadvantages:
- parser not built in to Python (there is a json5 library on PYPI -- that I wrote -- that we'd have to install or check in)
- it's easy enough to define a subset of PYL as well (which I have also done)

I don't think this is an urgent thing.

I also don't want to confuse this with using protobufs. For situations where we need well-defined schemas, and where we can have generated parsing stubs, I tend to agree that we should use protobufs instead. But, I don't think we're at a point where we can just
require protobufs for everything.

What do others think? Is there a good argument for keeping .pyl files? Separately, there are a few places where we use .json files for configuration, which is terrible. If we can switch from PYL to JSON5, switching from .json to .json5 seems also easy enough, but that's a separate, follow-on discussion.

-- Dirk



--
You received this message because you are subscribed to the Google Groups "infra-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to infra-dev+unsubscribe@chromium.org.

Aaron Gable

unread,
Jan 13, 2017, 12:13:09 PM1/13/17
to Dirk Pranke, Sergiy Byelozyorov, infr...@chromium.org, Sasha Bermeister, Kevin Liu
I think there were always two main arguments for .pyl files:
1) They had better syntax than .json files (supported comments, trailing commas, etc)
2) They had perfect language support in our main language

JSON5 does away with the first argument. The second, however, still stands. There are no standard, packaged, bug-free implementations of the JSON5 parser. Not to cast any disparagement on your implementation, but it's clearly not battle tested the way the python parser itself is, and it isn't deployed everywhere. To switch to JSON5, we'd need to make sure that the parser library is available everywhere, and while we've gotten a lot better at that, it's not non-trivial.

Basically, I agree with you on all points except one. JSON5 seems like the better solution for everything that we're currently using .pyl for. It's more standard, it's forwards-compatible, it satisfies our documentation and formatting needs. (I disagree that it is strictly more readable; the only real difference is that keys can be unquoted, unless they contain characters like '-', but that makes it less readable in my heavily-python-influenced opinion.)

The point where I actually disagree is that switching would be worthwhile. I think we have a few parallel directions forward:
1) New services that need configuration will be written in Go and configured with Protobufs
2) Old services written in python and using .pyl should be rewritten in Go and configured with protobufs
3) Old services written in python that won't be rewritten are probably not worth upgrading to JSON5 either

I'd be totally fine with saying that any new configuration files should be .proto or .json5. I don't think there's a convincing argument for perpetuating our use of .pyl. I'm just not sure there's a convincing argument for expending effort on finding and changing current instances of .pyl either, since they have essentially 0 maintenance cost.

To unsubscribe from this group and stop receiving emails from it, send an email to infra-dev+...@chromium.org.
--
Sergiy Byelozyorov | Software Engineer | ser...@google.com

Google Germany GmbH
Erika-Mann-Strasse 33
80636 München

AG Hamburg, HRB 86891 | Sitz der Gesellschaft: Hamburg | Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle

--
You received this message because you are subscribed to the Google Groups "infra-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to infra-dev+...@chromium.org.

To post to this group, send email to infr...@chromium.org.

Dirk Pranke

unread,
Jan 13, 2017, 12:24:46 PM1/13/17
to Aaron Gable, Sergiy Byelozyorov, infr...@chromium.org, Sasha Bermeister, Kevin Liu
On Fri, Jan 13, 2017 at 9:12 AM, Aaron Gable <aga...@chromium.org> wrote:
I think there were always two main arguments for .pyl files:
1) They had better syntax than .json files (supported comments, trailing commas, etc)
2) They had perfect language support in our main language

JSON5 does away with the first argument. The second, however, still stands. There are no standard, packaged, bug-free implementations of the JSON5 parser. Not to cast any disparagement on your implementation, but it's clearly not battle tested the way the python parser itself is, and it isn't deployed everywhere. To switch to JSON5, we'd need to make sure that the parser library is available everywhere, and while we've gotten a lot better at that, it's not non-trivial.

Your points are good ones, except that my implementation is definitely buggy and not battle-tested or tuned, so disparagement is definitely appropriate there :).
 
Basically, I agree with you on all points except one. JSON5 seems like the better solution for everything that we're currently using .pyl for. It's more standard, it's forwards-compatible, it satisfies our documentation and formatting needs. (I disagree that it is strictly more readable; the only real difference is that keys can be unquoted, unless they contain characters like '-', but that makes it less readable in my heavily-python-influenced opinion.)

The point where I actually disagree is that switching would be worthwhile. I think we have a few parallel directions forward:
1) New services that need configuration will be written in Go and configured with Protobufs
2) Old services written in python and using .pyl should be rewritten in Go and configured with protobufs
3) Old services written in python that won't be rewritten are probably not worth upgrading to JSON5 either

I'd be totally fine with saying that any new configuration files should be .proto or .json5. I don't think there's a convincing argument for perpetuating our use of .pyl. I'm just not sure there's a convincing argument for expending effort on finding and changing current instances of .pyl either, since they have essentially 0 maintenance cost.

I don't think they have zero maintenance cost, since new people who look at the file formats have to learn what it is and why it's different from the other formats we might use. So, that's an argument for consolidating formats wherever possible, and for not allowing .json5 as long as we still have .pyl.

It also an argument for pushing more strongly to convert .pyl to protobufs, though.

From what I can tell, we have the following uses of .pyl:

in infra:
- builders.pyl files (a fair number of them)
- recipe_engine/bootstrap/deps.pyl -- I'm not actually sure why this isn't a protobuf?
- scripts/slave/logdog-params.pyl

in src:
- mb_config.pyl
- gn_isolate_map.pyl
- bootstrap/deps.pyl, again
- tools/determinism/deterministic_build_whitelist.pyl
- tools/gritsettings/translation_expectations.pyl

As I said, this isn't an urgent thing, so I guess the right thing to do is to table this for now, until such time as I think we do have a json5 parser solution that is stable and viable.

-- Dirk
 

To unsubscribe from this group and stop receiving emails from it, send an email to infra-dev+unsubscribe@chromium.org.
--
Sergiy Byelozyorov | Software Engineer | ser...@google.com

Google Germany GmbH
Erika-Mann-Strasse 33
80636 München

AG Hamburg, HRB 86891 | Sitz der Gesellschaft: Hamburg | Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle

--
You received this message because you are subscribed to the Google Groups "infra-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to infra-dev+unsubscribe@chromium.org.

To post to this group, send email to infr...@chromium.org.

Aaron Gable

unread,
Jan 13, 2017, 12:43:01 PM1/13/17
to Dirk Pranke, Aaron Gable, Sergiy Byelozyorov, infr...@chromium.org, Sasha Bermeister, Kevin Liu


-- Dirk
 

To unsubscribe from this group and stop receiving emails from it, send an email to infra-dev+...@chromium.org.
--
Sergiy Byelozyorov | Software Engineer | ser...@google.com

Google Germany GmbH
Erika-Mann-Strasse 33
80636 München

AG Hamburg, HRB 86891 | Sitz der Gesellschaft: Hamburg | Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle

--
You received this message because you are subscribed to the Google Groups "infra-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to infra-dev+...@chromium.org.

To post to this group, send email to infr...@chromium.org.

--
You received this message because you are subscribed to the Google Groups "infra-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to infra-dev+...@chromium.org.

To post to this group, send email to infr...@chromium.org.

Robert Iannucci

unread,
Jan 13, 2017, 12:57:00 PM1/13/17
to Aaron Gable, Dirk Pranke, Sergiy Byelozyorov, infr...@chromium.org, Sasha Bermeister, Kevin Liu
Adding slightly more noise to this discussion :). From my point of view, the situation is this:
  * It would be good to use textproto (or jsonpb**) everywhere. Even though it "forces you to have a schema", json and pyl secretly force you to as well, it's just encoded (as code) in whatever script(s) interpret it. This can make it difficult to work with it (adding features, documentation, refactoring, writing new implementations of the scripts, writing other scripts to manipulate/generate it). Proto is better situated here because the protocol definition is independent of the language that is interpreting it (which makes it much easier to make multiple tools that can work with it at the same time)
  * Using proto with python is a pain in the ass. It requires precompilation and a system-installed C extension, which very few chromium python environments are capable of supporting today.
  * So we consider "schemaless" formats with pure-python implementations like pyl, yaml and json5

JSON5 has one more unique advantage over PYL which hasn't been discussed: since the spec for it is not language-dependent, it's capable of being interpreted in multiple languages (such as python, c++ and Go). This advantage would remove on additional hurdle for porting an existing system, should that system be deemed portable.

As for the systems that infra actually uses:
  * deps.pyl is intrinsically python related (it specifies python dependencies in infra). It wouldn't be worth porting, imo, especially it would be the mechanism by which we actually obtain a json5 parser in infra, unless we vendored it, which we've been pretty good at avoiding so far. If we port all the infra stuff to Go (unlikely and/or low-priority), then this would go away.
  * builders.pyl is sort-of intrinsically python related (it loads builder configs into the master). It MIGHT be worth porting, to make it feel closer to other config files in chromium/src, since devs may interact with it. However, luci-lite is aiming to remove the masters by Q2, so it's unclear to me if porting it would be worth it (I'm not sure what will happen to the builders.pyl file contents as part of that switch, they might turn into protos. They might turn into JSON5 :)). If the switch to json5 took less than a week or some similarly short timeframe, it could be potentially worth it.
  * logdog-params.pyl is interpreted during a buildbot job start phase. This will definitely go away with the luci-lite switch (I believe kitchen already handles this).

There's also a wholistic consistency argument (which I like) too; even though any single component may not be worth porting, it would be worth it for the overall system to have consistency. Because of the wholistic argument, I'm not against json5, but due to the specific cases above I also don't think it's particularly high priority work. If someone has a burning desire to get it done, I certainly wouldn't stop them, but I wouldn't deprioritize anything else we have going on to work on it.

I think an actually better question might be: What would it take for us to make proto a viable option everywhere? (i.e. where a dev can just add a .proto file and start using it, and all the rest of the plumbing works). Right now the friction for using proto is very high outside of Go (and recipes, where dnj sacrificed his own longevity to mash that toolchain into shape). This question may be reducible to "what would it take for us to make proto a viable option in chromium/src (and related projects)?" (and the sibling question "If using proto in chromium/src was effortless, would folks want to use it?").

My $0.02
R

** If we do want to use jsonpb, I would not be opposed to writing a json5pb library. I don't think it would actually be hard to do, either, and might be the best of many worlds.

Reply all
Reply to author
Forward
0 new messages