There was an item on the
agenda [1] for last week's TUF community meeting on the use of Canonical JSON. As the call didn't have many participants from non-reference TUF implementations, I thought it would be worthwhile to bring the discussion to the list.
The agenda item sought to answer the following questions:
- Are TUF users/implementers happy with Canonical JSON?
- If so, do we want to try and rally around a Canonical JSON specification?
- If not, do we want to try and switch to something else with wider cross-language support?
An additional question that wasn’t part of the original agenda is:
- Should the TUF specification even cover the wireline format? Or should it describe useful features of a wireline format and leave the choice up to implementers (the spec could still use JSON in the examples, so long as it’s made clear that this is not a requirement)
For context, several concerns have been raised with regards the use of Canonical JSON. Below is a non-exhaustive collection of items from the discussion on the call, as well as various issues filed against TUF (
specification #92 [2],
securesystemslib #159 [3] and
tuf #457 [4]):
- Per the TUF specification: “All documents use a subset of the JSON object format, with floating-point numbers omitted. When calculating the digest of an object, we use the "canonical JSON" subdialect as described at http://wiki.laptop.org/go/Canonical_JSON” – that is, TUF metadata is a JSON subset which is canonicalized to calculate digests and signatures.
- Canonical JSON, as defined by the wiki page above, is not valid JSON. Therefore, implementers cannot use standard JSON parsers on the canonicalized metadata.
- There is an active IETF draft to standardise a JSON Canonicalization Scheme [5].
- Use of different canonicalization schemes across TUF implementations mean that, for example, the Go implementation of TUF used in Notary has a different Canonical JSON implementation to the Python reference implementation of TUF and therefore the two systems create incompatible TUF metadata.
- It may be desirable to have the wire format match what is verified (explicitly not requiring canonicalization/encoding before checking the signature). This would mean implementers do not need to include the JSON parser in the TCB, as it will only parse trusted (verified) metadata.
- JSON allows duplicated keys, whereas TUF should (must?) not.
Some possible outcomes of this discussion could be:
- Change the spec to RECOMMEND use of a specific Canonical JSON specification, i.e. the DRAFT IETF JSON Canonicalization Scheme spec [5].
- Change the spec to RECOMMEND use of a different structured file format with more uniform cross-language support.
- Change the spec to generalise the format of metadata documents allowing implementers free reign in metadata format.
Regards,
Joshua
1.
https://docs.google.com/document/d/1p4i2Bu3I2QbcJgw3w65KX0yy5Aqy_GeOA8INJ63iLLc/edit#2.
https://github.com/theupdateframework/specification/issues/923.
https://github.com/secure-systems-lab/securesystemslib/issues/1594.
https://github.com/theupdateframework/tuf/issues/4575.
https://datatracker.ietf.org/doc/draft-rundgren-json-canonicalization-scheme/6.
https://github.com/theupdateframework/taps/blob/master/tap11.md