W3C Trace Context

353 views
Skip to first unread message

Andrew Arnott

unread,
Sep 28, 2020, 1:13:25 PM9/28/20
to JSON-RPC
Given the  W3C Trace Context recommendation, has anyone already decided how to add the context IDs (e.g. traceparent, tracestate) to a JSON-RPC message? I'm interested in adding this to the StreamJsonRpc library but in an interoperable way with other libraries.

--
Andrew Arnott
"I [may] not agree with what you have to say, but I'll defend to the death your right to say it." - S. G. Tallentyre

Nathan Fischer

unread,
Sep 30, 2020, 8:08:27 PM9/30/20
to JSON-RPC
I've been meaning to add support for mercury as well.

My plan was simply to add traceparent and tracestate as top level fields. traceparent would have an object value, and the traceparent fields would become fields of that object. tracestate would have an array value with with object members, each of those would have a "key" field and a "value" field.

The best way to ensure interoperability would be to write something up and add it to https://www.w3.org/TR/trace-context-protocols-registry. Let me know if you have any interest in collaborating on that.

Andrew Arnott

unread,
Oct 1, 2020, 3:56:00 PM10/1/20
to JSON-RPC
Hi Nathan,

Yes, let's collaborate! I've never contributed to a w3c recommendation, so I'm excited at the prospect.

I appreciate your initial thoughts on the design. I wonder though, wouldn't it make for a more compact and recognizable format if we just have two string properties: traceparent and tracestate, that match formatting with the HTTP headers?

--
Andrew Arnott
"I [may] not agree with what you have to say, but I'll defend to the death your right to say it." - S. G. Tallentyre

--
You received this message because you are subscribed to the Google Groups "JSON-RPC" group.
To unsubscribe from this group and stop receiving emails from it, send an email to json-rpc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/json-rpc/d5c141c6-61c3-4236-b357-052b35575403n%40googlegroups.com.

Nathan Fischer

unread,
Oct 1, 2020, 7:33:47 PM10/1/20
to JSON-RPC
cool, me too.

I thought about just using the header format. It seems to me that the trade-off is that with the original format, all the keys and values of either header can just be passed around as one string. But breaking them into json makes them easier to parse and manipulate (which presumably our jsonrpc libraries would do).
The original format is a little more compact, but we could make it even more so by using json and switching to base64 instead of hex, but at the cost of straying even further from the existing spec.
Finally, it just feels more natural to have data represented in the json ast when using a json protocol, rather than the custom parsing format that http headers necessitate.

I don't think any of those considerations are decisive, and it's actually the more subjective idea of encoding data in json on the json protocol that tips it for me.

I also had the thought that the protocol could support both. If the value is a string then parse it like you parse the headers, if it is an object/array then read it in json. But I'm leaning towards thinking that the flexibility would not be worth it.

What do you think?

Andrew Arnott

unread,
Oct 4, 2020, 9:33:54 AM10/4/20
to JSON-RPC
I suspect we may end up having to support multiple formats in our implementations inevitably due to the need to interop with other libraries that didn't coordinate on this mailing list. Proactively supporting two formats seems premature then, since we might end up supporting 2+1 instead of 1+1 formats. :) Like you, I'd prefer to settle on just one format whatever that may be.

Do you know why the spec for HTTP header encoding chose hex instead of base64? Base64 would have been compatible, and since it's representation is shorter as you say, I wonder what benefit we haven't thought of that led them to choose hex, and might that apply to our JSON encoding as well?
HTTP certainly could have broken up these values into separate headers as well, but chose to represent as just one. I don't know the history of this, but it might be for fewer bytes.
Consider too that the spec offers two levels of participation: raw propagation or contributing/changing the values. For raw propagation, simple strings are certainly easier to implement support for than having to parse JSON or creating JSON. Someone sitting on the bridge between JSON-RPC and HTTP will have to convert between these two formats as opposed to just copying the strings over.

Is there precedent for other encoding besides the single-string one presented by the HTTP header spec? If there is, such that breaking the values up is common, I'll be less concerned. But if folks tend to just reuse that single-string encoding across many protocols, I think we should follow that. 
Keeping with the single-string representation will be recognizable, and it's quite possible that someone implementing it will have access to code that can already parse the string due to their HTTP support elsewhere. If we change everything (hex to base64, string to JSON object, etc.) then nothing can be reused in their software for handling this data. 
I can imagine someone reading our spec where it's broken up as JSON objects and arrays and saying "whoa, that's too complicated. I'll just use strings". But if we go with strings, I struggle to imagine someone ignoring our spec to go with objects instead. So in the interest of writing the first spec for JSON-RPC that people will be more likely to follow, strings seem preferable.

That said, if we're writing a W3C recommendation, I value consensus over some of these reasons for keeping with the simple strings, so I'm flexible. But I believe simple strings would be preferable.

--
Andrew Arnott
"I [may] not agree with what you have to say, but I'll defend to the death your right to say it." - S. G. Tallentyre

Nathan Fischer

unread,
Oct 4, 2020, 2:01:25 PM10/4/20
to JSON-RPC
> I suspect we may end up having to support multiple formats in our implementations inevitably due to the need to interop with other libraries that didn't coordinate on this mailing list.

That makes sense, I agree.


> Do you know why the spec for HTTP header encoding chose hex instead of base64?

I do not. Poking around their github there isn't a discussion of base64, but they have talked about removing the fixed length binary requirement on some fields all together (in favor of any old string).
Side note, I have the same question about UUID. Although I'm not sure if the same readability arguments apply.


> Someone sitting on the bridge between JSON-RPC and HTTP will have to convert between these two formats as opposed to just copying the strings over.

That's a good point. I want to say that it's trivial to convert 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01 to something like { version: 0, trace-id:  "4bf92f3577b34da6a3ce929d0e0e4736", parent-id: "00f067aa0ba902b7", trace-flags: 1}, but experience tells that even doing something trivial can seem burdensome compared to doing nothing.
Also, just typing that out I had the compulsion to convert the 'version' and 'trace-flags' to numbers because a) that would be the natural json type and b) it's smaller on the wire. But that's another degree of ambiguity that we would need to specify.

> Is there precedent for other encoding besides the single-string one presented by the HTTP header spec? If there is, such that breaking the values up is common, I'll be less concerned. But if folks tend to just reuse that single-string encoding across many protocols, I think we should follow that. 

Yes, there are drafts of a binary format and a format for MQTT. The binary format breaks the fields up, converts the hex to binary, and re-combines the fields with field delimiters and length prefixes and such. The MQTT format only specifies for v5, but makes a recommendation for v3. v5 uses headers, so it uses the same formats as HTTP ( 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01 ). v3 is just a recommendation, but suggests that if the body is json, to use simple strings in the outermost object. I doubt we will see implementations of the v3 recommendation in libraries, and the spec is just a draft, but I do think it might be a bit reckless for them to recommend injecting a field with a name like 'tracestate' into json payloads whose format they know nothing about. On one hand it would be compatible with a simple string implementation of json-rpc over mqtt, but on the other it might overwrite a pre-existing value that we had put there already.

> we might end up supporting 2+1 instead of 1+1 formats

I think this is persuasive. I don't think that we would end up with 2+1 if we write for W3C, but the object format might take longer to get consensus on.
It seems like if we have one format for a first version and an inevitable additional format in a second version, and one of those formats is going to be the simple string, then it's sensical to have the simple string as the first version. It also makes for a pretty short first version of the spec.

Andrew Arnott

unread,
Oct 4, 2020, 2:39:08 PM10/4/20
to JSON-RPC
LOL. Well, even as you seem to be coming over to the side of simple strings, I'm coming closer to the side of a JSON object. :)  The reason being that we encode JSON-RPC as UTF-8 as well as MessagePack, and when using MessagePack it would make sense to encode the binary portions in raw binary as you mentioned MQTT does. 
Considering that anyone who does anything more than propagate the trace headers actually must parse it, I suppose exposing their individual components internally to the library and its callers will be important. But that's just an internal detail -- my library for example could certainly render the trace data as the simple strings but expose it through its public API as the object model similar to what you proposed.

So for simplicity and following precedent of the wire protocol, strings seem to win out.
But for compactness and simplicity of more interesting implementations, the deeper object model you describe seems to win out.

So I'm split. Are there other considerations we should make? Do anyone else on this mailing list want to chip in?

Let's consider your original proposal a bit. It included:
tracestate would have an array value with with object members, each of those would have a "key" field and a "value" field.

I read this to mean we'll have something like this:
tracestate: [{"key": "vendor1", "value": "value1"},{"key": "vendor2","value":"value2"}]

That seems pretty verbose to me. I guess the spec doesn't mandate that the keys are unique from what I can find, otherwise we could just do this:
tracestate: { "vendor1": "value1", "vendor2": "value2" }
But another problem with this nicer format is that keys are allowed to start with a digit, which while it's a valid JSON, isn't a valid Javascript object (go figure) because property names cannot start with digits in Javascript. Reviewing the JSON spec I guess JSON doesn't guarantee the property names in a JSON object are unique either, so we could in fact represent as JSON in the most compact example I have above, but I suppose it kind of defeats a primary goal of JSON-RPC if being JSON it can't be trivially parsed into a Javascript object.
But what if we find a more compact form anyway? Something like:
tracestate: [["vendor1","value1"],["vendor2","value2"]]

--
Andrew Arnott
"I [may] not agree with what you have to say, but I'll defend to the death your right to say it." - S. G. Tallentyre

Nathan Fischer

unread,
Oct 4, 2020, 10:38:12 PM10/4/20
to JSON-RPC
> LOL. Well, even as you seem to be coming over to the side of simple strings, I'm coming closer to the side of a JSON object. :)

ha, ha. well good, it means that we'll have both sides out in detail by the time we're done.

> raw binary as you mentioned MQTT does.

To clarify, there are two separate formats. One for binary, one for MQTT (which is not binary). That use case makes sense, ubjson is my preferred flavor and it would be nice to have a compact representation for that.
We could just use the existing binary format for messagepack/ubjson. But then implementations need to support the proprietary trace context binary format rather than just rendering the JSON AST to binary like they already know how to do.

> I guess the spec doesn't mandate that the keys are unique from what I can find

Right, but more importantly the spec mandates that the entries be in order. JSON objects don't guarantee key ordering, so that requires arrays.

> But what if we find a more compact form anyway? Something like:
> tracestate: [["vendor1","value1"],["vendor2","value2"]]

I think that would be the most compact, and have been contemplating similar for traceparent since we started talking. But it's much more manual to parse into an object in most languages (not hard, just not automatic).
What about using short field names? [{"k": "vendor1", "v": "value1"},{"k": "vendor2","v":"value2"}]

> Do anyone else on this mailing list want to chip in?

+1


P.S. I have long been thinking about formalizing a method for "anonymous JSON" for the exact use case of traceparent that I mentioned, or for tracestate entries. Given some schema for an object, you could "anonymize it" into an array by removing the field names. On the other end, you could de-anonymize an array into an object given a schema. 
ie.
{ version: 00, trace-id: 4bf92f3577b34da6a3ce929d0e0e4736, parent-id: 00f067aa0ba902b7, trace-flags: 01 }
-
[version, trace-id, parent-id, trace-flags]
=
[00, 4bf92f3577b34da6a3ce929d0e0e4736, 00f067aa0ba902b7, 01]
+
[version, trace-id, parent-id, trace-flags]
=
{ version: 00, trace-id: 4bf92f3577b34da6a3ce929d0e0e4736, parent-id: 00f067aa0ba902b7, trace-flags: 01 }

Andrew Arnott

unread,
Oct 5, 2020, 9:51:35 AM10/5/20
to JSON-RPC
 I have long been thinking about formalizing a method for "anonymous JSON" 

Yes, yes yes ! (changing subject line to track the fork in the conversation)

As I'm defining a bunch of RPC contracts that I'll use over JSON-RPC I've been thinking about this. Not so much to minimize the overhead of the json-rpc envelope but rather the payload (args and return value). Particularly when the data is a large array, if the elements themselves are encoded as objects rather than arrays of raw values it can greatly increase the size of the payload. Defining a 'schema' as it were within the message or perhaps before that message so that I can encode everything as efficient arrays could be a huge win.

I haven't given it a whole lot of thought yet, but it's an interesting area.
 
--
Andrew Arnott
"I [may] not agree with what you have to say, but I'll defend to the death your right to say it." - S. G. Tallentyre

Nathan Fischer

unread,
Oct 5, 2020, 7:24:17 PM10/5/20
to JSON-RPC
I don't think google groups acknowledges forking the conversation, let's continue here?

Andrew Arnott

unread,
Oct 18, 2020, 10:56:09 AM10/18/20
to JSON-RPC
> What about using short field names? [{"k": "vendor1", "v": "value1"},{"k": "vendor2","v":"value2"}]

I'm not a big fan of that due to readability. I think if we go with this schema approach (instead of the simple string) we should spell it out fully ("key", "value"). If it's too verbose, folks can always apply compression at a lower level.

After a week long vacation, I could go either way on this (simple string or structured). For the sake of making progress and simplicity in the wire protocol I propose we go with the two strings approach like the HTTP header encoding spec. If you're still agreeable to that idea, what's the next step to get this into a W3C doc?

Nathan Fischer

unread,
Oct 20, 2020, 9:06:39 PM10/20/20
to JSON-RPC
> what's the next step to get this into a W3C doc?

It looks like all that development is on github. They have a gitter https://gitter.im/TraceContext/Lobby where we could stick our heads in and inquire about next steps.

Roughly, I think we want to get our spec up as a draft. Then update the trace context protocol registry to include json-rpc and our draft.

Andrew Arnott

unread,
Oct 28, 2020, 11:04:35 PM10/28/20
to JSON-RPC
Nathan,

I thought I'd already replied but I don't see it in Google Groups history. I started a conversation on gitter.im as you suggested. I was invited to attend a meeting today where I presented the various options for trace-context over JSON-RPC and they saw merit in each, but ultimately they preferred the simplest option of two string properties that resemble the HTTP ones. I came to agree with them. We can discuss reasons if you're curious or object.

They're willing to take a PR to add our spec (once written) to their protocol registry. For now, I'm drafting it up here: 
https://github.com/AArnott/vs-streamjsonrpc/blob/trace-context/doc/tracecontext.md#protocol

Care to review?

Nathan Fischer

unread,
Oct 29, 2020, 7:43:42 PM10/29/20
to JSON-RPC
that looks like good documentation, but I think the spec would need to be more generic.
something like this https://gist.github.com/kag0/49423bff34da6ab42a69be6fba5fd224 (shamelessly adapted from the MQTT protocol)

Andrew Arnott

unread,
Nov 3, 2020, 12:21:32 AM11/3/20
to JSON-RPC
Yes, the formatting and wording style isn't consistent with the w3c specs just yet. Where should we publish the shared spec? A gist doesn't allow for comments or PRs.

It now covers both the text and binary encodings. My implementation of it is nearly complete. I just have to figure out where to stick the tracestate data so that it can propagate even if my library isn't on both ends of a process's involvement in a chain.

Nathan Fischer

unread,
Nov 3, 2020, 1:43:54 PM11/3/20
to JSON-RPC
I was hoping we could get a repo in the w3c github org, like https://github.com/w3c/trace-context-mqtt

Let's leave the binary encoding out of the final spec? Given https://w3c.github.io/trace-context-binary/ is already out there, it might be better to say something like
> When using a binary encoding (e.g. MessagePack) the trace-context values MUST be encoded as specified in https://w3c.github.io/trace-context-binary

Andrew Arnott

unread,
Nov 8, 2020, 10:56:03 AM11/8/20
to JSON-RPC
Reusing that binary encoding spec is an option. FWIW when I spoke with the WG a week or so ago, they agreed that the binary spec is particularly useful when binary is an option but there is no surrounding binary format to borrow from. I specifically brought up the messagepack case and they readily agreed that if serializing traceparent into a messagepack stream, it would make more sense to serialize into messagepack format rather than a foreign binary spec. It's easier from an implementation point of view to reuse the binary reader/writer than to write a custom one.

Does that change your opinion at all?

Nathan Fischer

unread,
Nov 9, 2020, 1:51:14 PM11/9/20
to JSON-RPC
> Does that change your opinion at all?

A little. It looks like you deleted your branch so I can't refresh my memory on what you had spec'd for binary. But, my main thought is that it seems easier for an implementation to just import a dependency for the trace context binary format in X language than it would be to find a dependency for the trace context binary json format for that language. And if it gets down to the even more specific "trace context messagepack/ubjson/CBOR/... format" then it seems likely that developers will probably be re-implementing that specific format every time.

Andrew Arnott

unread,
Nov 10, 2020, 11:55:57 AM11/10/20
to JSON-RPC
My branch was deleted because it merged into the master branch. You can still view the doc I wrote here.

You make a fair point. And it actually sounds like the same argument that led us to reuse the text-based header value spec. So I'll plan to change to conform on the binary format too.

Nathan Fischer

unread,
Nov 19, 2020, 8:09:40 PM11/19/20
to JSON-RPC
Reply all
Reply to author
Forward
0 new messages