Conveying correlation ID in a REST API call via a custom http header

Andrew Braae

unread,

Dec 3, 2014, 1:40:41 PM12/3/14

to api-...@googlegroups.com

As per http://www.infoq.com/articles/microservices-practical-tips, which quotes Sam Newman:

In a microservice-based system, it usually happen that a task is carried out through different nodes. This entails the risk of losing the big picture when investigating issues. An easy way to correlate tasks at different nodes that belong to the same domain-level operation is associating a correlation ID to the higher-level operation and let this ID flow through the system so you can later reconstruct the whole history of that operation as it went across the system.:

How should such a correlation ID be carried through a set of nested REST calls, each making up a part of a larger, domain-level operation like "terminate an employee"?

We are leaning towards using an http header, but which one? A user-defined one? Is there any prior art, conventions, precedents?

Paul Fowler

unread,

Dec 5, 2014, 9:09:28 AM12/5/14

to api-...@googlegroups.com

I second this question. Here is Daniel Bryant's implementation which Andrew may have already seen: http://java.dzone.com/articles/implementing-correlation-ids-0

Daniel's answer is somehow unsatisfactory from a more "architectural" perspective where I would like a theoretical grounding (rationale) on the approach. I always have the fear of approaching the question incorrectly due to prior SOAP grounding.

Should we use one or more existing headers (ETags, Content-MD5, Date, etc)? Perhaps we should use something from our security context such as nonce? Should we put something in the URI?

I suppose if you want end-to-end correlation, then it must be something storable in various cache servers? (REST is, after all, a layered style as well as cached)

darrel...@gmail.com

unread,

Dec 5, 2014, 9:23:32 AM12/5/14

to api-...@googlegroups.com

A new header is appropriate for this. I’ve seen this question pop up regularly. I’ve searched for specs that define a HTTP header like this but have not found anything suitable. Someone just needs to draw a line in the sand and say it’s called _this_ and write an Internet Draft for it.

Darrel

Sent from Surface

--
You received this message because you are subscribed to the Google Groups "API Craft" group.
To unsubscribe from this group and stop receiving emails from it, send an email to api-craft+...@googlegroups.com.
Visit this group at http://groups.google.com/group/api-craft.
For more options, visit https://groups.google.com/d/optout.

Adrian Cole

unread,

Dec 5, 2014, 9:50:32 AM12/5/14

to api-...@googlegroups.com

Some prior/current art:

A smidge below the http abstraction in http2. As http multiplexes, it needs to top-level the concern of request/response pairs. This is called a stream [1], and it is identified simply with a number, if even originating from client, odd from the server. I am not suggesting we break abstraction, rather there is a similar concern there.

Cloud apis commonly support a request id. For example, many Amazon apis place theirs in an the xml response[2]. That's because it is determined by the server. For the sake of idempotency, some AWS apis support a client-indicated token [3].

Many RPC frameworks need side channels to convey concerns such as tracing. Often these data end up in headers. An example to look at is Twitter zipkin [4]

Hope this helps!

-A

[1] https://tools.ietf.org/html/draft-ietf-httpbis-http2-16#section-5.1.1

[2] http://docs.aws.amazon.com/ses/latest/DeveloperGuide/query-interface-responses.html

[3] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Run_Instance_Idempotency.html

[4] https://github.com/twitter/zipkin/blob/master/doc/collector-api.md

--

Paul Fowler

unread,

Dec 5, 2014, 10:39:27 AM12/5/14

to api-...@googlegroups.com

I find the concept interesting that an intermediary could insert a unique id into a custom request header for downstream and the same id into a custom response header for upstream. In effect, all logs that include the request/response pair and headers would have the correlation id (given a few other assumptions and enforced rules.)

I am not saying this should be done, only that it is interesting that the correlation doesn't have to begin with the client or back-end server for correlation to be enabled end-to-end. The task could be an API gateway management task... I am thinking out-loud here.

On Wednesday, December 3, 2014 1:40:41 PM UTC-5, Andrew Braae wrote:

Andrew Braae

unread,

Dec 6, 2014, 9:52:13 PM12/6/14

to api-...@googlegroups.com

Great input and links, thank you!

I hadn't really considered response headers. Maybe I'm missing something but it seems like the correlation ID only belongs in the request headers, since:

a) if the client passes the correlation ID in (via custom request header), no response header is needed since by definition the response belongs to the request.

b) if the client does not pass it in, then yes, the server could realise that no correlation ID had been passed in, and therefore generate one itself, and pass it back to the client via response header. However the client wouldn't know what to do with it (if it understood correlation IDs, it should have passed one down in the first place). Perhaps it might blindly log it (some use), but it certainly wouldn't know to pass it down again if it made further calls to other servers within the one overarching domain operation.

Daniel Bryant's example only uses request header (though its just a proof of concept).

Caching..hmmm

Paul Fowler

unread,

Dec 7, 2014, 5:29:12 AM12/7/14

to api-...@googlegroups.com

Still thinking out-loud...

What if the client doesn't send the correlation id? Perhaps it is a legacy client? Thus the response header idea... Be liberal in what you accept and strict in what you provide...

And the caching does keeps me thinking also... Cache is an interesting variable.

Irakli Nadareishvili

unread,

Dec 7, 2014, 10:15:48 AM12/7/14

to api-...@googlegroups.com

Wild thought:

since correlation IDs are often used for debugging, I wonder if HTTP headers are the best choice. HTTP headers are usually not logged in access logs and by proxies. Maybe a request parameter would be a better choice, if this is to also be used for debugging purposes.

- Irakli

Irakli Nadareishvili

unread,

Dec 7, 2014, 10:17:59 AM12/7/14

to api-...@googlegroups.com

Sorry, by "request parameter" I meant URL param, in case it was confusing.

darrel...@gmail.com

unread,

Dec 7, 2014, 5:13:58 PM12/7/14

to api-...@googlegroups.com

Irakli,

It would make it easier to trace if the correlation id was logged as part of the URL. However that would introduce more complexity when it comes to caching.

I guess I forget sometimes that not everyone is using Runscope for logging/debugging API traffic. 😉

Darrel

Sent from Surface

--

Andrew Braae

unread,

Dec 7, 2014, 10:22:59 PM12/7/14

to api-...@googlegroups.com

(Just some notes in case it is useful/interesting to anyone).

Here's how we've decided to handle it (this is probably quite specific to our needs and is certainly not the material for an Internet Draft!):

0) all apps should now include a "correlation ID" in their log messages when known

1) when any app consumes an API, the correlation ID should be passed within the request header "correlationId"

2) apps that handle top-level domain actions (e.g. incoming user clicks, daemon initiated process, app startup, etc.) should generate a new correlation ID as the first step in processing such actions

3) apps that produce an API should use the correlation ID from the incoming request, but if not found should:

a) generate a new correlation ID locally

b) log a message saying that a new correlation ID was created since missing from the request

c) henceforth use the newly generated correlation ID as if it had been passed in (e.g., include it in log messages)

4) Correlation IDs should be generated as a randomly generated integer

Irakli Nadareishvili

unread,

Dec 8, 2014, 5:33:37 PM12/8/14

to api-...@googlegroups.com

Yeah, I was worried somebody would mention caching, that's why I pre-insured myself with "wild thought" disclaimer :)

Ok, so #fortgetaboutit URL params, and use RunScope, always :)

There really doesn't seem to be any pre-existing art and should we try to standardize `saga-tracing-token' as the new HTTP header in RFC? Since that takes a while if we have to create a website meanwhile, I will do it. As JSON's history shows us, to "standardize" something all you need is a domain, right? :)

P.S. Per http://tools.ietf.org/html/draft-saintandre-xdash-00 we shouldn't create an 'x-' header temporarily but go for the final thing.

cheers,

Irakli

Owen Rubel

unread,

Dec 8, 2014, 6:02:17 PM12/8/14

to api-...@googlegroups.com

Gonna chime in...

The issue here is that across proxy,api gate and all instances, you are going to have to coordinate information related to api I/O. Thus you need a common apiObject which can relate api data associated with the I/O. This correlated ID is no different that any other data that need to be associated and synced across all instances.

See the apiObject for more information (https://github.com/orubel/grails-api-toolkit-docs/wiki/API-Object)

Peter Williams

unread,

Dec 10, 2014, 1:04:12 AM12/10/14

to api-...@googlegroups.com

On Mon, Dec 8, 2014 at 3:33 PM, Irakli Nadareishvili <ira...@gmail.com> wrote:

There really doesn't seem to be any pre-existing art

I think there is prior art. This all sounds very similar to the "X-Request-ID" request header[1] used by Heroku, RoR and others (though, i am not sure of its exact origin). As the service calls are proxied or fanned out this header allows the sub-requests to be correlated with the upstream request that caused it.

and should we try to standardize `saga-tracing-token' as the new HTTP header in RFC? Since that takes a while if we have to create a website meanwhile, I will do it. As JSON's history shows us, to "standardize" something all you need is a domain, right? :)

Documenting its semantics and uses in an RFC would be very useful. Someone should totally do that. :)

[1]: https://www.google.com/search?q=x-request-id+header

Peter

barelyenough.org

Owen Rubel

unread,

Dec 10, 2014, 10:01:35 AM12/10/14

to api-...@googlegroups.com

The tricky part is within the architecture. Forwards, batching, chaining, redirects through proxies, api gates, MQ, request/response tooling, etc do not always preserve the header. This is why their is redundancy with I/O in tooling.

Andrew B

unread,

Dec 10, 2014, 6:03:35 PM12/10/14

to api-...@googlegroups.com

Thanks Peter, it sounds like the conventions around X-Request-ID are just what we want.

Reply all

Reply to author

Forward