This is true for calls to the ClientCall methods, but is not for calls to ClientCalls.Listener. The calls to the ClientCall.Listener methods are in the reverse order as the calls to ClientCall methods. So if both a and b above "intercept" the ClientCall.Listener, then the ordering is something like this:// startCall: b called firsta.startCall(b.startCall())// onMessage: a called firstb.onMessage(a.onMessage())
This makes it impossible for an interceptor to do something like:* setup some ThreadLocal state* invoke the next interceptor's method (e.g.: start or onMessage)* teardown ThreadLocal state
This is a common pattern for setting up tracing or logging context or integrating with other frameworks that use ThreadLocal.
This is 100% the behavior you want; anything else leads to madness. Interceptors inject themselves to the boundary because the application and the gRPC library and allow you to do "whatever you want" with the API. This produces a layering, and to preserve the layering "who calls who" must be the exact opposite for callbacks as normal method calls.
This makes it impossible for an interceptor to do something like:* setup some ThreadLocal state* invoke the next interceptor's method (e.g.: start or onMessage)* teardown ThreadLocal stateHow so? Each method call may be on a different thread, so you have to setup and teardown the ThreadLocal every method:
This is 100% the behavior you want; anything else leads to madness. Interceptors inject themselves to the boundary because the application and the gRPC library and allow you to do "whatever you want" with the API. This produces a layering, and to preserve the layering "who calls who" must be the exact opposite for callbacks as normal method calls.I believe this can make sense in some cases, but I'm not sure that's the case in general. I believe there are cases where you want the interceptor to be at the same "logical" place on both paths. For example, a tracing interceptor wants to be first on the outbound path so create a new span and wants to be first on the callback path to make it the "active span" again. As far as I can tell, it is not possible to have this setup with the current API.
Furthermore, the ServerInterceptor doesn't behave like this, i.e.: the calls to ServerCall.Listener are in the same order as the calls to interceptCall. So on that point, perhaps some small documentation updates would help people figure this out.
Hmm, right, my example was incomplete. What I meant to say was that the same interceptor cannot do this kind of setup for both the outbound and callback paths.
Again, taking the tracing interceptor as an example, I believe you would want something like this:
void onMessage(...) {try(activateSpan(span)) { super.onMessage(...) }}
I believe you want the tracing to be setup before all other interceptors on both paths. For example, so that log statements related to that request are attached to the same span.
No, it really doesn't. What if there is a retrying interceptor? How about fault injection? How about binary logging? From our own experience of creating a tracing interceptor, I will strongly assert that it should not be split as you suggest. Initially it was unclear, but as time has gone on it has only been more obvious that it needs to be a single point. Otherwise most other features have to have custom tracing support to make everything cooperate.
Ok, this is interesting and it would be great if the experience of creating that tracing interceptor was shared since it is pretty clear from my own assumptions and looking at existing tracing libraries that they're all doing it "wrong". Is there an existing write-up of this somewhere? I'd be happy to contribute something in https://grpc.io/docs/guides/ if such a thing exists.
Furthermore, the ServerInterceptor doesn't behave like this, i.e.: the calls to ServerCall.Listener are in the same order as the calls to interceptCall. So on that point, perhaps some small documentation updates would help people figure this out.That is factually incorrect. Rerun whatever test made you come to that decision.
I've written this test: https://github.com/plaflamme/grpc-java/pull/1 to replicate what I was seeing. The test checks that the order is the same and according to the CI, the test passes. I'm certainly missing something, so I'd appreciate a quick look if you can.
It can, by intercepting both the Call+Listener. But generally it is nonsensical to do so, since in one case it would call the library and in the other the application.
Yes, I agree with that. But my particular goal is to instrument the gRPC interceptors themselves. The application is "shielded" from this because it has its own "request-local" state that it will restore when the callback reaches it.For context, my particular goal is to make sure that some logging interceptor has visibility into the current span so it can add the span's ID to the log statements for correlation purposes. In practice, we do this through the "MDC" logging feature, but it's conceptually the same. So it seems important that this "tracing interceptor" be "before" the logging one in both directions.
Again, taking the tracing interceptor as an example, I believe you would want something like this:void onMessage(...) {try(activateSpan(span)) { super.onMessage(...) }}It took us a while to realize this the hard way, but you don't actually want to use the span for the callbacks. That can produce an infinite cascading of spans, as completing some work can cause other work to begin. You may want to propagate the initial span though, the span that was active before being replaced with `span`.
Agreed. This is the case in our application stack: it will correctly re-activate the initial / parent span "on top", so the "callback span" isn't really problematic. But it does mean that the child span covers a part of the application that it shouldn't, so I definitely see your point here. We probably need to adjust this so that the child span is not just hidden, but actually "closed". Thanks for this!
I think different people may come to different decisions depending on what is important to them. I honestly don't know what log statements an interceptor would do such that I'd care about the span, with the exception of a "log every RPC" interceptor.
In our case, we want to log the status code and description when the status is !OK. Furthermore, we need this statement to be attached to the request's span which was setup at the very beginning of the call (so that the span covers the whole request). We don't log anything on the outbound path, so I suppose we could more strategically place the tracing interceptor, but that seems pretty brittle. Any thoughts on a recommended approach for such a setup?
I don't know of any libraries doing this wrong.
No, there's no write-up. I honestly don't know what would be put in a general-purpose document. I do think there could be something like "this is how to integrate gRPC with MDC". But really, it seems best to just have a repo with some code somewhere with utilities instead of each user reinventing the wheel.
Maybe the documentation could make that more clear. Maybe we should also downplay which runs "first" and emphasize placement further or closer to network/application.
It can, by intercepting both the Call+Listener. But generally it is nonsensical to do so, since in one case it would call the library and in the other the application.Yes, I agree with that. But my particular goal is to instrument the gRPC interceptors themselves. The application is "shielded" from this because it has its own "request-local" state that it will restore when the callback reaches it.For context, my particular goal is to make sure that some logging interceptor has visibility into the current span so it can add the span's ID to the log statements for correlation purposes. In practice, we do this through the "MDC" logging feature, but it's conceptually the same. So it seems important that this "tracing interceptor" be "before" the logging one in both directions.I see. MDC. I do think you should split the problem into two different parts: MDC-propagation and Span-creation. Those should be separate components and be considered separately. And they have to be considered separately for client-side and server-side.
For client-side, I think you could have an MDC-propagation interceptor that sits very close to the network. Any interceptor that modifies MDC would modify the downward calls (Interceptor, ClientCall) and restore the parent value on the upward calls (Listener) (as discussed elsewhere).
For server-side, I think you set the Span on the downward calls (Interceptor, Listener)... and you could restore the original value on the upward calls (ServerCall), but it seems nothing may care. It seems there wouldn't be any MDC propagation here.
Does that sound like it would work?
Okay, so it is a "log the RPC" interceptor. I don't know if this is on client-side or server-side. I think splitting the MDC propagation from the span creation addresses this on client-side. Server-side is harder because there's nothing that would guarantee the "correct" span to be active in the MDC when the server responds; the interceptor may need to save the span in the request path to use it in the response path.