Distributed tracing with opentelemtry collector - headers propagation

68 views
Skip to first unread message

Lautaro Bazalo

unread,
Jan 14, 2025, 4:12:14 PM1/14/25
to envoy-users

Hi everyone,

I'm part of the platform engineering team, and we are developing tracing in our delivery and observability platform.
I was reading the documentation about tracing and started some pocs. In summary, the poc consists of Envoy, two services, an OpenTelemetry Collector, and Grafana with Tempo. The test involved making a request from a client to Service A, which then made another request to Service B. Finally, Service B responded to Service A, which then responded to the client.

The result showed that we only need to propagate the "traceparent" header between the services, which led to our current question. The Envoy tracing documentation states:

"Whichever tracing provider is being used, the service should propagate the x-request-id to enable logging across the invoked services to be correlated."

This seems to contradict our findings. What should we do in this case?

Ric Hincapie

unread,
Jan 15, 2025, 9:47:56 AM1/15/25
to envoy-users
Hi. AFAIU, the x-request-id is not involved in the traceID at all. Those are used for different purposes. If you take a look at a trace, the traceID is different than x-request-id.
The quote you mention is talking about correlating logging across invoked services. So, you want to log in Service A and B the same x-request-id to facilitate debugging.
---------
Take this as an example with standard accesslogs calling with -H "x-request-id: acbdefg123456":
GW:
[2025-01-06T06:18:43.219Z] "GET /productpage HTTP/1.1" 200 - via_upstream - "-" 0 9429 12 12 "10.244.2.6" "curl/8.5.0" "acbdefg123456" "foo.bookinfo.com" "10.244.1.5:9080" outbound|9080||productpage.bookinfo.svc.cluster.local 10.244.2.6:43240 127.0.0.1:80 127.0.0.1:41528 -
Productpage:
[2025-01-06T06:18:43.392Z] "GET /details/0 HTTP/1.1" 200 - via_upstream - "-" 0 178 1 1 "-" "curl/8.5.0" "acbdefg123456" "details:9080" "10.244.1.7:9080" outbound|9080|v1|details.bookinfo.svc.cluster.local 10.244.1.5:38466 10.96.73.84:9080 10.244.1.5:33926 - -
Details:
[2025-01-06T06:18:44.139Z] "GET /details/0 HTTP/1.1" 200 - via_upstream - "-" 0 178 0 0 "-" "curl/8.5.0" "acbdefg123456" "details:9080" "10.244.1.7:9080" inbound|9080|| 127.0.0.6:39381 10.244.1.7:9080 10.244.1.5:38466 invalid:outbound_.9080_.v1_.details.bookinfo.svc.cluster.local default

See how you can follow the request along different proxies?
----------
I'd recommend your apps to propagate all the b3 along with x-request-id, traceparent and tracestate so you don't miss out of any functionality they currently and in the future may provide.

Lautaro Bazalo

unread,
Jan 15, 2025, 6:25:09 PM1/15/25
to envoy-users
Hi Ric, thanks for clarifying!
Do you still recommend propagating the B3 headers? I understand they are used for Zipkin, and since we will only use an Otel collector as a sidecar in each pod, I think we will be fine with just propagating the traceparent.

Ric Hincapie

unread,
Jan 16, 2025, 10:07:12 AM1/16/25
to envoy-users
No worries.
I'd propagate them as the docs suggest in case afterwards you need a different provider or there's a chage in direction. Say. what if at some point you decide onboarding Istio and it integrates very easy, out of the box with Zipkin and some other vendor products? Probaly the best way to go is by the standard.  

Reply all
Reply to author
Forward
0 new messages