Native support for OpenTracing

160 views
Skip to first unread message

Daniel Schierbeck

unread,
Oct 17, 2018, 8:58:11 AM10/17/18
to Ruby on Rails: Core
With the advent of OpenTracing (https://opentracing.io) along with an official Ruby library (https://github.com/opentracing/opentracing-ruby) as well as growing industry support (e.g. https://www.datadoghq.com/blog/opentracing-datadog-cncf/ and http://blog.scoutapp.com/articles/2018/01/17/tutorial-distributed-tracing-in-ruby-with-opentracing), would it make sense to provide native tracing of Rails using the opentracing-ruby library?

At Zendesk we're using Datadog's proprietary tracing API, which monkey patches Rails and other libraries in order to trace key interactions. I think a more sustainable approach would be for libraries to include tracing support out of the box using the standardized OpenTracing APIs. It is then merely a matter of hooking up e.g. the Datadog trace collector in order to get a working tracing setup.

If there's interest in this I'd be willing to contribute code. I've done a bunch of working in order to trace various aspects of Rails, including Rack middleware and before/after filters, but without native support, these implementations are brittle and prone to breakage when internal Rails APIs change.

I'd love to hear your thoughts on this.

Cheers,
Daniel Schierbeck

Jeremy Daer

unread,
Oct 17, 2018, 1:25:53 PM10/17/18
to rubyonra...@googlegroups.com
Hey Daniel,

Absolutely! We're looking at OpenCensus (https://opencensus.io) integration, which seems to be leapfrogging OpenTracing in standardization and adoption.

Current Ruby integration, including early Rack and Rails support: https://github.com/census-instrumentation/opencensus-ruby


Production use is fantastic, but I'd particularly love to see a collector and built-in visualization for local app development and tests.

We have an existing ActiveSupport::Notifications API which works much like typical parent-span instrumentation, but it doesn't propagate or report trace context. For deeper Rails integration, we could adapt the AS::N design to more directly map to OpenCensus, or introduce ActiveSupport::Tracing if there's too much mismatch or compatibility concern. That'd allow these libraries to plug in directly without needing to carefully instrument Rails on their own. Rails should be able to participate in distributed tracing out of the box, report stats out of the box, show traces and stats in development mode, and flip between production APM vendors without specialized integration.

At Basecamp, we have a home-grown StatsD setup, similar to Datadog, that hooks Active Support notifications (https://signalvnoise.com/posts/3091-pssst-your-rails-application-has-a-secret-to-tell-you). We also parse logs from Kafka to reconstruct some traces. We'd love to extract this and rely on Rails to natively export traces and stats.

I'd love to hear what you're doing at Zendesk, where you're headed, and whether this sketch aligns well. And anyone else who's working in this area!

Best,
Jeremy

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rubyonrails-co...@googlegroups.com.
To post to this group, send email to rubyonra...@googlegroups.com.
Visit this group at https://groups.google.com/group/rubyonrails-core.
For more options, visit https://groups.google.com/d/optout.

Daniel Schierbeck

unread,
Oct 18, 2018, 5:27:17 AM10/18/18
to rubyonra...@googlegroups.com
On Wed, Oct 17, 2018 at 7:26 PM Jeremy Daer <jerem...@gmail.com> wrote:
Hey Daniel,

Absolutely! We're looking at OpenCensus (https://opencensus.io) integration, which seems to be leapfrogging OpenTracing in standardization and adoption.

So now there are two standards? 😬
Is there clarity on where things are going? The point of a standard would be that we'd only need to support *one*, and not have an extra layer of abstraction.
Compared to OpenTracing, is the ecosystem mature enough to warrant us going all-in on this? I definitely see the theoretical point of a unified stats and trace standard, especially seeing as Statsd has fragmented somewhat, but is it a horse we want to bet on? I'm fine with either, as long as there are working, scalable solutions *today* for getting things working in a variety of languages and without duct tape. For instance, it seems like the Datadog exporter only supports Go?
 
Production use is fantastic, but I'd particularly love to see a collector and built-in visualization for local app development and tests.

Me too 😄 but that's probably not going to be my initial focus, I'm meanly looking at the instrumentation side of things.
 

We have an existing ActiveSupport::Notifications API which works much like typical parent-span instrumentation, but it doesn't propagate or report trace context. 
For deeper Rails integration, we could adapt the AS::N design to more directly map to OpenCensus, or introduce ActiveSupport::Tracing if there's too much mismatch or compatibility concern.

I've done a bunch of work on AS::N in the past (I'm the one who created the original ActiveSupport::Subscriber base class) and feel pretty confident that it can form the basis for this work. We'd probably want to instrument more places, e.g. each middleware invocation and maybe filters in controllers, but otherwise it's a good starting point.

I do think we need to have a specific mapping from AS::N to the tracing backend, selecting which payload keys should be propagated and maybe formatting some of the values, so it's probably not just a matter of copying everything verbatim. It sounds like you're doing the verbatim thing at Basecamp though – how is that working out? Would you be in favor of that?

We've seen issues when tracing gets too granular or too much data is captured, so I'd like to be a bit conservative.
 
That'd allow these libraries to plug in directly without needing to carefully instrument Rails on their own. Rails should be able to participate in distributed tracing out of the box, report stats out of the box, show traces and stats in development mode, and flip between production APM vendors without specialized integration.

Yup, that's my goal as well. APM vendors should not compete on their quality of instrumentation, but on the quality of their product. One thing I want to emphasize though is that I think we need to push for standardization *beyond* Rails. AS::N could have been a great standard if it wasn't tied to AS – it hasn't seen widespread adoption because gem authors are unwilling to add a dependency on AS, I think. So we should think holistically about the entire ecosystem and what would make sense for Ruby as a whole.
 
At Basecamp, we have a home-grown StatsD setup, similar to Datadog, that hooks Active Support notifications (https://signalvnoise.com/posts/3091-pssst-your-rails-application-has-a-secret-to-tell-you). We also parse logs from Kafka to reconstruct some traces. We'd love to extract this and rely on Rails to natively export traces and stats. 

I'd love to hear what you're doing at Zendesk, where you're headed, and whether this sketch aligns well. And anyone else who's working in this area!

We're currently all-in on Datadog, and I've helped improve their instrumentation. However, I keep running into ad-hoc instrumentation being brittle, which is why I'm interested in first-class support. I think the only sustainable path forward is that gems natively support some form of tracing, either through AS::N (which would need to be extracted) or directly with a standardized tracing gem.

How would you feel about extracting AS::N, actually? Then gems could adopt it for pub/sub and it would be a lot simpler to plug in a tracing subscriber.
 

Best,
Jeremy

On Wed, Oct 17, 2018 at 5:58 AM 'Daniel Schierbeck' via Ruby on Rails: Core <rubyonra...@googlegroups.com> wrote:
With the advent of OpenTracing (https://opentracing.io) along with an official Ruby library (https://github.com/opentracing/opentracing-ruby) as well as growing industry support (e.g. https://www.datadoghq.com/blog/opentracing-datadog-cncf/ and http://blog.scoutapp.com/articles/2018/01/17/tutorial-distributed-tracing-in-ruby-with-opentracing), would it make sense to provide native tracing of Rails using the opentracing-ruby library?

At Zendesk we're using Datadog's proprietary tracing API, which monkey patches Rails and other libraries in order to trace key interactions. I think a more sustainable approach would be for libraries to include tracing support out of the box using the standardized OpenTracing APIs. It is then merely a matter of hooking up e.g. the Datadog trace collector in order to get a working tracing setup.

If there's interest in this I'd be willing to contribute code. I've done a bunch of working in order to trace various aspects of Rails, including Rack middleware and before/after filters, but without native support, these implementations are brittle and prone to breakage when internal Rails APIs change.

I'd love to hear your thoughts on this.

Cheers,
Daniel Schierbeck

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rubyonrails-co...@googlegroups.com.

To post to this group, send email to rubyonra...@googlegroups.com.
Visit this group at https://groups.google.com/group/rubyonrails-core.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "Ruby on Rails: Core" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rubyonrails-core/qJlL_uVxsnU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rubyonrails-co...@googlegroups.com.

Daniel Schierbeck

unread,
Oct 18, 2018, 5:52:09 AM10/18/18
to rubyonra...@googlegroups.com
Looks like OpenCensus already has support for development mode UIs, currently only for Java and Go though: https://opencensus.io/core-concepts/z-pages/

Have you deployed an OpenCensus integration to production? At least the metrics part looks pretty advanced, maybe too much so.

Cheers,
Daniel

Jeremy Daer

unread,
Oct 18, 2018, 3:16:17 PM10/18/18
to rubyonra...@googlegroups.com
On Thu, Oct 18, 2018 at 2:27 AM 'Daniel Schierbeck' via Ruby on Rails: Core <rubyonra...@googlegroups.com> wrote:
On Wed, Oct 17, 2018 at 7:26 PM Jeremy Daer <jerem...@gmail.com> wrote:
Hey Daniel,

Absolutely! We're looking at OpenCensus (https://opencensus.io) integration, which seems to be leapfrogging OpenTracing in standardization and adoption.

So now there are two standards? 😬
Is there clarity on where things are going? The point of a standard would be that we'd only need to support *one*, and not have an extra layer of abstraction.

Right? 😂

It's still not all that clear and I haven't seen a great discussion of the hows and whys, but it seems that OpenTracing is a common-API effort whereas OpenCensus is an umbrella effort covering common formats / wire protocol and library/exporter implementations. OpenCensus clients could talk to OpenTracing services.

The fact that neither project seems to directly address the other in their FAQs suggests that there are deeper organizational or community roots to their kinda-overlapping-but-kinda-distinct disposition.

We're running with OpenCensus because it standardizes protocols, exporters, and implementations, which means we don't end up with a common API but saddled with vendor-specific "leakage" that makes it hard to switch APMs in practice.

Compared to OpenTracing, is the ecosystem mature enough to warrant us going all-in on this? I definitely see the theoretical point of a unified stats and trace standard, especially seeing as Statsd has fragmented somewhat, but is it a horse we want to bet on? I'm fine with either, as long as there are working, scalable solutions *today* for getting things working in a variety of languages and without duct tape. For instance, it seems like the Datadog exporter only supports Go?

The ecosystem is not mature enough, but we're skating to that puck, so to speak, with Rails 6. We can drive maturity on the Ruby end by holding the local exporters, client, and app integration to our "just works" standard.

 
Production use is fantastic, but I'd particularly love to see a collector and built-in visualization for local app development and tests.

Me too 😄 but that's probably not going to be my initial focus, I'm meanly looking at the instrumentation side of things.
 

We have an existing ActiveSupport::Notifications API which works much like typical parent-span instrumentation, but it doesn't propagate or report trace context. 
For deeper Rails integration, we could adapt the AS::N design to more directly map to OpenCensus, or introduce ActiveSupport::Tracing if there's too much mismatch or compatibility concern.

I've done a bunch of work on AS::N in the past (I'm the one who created the original ActiveSupport::Subscriber base class) and feel pretty confident that it can form the basis for this work. We'd probably want to instrument more places, e.g. each middleware invocation and maybe filters in controllers, but otherwise it's a good starting point.

 Sweet! Yes.
 
I do think we need to have a specific mapping from AS::N to the tracing backend, selecting which payload keys should be propagated and maybe formatting some of the values, so it's probably not just a matter of copying everything verbatim. It sounds like you're doing the verbatim thing at Basecamp though – how is that working out? Would you be in favor of that?

We're doing a lot of mapping/translation/filtering, too. Particularly since we are using StatsD without tagging support.

We've seen issues when tracing gets too granular or too much data is captured, so I'd like to be a bit conservative.

Ditto. Traces should be introduced where meaningful and actionable, not just because we have the data.

 
That'd allow these libraries to plug in directly without needing to carefully instrument Rails on their own. Rails should be able to participate in distributed tracing out of the box, report stats out of the box, show traces and stats in development mode, and flip between production APM vendors without specialized integration.

Yup, that's my goal as well. APM vendors should not compete on their quality of instrumentation, but on the quality of their product. One thing I want to emphasize though is that I think we need to push for standardization *beyond* Rails. AS::N could have been a great standard if it wasn't tied to AS – it hasn't seen widespread adoption because gem authors are unwilling to add a dependency on AS, I think. So we should think holistically about the entire ecosystem and what would make sense for Ruby as a whole.

Agreed. A higher-level AS::N-like DSL in OpenCensus would be welcome. Using it directly feels pretty bare-metal today.

 
At Basecamp, we have a home-grown StatsD setup, similar to Datadog, that hooks Active Support notifications (https://signalvnoise.com/posts/3091-pssst-your-rails-application-has-a-secret-to-tell-you). We also parse logs from Kafka to reconstruct some traces. We'd love to extract this and rely on Rails to natively export traces and stats. 

I'd love to hear what you're doing at Zendesk, where you're headed, and whether this sketch aligns well. And anyone else who's working in this area!

We're currently all-in on Datadog, and I've helped improve their instrumentation. However, I keep running into ad-hoc instrumentation being brittle, which is why I'm interested in first-class support. I think the only sustainable path forward is that gems natively support some form of tracing, either through AS::N (which would need to be extracted) or directly with a standardized tracing gem.

How would you feel about extracting AS::N, actually? Then gems could adopt it for pub/sub and it would be a lot simpler to plug in a tracing subscriber.

I'd be concerned about being able to evolve AS::N in step with Rails. But I think there's definitely room for extracting the underlying setup to OpenCensus, if they're open to that, or to a higher-level gem that wraps OpenCensus. Then AS::N could start to rely on that directly, rather than bridging to it.

Jeremy Daer

unread,
Oct 18, 2018, 3:23:52 PM10/18/18
to rubyonra...@googlegroups.com
On Thu, Oct 18, 2018 at 2:52 AM 'Daniel Schierbeck' via Ruby on Rails: Core <rubyonra...@googlegroups.com> wrote:
Looks like OpenCensus already has support for development mode UIs, currently only for Java and Go though: https://opencensus.io/core-concepts/z-pages/

This is a great starting point. Rails dev can level up from there.

Have you deployed an OpenCensus integration to production? At least the metrics part looks pretty advanced, maybe too much so.

Not in production. We have a branch of Basecamp that exports to Stackdriver. In production, we'd prefer to use local agents and collectors rather than export directly to a vendor: https://github.com/census-instrumentation/opencensus-service

Daniel Schierbeck

unread,
Oct 19, 2018, 4:55:52 AM10/19/18
to rubyonra...@googlegroups.com
On Thu, Oct 18, 2018 at 9:24 PM Jeremy Daer <jerem...@gmail.com> wrote:
On Thu, Oct 18, 2018 at 2:52 AM 'Daniel Schierbeck' via Ruby on Rails: Core <rubyonra...@googlegroups.com> wrote:
Looks like OpenCensus already has support for development mode UIs, currently only for Java and Go though: https://opencensus.io/core-concepts/z-pages/

This is a great starting point. Rails dev can level up from there.

Have you deployed an OpenCensus integration to production? At least the metrics part looks pretty advanced, maybe too much so.

Not in production. We have a branch of Basecamp that exports to Stackdriver. In production, we'd prefer to use local agents and collectors rather than export directly to a vendor: https://github.com/census-instrumentation/opencensus-service

Sounds like you are farther ahead than us then – we're pushing stuff directly to Datadog right now.

How about this as a starting point: I try to add proper AS::N instrumentation to the places where monkey patches are currently used, e.g. middleware execution. I'll CC you on the PRs. Maybe we can stay in touch regarding your experience with OpenCensus in production, what, if anything, would be needed in order to "natively" support it, and anything else you might think relevant? It sounds like it's too premature to add a dependency on the opencensus gem and push traces from Rails itself, unless you think we're ready?

One problem is that we're unlikely to be able to run on Rails master in production, so there's little production feedback I can give.

Jeremy Daer

unread,
Oct 19, 2018, 2:23:34 PM10/19/18
to rubyonra...@googlegroups.com
On Fri, Oct 19, 2018 at 1:55 AM 'Daniel Schierbeck' via Ruby on Rails: Core <rubyonra...@googlegroups.com> wrote:
On Thu, Oct 18, 2018 at 9:24 PM Jeremy Daer <jerem...@gmail.com> wrote:
On Thu, Oct 18, 2018 at 2:52 AM 'Daniel Schierbeck' via Ruby on Rails: Core <rubyonra...@googlegroups.com> wrote:
Looks like OpenCensus already has support for development mode UIs, currently only for Java and Go though: https://opencensus.io/core-concepts/z-pages/

This is a great starting point. Rails dev can level up from there.

Have you deployed an OpenCensus integration to production? At least the metrics part looks pretty advanced, maybe too much so.

Not in production. We have a branch of Basecamp that exports to Stackdriver. In production, we'd prefer to use local agents and collectors rather than export directly to a vendor: https://github.com/census-instrumentation/opencensus-service

Sounds like you are farther ahead than us then – we're pushing stuff directly to Datadog right now.

Your instrumentation is likely further along since Datadog supports tagging and ingests traces :)

How about this as a starting point: I try to add proper AS::N instrumentation to the places where monkey patches are currently used, e.g. middleware execution. I'll CC you on the PRs. Maybe we can stay in touch regarding your experience with OpenCensus in production, what, if anything, would be needed in order to "natively" support it, and anything else you might think relevant? It sounds like it's too premature to add a dependency on the opencensus gem and push traces from Rails itself, unless you think we're ready?

Great path; agreed.

One problem is that we're unlikely to be able to run on Rails master in production, so there's little production feedback I can give.

A "rails-canary" branch could be enough!

Daniel Azuma

unread,
Oct 19, 2018, 4:15:15 PM10/19/18
to Ruby on Rails: Core
Hi folks,

Thought I'd jump in here as the engineer who has done most of the implementation on the opencensus gem so far. Ruby support in OpenCensus is currently a bit behind other languages—we don't yet have support for stats, z-pages, and some other things. So we're starting a push to get it up to date; I've been doing some updates myself, and it looks like Google will be donating another engineer for a period of time.

I'd love to help get OpenCensus's instrumentation fleshed out for people's use cases. The current gem does have basic integration with AS::N to collect trace information for events that are instrumented, but I'm trying very very hard not to introduce monkey patches. If you're using the opencensus gem and have particular instrumentation needs, I'll be happy to help with PRs and get them committed upstream. Please don't hesitate to reach out to me.

There also isn't a Datadog exporter yet for Ruby (that I know of), but I'd love to help get one started up.

Daniel Azuma

Daniel Schierbeck

unread,
Oct 22, 2018, 10:56:04 AM10/22/18
to rubyonra...@googlegroups.com
On Fri, Oct 19, 2018 at 10:15 PM Daniel Azuma <daz...@gmail.com> wrote:
Hi folks,

Thought I'd jump in here as the engineer who has done most of the implementation on the opencensus gem so far. Ruby support in OpenCensus is currently a bit behind other languages—we don't yet have support for stats, z-pages, and some other things. So we're starting a push to get it up to date; I've been doing some updates myself, and it looks like Google will be donating another engineer for a period of time.

Sounds great!
 
I'd love to help get OpenCensus's instrumentation fleshed out for people's use cases. The current gem does have basic integration with AS::N to collect trace information for events that are instrumented, but I'm trying very very hard not to introduce monkey patches. If you're using the opencensus gem and have particular instrumentation needs, I'll be happy to help with PRs and get them committed upstream. Please don't hesitate to reach out to me.

One thing I'm unsure of is naming conventions – with Datadog, we'll have a "span name" that's close to e.g. the AS::N event names, such as `rack.request`. In addition to that, there's the notion of a "resource", typically the name of an endpoint, e.g. `ArticlesController#show`. That part seems to be missing from OpenCensus, and the span names are overloaded with both span type info and "endpoint" names. Is there a standardized way to capture both? This is important because it's nice to have a small set of span types, but the resources can number in the thousands and you'll typically filter those.
 

There also isn't a Datadog exporter yet for Ruby (that I know of), but I'd love to help get one started up.

I don't really get this part of OC – since there's a standard wire format, would you not want an external process doing the exporting?
 

Daniel Azuma

Jaana Burcu Dogan

unread,
Oct 22, 2018, 7:41:57 PM10/22/18
to Ruby on Rails: Core


On Monday, October 22, 2018 at 7:56:04 AM UTC-7, Daniel Schierbeck wrote:
On Fri, Oct 19, 2018 at 10:15 PM Daniel Azuma <daz...@gmail.com> wrote:
Hi folks,

Thought I'd jump in here as the engineer who has done most of the implementation on the opencensus gem so far. Ruby support in OpenCensus is currently a bit behind other languages—we don't yet have support for stats, z-pages, and some other things. So we're starting a push to get it up to date; I've been doing some updates myself, and it looks like Google will be donating another engineer for a period of time.

Sounds great!
 
I'd love to help get OpenCensus's instrumentation fleshed out for people's use cases. The current gem does have basic integration with AS::N to collect trace information for events that are instrumented, but I'm trying very very hard not to introduce monkey patches. If you're using the opencensus gem and have particular instrumentation needs, I'll be happy to help with PRs and get them committed upstream. Please don't hesitate to reach out to me.

One thing I'm unsure of is naming conventions – with Datadog, we'll have a "span name" that's close to e.g. the AS::N event names, such as `rack.request`. In addition to that, there's the notion of a "resource", typically the name of an endpoint, e.g. `ArticlesController#show`. That part seems to be missing from OpenCensus, and the span names are overloaded with both span type info and "endpoint" names. Is there a standardized way to capture both? This is important because it's nice to have a small set of span types, but the resources can number in the thousands and you'll typically filter those.
 

There also isn't a Datadog exporter yet for Ruby (that I know of), but I'd love to help get one started up.

I don't really get this part of OC – since there's a standard wire format, would you not want an external process doing the exporting?

Jeremy Daer

unread,
Oct 22, 2018, 10:33:44 PM10/22/18
to rubyonra...@googlegroups.com
On Mon, Oct 22, 2018 at 9:56 AM 'Daniel Schierbeck' via Ruby on Rails: Core <rubyonra...@googlegroups.com> wrote:
On Fri, Oct 19, 2018 at 10:15 PM Daniel Azuma <daz...@gmail.com> wrote:
I'd love to help get OpenCensus's instrumentation fleshed out for people's use cases. The current gem does have basic integration with AS::N to collect trace information for events that are instrumented, but I'm trying very very hard not to introduce monkey patches. If you're using the opencensus gem and have particular instrumentation needs, I'll be happy to help with PRs and get them committed upstream. Please don't hesitate to reach out to me.

One thing I'm unsure of is naming conventions – with Datadog, we'll have a "span name" that's close to e.g. the AS::N event names, such as `rack.request`. In addition to that, there's the notion of a "resource", typically the name of an endpoint, e.g. `ArticlesController#show`. That part seems to be missing from OpenCensus, and the span names are overloaded with both span type info and "endpoint" names. Is there a standardized way to capture both? This is important because it's nice to have a small set of span types, but the resources can number in the thousands and you'll typically filter those.

Not sure! Would check out the Datadog Go exporter to start, since it's mapping from one span to another. Looks like it's just using the span name as the resource name rather than pulling it from an annotated attribute.

I'd naively expect to see spans annotated with the controller action (picked up from the active trace context) and have that exported as Datadog resource.

I don't really get this part of OC – since there's a standard wire format, would you not want an external process doing the exporting?

Direct export can be appealing for easy-setup or quick-deploy scenarios like dev/test, one-click Heroku apps, or short-lived services like one-off jobs run outside the main cluster.

Daniel Schierbeck

unread,
Oct 26, 2018, 5:51:52 AM10/26/18
to rubyonra...@googlegroups.com

--

Jeremy Daer

unread,
Oct 26, 2018, 10:57:12 PM10/26/18
to rubyonra...@googlegroups.com
Nice! 👍

You received this message because you are subscribed to the Google Groups "Ruby on Rails: Core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rubyonrails-co...@googlegroups.com.

Jeremy Daer

unread,
Mar 19, 2019, 11:41:06 AM3/19/19
to rubyonra...@googlegroups.com
And merged (targeting Rails 6.0.0.beta4) 😊

Daniel Schierbeck

unread,
Mar 19, 2019, 1:48:35 PM3/19/19
to rubyonra...@googlegroups.com
Awesome! Are you working on other OC integrations?
Reply all
Reply to author
Forward
0 new messages