Finagle-Zipkin Batteries

123 views
Skip to first unread message

Moses Nakamura

unread,
Jul 6, 2012, 11:59:06 AM7/6/12
to zipki...@googlegroups.com
I'm trying to use the ZipkinTracer in Finagle-Zipkin, and the documentation on github seems to suggest that all I need to do is add it as the .tracerFactory to server builder.  It seems to be properly hooked up to my ZipkinCollector, because if I manually call a Tracer.recordRpcName I get information.  However, it doesn't seem to send any information except for the records that I manually create.  Since the point of Dapper/Zipkin seems to be to be embedded in networking/flow control/rpc libraries, I feel like there must be some way of turning on Finagle-Zipkin's automatic recording.  How do I do this?

Thanks,
Moses Nakamura

Johan Oskarsson

unread,
Jul 6, 2012, 12:46:47 PM7/6/12
to zipki...@googlegroups.com
Hi Moses,

All you should need to do is to set the tracerFactory with ZipkinTracer. Now, that will work for the finagle modules we have added tracing. So far they include finagle-thrift, finagle-http, finagle-redis and finagle-memcache. Are you using any of these?

For an example of how we create the "core annotations" (client send, client received, server received and server send) see the http codec here. https://github.com/twitter/finagle/blob/master/finagle-http/src/main/scala/com/twitter/finagle/http/Codec.scala (in particular the classes HttpClientTracingFilter and HttpServerTracingFilter).

For thrift, memcache and redis there is nothing special you need to do. Unfortunately for finagle-http you need to explicitly enable tracing (see the case class Http in the above file). This is so that people can use finagle-http externally. I've created a todo for myself to make a note about this in the finagle docs.

/Johan

Nakamura

unread,
Jul 6, 2012, 2:22:30 PM7/6/12
to zipki...@googlegroups.com
Hi Johan,
That worked great for my HttpService, thanks!

I am using finagle-thrift's ThriftServerFramedCodec, and then protocol things from apache thrift.  I'm not sure if that's enough, but right now I'm only making http calls, so I wouldn't be able to tell.
However, now the web interface doesn't seem to be populating itself, although serviceNames and spanNames are both working.  Do I need to implement Aggregates for the web interface to work?  I am using the NullAggregates.

Also, when I clicked on the web service to get all spans, it sends a null serviceName to getTraceIdsByServiceName (which I guess is supposed to signal all?) but there is code in ZipkinQuery which seems to throw an exception when it gets a null serviceName, although my service isn't spewing errors to stdout.  Should I have implemented code in the index that lets me get all spans if I'm sent a null serviceName?  If so, why isn't the interface for getByServiceName Option[String]?

Sorry for all of the questions, I am trying to figure out what I need to implement for myself.  I'm not using Cassandra, or ZK, so I've removed the references to ZK and I've reimplemented the Index and Storage traits, and replaced the CassandraAggregates with NullAggregates.

Best,
Moses

Johan Oskarsson

unread,
Jul 6, 2012, 2:48:18 PM7/6/12
to zipki...@googlegroups.com
Hey,

Cool!

So are you using a modified finagle-thrift? Not sure I understood that part.

You should not need to implement aggregates no. That's just for typeahead suggestion of annotations in the ui, it will work without that.

getTraceIdsByServiceName always needs a service name, so that should not be null. So that gets sent by the web ui even if you select one of the services in the list? Must be a bug there then, could you describe the scenario in a bit more detail?

Out of curiosity what back end storage are you using? Would you be willing to contribute that back? Ideally we could split out Cassandra into zipkin-storage-cassandra and add another storage next to it. 
Same for Zk, while I recommend having a way to change the sample rate on the fly in production if you don't want to rely on it, could you help us make that a configuration setting? Most of it should already be optional. Would be neat if we could stick to a shared code base.

Thanks again for looking into Zipkin.

/Johan

Nakamura

unread,
Jul 6, 2012, 3:13:13 PM7/6/12
to zipki...@googlegroups.com
Hey,

I haven't modified my finagle-thrift, I just wasn't sure which classes I needed to be using from finagle-thrift in order to get the benefits.  The class I am using is the codec I mentioned.

I found a bug in that the traces/services_json endpoint in zipkin-web returns an empty array every time.  I would make a pull request, but my version of zipkin-web is pretty hacked, and I'm not good enough at ruby to know whether my hack is kosher or not.  Because traces/services_json wasn't returning valid service names, I was sending a null for my serviceName.

I've implemented it for redis and I've done an in memory version, but obviously that one only works if the query and collector are both launched by the same java process.  When they aren't horrible hacks I'll probably talk to my boss about open sourcing them, yes.  I've also been working on a Mongo driver in my free time, but I don't have that much free time.

Best,
Moses

Johan Oskarsson

unread,
Jul 6, 2012, 3:50:12 PM7/6/12
to zipki...@googlegroups.com
On 6 jul 2012, at 12:13, Nakamura wrote:

Hey,

I haven't modified my finagle-thrift, I just wasn't sure which classes I needed to be using from finagle-thrift in order to get the benefits.  The class I am using is the codec I mentioned.


Ah I see, yeah that should be all you need to do. Depends a bit on how the rest of your Thrift infrastructure works. If you call Finagle -> Finagle you should get one Span with the information from that call. If you do Finagle -> Vanilla Thrift you'd only get the client annotations. Same the other way around.

I found a bug in that the traces/services_json endpoint in zipkin-web returns an empty array every time.  I would make a pull request, but my version of zipkin-web is pretty hacked, and I'm not good enough at ruby to know whether my hack is kosher or not.  Because traces/services_json wasn't returning valid service names, I was sending a null for my serviceName.

We're working on throwing our the Ruby layer completely and moving that into Scala too. That would save us one deployment complexity and probably make the UI a bit faster too. Hopefully that fixes your issue too (but probably introduces a few new ones). In the meantime if you can isolate the fix in a pull request we'd love to take a look at it.

I've implemented it for redis and I've done an in memory version, but obviously that one only works if the query and collector are both launched by the same java process.  When they aren't horrible hacks I'll probably talk to my boss about open sourcing them, yes.  I've also been working on a Mongo driver in my free time, but I don't have that much free time.

Cool, sounds good.

Thanks
Reply all
Reply to author
Forward
0 new messages