Finagle-Zipkin HTTP tracing trouble

1,215 views
Skip to first unread message

Robbie Strickland

unread,
Sep 20, 2012, 11:26:59 AM9/20/12
to zipki...@googlegroups.com
I am trying to understand how to correctly instrument a Finagle service using Zipkin by following the simple HTTP example, but I am not seeing any traces.  I am attempting to post directly to the collector running on my local machine, which I have verified to work correctly using the test app in zipkin-test.  I set the sample rate to 1.0 and even added some explicit annotations, but I see no evidence of tracing on the collector.  I am querying the service using a simple curl request.  Here's my code:

object ServerTest extends App {
  val service: Service[HttpRequest, HttpResponse] = new Service[HttpRequest, HttpResponse] {
    def apply(request: HttpRequest) = {
      Trace.record("Got a request")
      Future {
        Trace.record("Sending response")
        new DefaultHttpResponse(HTTP_1_1, OK)
      }
    }
  }

  val address: SocketAddress = new InetSocketAddress(10000)

  ServerBuilder()
    .codec(Http())
    .bindTo(address)
    .name("HttpServer")
    .tracerFactory(ZipkinTracer(scribeHost="Robbie-Stricklands-MacBook-Pro.local", scribePort=9410, sampleRate=1.0F))
    .build(service)
}

Thanks in advance for any help!

Robbie

Nakamura

unread,
Sep 20, 2012, 11:54:52 AM9/20/12
to zipki...@googlegroups.com
The Http() codec needs to have tracing explicitly enabled for it, that should help.   Look for the method enableTracing in the http/Codec.scala file.  Trace.record doesn't make the Tracer send traces.  That doesn't happen until logSpan gets called in RawZipkinTracer.scala.  Also, for debugging purposes, ConsoleTracer can't be beat, worth looking into, to at least see if your tracing code is at least getting hit.


Robbie

--
 
 
 

Robbie Strickland

unread,
Sep 20, 2012, 1:58:06 PM9/20/12
to zipki...@googlegroups.com
Ok, so I confirmed my tracing code is getting hit using ConsoleTracer, and I changed my codec config to ".codec(Http().enableTracing(true))" as you indicated.  I also see that logSpan is getting called by DeadlineSpanMap after 120 seconds if it "isn't triggered by a natural end".  So given this, should I presume that my traces would get logged after 120 seconds no matter what (which isn't happening)?  What is a "natural end" and is there something I need to do to explicitly end the span?  I get the whole idea, but it's not clear code-wise what pieces all need to be there to trigger the logging.  Using the ConsoleTracer it seemed to just work, and I guess I was expecting Zipkin to be the same.  Is there a simple modification to my code that will make this work, or am I missing something more fundamental?

Franklin Hu

unread,
Sep 20, 2012, 2:06:07 PM9/20/12
to zipki...@googlegroups.com
In Finagle, we hold a span in memory optimistically until we have both annotations for the Span (CLIENT_SEND and CLIENT_RECV if you're making a client call, SERVER_SEND and SERVER_RECV if it's a server), or until we reach some timeout. Are you getting both SERVER_SEND and SERVER_RECV annotations with the ConsoleTracer?


Franklin


--
 
 
 

Johan Oskarsson

unread,
Sep 20, 2012, 2:09:34 PM9/20/12
to zipki...@googlegroups.com
I would recommend adding a StatsReceiver to the ServerBuilder. That's where we report if the zipkin tracer was able to send the trace to scribe or not. I believe it is added with .reportTo(). I don't know if there's one that reports to the console, but it should be easy to implement one if there is none.

/Johan

--
 
 
 

Robbie Strickland

unread,
Sep 20, 2012, 2:12:17 PM9/20/12
to zipki...@googlegroups.com
Console output:

0920 17:34:03.998 6ac5cc80e117ce15.6ac5cc80e117ce15<:6ac5cc80e117ce15] Rpcname(HttpServer,GET)
0920 17:34:04.005 6ac5cc80e117ce15.6ac5cc80e117ce15<:6ac5cc80e117ce15] ServerAddr(0.0.0.0/0.0.0.0:10000)
0920 17:34:04.007 6ac5cc80e117ce15.6ac5cc80e117ce15<:6ac5cc80e117ce15] BinaryAnnotation(http.uri,/)
0920 17:34:04.008 6ac5cc80e117ce15.6ac5cc80e117ce15<:6ac5cc80e117ce15] ServerRecv()
0920 17:34:04.009 6ac5cc80e117ce15.6ac5cc80e117ce15<:6ac5cc80e117ce15] Message(Got a request)
0920 17:34:04.010 6ac5cc80e117ce15.6ac5cc80e117ce15<:6ac5cc80e117ce15] Message(Sending response)
0920 17:34:04.014 6ac5cc80e117ce15.6ac5cc80e117ce15<:6ac5cc80e117ce15] ServerSend()
0920 17:34:04.011 6ac5cc80e117ce15.6ac5cc80e117ce15<:6ac5cc80e117ce15] Message(Gc(1,0.PCopy,2012-09-20 17:34:04 +0000,7975000.nanoseconds))

Nakamura

unread,
Sep 20, 2012, 2:14:12 PM9/20/12
to zipki...@googlegroups.com
You can add a statsReceiver to your ZipkinTracer as one of the arguments, too.  I usually add it to the server builder and also to the ZipkinTracer.

--
 
 
 

Robbie Strickland

unread,
Sep 20, 2012, 2:42:36 PM9/20/12
to zipki...@googlegroups.com
I created a little console stats receiver, added it to my server and the tracer, and this is the output:

Counter HttpServer:connects = 1
Counter HttpServer:received_bytes = 149
Counter HttpServer:requests = 1
Stat HttpServer:handletime_us = 61161.0
Stat HttpServer:request_latency_ms = 4.0
Counter HttpServer:success = 1
Counter zipkin:create_log_entries:error:java.lang.NoClassDefFoundError: org/slf4j/impl/StaticLoggerBinder = 1
Counter HttpServer:sent_bytes = 19
Stat HttpServer:connection_received_bytes = 149.0
Stat HttpServer:connection_sent_bytes = 19.0
Stat HttpServer:connection_duration = 170.0
Stat HttpServer:connection_requests = 1.0
Counter zipkin:log_span:ok = 1

This seems to indicate a successful span being logged, but is there a way to know where it went?  My collector remains silent...

Johan Oskarsson

unread,
Sep 20, 2012, 4:06:06 PM9/20/12
to zipki...@googlegroups.com
Hmm, this line doesn't look great: Counter zipkin:create_log_entries:error:java.lang.NoClassDefFoundError: org/slf4j/impl/StaticLoggerBinder = 1

Could there be a class path issue or is the slf4j jar file simply missing?

/Johan

--
 
 
 

Robbie Strickland

unread,
Sep 20, 2012, 4:56:53 PM9/20/12
to zipki...@googlegroups.com
I saw that but assumed it non-fatal since this is a common error when you don't have a logger implementation on the classpath for SLF4J to use, but it doesn't typically throw any sort of runtime exception.  In this case, however, I dropped slf4j-simple-1.5.8.jar on the classpath and it suddenly started working.  This is problematic, since the SLF4J version used by finagle-zipkin and the one used by zipkin itself are different (1.5.8 vs 1.6.4). Seems the two should be in sync.  Either way, that was the issue.  Appreciate the help!!

Johan Oskarsson

unread,
Sep 20, 2012, 5:09:56 PM9/20/12
to zipki...@googlegroups.com
Glad it helped, but worried others might run into this. It's also not great that the log span ok counter got incremented when it actually failed.

Would you mind creating a ticket for this issue so we can track it further?

Thanks!

/Johan

--
 
 
 

Robbie Strickland

unread,
Sep 20, 2012, 5:24:40 PM9/20/12
to zipki...@googlegroups.com
Will do.

Blake Smith

unread,
Oct 2, 2013, 10:06:19 AM10/2/13
to zipki...@googlegroups.com
Hey Zipkiners,

I was experiencing similar issues to this thread: Our finagle service (version 6.5.0) was never writing anything to scribe with the ZipkinTracer enabled (I didn't see any statsReceiver activity, nor was there any log data in scribe). I plugged in the RawZipkinTracer, and started seeing these log messages via the JavaLoggerStatsReceiver:

SLF4J: Found binding in [jar:file:/Users/blake/.m2/repository/org/slf4j/slf4j-jdk14/1.6.1/slf4j-jdk14-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/blake/.m2/repository/org/slf4j/slf4j-log4j12/1.6.4/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
INF [20131001-17:13:51.763] Finagle: zipkin/create_log_entries/error/java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeAsString([B)Ljava/lang/String; incr 1
INF [20131001-17:13:51.776] Finagle: tapp/handletime_us add 106114.000000INF [20131001-17:13:51.856] Finagle: zipkin/create_log_entries/error/java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeAsString([B)Ljava/lang/String; incr 1
INF [20131001-17:13:51.877] Finagle: tapp/requests incr 1INF [20131001-17:13:51.878] Finagle: tapp/success incr 1
INF [20131001-17:13:51.879] Finagle: tapp/request_latency_ms add 161.000000INF [20131001-17:13:51.883] Finagle: zipkin/create_log_entries/error/java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeAsString([B)Ljava/lang/String; incr 1
INF [20131001-17:13:51.895] Finagle: tapp/sent_bytes incr 42INF [20131001-17:13:51.907] Finagle: tapp/closechans incr 1
INF [20131001-17:13:51.907] Finagle: tapp/connection_received_bytes add 149.000000INF [20131001-17:13:51.908] Finagle: tapp/connection_sent_bytes add 42.000000
INF [20131001-17:13:51.908] Finagle: tapp/connection_duration add 286.000000INF [20131001-17:13:51.908] Finagle: tapp/connection_requests add 1.000000
INF [20131001-17:13:51.909] Finagle: tapp/closed incr 1INF [20131001-17:13:51.916] Finagle: zipkin/log_span/ok incr 1
INF [20131001-17:13:51.916] Finagle: zipkin/log_span/ok incr 1INF [20131001-17:13:51.916] Finagle: zipkin/log_span/ok incr 1
INF [20131001-17:13:55.438] Finagle: tapp/load 0.000000INF [20131001-17:13:55.442] Finagle: tapp/pending 0.000000
INF [20131001-17:14:01.613] Finagle: tapp/connections 0.000000

Looking at our maven dependency tree, I saw that many of our jars depend on commons-codec 1.4, but finagle-http and finagle-memcached depend on commons-codec 1.5. I was able to get zipkin trace working  by explicitly adding commons-codec 1.5 as a dependency in our pom.xml, and forcing all other jars that depended on commons-codec 1.4 to use 1.5 instead. This feels like a hack, but I'm not sure how else to fix this jar dependency issue.

A few questions:
  •  Does anyone have a more elegant way to fix this classpath jar dependency problem besides adding commons-codec 1.5 as an explicit dependency to my project?
  •  It looks like log_span/ok is still being incremented even though the data was never written to Scribe. Was a ticket ever opened for this issue (from earlier in this thread), or should I open one and work on a fix?

Thanks!

Blake

Brian Degenhardt

unread,
Oct 2, 2013, 11:54:59 AM10/2/13
to zipkin-user
Versions of finagle-zipkin before, I believe, 6.6.2 required a concrete implementation of slf4j to work correctly.  You might try adding slf4j-jdk14 to your classpath.


--
 
---
You received this message because you are subscribed to the Google Groups "zipkin-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zipkin-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages