Too many ServiceCache-0 threads being spawned while sending to Tranquility

SAURABH VERMA

unread,

Mar 2, 2015, 4:32:57 AM3/2/15

to druid-de...@googlegroups.com

Hello,

Please see the attached logs of the java process that listens to Kafka and sends message to Druid using tranquiility.

The process has 2 threadpool queues - i. thread pool having 3 threads reading from 3 Kafka partitions and ii. thread pool specific to each of the threads for the (i) thread pool, this second threadpool has 8 threads each => 24 threads total.

The issue being seen is that there are too many ServiceCache-0 threads being spawned by this java process, could you please help in this case and confirm whether it is a Tranquility side issue

Thanks,

Saurabh

javacore.20150228.210648.31794.0001.txt

Message has been deleted

SAURABH VERMA

unread,

Mar 2, 2015, 10:38:40 PM3/2/15

to druid-de...@googlegroups.com

Additional information :

Druid version : 0.6.171

Tranquility version : 0.3.2

Java process config :

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<topic>VidMessage_0</topic>

<zkAddress>phxscal1041:2181</zkAddress>

<consumerGroup>druidRT</consumerGroup>

<overlordServiceName>overlord</overlordServiceName>

<discoveryPath>/druid/discovery</discoveryPath>

<indexGranularity>MINUTE</indexGranularity>

<druidTopic>VidMessage</druidTopic>

</configuration>

Could you please have a look at it

Gian Merlino

unread,

Mar 2, 2015, 11:18:23 PM3/2/15

to druid-de...@googlegroups.com

Hi Saurabh,

Those look like possibly leaked firehose-discovery threads. How long has this process been running for? What have you got "partitions", "replicants", "segmentGranularity", and "windowPeriod" set to?

SAURABH VERMA

unread,

Mar 2, 2015, 11:56:31 PM3/2/15

to druid-de...@googlegroups.com

Gian,

The process has been running for around 88 hours, and :-

"partitions" : 16,

"replicants" : 1,

"segmentGran" : HOUR,

"windowPeriod" : PT10M,

Some more observations :-

There are 2 issues seen :-

There are 2 issues found in downstream listener :-

1. 1. Too many ServiceCache-0 threads issue

a. This is a load-independent issue, there is a pattern in the increase in no. of threads per hour

b. Observation is : When the new hour starts, the downstream listener connects to peons for the new hour, so there is an increase in the no. of threads (these new threads are ServiceCache-0 threads), but the threads connecting to peons of the past hours do not go down and thus there is continuous increase in no. of threads

c. Under load also there is no much increase in no. of threads, they are increasing at almost same rate (300 per hour)

2. 2. Downstream listener choking

a. This is a load-dependent issue

b. Values : Xmx30g, at choking point, RES memory for process is 31g => Downstream is overwhelmed by Upstream

c. To simulate this issue : Stop downstream listeners for a particular topic for 30 minutes (while messages are being pushed into Kafka for the same topic) and then start downstream listener so it is under very heavy load (which is much more than peak hour load for same no. of channels)

d. At this point, the listener doesn't get message from Kafka (or gets at a very slow rate)

e. Probable bottlenecks : This overwhelming may not be directly inside the listener itself, it may be in threads connected to Zookeeper / Kafka too as many times reconnections to zookeeper are seen in this state

I have also attached 4 files at https://drive.google.com/folderview?id=0B8BbNnJ2AAPLfjlJeW14TnlfcGQ3MHlXYmtZLVhqazcwN1ZLTXAyMVd6OXlHYTJRSEFCaU0&usp=sharing, please have a look:

a. fresh.txt = threaddump after 20 minutes of listener start

b. 1672.txt = threaddump when listener choked (and also spawned too many threads)

c. Output_1672.log & output_fresh.log = files containing more readable thread running info from these dumps

Thanks,

Saurabh

SAURABH VERMA

unread,

Mar 2, 2015, 11:58:15 PM3/2/15

to druid-de...@googlegroups.com

The process has been running for around 8* hours (not 88 : a typo)

Gian Merlino

unread,

Mar 3, 2015, 2:30:11 AM3/3/15

to druid-de...@googlegroups.com

For #1, can you try tranquility 0.3.3 and see if that fixes the thread leak? The number of threads should ideally be no more than the number of services that need to be tracked at any one time (overlord + one per task, so in your case 17–33 depending on where you are in the task lifecycle).

For #2, do I understand you correctly: you have a process (the "downstream listener") that reads from Kafka using a Kafka consumer and writes to Druid using tranquility. When it's under high load, throughput slows down. Is that right? If so, that sounds like GC issues, especially if your consumed-but-not-written-to-Druid messages are piling up in the heap. Can you get heap stats out of the jvm to confirm or deny that guess?

SAURABH VERMA

unread,

Mar 3, 2015, 7:58:05 AM3/3/15

to druid-de...@googlegroups.com

Sure Gian, let me test with these changes

SAURABH VERMA

unread,

Mar 3, 2015, 8:21:11 AM3/3/15

to druid-de...@googlegroups.com

Gian,

I worked on #1 and found that after upgrading to Tranquility version 0.3.3 too, the problem is still seen. Could you please tell what for ServiceCache threads are used ? I observe that ServiceCache threads are spawned as soon as new peons come for the new hour. I have ensured that the no. of threads pushing to peon using Tranquility API is as expected (3 Kafka consumers, each has 8 tranquility threads => 24 threads) and this is seen in threaddump, going by this it does not seem to be a synchronization issue while instantiating any threadpool

Thanks,

Saurabh

SAURABH VERMA

unread,

Mar 4, 2015, 12:15:42 AM3/4/15

to druid-de...@googlegroups.com

Hello Gian,

Any clue on this issue, the ServiceCache threads for the previous hours are not getting shutdown and thus go on accumulating.

Thanks,

Saurabh

Gian Merlino

unread,

Mar 4, 2015, 3:21:04 PM3/4/15

to druid-de...@googlegroups.com

Hi Saurabh,

For #1, I'm looking at reproducing the thread leak issue here and will get back to you with an update.

For #2, have you had a chance to take a look at those heap stats? I suspect that the thread leak is not actually causing your throughput problem, and that something else is going on.

SAURABH VERMA

unread,

Mar 4, 2015, 3:23:58 PM3/4/15

to druid-de...@googlegroups.com

I believe the same about #2 Gian, so that issue is more of due to latency in the processing before pushing to druid, the kafka consumer is reading messages at a higher rate than being pushed to druid, so thats leading to the choking issue.

For #1, if you need any more information, please let me know, I have also attached the thread dumps in an earlier post

Thanks

Saurabh

Gian Merlino

unread,

Mar 4, 2015, 7:04:52 PM3/4/15

to druid-de...@googlegroups.com

For #1, I could reproduce the thread leak with 0.3.2 but not with 0.3.3, so I believe it to be fixed now. At any given time you should have one ServiceCache thread for the overlord and one for each running task. Since you'll have one task per replica per partition, and potentially have overlapping tasks for a short period of time around handoff, that means you should see at most 1 + (2 * replicas * partitions) of them.

The bug in 0.3.2 was that you used to have one ServiceCache thread for each task that *ever* ran. In 0.3.3, you should only have one for each currently running task. I'd prefer to use a single thread for all the service caching, but I don't see a way to do that with the curator service discovery api…

For #2, do you have any idea what the root cause of the slowness is? If not, try checking for GCing or hitting disk i/o limits on the druid peons- those are somewhat common reasons for slowness.

SAURABH VERMA

unread,

Mar 4, 2015, 10:06:37 PM3/4/15

to druid-de...@googlegroups.com

Thanks a lot Gian for this useful information,

If the issue is not in 0.3.3, then let me check once again as last time my observation was same even after upgrading to 0.3.3

For #2,

1. I observe a pause in message pusher console for 1s each 8s, but root cause is not clear as of now, this is to be investigated. There is a processing involved that is done before pushing to peons and that can cause slowness but that should not lead to "pause" ideally

2. When the pause frequency increases (it increases to 1s each 5s after 1 hour), then the RES memory held by the process increases sharply and it stays there forever showing that after hitting pause, the process is never able to catch up to the original performance, the RES memory keeps on increasing to a point when it hits Xmx limits allotted and that is when complete process chokes (as a cascading, it even fails to read data from Kafka), only option at that time is to restart the process. This is a load-dependent issue as observed that under non peak hours, the process runs fine 12-14 hours, but under peak hours, it works for 6-7 hours.

For #2, I have 2 questions that can indicate whether threads become blocked :-

Q1. Is message push to peons an "async" event always, as in whatever is the load on pusher, is this event always async if there are no retransmissions. I am trying to understand that whether tranquility uses some buffer at pusher end and under heavy loads (when the buffer is completely filled), can message push in that case become a "sync" event and new pushes will be allowed only when the tranquility buffer has space for new pushes. Is that behaviour expected ?

Q2. Do I understand correctly : If the peons are slow in ingesting events, that ultimately leads to slowness in the message pusher as a cascading effect. I have seen "Transient error ... Retry in ....ms" messages on pusher console, but that seems to be service discovery issue more than load issue, please correct me, can this transient error appear under heavy loads ?

SAURABH VERMA

unread,

Mar 4, 2015, 10:56:40 PM3/4/15

to druid-de...@googlegroups.com

Gian, one more information, I missed to observe last time I tried 0.3.3 :-

when a new hour starts / new peons are spawned, I am consistently getting below error :-

"java.io.IOException: Unable to push events to task: index_realtime_vidmessage_2015-03-04T20:00:00.000-07:00_4_0_fgaddpfh (status = TaskRunning) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)8476 [finagle/netty3-35] WARN com.metamx.tranquility.finagle.FutureRetry$ - Transient error, will try again in 224 ms

at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at com.twitter.util.Promise$Transformer.liftedTree1$1(Promise.scala:96)

"

and it loops in retrying always, it seems it is unable to find the peon. This issue is seen in 0.3.3 and not in 0.3.2

Version info :-

Druid : 0.6.171

Java : "1.7.0"

Zookeeper : 3.4.6

On Monday, March 2, 2015 at 3:02:57 PM UTC+5:30, SAURABH VERMA wrote:

Gian Merlino

unread,

Mar 5, 2015, 12:52:14 AM3/5/15

to druid-de...@googlegroups.com

Q1) It depends on the API you use. If you use the Beam API then it's always async, but it's dangerous to just keep throwing stuff in there because you might overflow your heap if data can't actually be sent to the peon for a while. If you use the BeamPacketizer API then it's "mostly async"- meaning it's async until you fill up a fixed-size buffer, at which point it becomes blocking to exert backpressure. This is usually the behavior you want, since it prevents you from running out of memory.

Q2) Yes, if there's slowness in the peons it can cause slowness upstream in your message processing stuff. Peons being overloaded will cause connections to time out, which tranquility treats as transient errors and will retry with exponential backoff.

SAURABH VERMA

unread,

Mar 5, 2015, 1:24:09 AM3/5/15

to druid-de...@googlegroups.com

Thank you Gian for this information, we are using BeamPacketizer so after a point since everything has become sync, thatswhy there is choking.

Gian, there is an issue that I am facing with 0.3.3 which was not in 0.3.2 as mentioned below :-

when a new hour starts / new peons are spawned, I am consistently getting below error :-

"java.io.IOException: Unable to push events to task: index_realtime_vidmessage_2015-03-04T20:00:00.000-07:00_4_0_fgaddpfh (status = TaskRunning) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)8476 [finagle/netty3-35] WARN com.metamx.tranquility.finagle.FutureRetry$ - Transient error, will try again in 224 ms

at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at com.twitter.util.Promise$Transformer.liftedTree1$1(Promise.scala:96)

"

and it loops in retrying always, it seems it is unable to find the peon. This issue is seen in 0.3.3 and not in 0.3.2, is there any version specific compatibility required between druid, zk, and tranquility

Version info :-

Druid : 0.6.171

Java : "1.7.0"

Zookeeper : 3.4.6

Gian Merlino

unread,

Mar 5, 2015, 1:30:45 AM3/5/15

to druid-de...@googlegroups.com

Is there any more detail to that error? I think there should be some more logging or stack traces surrounding it.

SAURABH VERMA

unread,

Mar 5, 2015, 1:47:42 AM3/5/15

to druid-de...@googlegroups.com

Gian,

PFA the complete logs.

Adding to this, there is only one thread that is pushing to 16 Peons via Tranquility, I used one thread to get properly ordered logs currently

kc.log

SAURABH VERMA

unread,

Mar 6, 2015, 11:11:07 AM3/6/15

to druid-de...@googlegroups.com

Hello Gian,

Did you have a chance to look at this issue?

Thanks,
Saurabh

--
You received this message because you are subscribed to a topic in the Google Groups "Druid Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-development/M13CYW7oWks/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/1004d3fe-8a27-4134-8c32-de3623e05946%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gian Merlino

unread,

Mar 13, 2015, 8:22:00 PM3/13/15

to druid-de...@googlegroups.com

Hi Saurabh, sorry, this issue fell off my radar for a bit. Are you still seeing this problem or did you find a solution?

Fwiw, the logs you posted most recently look like there's a service discovery issue between tranquility and the druid tasks- it can't find them. Are they actually running and advertising on the proper service discovery paths? (the tasks should mention that in their logs)

SAURABH VERMA

unread,

Mar 13, 2015, 9:50:34 PM3/13/15

to druid-de...@googlegroups.com

Hi Gian,

Yes the issue is consistently reproducible with 0.3.3 even when the tasks are running fine. As soon as I deploy with 0.3.2, it starts working.... So currently we are using 0.3.2 in prod with a wrapper script that does a rolling restart of Kafka listeners cum druid pushers so that we don't hit too many threads issue.

The flow observed in 0.3.3 is as below:-
1. Listeners pushing data to peons successfully
2. New hour starts, new peons created
3. Listener unable to discover the new task names to connect to
4. Restart the listener and they are able to connect this time

It seems the service discovery at listener is unable to find the new peons if new peons are spawned in the same zk session, though it is able to find new peons if it is restarted

That said the issue is still there and we have not yet solved its root cause

Thanks
Saurabh

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/3cb583ac-763a-4699-b235-536a95b3eca7%40googlegroups.com.

Gian Merlino

unread,

Mar 15, 2015, 9:57:47 PM3/15/15

to druid-de...@googlegroups.com

Thanks for the update. That's strange… we're running with 0.3.3 now and it fixed the thread leak for us. Can you please provide your tranquility logs around the time that the new peons are created and cannot be found (the time period from #2–#4 in your list)? Also, what versions of finagle-core, curator-framework, curator-recipes, and curator-x-discovery are you using?

Gian Merlino

unread,

Mar 19, 2015, 5:15:14 PM3/19/15

to druid-de...@googlegroups.com

Hi Saurabh, have you had a chance to gather that information? I'm hoping to do a tranquility release candidate soon, and would like to understand more about what is going wrong with 0.3.3 in your environment.

SAURABH VERMA

unread,

Mar 20, 2015, 4:15:36 AM3/20/15

to druid-de...@googlegroups.com

Gian,

Sorry for the late reply.

Please find the required information as below :-

1. Curator-{client,framework,recipes,x-discovery} = 2.6.0

2. Finagle-core = 6.20.0

PFA the logs herewith between #2-#4

newHourLogs.txt

SAURABH VERMA

unread,

Mar 23, 2015, 12:43:45 AM3/23/15

to druid-de...@googlegroups.com

Hi Gian,

Did you get a chance to have a look at this issue

Thanks,

Saurabh

Gian Merlino

unread,

Mar 23, 2015, 3:13:37 PM3/23/15

to druid-de...@googlegroups.com

Hmm, I don't see anything in those logs that is too different from what we see in our logs…

Do you mind attaching the logs for the task "index_realtime_vidmessage_2015-03-20T01:00:00.000-07:00_0_0_gmlgnnna"? Based on your logs, this task corresponds to the firehoseId vidmessage-08-0000-0000 and is one of the tasks that couldn't be located. I'm wondering if the task itself has anything interesting to say.

It might also be interesting to see even more logs from your tranquility process, especially around when it actually works. Perhaps from startup until it succeeds in pushing the first few events. Can you send those as well?

Also, do you mind filing a GitHub issue here: https://github.com/metamx/tranquility/issues, linking back to this conversation? I have some wild, uncorroborated guesses about what *may* be going on, so I may ask you to try a new build at some point.

Thanks for your help in tracking this down…

SAURABH VERMA

unread,

Mar 24, 2015, 12:08:38 PM3/24/15

to druid-de...@googlegroups.com

Hi Gian,

I have filed https://github.com/metamx/tranquility/issues/19 for this issue, I will get back with required information asap

Thanks,

Sauarbh

SAURABH VERMA

unread,

Apr 7, 2015, 12:46:19 AM4/7/15

to druid-de...@googlegroups.com

Hello Gian,

Did you have a chance to look at this ?

Thanks,

Saurabh

Gian Merlino

unread,

Apr 10, 2015, 10:25:44 PM4/10/15

to druid-de...@googlegroups.com

Hi Saurabh, I was hoping you'd be able to help out with logs I had asked about. The reason for this is that so far, everything you've posted has looked normal to me, so I'm hoping those logs have something interesting in them.

1) Task logs for "index_realtime_vidmessage_2015-03-20T01:00:00.000-07:00_0_0_gmlgnnna"

2) More logs from your tranquility process, especially around when it actually works. Perhaps from startup until it succeeds in pushing the first few events.

One other thing is, can you try updating to tranquility 0.4.2 (if you're not already using it)? We've been using this version in production for a couple of weeks and it has been solid for us. We haven't observed any issues with either thread leaks or with getting stuck.

SAURABH VERMA

unread,

May 28, 2015, 9:24:00 AM5/28/15

to druid-de...@googlegroups.com

Hi Gian,

Its been a long time around this issue, I have updated to 0.4.2, but the error is still observed.

Let me recap the issue :-

1. At hour start, there are many no brokers available exception though tasks are successfully created with 5 seconds at hour beginning

2. After around 2 minutes, all pusher threads are able to push data to peons via tranquility

As required, I am attaching logs around the time of error and time of everything going to normal

The value '501' at many places in logs imply the no. of msgs acknowledged from druid, data is being sent in batches of 500

There are total 200 threads pushing to druid, the event rate being ingested is 300K per second

Thanks,

Saurabh

aroundSuccess.log

noHost.log

Reply all

Reply to author

Forward