Rapid Queuing of Hystrix Commands

279 views
Skip to first unread message

Joseph Athman

unread,
Sep 24, 2015, 11:29:14 AM9/24/15
to HystrixOSS
I'm trying to validate some behavior I'm seeing that I didn't expect.

Let's say we have a hystrix thread pool with 25 available threads (nothing currently being processed), and a queue with a max size of 5. If create 25 hystrix commands in memory and then rapidly loop through them calling queue() on each one, is it possible for us to get commands rejected because the queue is full? When I queue a command does it first go to the queue and then go to the thread pool even if there are available workers in the thread pool?

Our system has a scenario where we will generate a relatively large number of hystrix commands for a single incoming request. We are seeing occasional rejected errors even though we think the thread pool always have available resources still.

Thanks for the help!

Matt Jacobs

unread,
Sep 26, 2015, 1:36:42 AM9/26/15
to HystrixOSS
Joseph - 

I tried to replicate your scenario in this unit test. (https://github.com/Netflix/Hystrix/pull/911). In every case, the test passed and I did not see rejections.  Is there anything you see that is different from your case?  

One option is to remove the queue from your system.  By setting maxQueueSize to -1, you use a SynchronousQueue, which experiences a rejection if the thread pool is full.  This is the strategy we use in production at Netflix (at least on the app I work on).

-Matt

Joseph Athman

unread,
Sep 30, 2015, 11:09:55 AM9/30/15
to HystrixOSS
It has been hard for me to replicate this in any kind of test as well which makes me concerned something else is going on.

Is there a JMX metric produced which shows the max number of items that has been worked at one time by one of the thread pools? I'm trying to figure out if we are actually maxing out a thread pool which then causing the queue to fill up. From what I've seen monitoring the active tasks being worked by any threadpool I've never seen it max out, but it might be happening so fast I miss it.

Joe

Matt Jacobs

unread,
Oct 1, 2015, 12:06:47 PM10/1/15
to HystrixOSS
There was one point a couple months ago where I stepped through the precise steps that a task takes to get executed in a ThreadPoolExecutor via the JDK, but the details escape me.

I don't believe there's a JDK metric that tracks the maximum work seen by the threadpool.  There is a Hystrix metrics that does: 'rollingMaxActiveThreads', seen here: https://github.com/Netflix/Hystrix/wiki/Metrics-and-Monitoring#rolling-counts-gauge.  This metrics gets updated everytime a thread enters/leaves a Hystrix-constructed threadpool, so it should be accurate.

Let me know if you find anything interesting.  I should have time next week to dig in more deeply if you haven't found anything before then.

-Matt

Matt Jacobs

unread,
Oct 14, 2015, 2:39:37 PM10/14/15
to HystrixOSS
These unit tests have failed fairly frequently in CI builds, so I created a Github issue to investigate further (https://github.com/Netflix/Hystrix/issues/933).  I don't think that any code change will come of this, since the threadpool implementation is owned by the JDK, but increasing my understanding and improving the documentation would be a great outcome.
Reply all
Reply to author
Forward
0 new messages