RabbitMq does not push messages until all the messages are consumed in consumer.

313 views
Skip to first unread message

Ozcan Bircan

unread,
Mar 23, 2016, 4:04:03 PM3/23/16
to masstransit-discuss
I saw another threads that sounded related but I guess this one is slightly different. 

My environment info:
 - RabbitMQ 3.5.3, clustered, HA, mirrored queues
 - MassTransit 2.10

I observed a weird behavior in couple of production environments which I could not replicate in Qa (not surprisingly). I wanted to check if anyone experienced a similar problem or have an explanation for this.

- Job Queue (mirrored) is shared in between 2 competing consumers (separate machines), is already filled with more than 1000 messages (I know that's not ideal).

- When I start both consumers, each consumer fetches the number provided to the bus while configuring queue, which is prefetch = 8, then 8 messages are being consumed in parallel in each consumer as expected.
At this point I can see Unacked count for Job Queue is 16 in RabbitMQ  admin UI, which is perfectly fine.

- Quick info about my subscription handler used in consumers: each message (job request) is handled in a separate .net process for historical reasons. And the code is simply like this: 

Process process = JobUtil.GetProcess(jobId);
process.Start();
Thread.Sleep(50); // Yield
process.WaitForExit(); // Wait until it finishes


- Now, the problem is, after a certain amount of time, while these jobs are running, Unacked count in RabbitMQ turns out to be 0, and when I check queue name in RabbitMQ Admin UI, I see no consumers attached to the Job Queue. But, there are still too many messages waiting in Job Queue and I know consumers are busy with processing existing jobs by following the processes running on consumer machines.

- After a job (process) is finished, RabbitMq seems to not to push another message, and Unacked count for Job Queue remains 0. Normally this should always be in sync with my prefetch count as long as I have messages in the queue, right?
As a result, number of processes running in each consumer drops from 8 to 1 gradually, and finally when the consumer is finished with executing very last process, RabbitMQ pushes another 8 messages. Scenario is the same for the second consumer.
This is causing a performance problem, since my consumer always waits for the slowest job to be finished before receiving 8 more jobs.

- I am trying to understand what might be the problem here. I think if this was a heartbeat problem (which should be enabled anyways in 2.10 be default), Consumer would not be able to get 8 messages again after finishing the last job. 

- It looks like my consumer sends multiple acknowledgements after finishing the last job, but I cannot see any reason why it does so. Also, I have no explanation why RabbitMQ Job Queue Unacked count in admin UI does not reflect the number of messages running in my consumers. Also, same setup in Qa environment works as charm, Unacked is always aligned with the Prefetch, and consumers keep processing multiple jobs at any time unlike production issues.

Sorry for long explanation, but I tried to give as much information as I could. I am not sure if the problem is related to using a separate process in subscription handler and waiting for it. 
Please share your thoughts since this is a critical performance problem right now in production. 

Chris Patterson

unread,
Mar 24, 2016, 1:09:23 AM3/24/16
to masstrans...@googlegroups.com
Thank you for the very detailed report on the issue you are experiencing. I'm going to be doing some work on the 2.x code base in a week, so it will be good to get an understanding of how I could reproduce it.

Because of how the RabbitMQ consumer is used with MT2, I wonder if it's an issued with the built in queued basic consumer and Ack ordering.



__
Chris Patterson




--
You received this message because you are subscribed to the Google Groups "masstransit-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-dis...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/masstransit-discuss/609f428b-03e5-4b5c-a5ba-9d7ab5fcdf2b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ozcan Bircan

unread,
Mar 24, 2016, 1:51:15 AM3/24/16
to masstransit-discuss
Hey Chris,
It looks like the Load Balancer, exists in front of RabbitMQ clusters, was killing the connections after sometime. I don't have access to LB to check how that is configured, so instead I have changed Masstransit's configuration to bypass LB and use one of the RabbitMQ cluster nodes directly. So far, problem seems to be resolved, but I'm still monitoring in case that happens again and I will update the post with additional findings.

I was under the assumption that default heartbeat will force Load Balancer to keep the connections alive, but apparently it was not enough. The interesting thing for me is that, once consumer finishes processing the last message, then it was able to receive 8 messages again without any issues. Probably at that point, it creates a new connection. Anyways, thanks for your response.
Reply all
Reply to author
Forward
0 new messages