RabbitMQ throttling publisher which is ingesting millions of messages. Any help in improving the performance

737 views
Skip to first unread message

venkata santhosh kumar Tangudu

unread,
Oct 13, 2019, 8:01:50 AM10/13/19
to rabbitmq-users

Hi,

I have been testing one use case where I need to ingest millions of messages to RabbitMQ using single threaded application and Multithreaded application. In both the cases, I am seeing the same performance. I didn't get the reason for this. Can you please help me to improve the publisher performance?

Environment:
I have two virtual machines each with 20 cores, 50GB RAM, 300GB of SSD disk space.

In one machine, RabbitMQ(3.6.6-1)  is running with default settings. I created one direct exchange binding with 7 lazy queues. All the exchanges and queues are marked as durable.

Scenario1: From other machine, I am running single threaded application which ingests 25GB of data to RabbitMQ where each record of size 100 Bytes (Max). The application reads each record and ingest into RabbitMQ Exchange. It could able to ingest 39000/sec. During this time, CPU and memory utlization is very minimal in both the machines.

Scenario 2: To simulate the multi threaded application, I have written Hadoop job to ingest data to RabbitMQ where there 40 concurrent tasks are ingesting data to RabbitMQ. Here also, I am seeing the same 39000/sec records ingestion. Here also, CPU and memory utlization looks normal. Each thread is using separate connection to ingest data to RabbitMQ.

I couldn't find the reason why couldn't I acheive the better performance with Multithreaded application?

Thanks
Santhosh



Wesley Peng

unread,
Oct 13, 2019, 8:27:30 AM10/13/19
to rabbitm...@googlegroups.com
Hi

to improve the publish performance, you need to investigate what’s the bottleneck. For your case, the bottleneck is maybe disk IO since you setup the durability. And , network is may slow in some environment. CloudAMQP has a good article on how to improve the throughout you may want to check. 

And, 3.6 is very old version, please upgrade to the latest.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.

Luke Bakken

unread,
Oct 13, 2019, 12:41:56 PM10/13/19
to rabbitmq-users
In one machine, RabbitMQ(3.6.6-1)  is running with default settings. I created one direct exchange binding with 7 lazy queues. All the exchanges and queues are marked as durable.

RabbitMQ 3.6.6 is almost three years old and is not supported. Please use the latest version of RabbitMQ for tests. When asking questions on this mailing list, you should also let us know what version of Erlang you are using as well as on what operating system.
 
Scenario 2: To simulate the multi threaded application, I have written Hadoop job to ingest data to RabbitMQ where there 40 concurrent tasks are ingesting data to RabbitMQ. Here also, I am seeing the same 39000/sec records ingestion. Here also, CPU and memory utlization looks normal. Each thread is using separate connection to ingest data to RabbitMQ.

I couldn't find the reason why couldn't I acheive the better performance with Multithreaded application

Why do you expect a multi-threaded publisher to be able to publish more data to RabbitMQ?

Are you using a single queue? Using a single-queue is a known anti-pattern as the queue is a single point of concurrency within RabbitMQ (https://www.cloudamqp.com/blog/2018-01-19-part4-rabbitmq-13-common-errors.html

Rather than use your own applications to test, I strongly suggest using the PerfTest tool - https://rabbitmq.github.io/rabbitmq-perf-test/stable/htmlsingle/

Thanks,
Luke

V Santhosh Kumar Tangudu

unread,
Nov 22, 2019, 8:36:49 AM11/22/19
to rabbitm...@googlegroups.com
Erlang Version: 1:22.1.3-1
RabbitMQ Version: 3.8.0-1
OS : Debian-9

I cannot use the PerfTest tool for my experiment.  As you guys suggested, I updated the RabbitMQ and started using lazy queues. The performance has been improved.

I have another problem that the disk utilization is too high. Even I ingest 70GB of data to RabbitMQ, It is taking almost 300GB of disk space. Please let me know why it is taking too huge disk space and how to optimize it.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.


--
Thanks
T.V.Santhosh Kumar
095382 55159

Wesley Peng

unread,
Nov 22, 2019, 8:46:16 AM11/22/19
to 'Sven Spudat' via rabbitmq-users
Because you are using lazy queue, the cold messages would be stored in disk, so disk size increases.

regards 

V Santhosh Kumar Tangudu

unread,
Nov 22, 2019, 8:53:23 AM11/22/19
to rabbitm...@googlegroups.com
Total size of the data is 70GB. I couldn't understand the cold messages logic.

When I am ingesting 70GB alone, it may take little more than 70GB (the extra disk space for meta data). But it is taking >300GB disk space.

Luke Bakken

unread,
Nov 22, 2019, 9:38:35 AM11/22/19
to rabbitmq-users
Hello,

Are you also consuming messages at the same time?
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.


--
Thanks
T.V.Santhosh Kumar
095382 55159


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.

V Santhosh Kumar Tangudu

unread,
Nov 22, 2019, 11:54:56 AM11/22/19
to rabbitm...@googlegroups.com
No, there are no consumers. Thats why I am storing the data in disk for future consumption.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.


--
Thanks
T.V.Santhosh Kumar
095382 55159


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.


--
Thanks
T.V.Santhosh Kumar
095382 55159

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/d466087b-70bd-4455-b9f3-f249dc0d50bd%40googlegroups.com.

Wesley Peng

unread,
Nov 22, 2019, 5:51:29 PM11/22/19
to rabbitmq-users

V Santhosh Kumar Tangudu

unread,
Nov 22, 2019, 10:52:46 PM11/22/19
to rabbitm...@googlegroups.com
I didn't enable any queue mirrors. I am using the default configuration available with RabbitMQ.

Will queue mirrors get enabled by default?

Wesley Peng

unread,
Nov 22, 2019, 11:04:40 PM11/22/19
to rabbitmq-users
Nope, it doesn’t enable by default.

V Santhosh Kumar Tangudu

unread,
Nov 23, 2019, 12:41:35 AM11/23/19
to rabbitm...@googlegroups.com
What would be the potential reason for high disk usage? How to debug this?

V Santhosh Kumar Tangudu

unread,
Nov 23, 2019, 3:11:07 AM11/23/19
to rabbitm...@googlegroups.com
Let me give more information about the setup.

I created one direct exchange which is bound with 7 queues. All of these (exchanges and queues) are marked as durable. Queues are also marked as classic and lazy. There is no consumer to consume the messages.

Ingested 5GB of data. But it occupied around 25GB of disk space. I debugged the disk space issue further and identified that it created huge number of 4.3MB of idx files for each queue. But RabbitMQ UI is still showing around 6GB of space utilization for all the queues.

Please find attached screenshot for reference and let me know what can I tweak to reduce the disk utilization.



Screenshot 2019-11-23 at 1.19.32 PM.png

Wesley Peng

unread,
Nov 23, 2019, 4:08:01 AM11/23/19
to rabbitmq-users
Hi

that's correct. there are 7 queues in the exchange, when a message was published to exchange, it was delivered to all the queues meeting the criteria, for example, have the same routing key.

As many as you have x duplicated messages, the disk size will increase x times.

regards.


Attachments:
  • Screenshot 2019-11-23 at 1.19.32 PM.png

V Santhosh Kumar Tangudu

unread,
Nov 23, 2019, 5:01:28 AM11/23/19
to rabbitm...@googlegroups.com
All these seven queues are mutually exclusive. There is no possibility for duplication. There is no correlation among them. 

Will the message get stored in RabbitMQ exchange (it is also durable) also?

What about  huge number of small (4.3MB) .idx files?

Wesley Peng

unread,
Nov 23, 2019, 6:59:45 AM11/23/19
to rabbitmq-users
Hi

Message is stored in queue, queue is in disk. Those idx files are indexes to local file pieces.

Regards


2019年11月23日 星期六 +0800 下午6:01 发件人 t.v.sant...@gmail.com <t.v.sant...@gmail.com>:

V Santhosh Kumar Tangudu

unread,
Nov 23, 2019, 7:03:29 AM11/23/19
to rabbitm...@googlegroups.com
These idx files are taking huge amount of disk space (amost 25G for 5G of data). Is it possible for me to go away from these idx files or compress the size of these files?


Wesley Peng

unread,
Nov 23, 2019, 7:05:36 AM11/23/19
to rabbitmq-users
What file system are you using? Is it ceph or something similar?

Regards 
Sent from mobile


2019年11月23日 星期六 +0800 下午8:03 发件人 t.v.sant...@gmail.com <t.v.sant...@gmail.com>:

V Santhosh Kumar Tangudu

unread,
Nov 23, 2019, 7:07:38 AM11/23/19
to rabbitm...@googlegroups.com
I am not using any special file system for RabbitMQ. I am using ext4.

Wesley Peng

unread,
Nov 23, 2019, 7:16:06 AM11/23/19
to rabbitmq-users
I am not sure for those idx files. Please try google them.

Regards 
Sent from mobile


2019年11月23日 星期六 +0800 下午8:07 发件人 t.v.sant...@gmail.com <t.v.sant...@gmail.com>:

Luke Bakken

unread,
Nov 23, 2019, 4:14:42 PM11/23/19
to rabbitmq-users
Hello,

You are giving us information a bit at a time, which doesn't enable us to help you quickly. What would help the most is to provide a script we can run to see the same behavior, using the same exchanges and queues you are using.

Once we have a way to reproduce what you are describing, we can help out. At this time I don't have an explanation.

Thanks,
Luke
Reply all
Reply to author
Forward
0 new messages