High Latency while federating messages over 2K messages/sec

94 views
Skip to first unread message

himani verma

unread,
Nov 24, 2021, 7:27:42 AM11/24/21
to rabbitmq-users
Hi RabbitMQ Team, 

I am setting up RabbitMQ federation between two clusters of 3 nodes - c5.2xLarge each. These clusters are deployed in different US regions east and west. We need redundancy of messages in west region. The clusters are communicating over a vpc peering in aws regions. 

I have tried multiple version of RabbitMQ 3.7.13 and 3.9.0. The delay in replication of these messages is being caused by federation where I am receiving 2000 m/s of 512 bytes in east region classic queue, while the local queue used by federation is transmitting it at the rate of 500 m/s to the west region. This behaviour is irrespective of the backlog by consumers. We are using ha-all policies for all queues and exchanges inside both clusters also. The CPU and Memory metrics look normal. 

Can someone please help me what could be the issue be with federation here ?

himani verma

unread,
Nov 24, 2021, 7:34:47 AM11/24/21
to rabbitmq-users

Timothy Peng

unread,
Nov 24, 2021, 7:36:26 AM11/24/21
to rabbitm...@googlegroups.com
Maybe you need the faster network connection and higher throughput
between west and east?
> --
> You received this message because you are subscribed to the Google
> Groups "rabbitmq-users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to rabbitmq-user...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/rabbitmq-users/21c925a1-2f6d-438c-bed4-632d0e5908ean%40googlegroups.com
> [1].
>
>
> Links:
> ------
> [1]
> https://groups.google.com/d/msgid/rabbitmq-users/21c925a1-2f6d-438c-bed4-632d0e5908ean%40googlegroups.com?utm_medium=email&utm_source=footer

Timothy Peng

unread,
Nov 24, 2021, 7:37:22 AM11/24/21
to rabbitm...@googlegroups.com
Then take a look at this posting also:
https://techblog.myhostnames.com/many-factors-impact-the-throughput-of-rabbitmq/
> --
> You received this message because you are subscribed to the Google
> Groups "rabbitmq-users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to rabbitmq-user...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/rabbitmq-users/5683d806-a91f-4618-82a9-844e59f967c3n%40googlegroups.com
> [1].
>
>
> Links:
> ------
> [1]
> https://groups.google.com/d/msgid/rabbitmq-users/5683d806-a91f-4618-82a9-844e59f967c3n%40googlegroups.com?utm_medium=email&utm_source=footer

himani verma

unread,
Nov 26, 2021, 2:35:21 AM11/26/21
to rabbitmq-users
I tried creating a cluster in the east region itself to check if its the network connection between east and west that's causing such low throughput , but the behaviour remains same. The consumer of local federation queue is accurately responsive till 1000 messages/sec , but the rate drops as soon as we increase the messages/sec count to 1200 or so. 

Could the perfTest be the reason as to how the connections are being created , but I doubt since federation local queue receives messages at same rate as the queue but the consumption rate drops after 1k messages/sec. 
Is there any way to see whats happening over the link ? There's no such error in the logs and iptraf also shows fine. 

himani verma

unread,
Nov 26, 2021, 5:01:37 AM11/26/21
to rabbitmq-users
In the logs though I found the channels closing from downstream 


2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>     supervisor: {<0.15002.21>,amqp_channel_sup_sup}
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>     errorContext: shutdown_error
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>     reason: shutdown
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>     offender: [{nb_children,1},
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>                {id,channel_sup},
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>                {mfargs,
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>                    {amqp_channel_sup,start_link,
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>                        [direct,<0.14999.21>,
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>                         <<"<rab...@ip-10-238-37-4.1637838750.14999.21>">>]}},
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>                {restart_type,temporary},
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>                {shutdown,infinity},
2021-11-26 09:25:59.338144+00:00 [error] <0.15002.21>                {child_type,supervisor}]

Reply all
Reply to author
Forward
0 new messages