How long should syncing take?

1,267 views
Skip to first unread message

Dave Collins

unread,
Mar 28, 2017, 1:39:34 PM3/28/17
to rabbitmq-users
Hello,

   We are still stress-testing our new RabbitMQ setup.  Have pounded it with some high volumes and everything went extremely well.

   Now we are trying stopping one of the nodes in our mirrored HA cluster:


Two Node Cluster on EC2 t2.large's
Each is RabbitMQ 3.6.8, Erlang 19.3 on Ubuntu

We had a backlog of about 92,000 images (each about 300Kb)

We stopped one of the nodes and all seemed to keep going well.

Then we restarted the stopped node.  It appeared to start syncing, but then got stuck at 25%, and the RAM on the "healthy" node went Red at 3.1Gb with a question mark.

Stats look like this:

Messages:
Total: 98,042
Ready: 92,042
Unacked: 6,000
in Memory:  14,927
Persistent: 0
Transient, paged out: 60,838


Message body bytes:
Total: 11GB
Ready: 10GB
Unacked: 653MB
in Memory:  1.4GB
Persistent: 0
Transient, paged out: 7.1GB

Very roughly, how long would you expect this sync to take?  10 minutes? Half an hour?  Two hours?

Eventually the entire cluster seemed to just hang.  Stress test successful!

Just looking for some overall level setting on how long syncing SHOULD take.

Thanks,
Dave



Michael Klishin

unread,
Mar 28, 2017, 2:15:42 PM3/28/17
to rabbitm...@googlegroups.com
The receiving node will block its source when it's overwhelmed (flow control in modern versions cover eager sync) and in your case, all messages are transient therefore you *asked* RabbitMQ to keep them in RAM as much as possible.

If your queues are durable, publish messages as persistent; otherwise lowering rabbit.vm_memory_high_watermark_paging_ratio or turning queues into lazy ones
will help, as should lowering the eager sync batch size (IIRC it is 4096 messages at a time by default).
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Mar 28, 2017, 2:20:00 PM3/28/17
to rabbitm...@googlegroups.com
It's worth mentioning that 300 kB is way more than an average message size (it's usually < 4kB) so using transient messages and the default batch sync size of 4096
is not something I'd recommend.

Syncing 11 GB of messages will take maybe 150-200% of the time it takes for that much
data to be loaded from disk and transferred between the nodes (this is a ballpark
estimate and not a scientific number). Of course if you run into alarms then
the estimate no longer applies.

On 28 Mar 2017, at 20:39, Dave Collins <davec...@cc-logic.com> wrote:

Dave Collins

unread,
Mar 28, 2017, 4:57:18 PM3/28/17
to rabbitmq-users
Thanks as always Michael for your help!   

In the "real world" we are only going to have about 5-10 of these images in flight every second, and the consumer threads can prefetch 100 per thread and we have 30+ consumer threads on different consumer machines.  So there will "normally" be nothing in the RMQ image queue.  It is just when we simulate extremely high volume + dead consumers + loss of a node of the cluster that we are seeing this.

We had originally set this queue as durable but thought the extra writes to disk were going to slow things down too much in HA, and possibly create an unintended coupling between the producers and consumers due to this bit in the docs: (Publisher Confirms - When will messages be confirmed?)  "For routable messages, the basic.ack is sent when a message has been accepted by all the queues. For persistent messages routed to durable queues, this means persisting to disk. For mirrored queues, this means that all mirrors have accepted the message."  So our concern was that, in this durable mirrored scenario, we would be slowing down the publisher confirm basic.ack due to the disk writes.

We'll try a few of these suggestions and determine the impact.

Thanks again,
Dave

Dave Collins

unread,
Mar 30, 2017, 12:25:27 PM3/30/17
to rabbitmq-users
Hello Michael,

   OK we tried another experiment.  This time we set the queue to Durable: true, and the messages to Persistent.  We also set ha-sync-batch-size to 80.  Since the average image size is around 300kB that should mean that, when it syncs, it only has to deal with around 24Mb of data.

   We then pushed this two-node cluster up to around 14K image backlog.  All seemed to be running fine.  The RAM seemed to, in general, stay balanced between the two machines.

    Then we killed a cluster node.  Let it sit for a bit, then brought the cluster node back.

   The results were pretty much the same (sync did not finish) although it got further through the sync process:  The first time it got to about 70% then seemed to be stopped.  We tried restarting that node again and it got to 69% and then stopped.  The first (surviving) cluster also has a red alarm on the RAM again : 3.2Gb (3.1Gb is high water mark).

The error report is in this gist (Note "reason for termination" around line 145)

Here is a summary of the system memory:

Messages
Total: 14,514
Ready: 14,514
Unacked: 0
In Memory: 0
Persistent: 14,514
Transient: 0

Message Body Bytes
Total: 4.0Gb
Ready: 4.0Gb
Unacked: 0
In Memory: 0
Persistent: 4.0Gb
Transient: 0

   It is a bit concerning that we can't survive the restart of a cluster node.  Can you please suggest other configuration changes we might make?  Should we try this experiment on Ubuntu servers with more RAM?

Thanks in advance
Dave

Michael Klishin

unread,
Mar 30, 2017, 12:31:23 PM3/30/17
to rabbitm...@googlegroups.com
The problem here is that the receiving node hits an alarm. According to your summary
there are no messages on memory but something still triggers the alarm.

You can use `rabbitmqctl status`, node details page in the management UI and finally rabbitmq-top to learn more.

Last time we've seen an issue with eager sync overwhelming a node due to lacking flow control was around 3.6.1-3.6.2
and it involved 100s of gigabytes of data. After we introduced flow control for sync, I don't recall similar reports.


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

brend...@gmail.com

unread,
Mar 30, 2017, 4:29:35 PM3/30/17
to rabbitmq-users
Hey Michael,
Working on the project with Dave.

Here is the output of rabbitmqctl status while our node is locked with a memory warning:

```
[{pid,1070},
 {running_applications,
     [{rabbitmq_top,"RabbitMQ Top","3.6.8"},
      {rabbitmq_management_visualiser,"RabbitMQ Visualiser","3.6.8"},
      {rabbitmq_management,"RabbitMQ Management Console","3.6.8"},
      {rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.6.8"},
      {rabbitmq_management_agent,"RabbitMQ Management Agent","3.6.8"},
      {rabbit,"RabbitMQ","3.6.8"},
      {cowboy,"Small, fast, modular HTTP server.","1.0.4"},
      {ranch,"Socket acceptor pool for TCP protocols.","1.3.0"},
      {ssl,"Erlang/OTP SSL application","8.1.1"},
      {public_key,"Public key infrastructure","1.4"},
      {cowlib,"Support library for manipulating Web protocols.","1.0.2"},
      {crypto,"CRYPTO","3.7.3"},
      {mnesia,"MNESIA  CXC 138 12","4.14.3"},
      {os_mon,"CPO  CXC 138 46","2.4.2"},
      {inets,"INETS  CXC 138 49","6.3.6"},
      {amqp_client,"RabbitMQ AMQP Client","3.6.8"},
      {rabbit_common,
          "Modules shared by rabbitmq-server and rabbitmq-erlang-client",
          "3.6.8"},
      {compiler,"ERTS  CXC 138 10","7.0.4"},
      {xmerl,"XML parser","1.3.13"},
      {syntax_tools,"Syntax tools","2.1.1"},
      {asn1,"The Erlang ASN1 compiler version 4.0.4","4.0.4"},
      {sasl,"SASL  CXC 138 11","3.0.3"},
      {stdlib,"ERTS  CXC 138 10","3.3"},
      {kernel,"ERTS  CXC 138 10","5.2"}]},
 {os,{unix,linux}},
 {erlang_version,
     "Erlang/OTP 19 [erts-8.3] [source-d5c06c6] [64-bit] [smp:2:2] [async-threads:64] [hipe] [kernel-poll:true]\n"},
 {memory,
     [{total,3398023440},
      {connection_readers,463912},
      {connection_writers,9216},
      {connection_channels,824200},
      {connection_other,367504},
      {queue_procs,438344},
      {queue_slave_procs,16990856},
      {plugins,4379184},
      {other_proc,24018184},
      {mnesia,151312},
      {metrics,225664},
      {mgmt_db,1396112},
      {msg_index,1878640},
      {other_ets,2953144},
      {binary,3309367288},
      {code,24794668},
      {atom,1033401},
      {other_system,8954643}]},
 {alarms,[memory]},
 {listeners,
     [{clustering,25672,"::"},
      {amqp,5672,"::"},
      {'amqp/ssl',5671,"::"},
      {http,15672,"::"}]},
 {vm_memory_high_watermark,0.4},
 {vm_memory_limit,3347939328},
 {disk_free_limit,50000000},
 {disk_free,146945978368},
 {file_descriptors,
     [{total_limit,924},{total_used,10},{sockets_limit,829},{sockets_used,6}]},
 {processes,[{limit,1048576},{used,487}]},
 {run_queue,0},
 {uptime,472},
 {kernel,{net_ticktime,60}}]
```

rabbitmq-top doesn't seem to show anything odd.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Mar 30, 2017, 4:55:45 PM3/30/17
to rabbitm...@googlegroups.com
So nearly all of the memory is the binary heap :/

Try enabling background GC (which was disabled in 3.6.7 by default because it's not worth the
effect it has on latency):

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

brend...@gmail.com

unread,
Mar 30, 2017, 5:51:30 PM3/30/17
to rabbitmq-users
GC didn't seem to help us out.

Honestly the problem happens almost as soon as the node joins the cluster, it probably doesn't have time to collect.
I will try making the GC interval shorter as well

 {memory,
     [{total,3444942128},
      {connection_readers,164248},
      {connection_writers,7680},
      {connection_channels,110080},
      {connection_other,210776},
      {queue_procs,158112},
      {queue_slave_procs,9818088},
      {plugins,2891992},
      {other_proc,18081448},
      {mnesia,133336},
      {metrics,219080},
      {mgmt_db,907672},
      {msg_index,1828352},
      {other_ets,2956648},
      {binary,3372915912},
      {code,24784275},
      {atom,1033401},
      {other_system,8937276}]}

Michael Klishin

unread,
Mar 30, 2017, 6:25:35 PM3/30/17
to rabbitm...@googlegroups.com
The problem with binary heap GC is that it's lazy and RabbitMQ has no control
over when it happens. Usually the binary heap doesn't grow quickly or stay that way
for long in 19.x.

We will try reproducing this but FWIW I don't recall this reported in a while.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages