Help us determine better Erlang VM memory management configuration defaults for RabbitMQ

2,536 views
Skip to first unread message

gl...@pivotal.io

unread,
May 3, 2018, 12:05:12 PM5/3/18
to rabbitmq-users
TL;DR

We have identified a set of Erlang VM memory allocator flags that reduce RabbitMQ's memory usage by up to 37%. We would like your help to determine if these flags can be applied by default.

WHY ARE WE DOING THIS?

RabbitMQ has always used Erlang VM memory allocator defaults. While the Erlang VM defaults are good for the majority of Erlang apps, we have observed that they are not ideal for RabbitMQ. Efficient memory usage is essential to RabbitMQ's normal operation: the memory alarm blocks all incoming messages and is also known to block queue synchronisation.

Our goal is to settle on an Erlang memory management configuration that is most efficient for RabbitMQ. This is harder than it sounds, since the default should be an improvement for most workloads, and it must not make things worse for any workload.

We need your help to test the new defaults that we are proposing so that we can be confident in applying them out of the box.

WHAT HAVE WE LEARNED SO FAR?

1. Setting the largest multi-block carrier size to 512KB (default is 5120KB) for binary_alloc and eheap_alloc results in ~24% lower RSS (Resident Set Size) usage under load (932MB vs 1228MB), and ~29% lower RSS usage after load (603MB vs 852MB) [1].

2. Using `ageffcbf` as the allocation strategy for binary_alloc & eheap_alloc (default is `aoffcbf`) results in ~31% lower RSS usage under load (871MB vs 1270MB), and ~20% lower RSS usage after load (674MB vs 848MB) [2].

3. Combining the lower largest multi-block carrier size with the new allocation strategy for binary_alloc & eheap_alloc results in ~37% lower RSS usage under load (867MB vs 1392MB), and ~30% lower RSS usage after load (524MB vs 751MB) [3].

4. Erlang/OTP 20.2.3 or above is required for `ageffcbf`, the default memory allocation strategy that we are proposing [4].

5. It was hard to measure what goes on in Erlang's memory allocators [5].

6. Optimising Erlang's memory utilisation is hard [6].

7. Memory instrumentation was rewritten in Erlang/OTP 21 to make it easier to use [7]. As soon as RabbitMQ 3.7.x works as expected with Erlang/OTP 21 [8], we will have better tooling to continue investigating fragmentation in multi-block carriers.

All the observations were done on a single-node RabbitMQ 3.7.5-rc1 with Erlang/OTP 20.3.5, running on GCP n1-standard-2 [9], and using the following workload:

* Connections: 1000
* Channels: 1000
* Queues: 100
* Queue type: durable
* Msg size in KB: 1
* Msg/s incoming: 1000
* Msg/s outgoing: 500
* Msg type: persistent

HOW CAN YOU HELP?

You can help us by trying out the proposed flags in your RabbitMQ deployment, with your specific workload, and sharing your findings on this mailing list thread. We would like to know:

1. Are you noticing any change in RabbitMQ's message throughput?
2. How did the RSS usage for `beam.smp` system process change after applying the new flags?
3. Any other changes worth mentioning?

The easies way of applying the new flags is to set the following
environment variable RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS='+MHas ageffcbf +MBas ageffcbf +MHlmbcs 512 +MBlmbcs 512', and restart RabbitMQ. To check that the flags have been applied correctly, run the following command:

rabbitmqctl eval 'erlang:system_info({allocator, binary_alloc}).' | egrep '{as,ageffcbf}|{lmbcs,524288}'
                      {lmbcs,524288},
                      {as,ageffcbf}]},
                      ...
                      this will repeat CPU count + 1
                      ...
                      {lmbcs,524288},
                      {as,ageffcbf}]}

# now repeat the command for eheap_alloc
rabbitmqctl eval 'erlang:system_info({allocator, binary_alloc}).' | egrep '{as,ageffcbf}|{lmbcs,524288}'
...

If the commands don't return any output, the new flags were not applied correctly. Double-check beam.smp process environment (e.g. cat /proc/PID/environ) and erl flags (e.g. pgrep -a beam). Ensure RabbitMQ was restarted after the environment variable RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS was set.

Thank you all, Gerhard.

Ilya Smirnov

unread,
May 4, 2018, 3:53:36 AM5/4/18
to rabbitmq-users
Hello, Gerhard,

Have you tested  +hmqd and +MMmcs erl flags?
1. +hmqd off_heap help with big queue
2. +MMmcs 30 raise % memory cache hit (In our environment raise from 69% to 94%)

And another one option - try to set smbcs value equal to lmbsc (+MBlmbcs 512 +MBsmbcs 512 +MHlmbcs 512 +MHsmbcs 512)

Thanks.

gl...@pivotal.io

unread,
May 4, 2018, 11:28:59 AM5/4/18
to rabbitmq-users
Hi Ilya,


1. +hmqd off_heap help with big queue

+hmqd off_heap is controversial. While memory usage is more efficient when there are Erlang processes with deep mailboxes, it can make throughput worse:

If the process potentially can get many messages in its queue, you are advised to set the flag to off_heap. This because a garbage collection with many messages placed on the heap can become extremely expensive and the process can consume large amounts of memory. Performance of the actual message passing is however generally better when not using flag off_heap. [1]

Adding +hmqd off_heap to the proposed erts_alloc defaults had negligible impact during load (970MB vs 967MB in RSS). After the load, it started with a 3.9% degradation (632MB vs 608MB in RSS) and eventually reached an improvement of 1.5% (591MB vs 600MB) [2]. While this may be a good configuration in certain scenarios, it doesn't feel like a default worth adopting from the perspective of memory efficiency.
 
2. +MMmcs 30 raise % memory cache hit (In our environment raise from 69% to 94%)
 
Adding +MMmcs 30 had negligible impact during load (950MB vs 951MB in RSS), but it did result in a 5.6% improvement after load (567MB vs 599MB in RSS) [3]. It did however result in lower recon_alloc:cache_hit_rates()

+MMmcs 10 (default)

[{{instance,2},
  [{hit_rate,0.6418036990471149},
   {hits,155183},
   {calls,241792}]},
 {{instance,1},
  [{hit_rate,0.6638249713791325},
   {hits,151919},
   {calls,228854}]},
 {{instance,0},
  [{hit_rate,0.6498441741228511},
   {hits,45248},
   {calls,69629}]}]

+MMmcs 30 (optimised)

[{{instance,2},
  [{hit_rate,0.5869914654215622},
   {hits,30675},
   {calls,52258}]},
 {{instance,1},
  [{hit_rate,0.5347489061146525},
   {hits,24076},
   {calls,45023}]},
 {{instance,0},
  [{hit_rate,0.6593599704032557},
   {hits,7129},
   {calls,10812}]}]

All in all, I'm confident to recommend it part of the defaults, thanks!

And another one option - try to set smbcs value equal to lmbsc (+MBlmbcs 512 +MBsmbcs 512 +MHlmbcs 512 +MHsmbcs 512)

Adding +MBsmbcs 512 +MHsmbcs 512 had negligible impact during load (940MB vs 939MB in RSS), and a 3.7% degradation after load (592MB vs 571MB in RSS) [4]. It doesn't feel worth recommending, the default growth stage behaviour seems just right. Can you think of a different set of metrics that would make a stronger case for smbcs?

Sean Nolan

unread,
May 10, 2018, 5:38:02 PM5/10/18
to rabbitmq-users
I'm running with these new settings, only about an hour or so in but the results look very promising. I'm running RabbitMQ 3.7.5, Erlang 20.3 on Ubuntu 16.04 on AWS EC2 m4.2xlarge instances (32GB). I used the settings you suggested with the one +MMmcs 30
that Ilya Smirnov suggested and so far I'm seeing that in our case we were getting <2,000 connections per 1GB, we are now getting 3,000-4,000 connections per 1GB used. So it's about 50% less now, I guess after longer usage that will decrease but it is still much better.

Thanks!
Sean

Michael Klishin

unread,
May 10, 2018, 9:01:38 PM5/10/18
to rabbitm...@googlegroups.com
Thanks for trying these out, Sean.

I guess it's obvious but just in case: after we collect enough evidence of meaningful improvement (hopefully for different workloads but we can't
really know what the users do — RabbitMQ nodes never phone home), default allocator flags will be updated in  a future version of RabbitMQ.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ
Message has been deleted

gl...@pivotal.io

unread,
May 31, 2018, 1:28:20 PM5/31/18
to rabbitmq-users
I'm very pleased to hear Sean that this had the expected outcome for your RabbitMQ deployment. Are there any news worth sharing after 3 weeks of using these new defaults?

gl...@pivotal.io

unread,
May 31, 2018, 1:30:25 PM5/31/18
to rabbitmq-users
Based on the last set of benchmarks that we ran against non-durable queues [1], I am confident to propose these changes as defaults in 3.7.x. We didn't benchmark durable, mirrored or lazy queues, but we expect the results to be similar.

If anyone thinks that this is a bad change, please let me know, otherwise I'm happy to see this through into 3.7.6.

Thank you for your input!

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
May 31, 2018, 2:54:35 PM5/31/18
to rabbitm...@googlegroups.com
We haven't identified any meaningful (statistically significant) regressions with these new defaults and there's
some empirical evidence that the new defaults make a big difference for some users. So I agree with Gerhard.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Ilya Smirnov

unread,
Jun 4, 2018, 9:08:39 AM6/4/18
to rabbitmq-users
Hello, we use this settings about 2 weeks in prod, and free up to 40% of memory usage too.

gl...@pivotal.io

unread,
Jun 4, 2018, 9:58:17 AM6/4/18
to rabbitmq-users
That is very encouraging, it gives me even more confidence that it's a good default as of 3.7.6 ; )

Ilya Smirnov

unread,
Jun 4, 2018, 10:44:42 AM6/4/18
to rabbitmq-users
Gerhard, did you try +MBsbct +MHsbct values smaller then default 512?(together with this reduce the size "smbcs" and "lmbcs")
Because my average block size, for example, 8-12Kb. I reached 32 & 64 kb - usage and fragmentation was very good, but count of "calls" exceeded billions for a short time, i think, that is not good for CPU utilization.

Michael Klishin

unread,
Jun 4, 2018, 10:53:30 AM6/4/18
to rabbitm...@googlegroups.com
Ilya,

Thanks for your feedback. We likely will stick to these values for at least a few months. It's very difficult to find
optimal settings for "the average workload" because no such thing exists. So we need more evidence that certain values
work well for different users. On top of that the folks who've been working on researching this will likely switch to other things
for a while.

On Mon, Jun 4, 2018 at 5:44 PM, Ilya Smirnov <ilya...@gmail.com> wrote:
Gerhard, did you try +MBsbct +MHsbct values smaller then default 512?(together with this reduce the size "smbcs" and "lmbcs")
Because my average block size, for example, 8-12Kb. I reached 32 & 64 kb - usage and fragmentation was very good, but count of "calls" exceeded billions for a short time, i think, that is not good for CPU utilization.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

gl...@pivotal.io

unread,
Jun 5, 2018, 10:10:01 AM6/5/18
to rabbitmq-users
We have tested the new defaults, +MBas ageffcbf +MHas ageffcbf +MBlmbcs 512 +MHlmbcs 512 +MMmcs 30 (left), against your proposed changes +MBas ageffcbf +MHas ageffcbf +MBsmbcs 32 +MHsmbcs 32 +MBlmbcs 64 +MHlmbcs 64 +MBsbct 64 +MHsbct 64 +MMmcs 30 (right), and this is what we've observed:

1. Under the load described at the top of this thread, your proposed changes result in ~7% higher RSS usage (1.34GB vs 1.25GB), but lower allocated & unused memory [1]. Since RSS is the value that is used to determine whether the memory alarm should be triggered, I don't think this is an improvement.
2. There are half the mseg_alloc calls since the carriers are half as small, but it only seems to have marginal impact on CPU. CPU utilisation is however spikier [2]
3. After load, your proposed changes result in 5% lower RSS usage (682MB vs 716MB) [3], which is an improvement worth considering, were it not for the next observation
4. If AMQP message bodies are increased from 1000 bytes to 100,000 bytes, allocated & unused binary_alloc has silly numbers: 14.82GB allocated (vs 3.27GB) & 12.08GB unused (vs 531MB) [4]. We don't know why the single-block carriers behave this way, but until we figure it out, any RSS memory savings become irrelevant.

In conclusion, the new memory allocator defaults are better than what we had before, and while I am convinced that we can make them even better, we would like to stick to what we've settled on for a while, and wait for more feedback. Based on what we have learned so far, it's obvious that it's impossible to have defaults that fit everyone's needs, so we would like to spend some time coming up with RabbitMQ optimisation for specific use-cases (e.g. low-latency, buffering, many channels, many queues, shared environments etc.), before we take another pass at memory allocators.

Thank you for all your help Ilya, Gerhard.

jenius_Yang

unread,
Jul 6, 2018, 12:59:54 AM7/6/18
to rabbitmq-users
I'm running with RabbitMQ 3.6.6, Erlang 20.3 on Debian 8  on instances (8GB). I used the settings you suggested with the default  +MMmcs settings
and I'm seeing almost 3G cost by 6000 connections, how to decrease this usage, since it will limit to rabbitmq vm threshold soon

Ilya Smirnov

unread,
Jul 6, 2018, 2:42:45 AM7/6/18
to rabbitmq-users
Hello,

Try to reduce sndbuf, recbuf values.

Jenius_Yang

unread,
Jul 6, 2018, 3:17:58 AM7/6/18
to rabbitmq-users
I have thought abt that , But my config use the default (version 3.6.6)

 {tcp_listen_options, [

                                 binary,

                                 {backlog, 4096},

                                 {sndbuf, 32768},

                                 {recbuf, 32768}

                                ]}


Its only 32K, 32K * 2* 6000 = 384M   It cannot be up to 1G , but why my consumption is over 2G,  or whats the default TCP buffer size for rabbitMQ 3.6.6

Michael Klishin

unread,
Jul 6, 2018, 5:37:30 AM7/6/18
to rabbitm...@googlegroups.com
Please cut it off with arbitrary questions in this thread. It was very specifically started to collect data about the effects of a certain set of allocator flags.
Start new threads for new questions, this is mailing list etiquette 101.

Connections are NOT just TCP buffers. They have state and the runtime can (and does) preallocate more memory as it sees fit. So the minimum connection
cost will never be equal to the size of its TCP buffers, it will be more. However TCP buffers are the primary contributor in nearly every case, by far.

Starting with RabbitMQ 3.6.11 a more precise strategy is used by default and management UI will display allocated-but-not-yet-used memory
in the breakdown [1]. This can be another critically important factor that you won't be able to see on 3.6.6.

3.6.6 is 10 patches behind even in the now EOL 3.6.x series. Please upgrade or we won't be able to help you.


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jenius_Yang

unread,
Jul 6, 2018, 7:31:52 AM7/6/18
to rabbitmq-users
I feel sorry abt that Thanks for your reply


On Friday, July 6, 2018 at 5:37:30 PM UTC+8, Michael Klishin wrote:
Please cut it off with arbitrary questions in this thread. It was very specifically started to collect data about the effects of a certain set of allocator flags.
Start new threads for new questions, this is mailing list etiquette 101.

Connections are NOT just TCP buffers. They have state and the runtime can (and does) preallocate more memory as it sees fit. So the minimum connection
cost will never be equal to the size of its TCP buffers, it will be more. However TCP buffers are the primary contributor in nearly every case, by far.

Starting with RabbitMQ 3.6.11 a more precise strategy is used by default and management UI will display allocated-but-not-yet-used memory
in the breakdown [1]. This can be another critically important factor that you won't be able to see on 3.6.6.

3.6.6 is 10 patches behind even in the now EOL 3.6.x series. Please upgrade or we won't be able to help you.

On Fri, Jul 6, 2018 at 10:17 AM, Jenius_Yang <YJFlo...@gmail.com> wrote:
I have thought abt that , But my config use the default (version 3.6.6)

 {tcp_listen_options, [

                                 binary,

                                 {backlog, 4096},

                                 {sndbuf, 32768},

                                 {recbuf, 32768}

                                ]}


Its only 32K, 32K * 2* 6000 = 384M   It cannot be up to 1G , but why my consumption is over 2G,  or whats the default TCP buffer size for rabbitMQ 3.6.6


On Friday, July 6, 2018 at 2:42:45 PM UTC+8, Ilya Smirnov wrote:
Hello,

Try to reduce sndbuf, recbuf values.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages