Understanding sample_retention_policies in RabbitMQ

535 views
Skip to first unread message

Girish Kg

unread,
Jun 5, 2017, 6:18:55 PM6/5/17
to rabbitmq-users
Hi,

I am trying to address the memory leak in version 3.5.6 and changed the rates_mode to "none" . I would like to change the retention policy to a small number. Can someone explain me the below default configuration 


 {sample_retention_policies,
          [{global,[{605,5},{3660,60},{29400,600},{86400,1800}]}]}

Does that means it collect the stat in every 5 mins interval and keep it for 10 minutes and 5 seconds and then every one minutes interval and retained for an hour and then every 10 minutes ( keep it for 8 hours) and there after every 30 mins ( and keep for a day) ? 

I tried to change to the exact half 

{sample_retention_policies,
                            [{global,[{305, 5}, {1830, 60}, {14700, 600}, {43200, 1800}]}]}

but it seems like it didnt work. Most of the time I was getting the message that "statistics note available" . So that means it may not working as the way I mentioned. Can someone help me please to understand this.

Thanks.
Girish

Michael Klishin

unread,
Jun 5, 2017, 7:10:45 PM6/5/17
to rabbitm...@googlegroups.com
Samples are not actually collected at the intervals in the policy.
Stats are emitted continuously at fixed intervals and then aggregated
(added up or averaged, if we oversimplify).

When aggregating samples, sample retention policies control two things:

 * For how long the samples have to be kept (e.g. if you never request values for the last 12 hours, why keep it for so long)
 * How fine grained should the intervals be (e.g. 5 minute intervals or 60 minute intervals)

Aggregated samples actually do not take a lot of memory. What does is the stats collector
when it cannot keep up with all the events (since up to 3.6.7 it was a responsibility of single node
and in 3.5.6, a single process).

You have provided no data that proves that the issue is with the stats DB but if you are
sure about that, there are two known strategies you should use instead:

 * Increase stats collection interval, say, to 30 or 60 seconds (most monitoring systems use 60 second intervals in practice,
so emitting stats more frequently isn't important)
 * You can set up a cron job that restarts the stats database, as described in the docs:

Or you can just upgrade to 3.6.7 first, then 3.6.10:


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Girish Kg

unread,
Jun 7, 2017, 6:08:33 AM6/7/17
to rabbitmq-users
Hi Michael,

Thank you so much for the response.  Yes my rabbitMQ instance get terminated due to the increased binary memory usage. For eg: it start dying when the message counts goes more than 10M ( I have 8GB RAM and the binary start consuming more than this and crashes Rabbit).

I had already changed the stat collection interval to 60 seconds, but the issue  persists . So  was thinking that the retention policies can also help me; looks like that is not the case.  I did reset stat DB earlier, will create a cron job to execute periodically. I believe the command "rabbitmqctl eval 'exit(erlang:whereis(rabbit_mgmt_db), please_terminate).'" returns "true" only on the server where the DB resides and the command moves the DB to another node in the cluster. I believe it reset the DB also while doing this. correct? 


I was planning to upgrade the rabbitMQ in the future. So Can I upgrade to 3.6.9 directly ? or do I need to upgrade to 3.6.7 first and then 3.6.9?   


Thank you so much. Appreciate your help on this.

Regards
Girish 
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Girish Kg

unread,
Jun 7, 2017, 7:16:26 AM6/7/17
to rabbitmq-users
Michael,

Thanks.

I had created  30K test message in one of the queues and that increased binary memory to 985MB ( overall memory 1.1GB) and created a cron job to execute "rabbitmqctl eval 'exit(erlang:whereis(rabbit_mgmt_db), please_terminate).'". it did execute and returned true. But no change in binary memory observed.  I am using the below configs

{rabbitmq_management, [ {rates_mode, none},
                        {collect_statistics_interval,60000},
                        {stats_event_max_backlog,500}]}

Please let me whether there is any other way to reset the binary memory or upgrading to >3.6.3 is the only option.

Thanks. 

On Tuesday, 6 June 2017 00:10:45 UTC+1, Michael Klishin wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Jun 7, 2017, 8:00:25 AM6/7/17
to rabbitm...@googlegroups.com
You can upgrade to 3.6.10 directly.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages