Hello rabbitmq-users,
During the course of tracking down some latency issues in my system, I
came across background_gc.erl and the fact that a garbage collection of
all waiting processes is performed every 60 seconds. I suspect that
many users never notice this periodic collection, but my system "abuses"
RabbitMQ a bit by opening quite a few connections (several thousand).
The background_gc:gc/0 function takes a nontrivial amount of time to run
on an idle instance of RabbitMQ in my system (i.e. system is up and
running but doing no work). Recon reveals the following:
(rabbit@host)1> recon_trace:calls({background_gc, gc,
[{'_', [], [{return_trace}]}]},
{10, 1000}).
1
16:39:04.115221 <0.260.0> background_gc:gc()
16:39:04.203086 <0.260.0> background_gc:gc/0 --> ok
(rabbit@host)1>
This particular example takes almost 88 milliseconds to run. My
analysis shows that one of my CPU cores is likely pegged during this time.
I have observed 2+ second message latency in our production environment
at the time this VM-wide garbage collection occurs. I suspect this code
exists to prevent some pathological Erlang memory management issues with
refcounted binaries, but I wanted to get the list's opinion of its purpose.
I'm going to perform some testing using OTP 17.4 with the periodic GC
disabled. I will report my results here.
-TimS
--
Tim Stewart
t...@stoo.org