Memory usage growth when sending messages

42 views
Skip to first unread message

Alexander Gagarin

unread,
Jan 19, 2020, 1:58:02 PM1/19/20
to actor-framework
Hi!

I don't yet have a minimal code example to reproduce, but let's consider a code from closed issue https://github.com/actor-framework/actor-framework/issues/856

I did a simple test: send `rename` message to actor 1000000 times. Rename handler changes internal variable value to new name and sends notifications to a couple of other actors.
Test runs through 3 stages:
1. Send `rename` messages 1000000 times in a loop (processing of sent messages also starts in background).
2. Loop already fiished, but messages are still being processed by CAF runtime (call await_all_actors_done())
3. Destroy `actor_system` and exit.

What I see is that memory is growing constantly during stages 1 and 2. At some point on stage 2 it reaches a limit and keeps at that value. Then, in the very end of step 2 large amount of memory is released and the rest lesser amount is released on step 3, so no leak actually occurs.

Gepftools heap profiler gives me the following top 3 memory consumers (during step 2):

Total: 400.0 MB
   160.6  40.2%  40.2%    160.6  40.2% caf::make_message
   112.6  28.1%  68.3%    112.6  28.1% caf::make_mailbox_element
   101.5  25.4%  93.7%    101.5  25.4% caf::make_counted

Then, right after `await_all_actors_done()` finishes, I see that top 3 freed it's memory:

Total: -300.9 MB
    66.1 -22.0% -22.0%     66.1 -22.0% caf::detail::thread_safe_actor_clock::run_dispatch_loop
    33.0 -11.0% -32.9%     33.0 -11.0% caf::detail::thread_safe_actor_clock::set_request_timeout
  -101.5  33.7%   9.2%   -101.5  33.7% caf::make_counted
  -112.6  37.4%  46.6%   -112.6  37.4% caf::make_mailbox_element
  -160.6  53.4% 100.0%   -160.6  53.4% caf::make_message

Then, on step 3 more memory is freed:

Total: -99.1 MB
   -33.0  33.3%  33.3%    -33.0  33.3% caf::detail::thread_safe_actor_clock::set_request_timeout
   -66.1  66.7% 100.0%    -66.1  66.7% caf::detail::thread_safe_actor_clock::run_dispatch_loop

(here I listed only CAF-related top-consumers).

Can you please explain why this happens and is it normal? Even though no leaks are detected, behaivour like that (memory consumption growing) is sometimes not desired. Are there any memory-related knobs in CAF to limit memory consumption or turn on more frequent cleanup, etc?

Alexander Gagarin

unread,
Jan 19, 2020, 2:04:28 PM1/19/20
to actor-framework
Forgot to mention CAF version: 0.17.3

Dominik Charousset

unread,
Jan 24, 2020, 3:13:22 PM1/24/20
to actor-f...@googlegroups.com
Hi,

> Can you please explain why this happens and is it normal?

I think what you see is simply a producer that generates messages faster than the consumers can process. Since you’re using asynchronous messages, this means the sender pushes messages as fast as it can into the mailbox with no feedback on whether the receiver can keep up. Messages pile up until the reaching a maximum (when the producer stops or at least slows down) and then memory consumption decreases again as messages get removed from the mailbox again.

This is true for any queueing system, not just CAF. We introduced streams to CAF for adding such a feedback channel between senders and receivers.

> Are there any memory-related knobs in CAF to limit memory consumption or turn on more frequent cleanup, etc?

CAF uses reference counting to release resources as early as possible back to the allocator. If you see large chunks returned to the OS at once that’s most likely an artifact of the malloc()/free() implementation. Maybe you get a smoother curve with something like jemalloc. But CAF has no garage collector or other custom memory management.

FWIW, one reason why the producer is probably faster than the consumer might be the performance overhead of the current pattern matching implementation. That’s something we want to address with CAF 0.18. Maybe you’ll see a higher throughput (and thus not as big of a spike in memory usage) with the next CAF release.

Hope that helps.

Dominik

Alexander Gagarin

unread,
Jan 28, 2020, 1:25:10 PM1/28/20
to actor-framework
Dominik, thanks for an explanation!

FWIW, one reason why the producer is probably faster than the consumer might be the performance overhead of the current pattern matching implementation. That’s something we want to address with CAF 0.18. Maybe you’ll see a higher throughput (and thus not as big of a spike in memory usage) with the next CAF release.

I'm tracking a development process a little bit and very excited about upcoming release! But scary at the same time because of such massive changes -)
Special thanks for visibility-hidden suppor/Windows DLL, you didn't forget my issue and at least taken into account my pull request. Changeset is versatile and just HUGE! Thumbs up for that.

As far as I understand, we'll now have to register every message type manually? I already doing kinda the same for Python bindings (via pybind11), maybe we'll finally have some support for such annoying things in core language.
Reply all
Reply to author
Forward
0 new messages