[erlang-questions] Node mysteriously sends 11 MB while spawning a process on another node

16 views
Skip to first unread message

Filip Niksic

unread,
Apr 19, 2019, 10:47:52 AM4/19/19
to erlang-q...@erlang.org
Hi all,

I am trying to understand why a node sends 11 MB of unknown data to another node while spawning a process on that node.

Let me briefly explain my setup. There are two nodes involved: main and a. I am running them in two docker containers, which in turn are running in a simulated network in which I can inspect and analyze network traffic using Wireshark. Once the nodes are started, main spawns a process on node a with spawn_link(). In Wireshark I can observe an exchange of ErlDP (distribution protocol) packets. The spawn_link causes a colossal REG_SEND message being sent from main to a; the message has length 11011057 (11 MB) and it is broken into 7605 TCP packets.

Now, it has to be noted that one of the arguments to the spawned process is a function closure. Could it be that this closure causes the runtime to pack all of its data structures and pass them along with the message? If so, how can such a situation be avoided? Is there some general rule of thumb that function closures should not be passed as arguments in a distributed setting?

Thanks,

Filip

Dmitry Kolesnikov

unread,
Apr 19, 2019, 11:14:18 AM4/19/19
to Filip Niksic, erlang-q...@erlang.org
Hello


On 19 Apr 2019, at 17.39, Filip Niksic <fni...@seas.upenn.edu> wrote:

Now, it has to be noted that one of the arguments to the spawned process is a function closure. Could it be that this closure causes the runtime to pack all of its data structures and pass them along with the message? If so, how can such a situation be avoided? Is there some general rule of thumb that function closures should not be passed as arguments in a distributed setting?

I would bet that usage of closure function is the major reason. Passing anonymous functions causes a serialisation of its entire environment. It might easily grow to 11MB.

Some hints and discussion can be found here

The best practice to use fun Mod:Fun/N over fun() -> … end.

Best Regards,
Dmitry

Jesper Louis Andersen

unread,
Apr 19, 2019, 11:24:10 AM4/19/19
to Filip Niksic, Erlang (E-mail)
As Dmitry says, the closure must be sent upon the function spawn. If it is large, you can expect the 11MB to be sent when the function is spawned.

The best way around it is to avoid sending a large body of information when you create a function. My guess is you are referencing a large map or list of data, which in turn gets copied. In some situations, this will also hurt when you spawn the same function locally, so there is good reason to avoid it there as well (though there are some caveats if the referenced data is part of the literal arena in the memory allocation system, and so on).


_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions


--
J.
Reply all
Reply to author
Forward
0 new messages