Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

stress-testing erlang

960 views
Skip to first unread message

Ulf Wiger

unread,
Nov 14, 2005, 10:41:18 AM11/14/05
to

Just thought I'd report some benchmarks running 64-bit
erlang on a 1.5 GHz SPARC with 16 GB RAM.

Running a simple ring benchmark, I was able to create
a ring of 20 million processes and pass a message around
the ring. Average time to spawn a process: 6.3 us;
average time to send & receive a message: 5.3 us.

It should be noted that to reach 20 million, I made all
processes in the ring "hibernate" while waiting for the
next message. This is an optimization in the runtime system
used to save memory in an idle process (a very agressive GC,
if you will.) This added slightly to the message-passing
cost. Otherwise, the cost of spawn and message passing
remains largely unaffected all the way.

(It goes without saying that there wasn't enough memory
left in the machine to actually do anything useful, but
I still found it gratifying that the runtime system scales
as well as it does.)

/Uffe
--
Ulf Wiger, Senior Specialist,
/ / / Architecture & Design of Carrier-Class Software
/ / / Team Leader, Software Characteristics
/ / / Ericsson AB, IMS Gateways

Joachim Durchholz

unread,
Nov 14, 2005, 2:24:24 PM11/14/05
to
Ulf Wiger schrieb:

> Just thought I'd report some benchmarks running 64-bit
> erlang on a 1.5 GHz SPARC with 16 GB RAM.
>
> Running a simple ring benchmark, I was able to create
> a ring of 20 million processes and pass a message around
> the ring.

Arrrgh... 20 million processes... I'm dazzled :-)

Um... does that mean that Erlang has a per-process overhead of roughly
589 Bytes? (Ignoring negligible quantities like operating system memory
and such *ggg*)

Regards,
Jo

Marcin 'Qrczak' Kowalczyk

unread,
Nov 14, 2005, 3:16:34 PM11/14/05
to
Joachim Durchholz <j...@durchholz.org> writes:

My 32-bit PC has 32 times less memory. 32 times less threads is 625k.
I managed to run 1M threads with my Kogut compiler and send a message
through the ring 3 times. The OS process used 1.3G of virtual memory.

I only had to override the maximum heap size with an environment
variable, because by default the runtime limits it to physical memory.

This means that a thread used 1.3k of memory. This includes:
- 128 bytes of a thread descriptor
- 256 bytes of the initial stack (malloc'ed)
- structures for the ring and for thread bodies (a mutex,
two condition variables, a mutable variable, some closures,
some other objects)
- doubling of most objects during GC (copying)

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

Marcin 'Qrczak' Kowalczyk

unread,
Nov 14, 2005, 3:20:20 PM11/14/05
to
Joachim Durchholz <j...@durchholz.org> writes:

My 32-bit PC has 32 times less memory. 32 times less threads is 625k.


I managed to run 1M threads with my Kogut compiler and send a message
through the ring 3 times. The OS process used 1.3G of virtual memory.

I only had to override the maximum heap size with an environment
variable, because by default the runtime limits it to physical memory.

This means that a thread used 1.3k of memory. This includes:
- 128 bytes of a thread descriptor
- 256 bytes of the initial stack (malloc'ed)
- structures for the ring and for thread bodies (a mutex,
two condition variables, a mutable variable, some closures,
some other objects)
- doubling of most objects during GC (copying)

The runtime works on 64-bit Unices too, taking advantage of it where
it makes sense, e.g. for array sizes. Most objects take twice as much
of memory though.

Ulf Wiger

unread,
Nov 15, 2005, 4:00:20 AM11/15/05
to
Joachim Durchholz <j...@durchholz.org> writes:

> Arrrgh... 20 million processes... I'm dazzled :-)
>
> Um... does that mean that Erlang has a per-process overhead of roughly
> 589 Bytes? (Ignoring negligible quantities like operating system
> memory and such *ggg*)

The default size of a process in 32-bit erlang is ca 4 Kbytes.
On 64-bit Erlang, most sizes double (which btw means 16 bytes
per character in a string... aargh!)

The 'hibernate' trick certainly reduces the memory footprint
of a process. Below is a shell dialogue from running a process
that just samples its own process metadata and sends it back to
the parent (example using 32-bit Erlang):

Eshell V5.4.9 (abort with ^G)
1> hibbe:spawn().
<0.41.0>
2> process_info(v(-1)).
[{current_function,{erlang,hibernate,3}},
{initial_call,{erlang,apply,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[]},
{dictionary,[{dict_item,dict_value}]},
{trap_exit,false},
{error_handler,error_handler},
{priority,high},
{group_leader,<0.23.0>},
{heap_size,5},
{stack_size,0},
{reductions,11},
{garbage_collection,[{fullsweep_after,17}]}]
3> v(1) ! hi.
hi
4> flush().
Shell got {<0.41.0>,
[{current_function,{hibbe,init,1}},
{initial_call,{erlang,apply,2}},
{status,running},
{message_queue_len,0},
{messages,[]},
{links,[]},
{dictionary,[{dict_item,dict_value}]},
{trap_exit,false},
{error_handler,error_handler},
{priority,high},
{group_leader,<0.23.0>},
{heap_size,233},
{stack_size,3},
{reductions,6},
{garbage_collection,[{fullsweep_after,17}]}]}
Shell got {<0.41.0>,
[{current_function,{hibbe,cont,1}},
{initial_call,{erlang,apply,2}},
{status,running},
{message_queue_len,1},
{messages,[hi]},
{links,[]},
{dictionary,[{dict_item,dict_value}]},
{trap_exit,false},
{error_handler,error_handler},
{priority,high},
{group_leader,<0.23.0>},
{heap_size,233},
{stack_size,3},
{reductions,14},
{garbage_collection,[{fullsweep_after,17}]}]}
ok


As you can tell, even 'initial_call' is reset. What is kept
is only the most needed stuff (anything in the process dictionary,
parameters controlling heap size and garbage collection, etc.

Oh, and here's the code for the example program:

-module(hibbe).

-export([spawn/0, cont/1]).

spawn() ->
Me = self(),
spawn_opt(fun() ->
init(Me)
end, [{priority,high}, {fullsweep_after, 17}]).

init(Parent) ->
put(dict_item, dict_value),
Parent ! {self(), process_info(self())},
erlang:hibernate(?MODULE, cont, [Parent]).

cont(Parent) ->
Parent ! {self(), process_info(self())}.

Ulf Wiger

unread,
Nov 15, 2005, 4:14:53 AM11/15/05
to
Ulf Wiger <ulf....@CUT-ericsson.com> writes:

> Joachim Durchholz <j...@durchholz.org> writes:
>
> > Arrrgh... 20 million processes... I'm dazzled :-)
> >
> > Um... does that mean that Erlang has a per-process overhead of roughly
> > 589 Bytes? (Ignoring negligible quantities like operating system
> > memory and such *ggg*)
>
> The default size of a process in 32-bit erlang is ca 4 Kbytes.

Hmpf! It's of course ca 1 KByte on a 32-bit system - not 4.
Sorry about that.

It's mainly the heap size setting that determines this.
The smallest heap size allowed is 233 words, on both
32-bit and 64-bit erlang.

/Uffe

0 new messages