[erlang-questions] Why Beam.smp crashes when memory is over?

Max Lapshin

unread,

Nov 8, 2009, 3:56:05 PM11/8/09

to Erlang-Questions Questions

Why does Erlang crashes, when physical memory is over?

Error is:

beam.smp(5454,0xb01aa000) malloc: *** mmap(size=41943040) failed (error code=12)

*** error: can't allocate region

*** set a breakpoint in
malloc_error_break to debug

Isn't it possible to intercept this error and stop erlang process,
that requests memory?

________________________________________________________________
erlang-questions mailing list. See http://www.erlang.org/faq.html
erlang-questions (at) erlang.org

Jayson Vantuyl

unread,

Nov 8, 2009, 3:57:45 PM11/8/09

to Max Lapshin, Erlang-Questions Questions

From within Erlang, I don't believe so.

From outside, maybe, with the right debugger.

In general, I believe the heartbeat functionality (i.e. erl -heart) is
supposed to handle this by letting it crash and restarting the whole
application.

--
Jayson Vantuyl
kag...@souja.net

Max Lapshin

unread,

Nov 8, 2009, 4:02:31 PM11/8/09

to Jayson Vantuyl, Erlang-Questions Questions

On Sun, Nov 8, 2009 at 11:57 PM, Jayson Vantuyl <kag...@souja.net> wrote:
> From within Erlang, I don't believe so.

And what are the problems? OS never crashes when memory is over, OOM
killer does the job well.
Why should die Erlang VM?

Jayson Vantuyl

unread,

Nov 8, 2009, 4:26:43 PM11/8/09

to Max Lapshin, Erlang-Questions Questions

Erlang needs to allocate memory in any number of situations. For
example, assume that Erlang tried to tell your code. Should it
generate a message? Should it call a function? Should it create an
exception? Guess what all of these have in common? They allocate
memory (which isn't there).

You can try to work around this. You can have reserved memory just
for this. However, there's still no clue on where it should happen.
There is not a very good chance that this error will happen in the
process that has all of the memory allocated. If you take the Linux
OOM approach, you would have to scan all of the processes, weigh them,
mix in some randomness, and message it. There's no memory for it do
think much about the problem. Even if you killed it, that would just
trigger the supervisor to restart it, even though we may not have
actually stopped the memory leak.

Worse, this means that an "out of memory error" can happen anywhere
and must be handled everywhere, even the supervisors. Patching the
supervisors to reliably handle this would be insane. Suddenly,
reliability under load becomes impossible to guarantee.

Even if you emulate Linux and provide an OOM-killer (i.e. kill
processes based on randomness + heuristics to detect runaway
processes), you introduce tons of random behavior into the VM, when a
VM restart would be recognizable, loggable, and generally easier to
debug.

Exposing those errors creates an ugly situation. This extra error
handling would cause an explosion of corner cases, decreases in
reliability, and volumes of code (i.e. where bugs live). Inside of
Erlang, the philosophy is to use supervisors and writing daemons to be
able to recover from a restart. Heartbeat gives the same behavior for
the entire VM. It's a philosophical design choice to try to handle
critical faults rather than mask critical faults. It's really better
than trying to handle this.

It's seems obvious that there should be a better way to handle OOM,
but it's is all devilishly difficult to do in any meaningfully
portable (or useful) way.

--
Jayson Vantuyl
kag...@souja.net

Tony Arcieri

unread,

Nov 8, 2009, 6:25:17 PM11/8/09

to Jayson Vantuyl, Max Lapshin, Erlang-Questions Questions

On Sun, Nov 8, 2009 at 1:57 PM, Jayson Vantuyl <kag...@souja.net> wrote:

> In general, I believe the heartbeat functionality (i.e. erl -heart) is
> supposed to handle this by letting it crash and restarting the whole
> application.

Yes, the design philosophy of Erlang is that your application should be
stateless aside from things residing in dets tables or other external,
stateful resources.

Following the fail early, fail often approach, Erlang recovers from an out
of memory condition by crashing the entire Erlang intrepreter, restarting it
via heartbeat, and reloading your application.

--
Tony Arcieri
Medioh/Nagravision

Max Lapshin

unread,

Nov 8, 2009, 11:56:38 PM11/8/09

to Jayson Vantuyl, Erlang-Questions Questions

I understand your arguments. Yet, It would be great to have
instrument, that tells supervisor, that child
is eating too much memory. Look, if we are speaking about reliable
platform, than it is better to have some
predictable behaviour.

Perhaps some startup options, like memory threshold for OOM killer?
Supervisor will never grow up to 20 GB,
but our rabbitmq production instance error_logger used to grow up to 20GB+

Max Lapshin

unread,

Nov 9, 2009, 12:08:58 AM11/9/09

to Jayson Vantuyl, Erlang-Questions Questions

Look, I have a program, that reads MPEG TS stream from network. There
is some bug in my code
and my ts_lander crashes beam. How is it possible to store state of
external HTTP stream??

If Erlang VM wouldn't crash, but only kill the leaking process, I
could buffer stream in other process and feed
new mpeg ts lander after restart. But it is impossible =(

Tony Arcieri

unread,

Nov 9, 2009, 12:43:25 AM11/9/09

to Max Lapshin, Erlang-Questions Questions

On Sun, Nov 8, 2009 at 9:56 PM, Max Lapshin <max.l...@gmail.com> wrote:

> I understand your arguments. Yet, It would be great to have
> instrument, that tells supervisor, that child
> is eating too much memory. Look, if we are speaking about reliable
> platform, than it is better to have some
> predictable behaviour.
>

I'm not really sure if we're speaking about the same thing then. Erlang is
best utilized as a stateless platform, with a "fail early fail often"
philosophy, and crashing the entire interpreter in the event of some
underlying problem with the OS fits that entirely. You seem to be seeking a
platform that lets you program defensively, handling errors rather than
crashing and restoring with a clean state. In that regard, Erlang may not
be the best language for you to work with, but then again the problem of
successfully recovering from an out-of-resource condition is an extremely
difficult one to handle well without scrapping it all and trying to restart
with a clean state.

I'm not really sure what alternative behavior you're proposing, either?

--
Tony Arcieri
Medioh/Nagravision

Andrew Thompson

unread,

Nov 9, 2009, 1:48:11 AM11/9/09

to Tony Arcieri, Max Lapshin, Erlang-Questions Questions

On Sun, Nov 08, 2009 at 10:43:25PM -0700, Tony Arcieri wrote:
> I'm not really sure if we're speaking about the same thing then. Erlang is
> best utilized as a stateless platform, with a "fail early fail often"
> philosophy, and crashing the entire interpreter in the event of some
> underlying problem with the OS fits that entirely. You seem to be seeking a
> platform that lets you program defensively, handling errors rather than
> crashing and restoring with a clean state. In that regard, Erlang may not
> be the best language for you to work with, but then again the problem of
> successfully recovering from an out-of-resource condition is an extremely
> difficult one to handle well without scrapping it all and trying to restart
> with a clean state.
>

I agree that Erlang is best used when your application is as stateless
as possible, or has clear transfer of responsibility (HTTP server, SMTP
server, etc), but sometimes Erlang still has a lot of benefits even if
you're lugging some state around, and that's when the current OOM
behaviour is a little annoying. I don't say that the current behaviour
is not the best in a lot of situations, but it'd be nice to have an
alternative in cases where nuking the VM and having to recover or simply
discard a lot of state is unpleasant.

In an ideal world I'd like to see an *optional* OOM behaviour similar to
the following:

On startup, the VM pre-allocates enough memory to be able to attempt OOM
recovery without additional malloc calls. If and when an OOM condition
is detected, freeze the VM, iterate the process list and simply
exit(FatPid, oom) the process with the most memory allocated (obviously
this strategy is full of holes, epecially with regard to off-heap
binaries), but *sometimes* it's better than just blowing away the whole
thing. Then, if the malloc fails again, revert to the current behaviour.

You could, of course try to be smarter about it (look at the process
calling malloc, look at the reference counted binaries, try to avoid
killing SASL or system processes), but the point is that sometimes the
obvious offender is the only one you need to kill to restore a system's
functionality.

Of course, I'm no expert on this, these are merely musings from having
(entirely due to my own fault) triggered plenty of OOM VM kills that
would have been as easily solved if a single process had been killed.

Andrew

Zoltan Lajos Kis

unread,

Nov 9, 2009, 2:35:53 AM11/9/09

to Max Lapshin, Erlang-Questions Questions

Why don't you create a process, which regularly checks on the memory usage
of those processes, and kills them when they begin to eat up too much
memory?

Zoltan.

Max Lapshin

unread,

Nov 9, 2009, 2:37:32 AM11/9/09

to Zoltan Lajos Kis, Erlang-Questions Questions

Yes, there are techniques to write watchdogs, but my question was: is
it possible to prevent Erlang VM from crash?

Robert Virding

unread,

Nov 9, 2009, 3:16:44 AM11/9/09

to Max Lapshin, Zoltan Lajos Kis, Erlang-Questions Questions

No.

There is a major difference between handling OOM in an OS and in the BEAM.
In an OS it usually at a per process level that memory runs out so it is
easy to decide which process to kill so that the OS can continue. In the
BEAM, however, it is the VM as a whole which has run out of memory not a
specific, it is. therefore, much more difficult to work out which process is
the culprit and to decide what to do. For example it might be that the
process which causes the OOM is not the actual problem process, it might
just the last straw. Or the actual cause may that the whole app might be
generating large binaries too quickly. Or it might be that the whole app is
spawning to many processes without any one process being the cause. Or ...
In all these cases killing the process which triggered the OOM would be the
Wrong Thing. We found that it was difficult to work out a reasonable
strategy to handle the actual cause so we decided not handle it.

"Don't catch an error which you can't handle" as the bard put it.

Robert

2009/11/9 Max Lapshin <max.l...@gmail.com>

Max Lapshin

unread,

Nov 9, 2009, 3:19:12 AM11/9/09

to Robert Virding, Zoltan Lajos Kis, Erlang-Questions Questions

Ok, so You advice is just to look after memory usage of "dangereous" processes?

Robert Virding

unread,

Nov 9, 2009, 3:42:03 AM11/9/09

to Max Lapshin, Zoltan Lajos Kis, Erlang-Questions Questions

That would be a start, it very much depends on your app. I am not really the
right person to ask about details on chasing the culprit, but I do remember
the discussions we had long ago on trying to work out a good solution. Which
we couldn't.

2009/11/9 Max Lapshin <max.l...@gmail.com>

Tony Rogvall

unread,

Nov 9, 2009, 3:45:10 AM11/9/09

to Robert Virding, Max Lapshin, Zoltan Lajos Kis, Erlang-Questions Questions

Interesting discussion!

I have been working on a resource system for Erlang for nearly two
years now.
I have a working (tm) prototype where you can set resource limits like
max_processes/max_ports/max_memory/max_time/max_reductions ...
The limits are passed with spawn_opt and are inherited by the
processes spawned.
This means that if you spawn_opt(M,F,A[{max_memory, 1024*1024}]) the
process
will be able to use 1M words for it self and it's "subprocesses". This
also means
that the spawner will get 1M less to use (as designed right now). If a
resource limit
is reached the process crash with system_limt reason.

There are still some details to work out before a release, but I will
try to get it ready before
end of this year.

/Tony

Kenneth Lundin

unread,

Nov 9, 2009, 4:53:10 AM11/9/09

to Robert Virding, Max Lapshin, Zoltan Lajos Kis, Erlang-Questions Questions

On Mon, Nov 9, 2009 at 9:16 AM, Robert Virding <rvir...@gmail.com> wrote:
> No.
>
> There is a major difference between handling OOM in an OS and in the BEAM.
> In an OS it usually at a per process level that memory runs out so it is
> easy to decide which process to kill so that the OS can continue. In the
> BEAM, however, it is the VM as a whole which has run out of memory not a
> specific, it is. therefore, much more difficult to work out which process is
> the culprit and to decide what to do. For example it might be that the
> process which causes the OOM is not the actual problem process, it might
> just the last straw. Or the actual cause may that the whole app might be
> generating large binaries too quickly. Or it might be that the whole app is
> spawning to many processes without any one process being the cause. Or ...
> In all these cases killing the process which triggered the OOM would be the
> Wrong Thing. We found that it was difficult to work out a reasonable
> strategy to handle the actual cause so we decided not handle it.
>
> "Don't catch an error which you can't handle" as the bard put it.
>
> Robert

We have discussed this many times and have always come to the same
conclusion as Robert expains above, that we can't know the right thing
to do so we just do nothing and terminate the VM.

What we could do is to make it easier for the user to prevent OOM
situations and also to
let him take the decision when it occurs or rather before it occurs.

One way would be to let the user set a memory quota on a process with
options at spawn time. When the process reaches it quota it can be
automatically killed or the user can
be notified in some way and take actions.

We have also thought about ways for the user to monitor memory consumption and
by this taking actions before the VM runs out of memory.

/Kenneth, Erlang/OTP Ericsson

Decker, Nils

unread,

Nov 9, 2009, 5:31:19 AM11/9/09

to Jayson Vantuyl, Max Lapshin, Erlang-Questions Questions

Hello,

I had a few occasions during developing that a runaway process caused my machine to swap and gind to a halt. (i should have used ulimit on beam)

I wondered why there is no way to limit the size of a single process. It could be a simple option to spawn to limit the heap size of a process. (like ulimit)
The process gets killed if it ever grows beyond the limit. A limit for the size of the message queue would be nice too, because a process with a few thousand entries is (most of the time) not going to cope with it anyway.
A similar limit for ETS tables would be possible too. A memory limit for shared resources (large binaries) would be more difficult.

Nils Decker

Dmitry Belyaev

unread,

Nov 9, 2009, 6:05:41 AM11/9/09

to Max Lapshin, Jayson Vantuyl, Erlang-Questions Questions

What if you'll have different nodes for different tasks instead of
processes?

Start one node to read stream and one (or many) other node to process
the stream part.

Angel Alvarez

unread,

Nov 9, 2009, 6:43:02 AM11/9/09

to erlang-q...@erlang.org

Well please let me say something

I'm plain new but some things are pretty clear for me.

The beauty of the erlang concept is "let it crash" , "don't program defensively"
so the VM and the underlaying hardware are entities that can fail, that's it.

What's the problem so?

Joe said...

If you want failure tolerance you need at least two nodes...

From J.A thesis
" ...Schneider [60, 59] answered this question by giving three properties
that he thought a hardware system should have in order to be suitable for
programming a fault-tolerant system. These properties Schneider called:..."

1. Halt on failure — in the event of an error a processor should halt
instead of performing a possibly erroneous operation.2

So on memory exhaustation the VM has to die and other node (erlang) will do the recovery.

that's the distrirbution role, no only to span computations over several nodes to enhance performance
but to provide resilence in the presence of fatal errors (non correctable).

As a OS process the VM has to compete with other OS processes so in a shared deployment (a VM running
on a server or a desktop) you cant be safe agaisnt a OOM trigered by other entities.

Such resource control thing wil only augment process overhead and context switching in the VM.

People new to erlang will be atracted to this hierarquical decomposition of tasks as joe stated in his thesis
"If you cant run certaing task try doing something simpler"

Many languages and VM's are incorporating erlang's good "multicore" features but not the erlang powerfull error handling concept
and you guys want to kill the simpliticy incorporating many defensive capabilities to avoid fatality instead of just organize code to
handle such fatality.

¿whats next?, ¿A mailbox maximum message queue control?

Well, that's all i have to say about that, Forrest Gump.

Este correo no tiene dibujos. Las formas extrañas en la pantalla son letras.
__________________________________________

Clist UAH a.k.a Angel
__________________________________________
Warning: Microsoft_bribery.ISO contains OOXML code.

Tony Rogvall

unread,

Nov 9, 2009, 8:11:34 AM11/9/09

to Angel Alvarez, erlang-q...@erlang.org

It is still the same "let it crash" concept using the resource limit
system I am designing.
But you can limit the crash in a more controlled way. Also you will be
able to report
interesting information about what is crashing and when.

There is sometimes an issue when big systems crash. The restart may
take a lot of time.
Nodes must be synchronised, database tables must be repaired etc etc.
I guess you can design this to be light and easy, but it is not always
the case.

/Tony

Angel Alvarez

unread,

Nov 9, 2009, 8:54:10 AM11/9/09

to Tony Rogvall, erlang-q...@erlang.org

Well still there are many issues with this new approach

Where are the maibox of processes located?

With a heap pre process...

Couldnt you trigger a memory exception on a remote process by just sending one message
when the process is almost consuming its reserved memory?

Systems other than embedded erlangs deploy ( form de current erlang movement as a server/desktop plataform)
will suffer from resource contention beetween erlang VM and other OS processes.

Port programs also need system resources...

Well in the end your approach is still very interesting as a framework for continous erlang VM innovations...

but please correct me if im wrong but I saw that memory carriers allowed to set several options on erlang VM start-up so,

is stil posible to pacth those carriers to allow a safe memory reservation to let de VM manage properly a memory full
condition by killing the offending process (sort of a OOM killer for the VM)?

Just telling the VM not to "kill system process" and let the supervisors do the work...

/Angel

--
No imprima este correo si no es necesario. El medio ambiente está en nuestras manos.
__________________________________________

Clist UAH a.k.a Angel
__________________________________________

China 'limpia' el Tibet para las Olimpiadas.

Thomas Lindgren

unread,

Nov 9, 2009, 8:25:09 AM11/9/09

to Erlang-Questions Questions

----- Original Message ----
> From: Andrew Thompson <and...@hijacked.us>
> ... it'd be nice to have an
> alternative in cases where nuking the VM and having to recover or simply
> discard a lot of state is unpleasant.
>
> In an ideal world I'd like to see an *optional* OOM behaviour similar to
> the following:
>
> On startup, the VM pre-allocates enough memory to be able to attempt OOM
> recovery without additional malloc calls. If and when an OOM condition
> is detected, freeze the VM, iterate the process list and simply
> exit(FatPid, oom) the process with the most memory allocated (obviously
> this strategy is full of holes, epecially with regard to off-heap
> binaries), but *sometimes* it's better than just blowing away the whole
> thing. Then, if the malloc fails again, revert to the current behaviour.

Another way of doing that might be to hand the decision to (a modified version of) the supervisor hierarchy (if any). The supervisor could get info on sibling priority (if any) and/or memory usage as part of the request perhaps. Restart up the supervisor hierarchy to suit. That might give more structure to the restarting.

Or you could do the OOM-killing in distributed erlang + unix instead. Run the buggy, memory-hungry stuff in a separate VM with a memory limit on the process, and manage the work from a "supervisor VM" using monitoring and rpc. Not quite as elegant, of course.

Best,
Thomas

Tony Rogvall

unread,

Nov 9, 2009, 9:38:02 AM11/9/09

to Angel Alvarez, erlang-q...@erlang.org

Hi!

On 9 nov 2009, at 14.54, Angel Alvarez wrote:

> Well still there are many issues with this new approach
>

Yes! But it does not scare me ;-)

> Where are the maibox of processes located?
>
> With a heap pre process...

Depends on the implementation. But in general you could do something
like, If the the data is shared then
you split the share (memory_size/ref_count). If the data is copied
then you must count it in.

>
> Couldnt you trigger a memory exception on a remote process by just
> sending one message
> when the process is almost consuming its reserved memory?
>

Yes. But that is the point. If it pass the limit the process will die.
There are many special cases where you could think of using the memory
in a better and more optimal way.
Lets say you are reaching the memory limit you may switch to a
compression algorithm for heap memory !?
But lets keep it simple for the prototype, and see if it useful.

> Systems other than embedded erlangs deploy ( form de current erlang
> movement as a server/desktop plataform)
> will suffer from resource contention beetween erlang VM and other OS
> processes.
>
> Port programs also need system resources...
>

For loadable drivers using driver_alloc, one could possibly do
something, otherwise it will be
up to the driver designer to handle it. There is a max_ports in the
prototype that limits number of
open_ports. If sockets/files are mapped to single ports then it may
help a bit.

> Well in the end your approach is still very interesting as a
> framework for continous erlang VM innovations...
>

Thanks.

> but please correct me if im wrong but I saw that memory carriers
> allowed to set several options on erlang VM start-up so,
>

I am not sure what you mean here?

/Tony

Angel Alvarez

unread,

Nov 9, 2009, 11:21:41 AM11/9/09

to erlang-q...@erlang.org

Well some bits can be controlled as you show on spawn_opt. In the other hand i mean that VM Safe memory
as stated by others (Andrew, Tom ...) should be controlled on memory carriers: ( erts_alloc(3) )

In esence

- A new emergency memory alocator in the alloc_util framework just in case the VM need memory to recover from a OOM.

or perhaps

- Using the segment allocator on mmap supported arquitectures to allow fast recovery for the full erlang vm (sort of checkpointing)
using a special BIF you could instruct memory carriers to checkpoint the entire VM usining this allocator just in case the VM crashes.
so on next run the enrire VM can be recovered parsing the last checkpint

</WARNING>

Still i think is better and more clean having two o more instances that bulletprooffing the VM

/Angel

--

Este correo no tiene dibujos. Las formas extrañas en la pantalla son letras.
__________________________________________

Clist UAH a.k.a Angel
__________________________________________

Artista -- (internet) --> Usuario final. Así los artistas cobran más y dicen menos paridas sobre lo que creen que es la piratería.

Jayson Vantuyl

unread,

Nov 9, 2009, 12:38:59 PM11/9/09

to Decker, Nils, Max Lapshin, Erlang-Questions Questions

This would be a very useful tunable.

On Nov 9, 2009, at 2:31 AM, Decker, Nils wrote:

> Hello,
>
> I had a few occasions during developing that a runaway process
> caused my machine to swap and gind to a halt. (i should have used
> ulimit on beam)
>
> I wondered why there is no way to limit the size of a single
> process. It could be a simple option to spawn to limit the heap size
> of a process. (like ulimit)
> The process gets killed if it ever grows beyond the limit. A limit
> for the size of the message queue would be nice too, because a
> process with a few thousand entries is (most of the time) not going
> to cope with it anyway.
> A similar limit for ETS tables would be possible too. A memory limit
> for shared resources (large binaries) would be more difficult.
>
> Nils Decker

--
Jayson Vantuyl
kag...@souja.net

Jayson Vantuyl

unread,

Nov 9, 2009, 12:40:55 PM11/9/09

to Max Lapshin, Erlang-Questions Questions

You could run two Erlang VMs (thus, keeping the HTTP connection and
its state separate). Then the crasher wouldn't take the other one
down. One with the processing and one with the networking, and have
them communicate via distributed Erlang. You should still ulimit beam
to keep the other one from getting OOM-killed though.

Assuming you can get the throughput you need, this also would
decompose some of the components, which might helps scale horizontally
later.

Jayson Vantuyl

unread,

Nov 9, 2009, 12:42:39 PM11/9/09

to Max Lapshin, Erlang-Questions Questions

The problem is that there is no way for the VM to really know which
process is leaking. The Linux OOM-killer has gone through a bunch of
rewrites and it still doesn't always get it right.

It might be possible to have a threshold trigger that would hit some
sort of global alarm that could be application specific. That's
probably the closest you could get though.

On Nov 8, 2009, at 9:08 PM, Max Lapshin wrote:

--
Jayson Vantuyl
kag...@souja.net

Richard O'Keefe

unread,

Nov 9, 2009, 11:36:44 PM11/9/09

to Kenneth Lundin, Robert Virding, Max Lapshin, Zoltan Lajos Kis, Erlang-Questions Questions

On Nov 9, 2009, at 10:53 PM, Kenneth Lundin wrote:
> What we could do is to make it easier for the user to prevent OOM
> situations and also to
> let him take the decision when it occurs or rather before it occurs.
>
> One way would be to let the user set a memory quota on a process with
> options at spawn time. When the process reaches it quota it can be
> automatically killed or the user can
> be notified in some way and take actions.

One of the reasons this hasn't been done is, I presume, the fact that
it is quite difficult for a programmer to determine what the memory
quota should be. It depends on
- the nature of the code being run
- the data it is processing (data loads tend to increase with the
years)
- the cleverness of the compiler
- the data representation used by the Erlang system (e.g., whether
pointers are 32 bits or 64, whether there are special tags for small
arrays
- the data representation used by library modules the code calls
(change the representation of ordered sets and a bound that might
have worked might not any more)
- a system policy on whether it's better to be tight on memory and
tolerate some processes crashing and needing to be restarted or
whether it's best to keep more processing running at the cost of
using more memory, a thing that might change at run time.

I have long bemoaned the (measured!) fact that the size of a stack
frame in C can vary by a factor of 10, so that determining the sizes
for POSIX thread stacks is a game of Russian Roulette.

You can determine suitable sizes for a particular release by running
experiments, but the sizes thus determined are ONLY reliable for the
specific release or releases you tried on the machine or machines
that you tried it or them on. Set a limit, and you can *EXPECT*
working processes to be crashed as a *NORMAL* outcome.

Once some feature goes in, I expect to see messages in this mailing
list "My program gets lots of killed processes due to Out-Of-Memory
but it used to work and I haven't changed it."

Angel Alvarez

unread,

Nov 10, 2009, 4:08:22 AM11/10/09

to erlang-q...@erlang.org

I agree with You Richard

Most of our production java code throws millions of nullpointer exceptions messages
and management staff, and developers dont care, if users are happy.

I presume many subtle errors came from those nullpointer errors but people got used
to have those messages when the moved from other platforms to java and now is
the norm among java developers.

Its not acceptable for a platform that focuses on highly scalable, survivable and error tolerant
systems that programmers were acustommed to see a lot of memory errors.

Everyone here knows how to measure memory available at program life-time (im new but, i think i knows ), there is no need
to cover such poor memory management with artificial limits, this is no C and the like.

Virtual memory and Garbage collection should garantee enough memory provided the programmers know something about the algorithms
and data structures (good unit testing).

Imagine someone asks a question on the list, provides the offending code and everyother who is helping sees lot of memory errors
on his code..

Should i waste my time helping someone when his code spurts so much memory errors and hi/her doesnt care?

Should newbies like me be exposed to such poor desing and coding style?

What's better? counting bits properly or just set the limit and wait the process for a memory crash??

How do you manage diferent compilers code size, or 32/64bits overhead? IMHO its up to the programmer concise
memory usage expectation and this work has to be done on desing fase, and not let some black magic, i pressume
this wil not be portable across many erlang versions and setups...

Clearly for memory battles on process/threads and the like we have already many good languages like C
despite most people here seems to want the VM to became a better OS...

Classic behavior (let the VM crash) should be preserved for legacy systems and for academic and marketing stuff...

I agree, Shielding the VM againts many error condictions and you probably will generate as much poor programmers.

/Angel

PS: I agree with Joe, Erlang best feature is a new mental model of dealing with faulty conditions more than the "multicore" promises

> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
>
>

--

Este correo no tiene dibujos. Las formas extrañas en la pantalla son letras.
__________________________________________

Clist UAH a.k.a Angel
__________________________________________

HONEY BUNNY - Any of you *uckin' pricks move and I'll execute every mother*ucking last one of you.

Chandru

unread,

Nov 10, 2009, 6:18:04 AM11/10/09

to Tony Rogvall, Erlang-Questions Questions

2009/11/9 Tony Rogvall <to...@rogvall.se>

> Interesting discussion!
>
> I have been working on a resource system for Erlang for nearly two years
> now.
> I have a working (tm) prototype where you can set resource limits like
> max_processes/max_ports/max_memory/max_time/max_reductions ...
>

You don't happen to have a max_message_q_size option do you? Or does
max_memory implicitly enforce this?

Chandru

Ulf Wiger

unread,

Nov 10, 2009, 7:10:24 AM11/10/09

to Richard O'Keefe, Erlang-Questions Questions

Richard O'Keefe wrote:
>> One way would be to let the user set a memory quota on a process with
>> options at spawn time. When the process reaches it quota it can be
>> automatically killed or the user can
>> be notified in some way and take actions.
>
> One of the reasons this hasn't been done is, I presume, the fact that
> it is quite difficult for a programmer to determine what the memory
> quota should be. It depends on

> ...

I implemented resource limits in erlhive - at the Erlang level rather
than in the VM. The purpose was to be able to run foreign code safely
in a hosted environment. Eliminating the possibility to do damage
through traditional side-effects was relatively easy with a code
transform, but two ways of staging a DoS attack would be to gobble
RAM or CPU capacity. I approached this by inserting calls to a check
function that sampled heap size, and started a "watchdog" process that
would unceremoniously kill the program after a certain time.

In short, I can see a need for such limits, and would like to include
a reduction ceiling. The limits could be set after careful testing
and high enough that they protect against runaway processes. A reduction
limit could be checked at the end of each slice, perhaps.

In my experience, per-process memory usage is fairly predictable in
erlang. Does anyone have a different experience?

BR,
Ulf W
--
Ulf Wiger
CTO, Erlang Training & Consulting Ltd
http://www.erlang-consulting.com

Chandru

unread,

Nov 10, 2009, 7:25:54 AM11/10/09

to Tony Rogvall, Erlang-Questions Questions

2009/11/10 Tony Rogvall <to...@rogvall.se>

> Not yet.
>
> What kind of flavor do you have in mind?
>
> One proposal is to let sender crash when receiver in box is full.
>

A few strategies to choose from would be useful.

* Crash sender
* Crash receiver
* Discard message

cheers
Chandru

I would like to test a blocking version as well. This may sound like utterly
> crazy, but we already have the infamous busy_port ;-)
> Busy ports do block senders (mostly distribution stuff).
>
An other flavor is to drop the overflow message. This can however be done
> by catch (Pid ! Message) if the crash semantics is implemented.
>
> A suggestion from Patrik Winroth is to be able to set thresholds on
> resource limits,
> say in percent from max limit. When threshold is reached then a warning
> message
> is sent to a logger process.
>
> The max_memory limit currently control number of words! that can be
> allocated for heaps.
> This is the sum of sizes of all heaps for the process and it's
> "sub"-processes
> (spawned by process that had the limit set in the first place). It is
> implementation specific
> if the message queue and/or the messages them selfs are located on the
> process heap.
> In the case where messages are none shared and located on a separate heap
> fragments
> this should be accounted for.
>
> /Tony

Joe Armstrong

unread,

Nov 10, 2009, 7:45:17 AM11/10/09

to Ulf Wiger, Richard O'Keefe, Erlang-Questions Questions

This is a very interesting problem. if processes have quotas, then how
could you set the
quota value?

A perfectly correct process might just have a very deep stack, just once in its
life and otherwise be fine. Whether to crash this process or not would
depend upon
what the other processes in the system happened to be doing at the
time. This would
be very unfortunate - it's like your program being hit by a cosmic ray
- nasty. It creates a random non-deterministic coupling between things
that are supposed to be independent.

A possibility that just occurred to me might be to suspend processes
that appear to be
running wild until such a time as the overall memory situation looks good.

Image two scheduler queues. One for well behaved programs. One for
programs whose
stacks and heaps are growing too fast. If memory is no problem we run
programs from both queues. If memory is tight we run processes in the
"problem" queue less often and with frequent garbs.

Killing a program with a large stack and heap, just because their
happens to be a
temporary memory problem seems horrible, especially since the problem
might go away
if we wait a few milliseconds.

Suspending a memory hungry process for a while, until memory is available seems
less objectionable. Perhapse it could be swapped out to disk and
pulled in a lot later.
Killing things at random in the hope it might help sounds like a
really bad idea.
Process migration could solve this - move it to a machine that has got
more memory.

Suspending things seems ok - you might even suspend an errant process forever
and reclaim the memory - but not kill it. Some other process could
detect that the processes
is not responding and kill it and thus all the semantics of the
application would be obeyed
(processes are allowed to be unresponsive, that's fine) and the semantics of the
error recovery should say what to do in this case.

Just killing processes when they have done nothing wrong is not a good idea.

/Joe

Tony Rogvall

unread,

Nov 10, 2009, 7:04:48 AM11/10/09

to Chandru, Erlang-Questions Questions

Not yet.

What kind of flavor do you have in mind?

One proposal is to let sender crash when receiver in box is full.

I would like to test a blocking version as well. This may sound like

/Tony

Kenneth Lundin

unread,

Nov 10, 2009, 8:13:02 AM11/10/09

to Joe Armstrong, Ulf Wiger, Richard O'Keefe, Erlang-Questions Questions

On Tue, Nov 10, 2009 at 1:45 PM, Joe Armstrong <erl...@gmail.com> wrote:
> This is a very interesting problem. if processes have quotas, then how
> could you set the
> quota value?

If you for example have processes representing subscribers och
half-calls or mobile phones
it is pretty easy to know a reasonable size of such a process by measurement.
It should be easy to say if it can grow up to 100 kwords, 1Mwords, 10
Mwords etc.
Of course you should not try to set quota as tight as possible. The
quota on process
size is just a way to protect the system from total disaster if one of
these processes
happens to grow over all reasonable limits because of a bug.

/Kenneth Erlang/OTP Ericsson

Ulf Wiger

unread,

Nov 10, 2009, 8:19:55 AM11/10/09

to Tony Rogvall, Chandru, Erlang-Questions Questions

Tony Rogvall wrote:
> Not yet.
>
> What kind of flavor do you have in mind?
>
> One proposal is to let sender crash when receiver in box is full.

I would prefer it if messages are simply dropped. This would be
symmetric with the distributed case - you can hardly crash a
remote process, or a port in active mode. :)

BR,
Ulf W
--
Ulf Wiger
CTO, Erlang Training & Consulting Ltd
http://www.erlang-consulting.com

________________________________________________________________

Ulf Wiger

unread,

Nov 10, 2009, 8:22:52 AM11/10/09

to Tony Rogvall, Chandru, Erlang-Questions Questions

Tony Rogvall wrote:
> Not yet.
>
> What kind of flavor do you have in mind?
>
> One proposal is to let sender crash when receiver in box is full.
>
> I would like to test a blocking version as well. This may sound like
> utterly crazy, but we already have the infamous busy_port ;-)

...and the punishing of senders if the receiver has a long
message queue. I think these are abominations in the multicore
world, and should be phased out.

> A suggestion from Patrik Winroth is to be able to set thresholds on
> resource limits, say in percent from max limit.

This could be done as an extension of erlang:system_monitor().
It already has the ability to trigger on large heap and long
GC, etc.

BR,
Ulf W
--
Ulf Wiger
CTO, Erlang Training & Consulting Ltd
http://www.erlang-consulting.com

________________________________________________________________

Ulf Wiger

unread,

Nov 10, 2009, 8:40:52 AM11/10/09

to Joe Armstrong, Richard O'Keefe, Erlang-Questions Questions

Joe Armstrong wrote:
>
> Just killing processes when they have done nothing wrong is not a good idea.

Well, it's optional, of course. :)

Imagine, OTOH, a well-tested system where memory characteristics
have been well charted for the foreseeable cases. It might be
defensible to set resource limits so that everything we expect
to see falls well within the limit, and stuff that we don't
expect might trigger the killing of some process. If this is
done on temporary processes, we should be able to accept it
as long as the number of spurious kills is low.

This is not much stranger than things that we do routinely in
other cases:

- If dets or disk_log notice that a file hasn't been properly
closed, it 'repairs' the file - that is, it repairs the index.
Corrupt objects are simply discarded, not repaired.

- Replication in the AXD 301 and similar products was asynchronous
with a bulking factor. Some failure cases could lead to dropped
calls, but as long as they were few, it was acceptable.

- Some complex state machines would bail out for unexpected
sequences (I showed an example of this in my Structured Network
Programming talk at EUC). This was a form of "complexity
overload", and hugely unfair to the poor process running the
code, as it was probably not a real failure case.

- Mnesia's deadlock prevention algorithm, or indeed any deadlock
prevention algo, will restart transactions if there is even
the smallest chance of deadlock. Granted, this should be
transparent if the transaction fun is well written, but there
will be false positives, and this will affect performance.

On the other hand, there can be situations where a rogue process
gobbles up all available memory, rendering the VM unresponsive
for several minutes (e.g. due to the infamous "loss of sharing"),
or cases where a number of unexpectedly large processes "gang up"
and kill the VM in one big memory spike. Or a difficult-to-reproduce
bug that sends some application into an infinite retry situation
rendering the system unusable. In all these cases, killing off
the poor culprits, guilty or not, may well result in a less deadly
disturbance for the system as a whole.

Jayson Vantuyl

unread,

Nov 10, 2009, 3:22:12 PM11/10/09

to Joe Armstrong, Ulf Wiger, Richard O'Keefe, Erlang-Questions Questions

Dropping messages, suspending processes, and crashing processes is
just a bad idea.

Erlang's messaging is not "left-guarded" in the sense described by
Hoare. That means that any behavior that suspends processes when the
remote mailbox is full can either exhaust memory (i.e. what we have
now) or arbitrarily deadlock the system. The only requirement for a
deadlock is a loop in messaging anywhere, and we have more than a few
of those.

Dropping messages is probably the least disruptive, but (in
applications that use OTP behaviors at least) it would just translate
into gen_server:call timeouts, and we're back to defensive programming
again.

I'm all for a heuristic that sends a signal to a program when it's
getting close (i.e. memory allocation failed and "reserved" memory is
starting to be consumed so trigger an alarm of some sort), but
anything proposed so far compromises the current behavior of the VM in
ways that are awfully unpredictable.

Richard O'Keefe

unread,

Nov 10, 2009, 4:52:07 PM11/10/09

to Ulf Wiger, Erlang-Questions Questions

On Nov 11, 2009, at 1:10 AM, Ulf Wiger wrote:
>
> In short, I can see a need for such limits, and would like to include
> a reduction ceiling. The limits could be set after careful testing
> and high enough that they protect against runaway processes. A
> reduction
> limit could be checked at the end of each slice, perhaps.
>
> In my experience, per-process memory usage is fairly predictable in
> erlang. Does anyone have a different experience?

I am not denying the *need* for limits. (Anyone else remember "engines"
in Scheme?) I've had enough functions in enough languages go into
infinite recursion that I can see the point of stopping them.

However, Ulf Wiger has not addressed these points:

- if the tagging scheme is changed (and it has in the past),
memory requirements may change. They may either decrease
(if, say, a special tag for 2-element tuples were introduced)
or increase (if, say, list cells were made bigger).

In particular, you may recall an EEP from me suggesting that
there should be two representations for atoms: one physically
unique as now and the other only logically unique as in LOGIX.
There are three aims in that proposal: (1) reduce the need for
symbol table locking when creating atoms, (2) reduce the impact
of the symbol table size limit by making it possible for a
process to create as many LOGIX-type atoms as it wants, and (3)
make it possible to garbage-collect such atoms.

If a process creates an atom now, the atom goes in shared storage
and presumably is not "billed to" that process. With the
LOGIX-style atoms proposal, many atoms would remain in the
process's private heap and _would_ be billed to that process.

If memory serves me correctly, the representation of binaries has
changed in the past. There are certainly thresholds in binaries;
as the performance characteristics of processors and the abilities
of the Erlang compiler and HiPE change, they might well change,
and if they do, the space needs of processes using binaries change.

- if a data structure library changes its representation,
memory requirements may change. One of the advantages of using
a library is NOT having to know how it works inside. Suppose
I need to keep track of N accounts. If I use gb_sets, from
having looked at the source code I can tell that it will take
something like 4N or 5N words, but without looking at the
code, I would not know that. Looking at the code NOW tells me
what the space cost is NOW. It does NOT guarantee me that the
same space cost will apply in the next release. I can think
of plausible reasons why the space cost might go up and plausible
reasons why it might go down.

- switching from one library module to another may change the
space needs. For example, gb_sets and sets offer very similar
interfaces. gb_sets uses binary search, sets uses hashing.
In the course of maintenance, you might very well want to change
from one to another. I haven't the faintest idea how much space
N items in a sets: set would take, especially as there are
tuning parameters in sets.erl which could very well be changed.

It's fatally easy, when doing such maintenance, to forget to
maintain the space bounds as well.

- there are library modules where I have no idea how much space
is needed. Come to think of it, that's nearly all of them.
I certainly haven't the least notion what the space costs of
'supervisor' are. I've just assumed that they were small enough
that I didn't need to bother.

- Erlang data structures are made of things whose size is naturally
measured in bytes (binaries) and things whose size is naturally
measured in words (list cells, tuples), and other stuff which I'll
ignore. When you switch from a 32-bit machine to a 64-bit machine
(or vice versa) the size of *words* changes, but the size of
*bytes* does not. This means that *right now* if you want your
size bounds to port between 32-bit and 64-bit machines, even
assuming everything else to remain unaltered, your size bounds
*must* be expressed as B*bytes + W*words, not as bytes nor as words
alone.

Now it's open to someone to propose that space bounds (stack, heap,
mailbox, individual messages sent, ...) *should* be expressed as
{B,W} pairs, and such a proposal will get a respectful hearing from
me.

At any rate, I am not saying that space limits are an essentially
bad idea, or that some means of providing such limits shouldn't be
added to Erlang. What I'm saying is that
- GETTING the limits right is never going to be easy
- KEEPING the limits right is never going to be easy
- EXPLAINING how to determine appropriate limits in fairly simple
terms is going to be a very important part of Erlang documentation
- CHECKING the limits, perhaps with some sort of load testing tools,
is going to become an important part of development and
maintenance.

Ulf Wiger

unread,

Nov 10, 2009, 5:26:34 PM11/10/09

to Richard O'Keefe, Erlang-Questions Questions

Richard O'Keefe wrote:
>

> I am not denying the *need* for limits. (Anyone else remember "engines"
> in Scheme?) I've had enough functions in enough languages go into
> infinite recursion that I can see the point of stopping them.
>
> However, Ulf Wiger has not addressed these points:
>
> - if the tagging scheme is changed (and it has in the past),
> memory requirements may change.

> ...

Perhaps I'm colored by having worked on systems where every
product release was preceded with months of relatively
thorough testing, where monitoring memory usage during
important operating conditions was routine.

We would also tune the system by setting process heap size
for important processes in order to minimize garbage
collection. This obviously needed to be reviewed anytime
something changed substantially.

Another type of tuning is to limit the number of jobs of
a given type that can execute concurrently. Again, this is
dependent on the actual memory use of a process.
When pushing high-throughput messaging systems to their
limits, these activities tend to become necessary, and they
have to be repeated for each significant revision of the
software.

A near-trivial case where memory requirements indeed will
change is if one switches from 32-bit to 64-bit Erlang.
This can cause memory usage in many Erlang applications to
roughly double (depending on how much binaries are used).

> At any rate, I am not saying that space limits are an essentially
> bad idea, or that some means of providing such limits shouldn't be
> added to Erlang. What I'm saying is that
> - GETTING the limits right is never going to be easy
> - KEEPING the limits right is never going to be easy
> - EXPLAINING how to determine appropriate limits in fairly simple
> terms is going to be a very important part of Erlang documentation
> - CHECKING the limits, perhaps with some sort of load testing tools,
> is going to become an important part of development and maintenance.

Agreed.

BR,
Ulf W
--
Ulf Wiger
CTO, Erlang Training & Consulting Ltd
http://www.erlang-consulting.com

________________________________________________________________

Richard O'Keefe

unread,

Nov 10, 2009, 6:13:01 PM11/10/09

to Joe Armstrong, Ulf Wiger, Erlang-Questions Questions

On Nov 11, 2009, at 1:45 AM, Joe Armstrong wrote:

> A perfectly correct process might just have a very deep stack, just
> once in its
> life and otherwise be fine. Whether to crash this process or not would
> depend upon
> what the other processes in the system happened to be doing at the
> time.

Surely with per-process quotas the process would be crashed on exceeding
its limit no matter what any other processes might be doing?

I may be misunderstanding you. One thing that has plagued the UNIX
world for years is over-commitment. A bunch of UNIX processes ask
for lots of virtual memory each, and the operating system says yes
to all of them, but doesn't actually give them the memory. When
they try to touch the memory, it's then that the operating system
actually allocates the pages. And it's then that the operating
system discovers that "oops, each process is within its quota, but
I don't actually have enough memory to go around." And it kills
processes until it does. Quite often, the wrong ones.

Some UNIX programmers (me amongst others) say this is ridiculous,
a well written process won't ask for memory if it doesn't think it's
going to need it, and if it can't have it, it should be told right
now. Others say, no, it's normal practice to reserve a lot more than
you are really likely to need, overcommitment is essential support.

So if a process usually needs 10K but *might* need 10M, and there
are process quotas, it has to ask for all 10M. And either the VM
refuses to create new processes when the sum of the quotas exceeds
available memory, in which case most of the memory might be lying
idle because it was worst case requests, or the VM allows new
processes to be created even so, in which case you can end up with
a bunch of processes all within their quotas, but the VM runs out
of memory anyway.

Erlang already has 'hibernate', where a process can allow the VM
to reclaim most of its memory. This suggests something that does
the opposite. If a process can *tell* when it's going to need a
lot more memory than usual, there could be an operation that
says "please increase my quota to B bytes + W words, and if you
can't do that just now, suspend me until you can".

> A possibility that just occurred to me might be to suspend processes
> that appear to be
> running wild until such a time as the overall memory situation looks
> good.

A computation might use a lot of memory by setting up one large
process.

A computation might use a lot of memory by setting up a large
number of small processes.

The VM could be running out of memory, and there could be a large
process, but _that_ process might be completely innocent.
200 (small) sheep weigh more than one (large) elephant and they
breed a lot faster.

A process might not be growing its own stack or heap at all, but
none the less might be (indirectly) responsible for increasing
demands on memory.

Someone has already proposed a system whereby a process has to
share its memory quota with the processes it spawns.

By the way, I note that Java has precisely the same kind of
problem. You can create a thread with a bound on its stack
size, but the documentation for the relevant Thread constructor
says:
Allocates a new Thread object so that it ...
has the specified stack size. ...
The stack size is the approximate number of bytes
of address space that the virtual machine is to
allocate for this thread's stack.
The effect of the stackSize parameter, if any,
is highly platform dependent.
[That sentence is bold in the original.]
On some platforms, specifying a higher value
for the stackSize parameter may allow a thread
to achieve greater recursion depth before throwing
a StackOverflowError. Similarly, specifying a
lower value may allow a greater number of threads
to exist concurrently without throwing an
OutOfMemoryError (or other internal error).
The details of the relationship between the value
of the stackSize parameter and the maximum
recursion depth and concurrency level are
platform-dependent. On some platforms, the value
of the stackSize parameter may have no effect whatsoever.
[That sentence is bold in the original.]

What happens when Java runs out of memory because the heap fills
up? It's not completely clear from the documentation, but it
appears that whichever process was running when an allocation
attempt fails gets an OutOfMemory exception, even if that's the
smallest process in the whole system. This exception doesn't
seem to be handled in very many Java programs, at any rate I've
often found perfectly good Java programs to crash with an
unhandled OutOfMemory exception on my 4GB laptop because the
default is 64MB.

So, the Erlang VM crashes when memory runs out?
That is, in practice, exactly what happens in Java.
Can it be that Java programmers just don't expect the
same quality of service from VMs as Erlang programmers do?
(:-) (:-) (:-)

Richard O'Keefe

unread,

Nov 10, 2009, 6:28:22 PM11/10/09

to Kenneth Lundin, Joe Armstrong, Ulf Wiger, Erlang-Questions Questions

On Nov 11, 2009, at 2:13 AM, Kenneth Lundin wrote:
> If you for example have processes representing subscribers och
> half-calls or mobile phones
> it is pretty easy to know a reasonable size of such a process by
> measurement.

For THIS version of the program
with THIS version of the Erlang libraries, compiler, VM
and THIS build (because C space requirements can depend on C
compiler settings)
on THIS hardware
with THIS workload.

Measurement is a good thing, but when any of these factors changes,
it's advisable to measure again. In particular, if I develop something
on my box and measure and set an appropriate bound, and I send you the
code, you have no reason to believe that the bound is appropriate for
your circumstances until you have measured.

Only the other day I was explaining to some students that the reason
the SPARK subset of Ada can be used safely in embedded systems is that
the SPARK subset excludes
- recursion
- pointers and dynamically allocated objects
- dynamic array sizes
so that the compiler can be sure everything fits. Erlang has all
the things that make this hard.

> It should be easy to say if it can grow up to 100 kwords, 1Mwords, 10
> Mwords etc.

I can't do that for small C programs. I've seen C programs where
the stack space varied by a factor of 10 between different hardware/
compiler/options choices. Come to that, with some C compilers
supporting tail call optimisation and some not, it's quite easy to
construct artificial examples where the memory needed by a thread
could be as low as 1 KB or as high as 1 GB.

> Of course you should not try to set quota as tight as possible. The
> quota on process
> size is just a way to protect the system from total disaster if one of
> these processes
> happens to grow over all reasonable limits because of a bug.

There are two separable issues.
A. How do we detect a runaway process and stop it ruining things
for everyone else? Loose quotas are just fine for this. Even
a factor of 10 too large is good enough. (Maybe.)
B. How do we handle memory exhaustion without killing the VM
*when no process has exceeded its quota*. Loose quotas actually
create this problem. (In the present system, the quotas are as
loose as they could possibly be...)

I have no quarrel with optional loose quotas as a debugging/safety
measure, a solution to problem A. It's just that we shouldn't
mistake them for a solution to problem B, which is what I thought
this thread was about. Tackling problem B should make a nice PhD
for someone. (While there are some, um, questionable decisions in
the design of the Java _language_, a lot of smart people have
worked on Java _implementation_, and if there were a good solution
to problem B I think some Java vendor would be boasting about it.)

Richard O'Keefe

unread,

Nov 10, 2009, 7:29:23 PM11/10/09

to Ulf Wiger, Erlang-Questions Questions

On Nov 11, 2009, at 11:26 AM, Ulf Wiger wrote:

> Richard O'Keefe wrote:
>
>> I am not denying the *need* for limits. (Anyone else remember
>> "engines"
>> in Scheme?) I've had enough functions in enough languages go into
>> infinite recursion that I can see the point of stopping them.
>> However, Ulf Wiger has not addressed these points:
>> - if the tagging scheme is changed (and it has in the past),
>> memory requirements may change.
> > ...
>
> Perhaps I'm colored by having worked on systems where every
> product release was preceded with months of relatively
> thorough testing, where monitoring memory usage during
> important operating conditions was routine.

Let's see if we can reach consensus here.

Ulf, I'm *not* saying that you can't do what you DID do.
That would be rude and stupid.

What I'm saying is that you have to *KEEP* doing it,
as indeed you did ("every product release"), and that
you have to keep doing it even if YOUR code didn't
change at all.

How you do that thorough testing is something that needs
to be written up clearly in some new Erlang book, which
I wish you would write. I would certainly buy it.

> A near-trivial case where memory requirements indeed will
> change is if one switches from 32-bit to 64-bit Erlang.
> This can cause memory usage in many Erlang applications to
> roughly double (depending on how much binaries are used).

I thought I already said that.

This whole thread is rather interesting because it shows up
the difference between *language* issues and *system* issues.
There have, for example, been some interesting changes to
the Java *language* over the years, but for many purposes
the instrumentation interfaces and things like MX Beans
are much more important.

Erlang *has* monitoring tools, but I don't think they are
anywhere near as well understood by many Erlang programmers
as they should be, including me. I'm serious about the book!

Ulf Wiger

unread,

Nov 11, 2009, 5:38:47 AM11/11/09

to Richard O'Keefe, Erlang-Questions Questions

Richard O'Keefe wrote:
>
> On Nov 11, 2009, at 11:26 AM, Ulf Wiger wrote:
>
>> Richard O'Keefe wrote:
>>
>>> I am not denying the *need* for limits. (Anyone else remember "engines"
>>> in Scheme?) I've had enough functions in enough languages go into
>>> infinite recursion that I can see the point of stopping them.
>>> However, Ulf Wiger has not addressed these points:
>>> - if the tagging scheme is changed (and it has in the past),
>>> memory requirements may change.
>> > ...
>>
>> Perhaps I'm colored by having worked on systems where every
>> product release was preceded with months of relatively
>> thorough testing, where monitoring memory usage during
>> important operating conditions was routine.
>
> Let's see if we can reach consensus here.

I think we do agree. Hopefully the volley will clarify things
to the rest of the crowd - let's go another round and see if
people will start groaning. :)

> What I'm saying is that you have to *KEEP* doing it,
> as indeed you did ("every product release"), and that
> you have to keep doing it even if YOUR code didn't
> change at all.

Indeed. For high-availability products, this ought to
be a given. But as Erlang is also used in many other settings,
I would like to add an observation - perhaps obvious to all:

Erlang has some limits /today/ that programmers have to live
with. In many cases, the limits are high enough that they
are no cause for concern. Some of the better-known limits are:

- the number of simultaneous processes in the system
(default: 32768, but can be raised up to 268435456)
- the number of ETS tables (default: 1400, can be raised considerably)
- the number of open ports and file descriptors (basically
an OS limit)

Other system limits can be found in
http://www.erlang.org/doc/efficiency_guide/part_frame.html
and most are implementation details and subject to change.

What happens if you run into a system limit, e.g. if you
try creating too many ets tables, is that the operation
fails with an exception. Out of memory is a rather special
case, in that it brings down the entire VM.

Adding a few more limits that affect individual processes
rather than suffering the node-global disaster of OOM,
is thus nothing new, and in that sense shouldn't be
terribly controversial in itself. The reasonable /default/ is
of course that processes behave as today, i.e. the heap
and message queue length limits et al would be /optional/.

Having said this, it is certainly important to discuss which
such limits would actually be useful in practice.

> How you do that thorough testing is something that needs
> to be written up clearly in some new Erlang book, which
> I wish you would write. I would certainly buy it.

Ok, at least one potential customer. I'll think about it. :)

BR,
Ulf W
--
Ulf Wiger
CTO, Erlang Training & Consulting Ltd
http://www.erlang-consulting.com

________________________________________________________________

John-Olof Bauner

unread,

Nov 11, 2009, 10:41:35 AM11/11/09

to Ulf Wiger, Tony Rogvall, Chandru, Erlang-Questions Questions

> Ulf Wiger wrote:

>> Tony Rogvall wrote:
>>
>> What kind of flavor do you have in mind?
>>
>> One proposal is to let sender crash when receiver in box is full.
>>
>> I would like to test a blocking version as well. This may sound like
>> utterly crazy, but we already have the infamous busy_port ;-)

> ...and the punishing of senders if the receiver has a long message
queue. I think these are abominations in the multicore world, and should
be phased out.

Reminds me of Chill, a telecom language from the Middle Ages, where the
sender hangs if the receiver inbox is full. Nice feature for its time
though.

J-O

Michael McDaniel

unread,

Nov 11, 2009, 6:56:14 PM11/11/09

to erlang-q...@erlang.org

________________________________________________________________

yes, please ... two customers

~Michael

>
> BR,
> Ulf W
> --
> Ulf Wiger
> CTO, Erlang Training & Consulting Ltd
> http://www.erlang-consulting.com
>

--
Michael McDaniel
Portland, Oregon, USA
http://trip.autosys.us

Angel Alvarez

unread,

Nov 12, 2009, 4:41:54 AM11/12/09

to erlang-q...@erlang.org

3 customers!!

/Angel

--

Este correo no tiene dibujos. Las formas extrañas en la pantalla son letras.
__________________________________________

Clist UAH a.k.a Angel
__________________________________________

No le daría Cocacola Zero, ni a mi peor enemigo. Para eso está el gas Mostaza que es mas piadoso.

ERLANG

unread,

Nov 12, 2009, 4:49:50 AM11/12/09

to Angel Alvarez, erlang-q...@erlang.org

5 customers (2 for my team)

Y.

Le 12 nov. 09 à 10:41, Angel Alvarez a écrit :

Reply all

Reply to author

Forward