Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How can I avoid my process being killed while using as much memory as I can.

1 view
Skip to first unread message

Wahid Chrabakh

unread,
Sep 8, 2002, 3:10:01 PM9/8/02
to
Hi:
Hello everybody:
I am writing a program that uses a lot of memory. It actually runs on
many machines. I cannot change this feature, this is a research
project requiring a lot of memory and running on a lot of mahcines.
The program memory usage
grows and I want to handle "out of memory cases" gracefully by trying
to split the program after I cannot get any more memory.
I am aware that the Linux kernel overcommitts memory. What can I do to
avoid being killed by the infamous OOM.
I wrote a test program that allocates a lot of memory using malloc.
malloc returns null pointer sometimes(why? even though there is
oversommit is ti beause we excced the 4GB maximum virtuall address
space ), and then at some point the program is killed. I want to be
able to know when I cannot allocate memory and be able to split the
problem with another machine on the network.

Are there system calls I can use?
Also can a different custom malloc remedy this problem? I looked at
glibc malloc and it seems to check to return values from sbrk
correctly.

thank you all,
-Wahid.

Kasper Dupont

unread,
Sep 9, 2002, 5:45:19 AM9/9/02
to

First of all, you are not going to find the problem by looking in
glibc, because it is in the kernel. Although the kernel has
overcommitment, it has an option to swithc between two behaviours.
AFAIR the one will always allow all allocations, the other use
some heuristic. But both do overcommit.

There are two possible solutions to your problem:

1) Add enough swap more swap.
2) Use another kernel.

The latest -ac kernels can do accounting and thus completely
prevent overcommitment. It has five different modes, check the
kernel mailing list archives for a description of the modes. You
can download the patch from kernel.org.

(There is no point in writing to fa.linux.kernel, it works only
one way. Kernel mailing list gets duplicated to fa.linux.kernel,
not the other way around.)

--
Kasper Dupont -- der bruger for meget tid på usenet.
For sending spam use mailto:aaa...@daimi.au.dk
or mailto:mcxumhv...@skrammel.yaboo.dk

Eric Braeden

unread,
Sep 9, 2002, 12:19:54 PM9/9/02
to
Not to be critical, but are you sure you don't have multiple
memory leaks in your code? Also memory allocation
failures should not kill a well written program. You
do have this programmed, right?


David Schwartz

unread,
Sep 9, 2002, 4:47:29 PM9/9/02
to

His problem is either:

1) His program does not properly handle the case where 'malloc' returns
NULL, or

2) The machine he is using is misconfigured.

If 1, he needs to fix his program. If 2, he needs to get the
administrator to set sensible resource limits or he can impose them
himself on his own program.

Worst case, you can modify the program to keep track of how much
memory/swap is available and simulate 'malloc' failures rather than
letting the machine overcommit.

By the way, professionally-written programs should already do all of
this. They should have an emergency pool to handle 'malloc' failures,
and they should be able to substitute other ways of getting free space
(such as mapping a file) if they can't handle a resource shortage any
other way.

DS

Mark Hahn

unread,
Sep 9, 2002, 8:54:39 PM9/9/02
to
> Are there system calls I can use?

no. your design is broken, that is could only possibly work
on a uniprocessing machine (dos). even with only one user,
a multiprocessing OS will have drastically changing amounts
of memory available to user-level programs, so you'd need to
implement some kind of kernel-assisted garbage collection in order
to give pages back to the OS when it wants them.

> Also can a different custom malloc remedy this problem? I looked at

the kernel-gc approach is simply not going to fly; if you insist on
this sort of memory-hog behavior, the best you could do is periodically
monitor the state of the system, and back off on the number of pages
you use when there's a crunch. mmap is probably the best way to do
this, and it would also let you set a max size that doesn't necessarily
interact with swap (ie, allocate your pages out of a mmaped file,
which you truncate to a fixed size).

of course, the first time someone tries to run two copies of your hog,
it'll all fall apart, but we already covered that under "is broken"...

Mirko

unread,
Sep 10, 2002, 8:58:12 AM9/10/02
to
Why don't you make cluster ?

Kasper Dupont

unread,
Sep 10, 2002, 10:25:49 AM9/10/02
to
Mirko wrote:
>
> Why don't you make cluster ?

Mirko, could you please fix your mailer.
The References header is missing.

0 new messages