Enforcement of job limits on native resources

4 views
Skip to first unread message

Jonas Bardino

unread,
Feb 22, 2011, 11:14:34 AM2/22/11
to migrid
Hi list

We have recently introduced basic enforcement of job memory and disk
limits on native resources to avoid DoS from stray jobs. This means
that even the resources without a Local Resource Management System
(like PBS/SGE/LoadLeveler/...) now limit jobs to the amount of memory
and disk they requested.
I.e. if your MiG job requested 32 megs of memory and your commands
actually tries to allocate significantly more memory the allocation
will fail!
Typically this leads to applications terminating with an out of memory
error message, but less polished applications may just crash because
they try to use memory that couldn't be allocated.
As an example running a simple python interpreter easily requires 25
megs of memory, so a job requesting say 1 meg and then running python
will fail with something like:
python: error while loading shared libraries: libc.so.6: failed to map
segment from shared object: Cannot allocate memory
If the memory limit is instead hit during the actual python program it
will result in python throwing a MemoryError.

There's a little margin for job overhead automatically added, but
please make sure you request especially enough memory to avoid such
problems.
The disk limit is less strict and only kicks in after one extra gig so
it should only be a problem in rare cases.

Cheers, Jonas
Reply all
Reply to author
Forward
0 new messages