[slurm-users] How to deal with user running stuff in frontend node?

1,614 views
Skip to first unread message

Manuel Rodríguez Pascual

unread,
Feb 15, 2018, 10:12:53 AM2/15/18
to Slurm User Community List
Hi all, 

Although this is not strictly related to Slurm, maybe you can recommend me some actions to deal with a particular user. 

On our small cluster, currently there are no limits to run applications in the frontend. This is sometimes really useful for some users, for example to have scripts monitoring the execution of jobs and taking decisions depending on the partial results.

However, we have this user that keeps abusing this system: when the job queue is long and there is a significant time wait, he sometimes runs his jobs on the frontend, resulting on a CPU load of 100% and some delays on using it for the things it is supposed to serve (user login, monitoring and so). 

Have you faced the same issue?  Is there any solution? I am thinking about using ulimit to limit the execution time of this jobs in the frontend to 5 minutes or so. This however does not look so elegant as other users can perform the sabe abuse on the future, and he should also be able to run low cpu-consuming jobs for a longer period. However I am not an experienced sysadmin so I am completely open to suggestions or different ways of facing this issue.

Any thoughts?

cheers, 




Manuel

Patrick Goetz

unread,
Feb 15, 2018, 10:25:39 AM2/15/18
to slurm...@lists.schedmd.com
The simple solution is to tell people not to do this -- that's what I
do. And if that doesn't work threaten to kick them off the system.

Paul Edmon

unread,
Feb 15, 2018, 10:27:54 AM2/15/18
to slurm...@lists.schedmd.com
We have an automated script, pcull which goes through and finds abusing
processes:

https://github.com/fasrc/pcull

-Paul Edmon-

Bill Barth

unread,
Feb 15, 2018, 10:28:16 AM2/15/18
to Slurm User Community List
We kick them off and lock them out until they respond. Disconnections are common enough that it doesn’t always get their attention. Inability to log back in always does.

Best,
Bill.

Sent from my phone.

Jeffrey Frey

unread,
Feb 15, 2018, 10:30:03 AM2/15/18
to Slurm User Community List
Every cluster I've ever managed has this issue.  Once cgroup support arrived in Linux, the path we took (on CentOS 6) was to use the 'cgconfig' and 'cgred' services on the login node(s) to setup containers for regular users and quarantine them therein.  The config left 4 CPU cores unused by regular users (cpuset config), and allowed them to use up to 100% of the 16 cores granted but yield cycles as other users demand (cpu config).  The config also keeps a minor amount of RAM unused by regular users, and limits each regular user to a couple GB.

The cgrules.conf works on first-match, so at the top we make sure root and sysadmins don't have any limits.  Support staff get the overall limits for regular users, and everyone else who's not a daemon user, etc, gets a personal cgroup with the most stringent limits.




/etc/cgconfig.conf:
mount {
cpuset = /cgroup/cpuset;
cpu = /cgroup/cpu;
#cpuacct = /cgroup/cpuacct;
memory = /cgroup/memory;
#devices = /cgroup/devices;
#freezer = /cgroup/freezer;
#net_cls = /cgroup/net_cls;
#blkio = /cgroup/blkio;
}

group regular_users {
  cpu {
    cpu.shares=100;
  }
  cpuset {
    cpuset.cpus=4-19;
    cpuset.mems=0-1;
  }
  memory {
    memory.limit_in_bytes=48G;
    memory.soft_limit_in_bytes=48G;
    memory.memsw.limit_in_bytes=60G;
  }
}

template regular_users/%U {
  cpu {
    cpu.shares=100;
  }
  cpuset {
    cpuset.cpus=4-19;
    cpuset.mems=0-1;
  }
  memory {
    memory.limit_in_bytes=4G;
    memory.soft_limit_in_bytes=2G;
    memory.memsw.limit_in_bytes=6G;
  }
}


/etc/cgrules.conf
#
# Include an explicit rule for root, otherwise commands with
# the setuid bit set on them will inherit the original user's
# gid and probably wind up under @everyone:
#
root cpuset,cpu,memory /

#
# sysadmin
#
user1 cpuset,cpu,memory /
user2 cpuset,cpu,memory /

#
# sysstaff
#
user3 cpuset,cpu,memory regular_users/
user4 cpuset,cpu,memory regular_users/

#
# workgroups:
#
@everyone cpuset,cpu,memory regular_users/%U/
@group1 cpuset,cpu,memory regular_users/%U/
@group2 cpuset,cpu,memory regular_users/%U/
  :






::::::::::::::::::::::::::::::::::::::::::::::::::::::
Jeffrey T. Frey, Ph.D.
Systems Programmer V / HPC Management
Network & Systems Services / College of Engineering
University of Delaware, Newark DE  19716
Office: (302) 831-6034  Mobile: (302) 419-4976
::::::::::::::::::::::::::::::::::::::::::::::::::::::




Pablo Escobar

unread,
Feb 15, 2018, 10:33:39 AM2/15/18
to Slurm User Community List
Hi Manuel,

A possible workaround is to configure a cgroups limit by user in the frontend node so a single user cannot allocate more than 1GB of ram (or whatever value you prefer). The user would still be able to abuse the machine but as soon as his memory usage goes above the limit his job will be killed by cgroup and this should not affect too much the users behaving correctly.

In any case the best solution I know is a non technical one. When a user abuse the system we close the account. He quickly sends and email asking what happened and why he cannot login and we reply that as he abused the system we won't open the account until his boss contacts us asking to reopen it. After the user has to explain the "problem" to his/her boss they don't abuse the system again ;)

regards,
Pablo.

Loris Bennett

unread,
Feb 15, 2018, 10:59:41 AM2/15/18
to Manuel Rodríguez Pascual, Slurm User Community List
Hi Manuel,
You can use cgroups to limit users, but you have modify the
configuration when new users are set up. We have also had problems
automating the restart of the two daemons affected, so it is not that
elegant a solution. However it works for us as we only set up a few new
users a week.

Cheers,

Loris

--
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris....@fu-berlin.de

John Hanks

unread,
Feb 15, 2018, 11:04:26 AM2/15/18
to Slurm User Community List
I've used this with some success: https://github.com/JohannesBuchner/verynice. For CPU intensive things it works great, but you have to also set some memory limits in limits.conf if users do any large memory stuff. Otherwise I just use a problem process as a chance to start a conversation with that user to see what they are working on, seems to make people happy when you talk to them and try to help rather than just killing their work and scolding them.

jbh

On Thu, Feb 15, 2018 at 7:32 AM, Pablo Escobar <pesco...@gmail.com> wrote:
Hi Manuel,

A possible workaround is to configure a cgroups limit by user in the frontend node so a single user cannot allocate more than 1GB of ram (or whatever value you prefer). The user would still be able to abuse the machine but as soon as his memory usage goes above the limit his job will be killed by cgroup and this should not affect too much the users behaving correctly.

In any case the best solution I know is a non technical one. When a user abuse the system we close the account. He quickly sends and email asking what happened and why he cannot login and we reply that as he abused the system we won't open the account until his boss contacts us asking to reopen it. After the user has to explain the "problem" to his/her boss they don't abuse the system again ;)

regards,
Pablo.

Michael Jennings

unread,
Feb 15, 2018, 11:07:09 AM2/15/18
to Slurm User Community List
On Thursday, 15 February 2018, at 16:11:29 (+0100),
I don't do this at my current job, but at my previous one, I used NHC
(https://github.com/mej/nhc) with a special config context I called
"patrol." I ran "nhc-patrol" (symlinked to /usr/sbin/nhc) with the
following /etc/nhc/nhc-patrol.conf:

### Kill ANY user processes consuming 98% or more of a CPU core or 20+% of RAM
ln* || check_ps_cpu -a -l -s -u '!root' -m '!/(^|\/)((mpi|i|g)?(cc|CC|fortran|f90))$/' 98%
ln* || check_ps_physmem -a -l -s -u '!root' -m '!/(^|\/)((mpi|i|g)?(cc|CC|fortran|f90))$/' 20%

### Ban certain processes on login nodes, like OpenMPI's "orted"
### or various file transfer tools which belong on the DTN.
ln* || check_ps_mem -a -k -u '!root' -m '/(^|\/)(scp|sftp-server|bbcp|ftp|lftp|ncftp|sftp|unison|rsync)$/' 1k
ln* || check_ps_mem -a -k -u '!root' -m '/(^|\/)(orted|mpirun|mpiexec|MATLAB)$/' 1k

### Ban certain misbehaving Python scripts and known application binaries
ln* || check_ps_mem -a -l -s -u '!root' -m '/(\.cplx\.x|xas\.x|volume.py|Calculate|TOUGH_|mcnpx|main|eco2n)/' 1G
ln* || check_ps_mem -a -l -s -u '!root' -f -m '/(^|\/)(([^b][^a][^s][^h].*|)Calculate(Ratio|2PartCorr)|.*main config\.lua|.*python essai\.py|.*python \.?\/?input_hj\.py|.*python .*/volume\.py|java -jar.*(SWORD|MyProxyLogon).*)/' 1G
ln* || check_ps_time -a -l -s -k -u '!root' -m '/(^|\/)(projwfc.x|xas_para.x|pp.x|pw.x|new_pw.x|xi0.cplx.x|sigma.cplx.x|xcton.cplx.x|diag.cplx.x|sapo.cplx.x|plotxct.cplx.x|forces.cplx.x|absorption.cplx.x|shirley_xas.x|wannier90.x|pw2bgw.x|metal_qp_single.x|epsbinasc.cplx.x|inteqp.cplx.x|summarize_eigenvectors.cplx.x|epsomega.cplx.x|epsilon.cplx.x|volume.py|ccsm.exe|pgdbg|pgserv|paratec.mpi|abinip|puppet|pyMPI|real.exe|denchar|nearBragg_mpi_test_8|vasp|cam|qcprog.exe|viewer|elk|bertini|namd2|Calculate2PartCorr|TOUGH_Shale|t2eos3_mp|pho|lmp_mftheory|t101|mcnpx.ngsi|chuk_code_mpi.exe|cape_calc|test_fortran_c_mixer|g_wham_d_mpi|tr2.087_eco2n_lnx|scph|phon|TRGw|tt2)$/' 1s

### Ban certain programs from specific naughty users
ln* || check_ps_physmem -a -l -s -k -u baduser1 -m '*test' 1k
ln* || check_ps_time -a -l -s -k -u baduser2 -m R 10s

### Prohibit non-file-transfer tools on the DTNs
xfer* || check_ps_time -a -l -s -k -u '!root' -f -m '/ssh ln/' 1s
xfer* || check_ps_time -a -l -s -u '!root' -f -m '!/\b(sshd: |ssh-agent|rsync|scp|ftp|htar|hsi|-?t?csh|-?bash|portmap|rpc|dbus-daemon|ntpd|xfs)\b/' 1s

---------------------------------------

Obviously this is something you'd need to tweak for your needs, and
some of these things are now better done via CGroups. But not only
did it help me keep the login nodes clean, once upon a time, it also
helped me catch a would-be hacker! :-)

HTH,
Michael

--
Michael E. Jennings <m...@lanl.gov>
HPC Systems Team, Los Alamos National Laboratory
Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605

Ryan Cox

unread,
Feb 15, 2018, 4:06:30 PM2/15/18
to Slurm User Community List
Manuel,

We set up cgroups and also do cputime limits (60 minutes in our case) in
limits.conf.  Before libcgroup had support for more generic "apply to
each user" kind of thing, I created a pam module that handles all of
that which still works well for creating per-user limits.  We also have
something that whitelists various file transfer programs so they aren't
subject to cputime limits.  We include an oom notifier daemon so that
users are alerted when their cgroup runs out of memory since many people
would otherwise have a tough time figuring out the exact cause of the
"Killed" message. All of this is available in
https://github.com/BYUHPC/uft (see the "Recommended Configuration"
section in the README.md for "Login Nodes").

We've had this in place for years and pretty much don't even have to
think about this anymore.  No complaints either.

If I had a user abusing the system after a warning I would probably
either kick him off for a cooling off period and/or implement a very
strict cputime limit (10 minutes?) in limits.conf just for him. Just my
$0.02.

Ryan
--
Ryan Cox
Operations Director
Fulton Supercomputing Lab
Brigham Young University


Nicholas McCollum

unread,
Feb 15, 2018, 5:13:29 PM2/15/18
to Slurm User Community List
I had previously contacted Ryan Cox about his solution and worked with
it a little to implement it on our CentOS 7 cluster. While I liked his
solution, I felt it was a little complex for our needs.

I'm a big fan of keeping stuff real simple, so I came up with two simple
shell scripts to solve the issue.

I have ulimits set to 10 minutes of cpu time.

One script runs continuously and runs:
# systemctl set-property user-$userid.slice CPUQuota=200%
# systemctl set-property user-$userid.slice CPUShares=256
# systemctl set-property user-$userid.slice MemoryLimit=4294967296

... for each user that it has discovered logged in. This essentially
sets the max amount of CPU cores and memory that the user can use.

The other script runs every 5 minutes and looks through /proc to find
any processes that I want to exceed the ulimit. I have an array of
processes (like: tar, bzip2, rsync, scp, etc) that I don't mind if the
user exceeds 10 minutes of cputime. This script looks for these
processes and runs:

# prlimit --pid $PID --cpu=unlimited

That way ulimits don't apply to those applications.

It's actually worked so well that I had totally forgotten about it until
I saw this thread. If you'd like a copy of the shell scripts, just send
me an e-mail.

---

Nicholas McCollum - HPC Systems Expert
Alabama Supercomputer Authority - CSRA

Petersen, Dirk

unread,
Feb 15, 2018, 7:12:48 PM2/15/18
to slurm...@lists.schedmd.com

I think cgroups is prob more elegant  ………. but here is another script

 

https://github.com/FredHutch/IT/blob/master/py/loadwatcher.py#L59

 

The email text is hard coded so please change before using.   We put this in place in Oct 2017 when things where getting out of control because folks were using much more multithreaded software than before. Since then we had 95 users removed from one of the login nodes and several 100 warnings sent.  The killall -9 -v -g –u username

has been very effective. We have 3 login nodes with 28 cores and almost 400G RAM.

 

Dirk

 

-----Original Message-----
From: hpcx...@lists.fhcrc.org [mailto:hpcxxxx...@lists.fhcrc.org] On Behalf Of loadwatchx...@fhcrc.org
Sent: Tuesday, November 14, 2017 11:45 AM
To: Doe, John <xxxxxxxxx @fredhutch.org>
Subject: [hpcpol] RHINO3: Your jobs have been removed!

 

This is a notification message from loadwatcher.py, running on host RHINO3. Please review the following message:

 

jdoe, your CPU utilization on rhino3 is currently 4499 %!

 

For short term jobs you can use no more than 400 % or 4.0 CPU cores on the Rhino machines.

We have removed all your processes from this computer.

Please try again and submit batch jobs

or use the 'grabnode' command for interactive jobs.

 

see http://scicomp.fhcrc.org/Gizmo%20Cluster%20Quickstart.aspx

or http://scicomp.fhcrc.org/Grab%20Commands.aspx

or http://scicomp.fhcrc.org/SciComp%20Office%20Hours.aspx

 

If output is being captured, you may find additional information in your logs.

 

 

 

 

 

Dirk Petersen
Scientific Computing Director
Fred Hutch

1100 Fairview Ave. N.

Mail Stop M4-A882
Seattle, WA 98109
Phone: 206.667.5926

Skype: internetchen

cid:8C6A9079-96CB-447C-94D9-DD59438042C1

 

Reply all
Reply to author
Forward
0 new messages