[slurm-dev] Expanding TotalCPU to include child processes

15 views
Skip to first unread message

Scott Yockel

unread,
Mar 3, 2015, 8:20:28 PM3/3/15
to slurm-dev
Slurm-Dev,

Is there anything in the works to add the capacity of TotalCPU to also track the child process user and system time?  I see that currently TotalCPU is defined: "provides a measure of the task’s parent process and does not include CPU time of child processes.”  I ask this because it would be nice to profile how well the multi-core jobs are using the system, a sort of parallel efficiency measure.   One could compare  (wall time * cpus) to (FullCPUTotal) and understand if the users were “hogging” cores.  Back in my LSF days, they had this Hog Factor that was something like this.  Right now the only way I see to catch this is while it is happening on the cluster, not post job completion.  

Cheers,

~Scott
==================================
Dr. Scott Yockel | Senior Team Lead of HPC
FAS Research Computing | Harvard University
38 Oxford Street Cambridge, MA
Office: 211A | Phone: 617-496-7468
==================================

Bjørn-Helge Mevik

unread,
Mar 4, 2015, 9:47:48 AM3/4/15
to slurm-dev

Scott Yockel <syo...@g.harvard.edu> writes:

> Slurm-Dev,
>
> Is there anything in the works to add the capacity of TotalCPU to also
> track the child process user and system time? I see that currently
> TotalCPU is defined: "provides a measure of the task’s parent process
> and does not include CPU time of child processes.”

In my experience, that description might not be accurate. It seems also
child processes are included, as long as the job doesn't time out. Here
is an email I wrote about it last year:


From: Bjørn-Helge Mevik <b.h....@usit.uio.no>
Subject: [slurm-dev] UserCPU etc. for subprocesses not registered when a job times out.
To: slurm-dev <slur...@schedmd.com>
Date: Fri, 12 Sep 2014 06:25:16 -0700
Reply-To: slurm-dev <slur...@schedmd.com>


We would like to use the UserCPU, SystemCPU and TotalCPU values from
sacct to assess the efficiency of jobs. When a job exits normally,
these values are reported for the batch script step and includes the
time spent by sub-processes.

However, if the job times out, these values only includ the CPU time
spent by the batch script process itself, not its sub-processes. Se
below for an illustration.

Is this intended behaviour? If so, is there any other way to gather CPU
times from jobs, even when they don't exit normally?


Illustration:

407 (1) $ cat shell-loop.sh
#!/bin/bash

echo Starting loop

## Loop that only uses shell builtins:
while true; do echo -n ; done

echo This is the end...

408 (1) $ cat timeout-in-subprocess.sm
#!/bin/bash
#SBATCH --account=staff
#SBATCH --time=0:2:0 --mem-per-cpu=500
#SBATCH --output=out/timeout-in-subprocess-%j.out

## Execute shell-loop.sh in subprocess:
./shell-loop.sh

409 (1) $ cat timeout-in-shell.sm
#!/bin/bash
#SBATCH --account=staff
#SBATCH --time=0:2:0 --mem-per-cpu=500
#SBATCH --output=out/timeout-in-shell-%j.out

## Run shell-loop.sh in this shell:
source shell-loop.sh

410 (1) $ sbatch timeout-in-subprocess.sm
Submitted batch job 40
411 (1) $ sbatch timeout-in-shell.sm
Submitted batch job 41

[... after a couple of minutes ...]

412 (1) $ sacct -o jobid,state,elapsed,usercpu,systemcpu,totalcpu -j 40,41
JobID State Elapsed UserCPU SystemCPU TotalCPU
------------ ---------- ---------- ---------- ---------- ----------
40 TIMEOUT 00:02:27 00:00.001 00:00.001 00:00.003
40.batch CANCELLED 00:02:27 00:00.001 00:00.001 00:00.003
41 TIMEOUT 00:02:24 02:23.415 00:00:00 02:23.416
41.batch CANCELLED 00:02:24 02:23.415 00:00:00 02:23.416

i.e., time spent in subprocesses is not reported.


Doing the same thing, but now with loops that terminate so the jobs
don't time out, we get:

416 (1) $ sbatch work-in-subprocess.sm
Submitted batch job 42
417 (1) $ sbatch work-in-shell.sm
Submitted batch job 43

[... after a couple of minutes ...]

418 (1) $ sacct -o jobid,state,elapsed,usercpu,systemcpu,totalcpu -j 42,43
JobID State Elapsed UserCPU SystemCPU TotalCPU
------------ ---------- ---------- ---------- ---------- ----------
42 COMPLETED 00:01:07 01:03.980 00:02.207 01:06.187
42.batch COMPLETED 00:01:07 01:03.980 00:02.207 01:06.187
43 COMPLETED 00:01:08 01:05.230 00:02.173 01:07.403
43.batch COMPLETED 00:01:08 01:05.230 00:02.173 01:07.403

i.e., time spent in subprocesses is reported.


--
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo


--
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo
Reply all
Reply to author
Forward
0 new messages