[slurm-users] Fwd: sreport cluster UserUtilizationByaccount Used result versus sreport job SizesByAccount or sacct: inconsistencies

35 views
Skip to first unread message

KK via slurm-users

unread,
Apr 15, 2024, 9:57:14 PMApr 15
to slurm...@lists.schedmd.com


---------- Forwarded message ---------
发件人: KK <daijian...@gmail.com>
Date: 2024年4月15日周一 13:25
Subject: sreport cluster UserUtilizationByaccount Used result versus sreport job SizesByAccount or sacct: inconsistencies
To: <slurm...@schedmd.com>


I wish to ascertain the CPU core hours utilized by user dj1 and dj. I have tested with sreport cluster UserUtilizationByAccount, sreport job SizesByAccount, and sacct. It appears that sreport cluster UserUtilizationByAccount displays the total core hours used by the entire account, rather than the individual user's cpu time. Here are the specifics:

Users dj and dj1 are both under the account mehpc.

In 2024-04-12 ~ 2024-04-15, dj1 used approximately 10 minutes of core time, while dj used about 4 minutes. However, "sreport Cluster UserUtilizationByAccount user=dj1 start=2024-04-12 end=2024-04-15" shows 14 minutes of usage. Similarly, "sreport job SizesByAccount Users=dj start=2024-04-12 end=2024-04-15" hows about 14 minutes.
Using "sreport job SizesByAccount Users=dj1 start=2024-04-12 end=2024-04-15" or "sacct -u dj1 -S 2024-04-12 -E 2024-04-15 -o "jobid,partition,account,user,alloccpus,cputimeraw,state,workdir%60" -X |awk 'BEGIN{total=0}{total+=$6}END{print total}'" yields the accurate values, which are around 10 minutes for dj1. Here are the details:

[root@ood-master ~]# sacctmgr list assoc format=cluster,user,account,qos
   Cluster       User    Account                  QOS
---------- ---------- ---------- --------------------
     mehpc                  root               normal
     mehpc       root       root               normal
     mehpc                 mehpc               normal
     mehpc         dj      mehpc               normal
     mehpc        dj1      mehpc               normal


[root@ood-master ~]# sacct -X -u dj1 -S 2024-04-12 -E 2024-04-15 -o jobid,ncpus,elapsedraw,cputimeraw
JobID             NCPUS ElapsedRaw CPUTimeRAW
------------ ---------- ---------- ----------
4                     1         60         60
5                     2        120        240
6                     1         61         61
8                     2        120        240
9                     0          0          0

[root@ood-master ~]# sacct -X -u dj -S 2024-04-12 -E 2024-04-15 -o jobid,ncpus,elapsedraw,cputimeraw
JobID             NCPUS ElapsedRaw CPUTimeRAW
------------ ---------- ---------- ----------
7                     2        120        240


[root@ood-master ~]# sreport job SizesByAccount Users=dj1 start=2024-04-12 end=2024-04-15
--------------------------------------------------------------------------------
Job Sizes 2024-04-12T00:00:00 - 2024-04-14T23:59:59 (259200 secs)
Time reported in Minutes
--------------------------------------------------------------------------------
  Cluster   Account     0-49 CPUs   50-249 CPUs  250-499 CPUs  500-999 CPUs  >= 1000 CPUs % of cluster
--------- --------- ------------- ------------- ------------- ------------- ------------- ------------
    mehpc      root            10             0             0             0             0      100.00%


[root@ood-master ~]# sreport job SizesByAccount Users=dj start=2024-04-12 end=2024-04-15
--------------------------------------------------------------------------------
Job Sizes 2024-04-12T00:00:00 - 2024-04-14T23:59:59 (259200 secs)
Time reported in Minutes
--------------------------------------------------------------------------------
  Cluster   Account     0-49 CPUs   50-249 CPUs  250-499 CPUs  500-999 CPUs  >= 1000 CPUs % of cluster
--------- --------- ------------- ------------- ------------- ------------- ------------- ------------
    mehpc      root             4             0             0             0             0      100.00%


[root@ood-master ~]# sreport Cluster UserUtilizationByAccount user=dj1 start=2024-04-12 end=2024-04-15
--------------------------------------------------------------------------------
Cluster/User/Account Utilization 2024-04-12T00:00:00 - 2024-04-14T23:59:59 (259200 secs)
Usage reported in CPU Minutes
--------------------------------------------------------------------------------
  Cluster     Login     Proper Name         Account     Used   Energy
--------- --------- --------------- --------------- -------- --------
    mehpc       dj1         dj1 dj1           mehpc       14        0



[root@ood-master ~]# sreport Cluster UserUtilizationByAccount user=dj start=2024-04-12 end=2024-04-15
--------------------------------------------------------------------------------
Cluster/User/Account Utilization 2024-04-12T00:00:00 - 2024-04-14T23:59:59 (259200 secs)
Usage reported in CPU Minutes
--------------------------------------------------------------------------------
  Cluster     Login     Proper Name         Account     Used   Energy
--------- --------- --------------- --------------- -------- --------
    mehpc        dj           dj dj           mehpc       14        0


[root@ood-master ~]# sacct -u dj1 -S 2024-04-12 -E 2024-04-15 -o "jobid,partition,account,user,alloccpus,cputimeraw,state,workdir%60" -X |awk 'BEGIN{total=0}{total+=$6}END{print total}'
601


[root@ood-master ~]# sacct -u dj -S 2024-04-12 -E 2024-04-15 -o "jobid,partition,account,user,alloccpus,cputimeraw,state,workdir%60" -X |awk 'BEGIN{total=0}{total+=$6}END{print total}'
240


[root@ood-master ~]# sreport cluster AccountUtilizationByUser accounts=mehpc start=2024-01-01 end=2024-04-15
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2024-01-01T00:00:00 - 2024-04-14T23:59:59 (9072000 secs)
Usage reported in CPU Minutes
--------------------------------------------------------------------------------
  Cluster         Account     Login     Proper Name     Used   Energy
--------- --------------- --------- --------------- -------- --------
    mehpc           mehpc                                 14        0
    mehpc           mehpc        dj           dj dj       14        0
    mehpc           mehpc       dj1         dj1 dj1       14        0
Reply all
Reply to author
Forward
0 new messages