[slurm-dev] sreport not reporting gpu info, but sacct does

283 views
Skip to first unread message

Tim Carlson

unread,
Oct 6, 2017, 3:13:57 PM10/6/17
to slurm-dev
Background: Recently installed new cluster which I started with 14.03 but then upgraded to 17.02 to get better/more gres/tres information.

In my other clusters I use sreport heavily to do billing and this new cluster is GPU based and I want to bill off of GPU time consumed.  My assumption was I could use something like

 sreport cluster AccountUtilizationByUser -T cpu,gpu Start=2017-10-06 End=2017-10-08 

I think I have slurm.conf configured correctly 

# grep -i gres /etc/slurm/slurm.conf
AccountingStorageTRES=gres/gpu
GresTypes=gpu
NodeName=dl[01-25] Gres=gpu:2 Feature=ml01 Procs=16 State=UNKNOWN

And sacct seems to report the gres/tres utilization.

# sacct -X -u tim --format=jobid,elapsed,ReqTRES%30,ReqGRES --starttime=2017-10-06 | tail
314            00:05:27        cpu=1,node=1,gres/gpu=1        gpu:1
315            00:05:27        cpu=1,node=1,gres/gpu=1        gpu:1
316            00:05:27        cpu=1,node=1,gres/gpu=1        gpu:1
317            00:05:27        cpu=1,node=1,gres/gpu=1        gpu:1
318            00:05:27        cpu=1,node=1,gres/gpu=1        gpu:1
319            00:05:27        cpu=1,node=1,gres/gpu=1        gpu:1
320            00:05:27        cpu=1,node=1,gres/gpu=1        gpu:1
321            00:05:24        cpu=1,node=1,gres/gpu=1        gpu:1
322            00:05:24        cpu=1,node=1,gres/gpu=1        gpu:1
323            00:05:24        cpu=1,node=1,gres/gpu=1        gpu:1


# sreport cluster AccountUtilizationByUser -T gpu Start=2017-10-06 End=2017-10-08
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2017-10-06T00:00:00 - 2017-10-06T11:59:59 (43200 secs)
Use reported in TRES Minutes
--------------------------------------------------------------------------------
  Cluster         Account     Login     Proper Name      TRES Name     Used
--------- --------------- --------- --------------- -------------- --------

Yet if I ask for cpu time from the tres field I get what I want.

# sreport cluster AccountUtilizationByUser -T cpu,gpu Start=2017-10-06 End=2017-10-08
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2017-10-06T00:00:00 - 2017-10-06T11:59:59 (43200 secs)
Use reported in TRES Minutes
--------------------------------------------------------------------------------
  Cluster         Account     Login     Proper Name      TRES Name     Used
--------- --------------- --------- --------------- -------------- --------
 marianas            root                                      cpu     143
 marianas             ops                                      cpu      143
 marianas             ops       tim     Tim Carlson            cpu      143

Bottom line being, what am I missing to get sreport to kick out gpu time?

Daniel Barker

unread,
Oct 6, 2017, 3:22:59 PM10/6/17
to slurm-dev
Tim,
I believe you have to refer to the gpu as gres/gpu.

[root@slurm-login ~]# sreport -T CPU,mem,gres/gpu cluster AccountUtilizationByUser Start=2017-01-01 End=2017-12-31                                                                                                                            
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2017-01-01T00:00:00 - 2017-10-06T14:59:59 (24069600 secs)
Use reported in TRES Minutes
--------------------------------------------------------------------------------
  Cluster         Account     Login     Proper Name      TRES Name     Used 
--------- --------------- --------- --------------- -------------- -------- 
 deadpool            root                                      cpu      228 
 deadpool            root                                      mem   192008 
 deadpool            root                                 gres/gpu      259 
 deadpool        hpcstaff                                      cpu      228 
 deadpool        hpcstaff                                      mem   192008 
 deadpool        hpcstaff                                 gres/gpu      259 
 deadpool        hpcstaff  danbarke   Daniel Barker            cpu      198 
 deadpool        hpcstaff  danbarke   Daniel Barker            mem   160947 
 deadpool        hpcstaff  danbarke   Daniel Barker       gres/gpu      259 

-Dan
--
Dan Barker
ARC-TS

Tim Carlson

unread,
Oct 6, 2017, 3:30:41 PM10/6/17
to slurm-dev
Perfect!  Thanks!

# sreport cluster AccountUtilizationByUser -T cpu,gres/gpu Start=2017-10-06 End=2017-10-08
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2017-10-06T00:00:00 - 2017-10-06T11:59:59 (43200 secs)
Use reported in TRES Minutes
--------------------------------------------------------------------------------
  Cluster         Account     Login     Proper Name      TRES Name     Used
--------- --------------- --------- --------------- -------------- --------
 marianas            root                                      cpu     143
 marianas            root                                 gres/gpu     143
 
 marianas             ops       tim     Tim Carlson            cpu      143
 marianas             ops       tim     Tim Carlson       gres/gpu      143

Merlin Hartley

unread,
Oct 9, 2017, 7:03:24 AM10/9/17
to slurm-dev
That’s what I’ve been looking for too!

Though now I see that my configuration must be wrong - I am trying to make use of a GPU cost the same as 160 CPUs - so I have this config:

<snip>
PartitionName=DEFAULT  DefaultTime=24:0:0 MaxTime=14-0:0:0 MaxNodes=4 TRESBillingWeights="CPU=1.0,Mem=0.25G,GRES/gpu=160.0”
NodeName=pascal[01-03] Sockets=2 CoresPerSocket=8  ThreadsPerCore=2 RealMemory=232000 Gres=gpu:pascal:4
PartitionName=pascal   Default=NO  State=UP Nodes=pascal[01-03] MaxNodes=1
<snip>

But sreport tells me
   mbu    <user info>            cpu    124828 
   mbu    <user info>       gres/gpu     49369 

For a user who exclusively uses GPU machines (4 GPUs and 16 CPUs per machine).

Any idea what I’ve missed?

Thanks


Merlin

--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
Cambridge, CB2 0XY
United Kingdom

Bill Broadley

unread,
Oct 28, 2017, 12:01:19 AM10/28/17
to slurm-dev


I noticed crazy high numbers in my reports, things like sreport user top:
Top 10 Users 2017-10-20T00:00:00 - 2017-10-26T23:59:59 (604800 secs)
Use reported in Percentage of Total
--------------------------------------------------------------------------------
Cluster Login Proper Name Account Used Energy
--------- --------- --------------- --------------- ----------- --------
MyClust JoeUser Joe User jgrp 3710.15% 0.00%

This was during a period when JoeUser hadn't submitted a single job.

We have been through some slurm upgrades, figured one of the schema tweaks had
confused things. I looked in the slurm accounting table and found the
job_table. I found 80,000 jobs with no end_time, that weren't actually running.
So I set the end_time = begin time for those 80,000 jobs. It didn't help the
reports.

I then tried deleting all 80,000 jobs from the job_table and that didn't help
either.

Is there a way to rebuild the accounting data from the information in the job_
table?

Or any other suggestion for getting some sane numbers out?

Doug Meyer

unread,
Oct 28, 2017, 12:11:01 PM10/28/17
to slurm-dev
Look up orphan jobs and lost.pl (quick script to find orphans) in https://groups.google.com/forum/#!forum/slurm-devel.

Battling this myself right now.

Thank you,
Doug

Douglas Jacobsen

unread,
Oct 28, 2017, 12:18:26 PM10/28/17
to slurm-dev
Once you've got the end times fixed, youll need to manually update the timestamps in the <cluster>_last_ran table to some time point before the start of the earliest job fixed.  Then on the next hour mark, it'll start rerolling up the past data to reflect the new reality you've set in the database.

Unfortunately I'm away from a keyboard right now so I'm not 100% certain of the table name.

Douglas Jacobsen

unread,
Oct 28, 2017, 12:35:30 PM10/28/17
to slurm-dev
A more complete response would be something like:

MariaDB [slurm_acct_db]> select * from <cluster>_last_ran_table;
+---------------+--------------+----------------+
| hourly_rollup | daily_rollup | monthly_rollup |
+---------------+--------------+----------------+
|    1509206400 |   1509174000 |     1506841200 |
+---------------+--------------+----------------+
1 row in set (0.00 sec)

MariaDB [slurm_acct_db]> update <cluster>_last_ran_table set hourly_rollup=UNIX_TIMESTAMP('2017-01-01 00:00:00'),daily_rollup=UNIX_TIMESTAMP('2017-01-01 00:00:00'),monthly_rollup=UNIX_TIMESTAMP('2017-01-01 00:00:00');
Query OK, 1 row affected (0.05 sec)
Rows matched: 1  Changed: 1  Warnings: 0

MariaDB [alva_slurm_acct_db]> select * from <cluster>_last_ran_table;
+---------------+--------------+----------------+
| hourly_rollup | daily_rollup | monthly_rollup |
+---------------+--------------+----------------+
|    1483257600 |   1483257600 |     1483257600 |
+---------------+--------------+----------------+
1 row in set (0.01 sec)

MariaDB [slurm_acct_db]> quit

Making changes to the timestamps and "<cluster>" as appropriate.

Obviously mucking with the database is dangerous, so be careful.

----
Doug Jacobsen, Ph.D.
NERSC Computer Systems Engineer

------------- __o
---------- _ '\<,_
----------(_)/  (_)__________________________


Bill Broadley

unread,
Oct 30, 2017, 8:15:30 PM10/30/17
to slurm-dev


The user account with the anomalously high usage hasn't run any jobs in the last
year (that I'm interested in), so I deleted all jobs by that user.

So that includes all jobs with that id_user, id_group, and any of their
associations:

mysql> select user,partition,acct,id_assoc from <cluster>_assoc_table where
user="JoeUser";
+---------+-----------+-----------+----------+
| user | partition | acct | id_assoc |
+---------+-----------+-----------+----------+
| JoeUser | high | avgrp | 91 |
| JoeUser | lo | avgrp | 89 |
| JoeUser | low | avgrp | 271 |
| JoeUser | med | avgrp | 90 |
+---------+-----------+-----------+----------+

I found some with id_user=0 or id_group=0 as well, we don't run jobs as root so
I nuked those as well.

Then I set the last_ran table to jan 1st 2015:
update <cluster>_last_ran_table set hourly_rollup=UNIX_TIMESTAMP('2015-01-01
00:00:00'),daily_rollup=UNIX_TIMESTAMP('2015-01-01
00:00:00'),monthly_rollup=UNIX_TIMESTAMP('2015-01-01 00:00:00');

Nothing happened, so I restarted slurmdbd daemon, and it ran at 100% for an hour
or so rebuilding the tables.

Unfortunately sreport still shows super high numbers for root and the user in
question, even for time periods in the last year.

ro...@nas-7-0.MyCluster:~# sreport cluster AccountUtilizationByUser
Start=2017-01-01 End=2018-01-01 -t percent
Cluster/Account/User Utilization 2017-01-01T00:00:00 - 2017-10-30T16:59:59
(26150400 secs)
Use reported in Percentage of Total
--------------------------------------------------------------------------------
Cluster Account Login Proper Name Used Energy
--------- --------------- --------- --------------- ------------- --------
MyCluster root 3762.30% 0.00%
MyCluster root root root 0.00% 0.00%
MyCluster avgrp 3643.77% 0.00%
MyCluster avgrp JoeUser Joe User 3388.96% 0.00%
MyCluster avgrp JoeUser Joe User 254.76% 0.00%

Any idea what to look for? Or any other way to rebuild the accounting data for
the last year?

I ran lost.pl and found nothing there either.
Reply all
Reply to author
Forward
0 new messages