The user account with the anomalously high usage hasn't run any jobs in the last
year (that I'm interested in), so I deleted all jobs by that user.
So that includes all jobs with that id_user, id_group, and any of their
associations:
mysql> select user,partition,acct,id_assoc from <cluster>_assoc_table where
user="JoeUser";
+---------+-----------+-----------+----------+
| user | partition | acct | id_assoc |
+---------+-----------+-----------+----------+
| JoeUser | high | avgrp | 91 |
| JoeUser | lo | avgrp | 89 |
| JoeUser | low | avgrp | 271 |
| JoeUser | med | avgrp | 90 |
+---------+-----------+-----------+----------+
I found some with id_user=0 or id_group=0 as well, we don't run jobs as root so
I nuked those as well.
Then I set the last_ran table to jan 1st 2015:
update <cluster>_last_ran_table set hourly_rollup=UNIX_TIMESTAMP('2015-01-01
00:00:00'),daily_rollup=UNIX_TIMESTAMP('2015-01-01
00:00:00'),monthly_rollup=UNIX_TIMESTAMP('2015-01-01 00:00:00');
Nothing happened, so I restarted slurmdbd daemon, and it ran at 100% for an hour
or so rebuilding the tables.
Unfortunately sreport still shows super high numbers for root and the user in
question, even for time periods in the last year.
ro...@nas-7-0.MyCluster:~# sreport cluster AccountUtilizationByUser
Start=2017-01-01 End=2018-01-01 -t percent
Cluster/Account/User Utilization 2017-01-01T00:00:00 - 2017-10-30T16:59:59
(26150400 secs)
Use reported in Percentage of Total
--------------------------------------------------------------------------------
Cluster Account Login Proper Name Used Energy
--------- --------------- --------- --------------- ------------- --------
MyCluster root 3762.30% 0.00%
MyCluster root root root 0.00% 0.00%
MyCluster avgrp 3643.77% 0.00%
MyCluster avgrp JoeUser Joe User 3388.96% 0.00%
MyCluster avgrp JoeUser Joe User 254.76% 0.00%
Any idea what to look for? Or any other way to rebuild the accounting data for
the last year?
I ran
lost.pl and found nothing there either.