purging deleted jobs

17 views
Skip to first unread message

Eric Wajnberg

unread,
Oct 6, 2021, 3:03:59 AM10/6/21
to diracgrid-forum

Hi there. I have a question regarding deleted/killed jobs.

I'm deleting a whole bunch of jobs (with the script dirac-wms-job-delete), but - on the DIRAC web interface - I still can see them, either as "killed" or "deleted" and they are remaining there sometimes for days. How can this be possible? The point is that they are accumulated here (and I now have several thousands), which sometimes generates a time-out error when I launch the script dirac-wms-job-delete.

Can this be solved?

Thanks for any help on that.

Eric.

Federico Stagni

unread,
Oct 6, 2021, 5:59:47 AM10/6/21
to Eric Wajnberg, diracgrid-forum
Hi Eric,
the script you mention will indeed not "physically" remove the jobs, but rather mark them as "Killed" or "Deleted". The final removal will be done server side by the JobCleaningAgent.

Cheers,
Federico

--
You received this message because you are subscribed to the Google Groups "diracgrid-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diracgrid-for...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/diracgrid-forum/5e57cf5f-d964-4cca-9cc0-087f7097dbc0n%40googlegroups.com.

Eric Wajnberg

unread,
Oct 6, 2021, 6:17:27 AM10/6/21
to diracgrid-forum
Thanks Federico,

Yes, I knew that (sorry if I did not explain that correctly). My question was not that, but the point is that killed or deleted jobs are actually taking ages to disappear. I've done a personal estimate of about 1500 jobs disappearing per hour. I have thus sometimes to wait days before all jobs have been purged, and this is (sometimes) making problems, with timeouts, etc.
My question was whether this can be solved.

Cheers, Eric.

Federico Stagni

unread,
Oct 6, 2021, 6:29:23 AM10/6/21
to Eric Wajnberg, diracgrid-forum
Hi Eric,
which DIRAC version are you running? Can you get a log of the running agent?

Cheers,
Federico

Eric Wajnberg

unread,
Oct 6, 2021, 7:21:00 AM10/6/21
to diracgrid-forum
Grazie Federico,

DIRAC version: v6r22p13 (but why changing the version should help?).

Log of running agent: What information are you willing me to provide you?

Right now I current have about 17,000 jobs waiting to disappear, so I estimate this to happen withing the next 10 h.

Cheers, Eric.

Federico Stagni

unread,
Oct 6, 2021, 7:56:32 AM10/6/21
to Eric Wajnberg, diracgrid-forum
The DIRAC version that you are using is really old, and not anymore supported, I would strongly suggest upgrading it. The JobCleaningAgent was largely re-worked in later versions.

Cheers,
Federocp

Eric Wajnberg

unread,
Oct 6, 2021, 8:56:17 AM10/6/21
to diracgrid-forum
Ok, will do that. Thanks!

Daniela Bauer

unread,
Oct 6, 2021, 9:38:20 AM10/6/21
to Eric Wajnberg, diracgrid-forum
Hi Eric,

You can tune the settings for this Agent in Systems/WorkloadManagement/Agents/JobCleaningAgent

You probably want to limit the MaxJobsAtOnce to something sensible (ours is 5000) to avoid overloading the system.
Our agent runs 4x per hour instead of the default once. We also remove jobs in Any state after a while, so the system doesn't get clogged up with jobs that will never run.

Regards,
Daniela



--
Sent from my guinea pig enhanced living room

-----------------------------------------------------------
daniel...@imperial.ac.uk
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: Working from home, please use email.
http://www.hep.ph.ic.ac.uk/~dbauer/
Reply all
Reply to author
Forward
Message has been deleted
0 new messages