Jobs keep being rescheduled

21 views
Skip to first unread message

Ar Kr

unread,
Jan 11, 2022, 7:15:04 AM1/11/22
to diracgrid-forum
Hi,
I'm assisting a user to run jobs on the biomed VO. I'm using DIRAC version 453f4f52e (2021-07-11 21:23:06 +0200), installed by our admin.

My problem is that the jobs (here a simple "hello world") are repeatedly marked as being rescheduled, and never executed.

$ dirac-wms-job-logging-info 118108918
Source                Status    MinorStatus             ApplicationStatus              DateTime
============================================================================================================
JobManager            Received  Job accepted            Unknown                        2022-01-06 09:38:50
JobPath               Checking  JobSanity               Unknown                        2022-01-06 09:38:50
JobSanity             Checking  JobScheduling           Unknown                        2022-01-06 09:38:50
JobScheduling         Waiting   Pilot Agent Submission  Unknown                        2022-01-06 09:38:50
Matcher               Matched   Assigned                Unknown                        2022-01-06 09:41:25
JobA...@EGI.UKIM.uk  Matched   Job Received by Agent   Unknown                        2022-01-06 09:41:25
JobA...@EGI.UKIM.uk  Matched   Submitting To CE        Unknown                        2022-01-06 09:41:28
JobPath               Checking  JobSanity               Unknown                        2022-01-06 11:44:01
JobSanity             Checking  JobScheduling           Unknown                        2022-01-06 11:44:01
JobScheduling         Checking  JobScheduling           On Hold: after rescheduling 1  2022-01-06 11:44:01
JobScheduling         Waiting   Pilot Agent Submission  Unknown                        2022-01-06 11:47:11
Matcher               Matched   Assigned                Unknown                        2022-01-06 11:51:48
JobA...@EGI.CPPM.fr  Matched   Job Received by Agent   Unknown                        2022-01-06 11:51:48
JobA...@EGI.CPPM.fr  Matched   Submitting To CE        Unknown                        2022-01-06 11:51:49
JobPath               Checking  JobSanity               Unknown                        2022-01-06 13:54:01
JobSanity             Checking  JobScheduling           Unknown                        2022-01-06 13:54:01
JobScheduling         Checking  JobScheduling           On Hold: after rescheduling 2  2022-01-06 13:54:01
JobScheduling         Waiting   Pilot Agent Submission  Unknown                        2022-01-06 13:59:11
Matcher               Matched   Assigned                Unknown                        2022-01-06 14:02:21
JobA...@EGI.UKIR.uk  Matched   Job Received by Agent   Unknown                        2022-01-06 14:02:21
JobA...@EGI.UKIR.uk  Matched   Submitting To CE        Unknown                        2022-01-06 14:02:27
JobPath               Checking  JobSanity               Unknown                        2022-01-06 16:04:01
JobSanity             Checking  JobScheduling           Unknown                        2022-01-06 16:04:02
JobScheduling         Checking  JobScheduling           On Hold: after rescheduling 3  2022-01-06 16:04:02
JobScheduling         Waiting   Pilot Agent Submission  Unknown                        2022-01-06 16:14:11
Matcher               Matched   Assigned                Unknown                        2022-01-06 16:18:02
JobA...@EGI.CPPM.fr  Matched   Job Received by Agent   Unknown                        2022-01-06 16:18:02
JobA...@EGI.CPPM.fr  Matched   Submitting To CE        Unknown                        2022-01-06 16:18:06


Do you have any idea how to investigate this problem ?
Thanks,
A.K.

Daniela Bauer

unread,
Jan 11, 2022, 9:32:37 AM1/11/22
to Ar Kr, diracgrid-forum
Hi AK,

I realise this is probably not what you want to hear, but unless you are using a production version of DIRAC all bets are off.
I don't know why you are attempting to use a random commit, but as a first measure I would ask your admin to move the server (and pilot version) to a production version.
Otherwise,  without access to the logs, I don't think there is anything you can do from a user perspective.
Checking there are no formatting issues and/or impossible conditions in the jdl might help, but as you say it's a hello world job, that seems unlikely.

Regards,
Daniela



--
You received this message because you are subscribed to the Google Groups "diracgrid-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diracgrid-for...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/diracgrid-forum/f01e0abd-4879-45cf-aab4-60490225d44en%40googlegroups.com.


--
Sent from my guinea pig enhanced living room

-----------------------------------------------------------
daniel...@imperial.ac.uk
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: Working from home, please use email.
http://www.hep.ph.ic.ac.uk/~dbauer/
Reply all
Reply to author
Forward
Message has been deleted
0 new messages