Dear all,
I am writing to ask you a question.
Is it possible to retrieve the status of cleared jobs (e.g. after completed with either success or failed) from the Slurm rest interface ?
When a job (job id=131 in the example below) is cleared, the rest interface returns this after some time after completion:
{"meta":{"plugin":{"type":"openapi/v0.0.36","name":"REST v0.0.36"},"Slurm":{"version":{"major":20,"micro":7,"minor":11},"release":"20.11.7"}},"errors":[{"error":"_handle_job_get: unknown job 131","error_code":0}],"jobs":[]}
I activated the job status storage in mysql:
sacct -j 131
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
131 testjob.sh cirasa 2 COMPLETED 0:0
131.batch batch 2 COMPLETED 0:0
131.0 hostname 2 COMPLETED 0:0
131.1 sleep 2 COMPLETED 0:0
but the rest service does not seem to pick the status from it.
Do you have hints?
Just to understand more:
- how many seconds the completed job stays available to be queried from squeue or rest API methods? Can this "time-to-live-before-cleanup" be configured, eventually increased a bit? This would be useful to avoid polling the status very frequently.
- do we have a push mechanism to send job status to external web services, rather than polling it using rest API methods?
Thanks very much for your help,
Cheers,
Simone
PS: Using Slurm v20.11.7 on Centos 7
****************************************************************