DIR fails to cancel job on SD

19 views
Skip to first unread message

Andrei Brezan

unread,
May 19, 2020, 4:23:11 AM5/19/20
to bareos-users
Hi,

Running on 18.2 I have a failing backup that eventually hogs all the slots on the SD. What I can see in the job log is:

19-May 06:47 bareos-sd JobId 3693: Releasing device "AWS_S3_1-01" (AWS S3 Storage).
19-May 06:47 bareos-dir-alxp JobId 3693: Fatal error: Director's comm line to SD dropped.

And then AWS_S3_1-01 remains locked with this job. Eventually if more jobs fail all the slots on the SD will be in use by failed jobs that are no longer running.

Is there anything that we can do to mitigate this or investigate further?

Thanks,
Andrei

Andrei Brezan

unread,
May 19, 2020, 5:35:05 AM5/19/20
to bareos-users
Tried to do:
*cancel storage=S3_Object_aws jobid=3693
3000 JobId=3693 Job="client1.2020-05-19_06.34.03_30" marked to be canceled.

But the job is still listed on the SD as:
Writing: Full Backup job client1 JobId=3693 Volume="AWS-Yearly-1768"
    pool="AWS-Yearly" device="AWS_S3_1-01" (AWS S3 Storage)
    spooling=0 despooling=0 despool_wait=0
    Files=1 Bytes=1,334 AveBytes/sec=0 LastBytes/sec=0
    FDReadSeqNo=29 in_msg=20 out_msg=6 fd=367

--
Andrei
Reply all
Reply to author
Forward
0 new messages