Larger Backups FD crashing

58 views
Skip to first unread message

Sebastian Scherer

unread,
Aug 27, 2024, 5:18:51 PM8/27/24
to bareos-users
Hi,
On larger backups the file daemon crashes on my setup on various clients pretty reliably. Typically this happens after 4-5 hours. Smaller backups run successfully. I added a heartbeat since I initially thought it might drop the connection but it seems that the file daemon just crashes after some time.

This is what the file daemon outputs and as error before crashing. I put it in a pdf file since I couldn't get the formatting for the tables look correct. Unfortunately I don't have more detailed logs but will try to get them next time. Bareos is version 23.0.3 (seems like it should be the latest verseion)

Thank you


Sebastian




Logs.pdf

Sebastian Scherer

unread,
Aug 27, 2024, 5:39:08 PM8/27/24
to bareos-users
I have found some more info on two of the computers where it is crashing but it does not seem very useful to me. I am running in docker :

computer 1:

Attempt to dump current JCRs. njcrs=1
threadid=0x000079abfde00640 killable=0 JobId=480 JobStatus=R jcr=0x79abf8000d50 name=airlab-share-01-backup.2024-08-12_21.55.20_20
UseCount=1
JobType=B JobLevel=F
sched_time=12-Aug-2024 21:55 start_time=12-Aug-2024 21:55
end_time=01-Jan-1970 00:00 wait_time=01-Jan-1970 00:00
db=(nil) db_batch=(nil) batch_started=0
dumping of jcrs finished. number of dumped = 1
Attempt to dump current JCRs. njcrs=1
threadid=0x00007edc01a00640 killable=0 JobId=486 JobStatus=R jcr=0x7edbfc000d50 name=airlab-share-01-backup.2024-08-13_02.00.00_19
UseCount=1
JobType=B JobLevel=F
sched_time=13-Aug-2024 02:34 start_time=13-Aug-2024 02:34
end_time=01-Jan-1970 00:00 wait_time=01-Jan-1970 00:00
db=(nil) db_batch=(nil) batch_started=0
dumping of jcrs finished. number of dumped = 1

computer 2:

Attempt to dump current JCRs. njcrs=1
threadid=0x00007f8dc63a8640 killable=0 JobId=583 JobStatus=R jcr=0x7f8dc00057e0 name=synology-nas-backup-set-1.2024-08-25_02.00.00_51
UseCount=1
JobType=B JobLevel=D
sched_time=25-Aug-2024 02:00 start_time=25-Aug-2024 02:00
end_time=01-Jan-1970 00:00 wait_time=01-Jan-1970 00:00
db=(nil) db_batch=(nil) batch_started=0
dumping of jcrs finished. number of dumped = 1

Sebastian Sura

unread,
Aug 28, 2024, 12:38:42 AM8/28/24
to bareos...@googlegroups.com

Hi Sebastian,

would it be possible for you to install gdb and our debuginfo packages ?  If you do, then the fd will create a very helpful traceback file
(not to be confused with the bactrace) if it crashes.

Kind Regards
Sebastian Sura

Am 27.08.24 um 23:39 schrieb Sebastian Scherer:
--
You received this message because you are subscribed to the Google Groups "bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bareos-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bareos-users/a794dc08-b77b-4fac-8b02-51562268d33bn%40googlegroups.com.
-- 
 Sebastian Sura                  sebasti...@bareos.com
 Bareos GmbH & Co. KG            Phone: +49 221 630693-0
 https://www.bareos.com
 Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
 Komplementär: Bareos Verwaltungs-GmbH
 Geschäftsführer: Stephan Dühr, Jörg Steffens, Philipp Storz

Sebastian Scherer

unread,
Aug 28, 2024, 4:59:51 PM8/28/24
to Sebastian Sura, bareos...@googlegroups.com
Hi,
I was able to get the debugging setup and will report back when it crashes.
Thank you

Sebastian


Sebastian Scherer

unread,
Sep 7, 2024, 6:27:24 PM9/7/24
to Sebastian Sura, bareos...@googlegroups.com
Here are the backtrace files
airlab-storage.1.bactrace
bareos-fd.9102.state
bareos.1.traceback

Sebastian Scherer

unread,
Sep 8, 2024, 4:30:58 PM9/8/24
to bareos-users
Here are traceback files from a second crash.







On Saturday, September 7, 2024 at 6:27:24 PM UTC-4 Sebastian Scherer wrote:
Here are the backtrace files
second-crash.1.traceback
second-crash.1.bactrace

Sebastian Sura

unread,
Sep 9, 2024, 2:31:17 AM9/9/24
to bareos...@googlegroups.com

Hi Sebastian

thanks for the tracebacks!  The issue you are expierencing is very weird. Alas  I am not able to reproduce it.
Could you to check if this crash still occurs with the next bareos version (i.e. https://download.bareos.org/next/) ?

Kind Regards,

Sebastian Sura

Am 08.09.24 um 22:30 schrieb Sebastian Scherer:
Here are traceback files from a second crash.







On Saturday, September 7, 2024 at 6:27:24 PM UTC-4 Sebastian Scherer wrote:
Here are the backtrace files
--
You received this message because you are subscribed to the Google Groups "bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bareos-users...@googlegroups.com.

Sebastian Scherer

unread,
Sep 9, 2024, 10:30:18 AM9/9/24
to Sebastian Sura, bareos...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages