Problems with Restore functionality

397 views
Skip to first unread message

Fredrik Jansson

unread,
Mar 13, 2023, 7:40:28 AM3/13/23
to bacularis
Hi,
I have problems with the Restore functionality using the Web GUI.

When I restore a limited number of files it works fine.

But when I try to restore a large number of files I get this error in my web browser.

Gateway Timeout
The gateway did not receive a timely response from the upstream server or application.

If I try doing the same restore using the bconsole it works fine.

Any ideas?

BR
Fredrik

Marcin Haba

unread,
Mar 13, 2023, 11:51:28 PM3/13/23
to Fredrik Jansson, bacularis
Hello Fredrik,

Welcome to the Bacularis user list.

Could you tell us on what wizard step you experience the timeout?

In Bacula we have two ways of doing restore:

 1) traditional restore using bconsole restore command
 2) Bacula Bvfs restore using Bvfs cache

Bacularis uses the 2) way.

This timeout can happen because in case many files Bacula needs a bit of time to prepare and manage the Bvfs cache. In this ticket I gave a couple of advices for the timeout problem:

https://gitlab.bacula.org/bacula-community-edition/bacula-community/-/issues/2648

The most important in my opinion is the database tuning. Then building cache asynchronously to the restore wizard (ex. by Bacula Runscript) is important as well.

Please let us know if it helped to solve this problem.

Best regards,
Marcin Haba (gani)

--
You received this message because you are subscribed to the Google Groups "bacularis" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bacularis+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/bacularis/d19cbdd0-1977-4b72-a47a-d0f4117b31c9n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
"Greater love hath no man than this, that a man lay down his life for his friends." Jesus Christ

"Większej miłości nikt nie ma nad tę, jak gdy kto życie swoje kładzie za przyjaciół swoich." Jezus Chrystus

Fredrik Jansson

unread,
Mar 14, 2023, 7:28:22 AM3/14/23
to bacularis
 Hi,
Thanks for the suggestions. I am using MySQL as database. I am no database guy so not sure if the default
values are totally wrong... so not really sure what to change there... but I tried the thing with the run script.

I added this "Run script" for the "BackupCatalog" job after looking at this page:
https://docs.bareos.org/IntroductionAndTutorial/InstallingBareosWebui.html#configure-updating-the-bvfs-cache-frequently

Job {
  Name = "BackupCatalog"
...
  Run Script {
    Console = ".bvfs_update"
    RunsWhen = After
    RunsOnClient = No
  }

Then I then manually ran the BackupCatalog job to test if it made any difference.

But I still get the error message in Bacularis Web-GUI almost exact 60 second after I hit the "Run restore" button in the restore wizard.


"Gateway Timeout
The gateway did not receive a timely response from the upstream server or application."

But at the same time I was looking in the bacula server log.

This looks ok to me... BUT the thing is that these log entries are actually written after those 60 seconds...

14-Mar 11:18 bacula01-dir JobId 198: Start Restore Job RestoreFiles.2023-03-14_11.18.38_10
14-Mar 11:18 bacula01-dir JobId 198: Restoring files from JobId(s) 178
14-Mar 11:18 bacula01-dir JobId 198: Connected to Storage "File1" at 10.10.10.50:9103 with TLS
14-Mar 11:18 bacula01-dir JobId 198: Using Device "FileChgr1-Dev1" to read.
14-Mar 11:18 bacula01-dir JobId 198: Connected to Client "client-fd" at 10.10.10.181:9102 without encryption
14-Mar 11:18 bacula01-sd JobId 198: Ready to read from volume "File1-365d0014" on File device "FileChgr1-Dev1" (/bacula1).
14-Mar 11:18 bacula01-sd JobId 198: Forward spacing Volume "File1-365d0014" to addr=10326118674
14-Mar 11:18 bacula01-sd JobId 198: Elapsed time=00:00:01, Transfer rate=18.65 M Bytes/second
14-Mar 11:18 bacula01-dir JobId 198: Bacula bacula01-dir 13.0.1 (19Jan23):
  Build OS:               x86_64-pc-linux-gnu redhat (Green
  JobId:                  198
  Job:                    RestoreFiles.2023-03-14_11.18.38_10
  Restore Client:         "client-fd" 9.4.2 (04Feb19) x86_64-pc-linux-gnu,ubuntu,20.04
  Where:                  /tmp/restore
  Replace:                Always
  Start time:             14-Mar-2023 11:18:40
  End time:               14-Mar-2023 11:18:41
  Elapsed time:           1 sec
  Files Expected:         195
  Files Restored:         195
  Bytes Restored:         18,624,462 (18.62 MB)
  Rate:                   18624.5 KB/s
  FD Errors:              0
  FD termination status:  OK
  SD termination status:  OK
  Termination:            Restore OK

In this case the number of files was not many, just 195.

Restoring individual files works fine, but then the bacula.log file is also updated in a couple a seconds and then the Bacularis web GUI is "happy".

In case I try with much more files (50.000) I don't see anything in the bacula.log when initiating it using Bacularis. 

BR
Fredrik

Marcin Haba

unread,
Mar 15, 2023, 1:39:58 AM3/15/23
to Fredrik Jansson, bacularis
Hello Fredrik

Thanks for all the details.

So, you don't have a problem with generating the Bvfs cache, but with starting the restore job. It is the reason why I asked on which wizard step you experience this timeout. So it is after clicking the 'Restore run' button. When you look during this 1 minute time on the process list in the system, you will see that the process that takes the most resources is mysqld. It is because the .bvfs_restore bconsole command uses the database intensively for many files.

I did a test on my side with 1 million files to restore (with MySQL 8.0.26). The restore start took 66 seconds and did not time out. In this test I used the Lighttpd web server. The timeout setting can be different in different web servers.

I don't know how we could optimize the .bvfs_restore command for MySQL. Maybe as a workaround you could try to increase the web server timeout to 2 minutes, for example.

Preparing the restore using bconsole can go faster because it does not use the Bvfs interface.

For now I would propose to try to increase the timeout in the web server config to see if it starts in reasonable time. Also please observe the process list during start. Because we don't have an impact on the .bvfs_restore command, maybe an idea could be to add an option in the last restore wizard step to start the restore in the background. It would be useful for not closing the process by web server after occuring timeout. I need to consider it.

Good luck.


Best regards,
Marcin Haba (gani)

For more options, visit https://groups.google.com/d/optout.

Fredrik Jansson

unread,
Mar 15, 2023, 4:30:43 AM3/15/23
to bacularis
Hi Marcin,
You are correct. I checked with top command and when starting a restore job
my MySQL (8.0.30) really uses a lot of CPU.

I increased the webserver timeout to 120 sec (Apache/2.4.37 (rocky))

Now I am able to restore 150.000 files in my test.

Thanks for the fast answers and a great project!

BR
Fredrik

Marcin Haba

unread,
Mar 16, 2023, 2:22:46 AM3/16/23
to Fredrik Jansson, bacularis
Hi Fredrik,

Great. Thanks.

Good to hear that it started working on your side.


Best regards,
Marcin Haba (gani)

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages