Hello Marcin,
I've run a debug trace while verifying a full backup.
Unfortunately, there is activity in the trace for every second of the trace. It's not like I can look at the trace and see that there are seconds of total inactivity. So that it is obvious that the execution is "pausing" at some function.
I ran the verify for 90 seconds.
During that 90 seconds 662 files were verified.
When the full backup was performed, 170,493 files were backed up in 4m 7 seconds.
So in 90 seconds, on a linear basis (assuming that the verify should take as long as the backup) around 60,000 files should have been verified.
I've had a look at the Bacula doc and it's not clear to me what the other debug levels are, and whether debugging at a higher level might help.
From my perspective, I think a strong case can be made that the answer is in front of me already.
If the verify should take around the same time as the backup. Then the number of files that were verified during the trace should be way higher than what was processed.
The bottom-line is the same, the throughput on a full verify is poor. There should have been around 60,000 files processed give or take maybe 5,000 - 10,000 files. But only 600 files were verified.
So the issue is the thruput, there should have been way more files processed per second than what was reported.
I'm happy to run more traces if there is a basis on which I can review the trace and attempt to pinpoint the bottleneck.
But me trying to eyeball the current trace and say the problem is 'here', isn't going to happen.
The only observation I can make, is that looking at the verify on the client, as files are being verified here are the functions that are being called
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:315-5712 Got hdr: FilInx=7 FullStream=20 Stream>
05-Sep-2024 19:12:00 CyberX_testclient5-fd: verify_vol.c:333-5712 Got stream data Encrypted File data, l
So potentially, the bottleneck is in those functions, as the number of files that should have been verified during the period of the trace should have been a lot higher than what was processed.
But as I mentioned above, it is not the case that in trace there are periods where nothing is happening. The problem is that there should be a lot more files being processed than what is being processed.