Understading VirtualFulls and target volumes retention for AI Jobs

55 views

Skip to first unread message

Brock Palen

unread,

Sep 7, 2019, 11:36:11 PM9/7/19

to bareos-users

We started using Always Incremental (AI) jobs on our single tape + disk volume system.

Disk Volume cannot hold all jobs so we setup a series of Copy and Migrate jobs to quickly move the Finished Consolidate jobs from Disk to tape. We do this because we cannot write and read to tape at the same time, so for consolidate jobs we only read from tape (the copied/migrated jobs) and the on disk pool that is the target of the consolidation has a short volume retention time. Copy/Migrate jobs happen every night after Consolidate.

Problem, we are seeing the disk Consolidate volumes get recycled way before we expect them to. So was wondering if the Volume Retention parameter for the pool is not based on last write but the date applied to the jobs, and given these are virtual Fulls their ‘virtual' date is actually far in the past compared to our retention, so when the next VirtualFull runs in the queue from the same Consolidate jobs Bareos purges everything from the volume as it’s older.
Is this a true assessment? Is there a way to use Volume Retention time based on RealJobEndTime? eg:

*llist jobid=14485
JobId: 14,485
Job: sch-hp-desktop-Users-Pictures.2019-09-07_09.47.04_00
Name: sch-hp-desktop-Users-Pictures
PurgedFiles: 0
Type: B
Level: F
ClientId: 7
Client: sch-hp-desktop-fd
JobStatus: T
SchedTime: 2019-09-07 09:47:04
StartTime: 2019-08-24 03:03:20
EndTime: 2019-08-24 03:05:25
RealEndTime: 2019-09-07 11:25:04
JobTDate: 1,566,630,325
VolSessionId: 626
VolSessionTime: 1,567,135,546
JobFiles: 26,282
JobBytes: 265,053,495,459
JobErrors: 0
JobMissingFiles: 0
PoolId: 10
PoolName: AI-Consolidated
PriorJobId: 0
FileSetId: 17
FileSet: Windows All Users Pictures

Here are examples of two jobs that ran today where a volume in AI-Consolidated was recycled, but was just written to from the job the same day (thus the confusion about why is it blowing away my jobs). This is causing major issues because if we fill our Offsite tapes and Consolidate runs it blows away jobs that are not copied and we end up with huge holes in our backups until the next job runs and it pulls way more data than it needed (defeating the point of AI backups).

# a successful VirtualFull (consolidate) job that wrote to volume AI-Consolidated-1581 @ 6-Sep-21:00
06-Sep 20:54 myth-dir JobId 14240: Start Virtual Backup JobId 14240, Job=macfu-Users.2019-09-06_20.24.03_52
06-Sep 20:54 myth-dir JobId 14240: Consolidating JobIds 12322,12543,13043,13213,14178,13522
06-Sep 21:00 myth-dir JobId 14240: Bootstrap records written to /var/lib/bareos/myth-dir.restore.135.bsr
06-Sep 21:00 myth-dir JobId 14240: Connected Storage daemon at myth.sheptechllc.com:9103, encryption: TLS_CHACHA20_POLY1305_SHA256
06-Sep 21:00 myth-dir JobId 14240: Using Device "FileStorage" to read.
06-Sep 21:00 myth-dir JobId 14240: Using Device "FileStorage2" to write.
06-Sep 21:00 myth-sd JobId 14240: stored/acquire.cc:151 Changing read device. Want Media Type="LTO5" have="File"
device="FileStorage" (/mnt/bacula)
06-Sep 21:00 myth-sd JobId 14240: Releasing device "FileStorage" (/mnt/bacula).
06-Sep 21:00 myth-sd JobId 14240: Media Type change. New read device "Tand-LTO5" (/dev/nst1) chosen.
06-Sep 21:00 myth-sd JobId 14240: Ready to read from volume "A00023L5" on device "Tand-LTO5" (/dev/nst1).
06-Sep 21:00 myth-sd JobId 14240: Volume "AI-Consolidated-1581" previously written, moving to end of data.
06-Sep 21:00 myth-sd JobId 14240: Ready to append to end of Volume "AI-Consolidated-1581" size=21363283396
06-Sep 21:00 myth-sd JobId 14240: Spooling data …
<snip>
Scheduled time: 06-Sep-2019 20:24:03
Start time: 23-Aug-2019 18:00:11
End time: 23-Aug-2019 18:52:02

Job that ran at 07-Sep-10:42 (20 hours later) that then recycled AI-Consolidated-1581 which we do not expect
07-Sep 10:42 myth-dir JobId 14485: There are no more Jobs associated with Volume "AI-Consolidated-1581". Marking it purged.
07-Sep 10:42 myth-dir JobId 14485: All records pruned from Volume "AI-Consolidated-1581"; marking it "Purged"
07-Sep 10:42 myth-dir JobId 14485: Recycled volume "AI-Consolidated-1581"
07-Sep 10:42 myth-sd JobId 14485: Recycled volume "AI-Consolidated-1581" on device "FileStorage3" (/mnt/bacula), all previous data lost.

Here is config of pools and jobs

# disk pools
Pool {
Name = AI-Incremental
Pool Type = Backup
Recycle = yes # Bareos can automatically recycle Volumes
Auto Prune = yes # Prune expired volumes
Volume Retention = 3 months # How long should jobs be kept?
Maximum Volume Bytes = 50G # Limit Volume size to something reasonable
Label Format = "AI-Incremental-"
Volume Use Duration = 7d
Storage = File
Next Pool = AI-Consolidated # consolidated jobs go to this pool
Action On Purge=Truncate
}

Pool {
Name = AI-Consolidated
Pool Type = Backup
Recycle = yes # Bareos can automatically recycle Volumes
Auto Prune = yes # Prune expired volumes
Volume Retention = 14 days # How long should jobs be kept?
Maximum Volume Bytes = 50G # Limit Volume size to something reasonable
Label Format = "AI-Consolidated-"
Volume Use Duration = 7 days
Storage = File
Action On Purge=Truncate
Migration Time = 7 days
}

Pool {
Name = Offsite
Pool Type = Backup
Recycle = no
Recycle Pool = Scratch
Auto Prune = yes
Volume Retention = 3 months
Volume Use Duration = 1 months
Storage = Tand-LTO5-Lib
}

# copy job to long term tape
Job {
Name = "Copy To Offsite AI-Consolidated"
Type = Copy
Pool = AI-Consolidated
Next Pool = Offsite
Schedule = ServerCycle
Allow Duplicate Jobs = no
Priority = 4 #before catalog dump
Messages = Standard
Selection Type = PoolUncopiedJobs
Selection Pattern = “.”

}

Brock Palen
1 (989) 277-6075
bro...@mlds-networks.com
www.mlds-networks.com
Websites, Linux, Hosting, Joomla, Consulting

Dan

unread,

Sep 24, 2019, 2:03:58 PM9/24/19

to bareos-users

Brock -

I've posted a number of times about Always Incremental configurations and issues. Specific to the question you asked, I would start here ... https://groups.google.com/forum/#!topic/bareos-users/KJAM5xDL2Ko

I have my volume retention on both the AI-Incremental pool and the AI-Consolidated pool set to 1 year. If your consolidation is working properly you'll never have a volume approach anywhere near that before it is recycled. If your consolidation isn't working properly this long retention will prevent you from unknowingly blowing away your only copy of critical backup data. My thought is that I'd rather run out of storage space than lose backup data.

Volume retention time is based on the last 'write'. However the retention is only applied/meaningful if there is still data on that volume. If a volume in purged (no longer contains any data) it is available to recycle regardless of the retention period. If consolidation is working correctly, data is moved off of multiple source volumes onto one or more destination volumes during the consolidation process. Once the source volume(s) are empty (i.e. no catalog entries pointing to them) they are recycled as space is required for future backup operations.

Reply all

Reply to author

Forward

0 new messages