RFC: Optimization for virtual full jobs on disk storage

24 views
Skip to first unread message

Burkhard Linke

unread,
Nov 2, 2022, 11:00:54 AM11/2/22
to bareos-devel
Hi,

we are running bareos with disk based storage (maybe adding tapes later for long term archives). It currently hosts ~220 million files with roughly 900TB data in about 150 jobs. Job size range from a few files/bytes to several hundred TB. We are using a standard textbook always incremental scheme, with extra long retention file for the initial full backup.

Example:

Job {
  Name = "volume-adm"
  Description = "Backup of volume adm"
  Accurate = true
  Allow Duplicate Jobs = false
  Always Incremental = true
  Always Incremental Job Retention = 6 months
  Always Incremental Keep Number = 60
  Always Incremental Max Full Age = 5 years
  Catalog = "bcf"
  Client = "volume-backup"
  File Set = "volume-adm"
  Full Backup Pool = "AI-Consolidated"
  Maximum Concurrent Jobs = 16
  Messages = "Standard"
  Pool = "AI-Incremental"
  Priority = 10
  Schedule = "Nightly"
  Storage = "CEPH-Backup"
  Type = backup
  Write Bootstrap = "/var/lib/bareos/%c.bsr"
}

This works fine, but the virtual full jobs triggered by the consolidation job need to be optimized.

In the current implementation (correct me if I'm wrong), the virtual full job reads all data from the jobs to be consolidated, and stores them in the full pool. With the configuration above, this is fine for the first run of a virtual full job. It will processes two incremental runs, and stores their content (minus overwritten files / deleted files) in the 'AI-Consolidated' pool. On the next run, it will read the data from the previous virtual full run (in pool 'AI-Consolidated') + data from the next incremental run (in pool 'AI-Incremental') and store it in the 'AI-Consolidated pool'. So data already stored in the correct pool is read and written again. This is fine for tape based backups, but in case of disk based backups data is copied unnecessary. For large jobs (think 100-200 TB) it might even become unfeasible since virtual full run will take days/weeks.

I don't know the details of the volume header format, but it should be possible to implement the following method:

for each file:
1. if source and target pool are different, use standard copy method
2. if source and target pool are not disk based, use standard copy method
3. update header in existing volume(s) to reflect changes (e.g. different job id)
4. update database to reflect changes
5. in case of pruned files, truncate corresponding chunk in the volume(s)

It might be tricky to ensure atomicity of steps 3 + 4 to avoid inconsistencies. Most filesystems should be able to handle sparse files correctly, an extra "defragmentation" steps seems to be unnecessary.

This handling of virtual full jobs needs to be configurable, since it is suitable for certain workload only (few changed files. many new files added per job).

Any comments on this? Are there any obvious showstoppers?

Best regards,
Burkhard Linke

Andreas Rogge

unread,
Nov 3, 2022, 4:40:41 PM11/3/22
to bareos...@googlegroups.com
Am 02.11.22 um 16:00 schrieb Burkhard Linke:
> Hi,
>
> we are running bareos with disk based storage (maybe adding tapes later
> for long term archives). It currently hosts ~220 million files with
> roughly 900TB data in about 150 jobs. Job size range from a few
> files/bytes to several hundred TB. We are using a standard textbook
> always incremental scheme, with extra long retention file for the
> initial full backup.
>
> Example:
>
> Job {
...

>   Always Incremental = true
>   Always Incremental Job Retention = 6 months
>   Always Incremental Keep Number = 60
>   Always Incremental Max Full Age = 5 years
...
> }
>
> This works fine, but the virtual full jobs triggered by the
> consolidation job need to be optimized.

When I reverse engineer your settings, it means:
- I want to keep every Incrementals that was made in the past 6 months
- I want to keep at least 60 of these Incrementals
- I want to keep a full backup around that isn't older than 5 years

Assuming that you're doing daily backups, you'll end up with:
- 6 * 30 = 180 Incrementals for the last 180 days
- 1 Full that is on average 2.75 years old

As long as your Full isn't older than 5 years, consolidation will take
the oldest Incremental, merge it with the second oldest Incremental into
a new Incremental (which is now considered the oldest one).

When your Full is 5 years old, consolidation will then take the Full,
the oldest Incremental and the second oldest Incremental and merge them
into a new Full.

So in your setup, the oldest Incremental will grow for 4.5 years until
it gets merged into your Full. During that period it will get bigger and
bigger, which will make the consolidation take longer and longer.

Long story short: you can probably save a lot of time moving data around
if you decrease AI Max Full Age to maybe 9 months or so, effectively
producing a new Full every 3 months and keeping the daily consolidation
a lot smaller.

> In the current implementation (correct me if I'm wrong), the virtual
> full job reads all data from the jobs to be consolidated, and stores
> them in the full pool. With the configuration above, this is fine for
> the first run of a virtual full job. It will processes two incremental
> runs, and stores their content (minus overwritten files / deleted files)
> in the 'AI-Consolidated' pool. On the next run, it will read the data
> from the previous virtual full run (in pool 'AI-Consolidated') + data
> from the next incremental run (in pool 'AI-Incremental') and store it in
> the 'AI-Consolidated pool'. So data already stored in the correct pool
> is read and written again. This is fine for tape based backups, but in
> case of disk based backups data is copied unnecessary. For large jobs
> (think 100-200 TB) it might even become unfeasible since virtual full
> run will take days/weeks.
I agree that AI (and Virtual Full in general) is pretty i/o heavy for
the SD. However, that's simply how it works right now.
If you need to consolidate large jobs (as you said 100-200 TB), you'll
need unreasonably fast storage (i.e. a lot more than 1 GB/s) to finish
within a day.
The only workaround is to cut these jobs into pieces and configure Max
Full Consolidations to spread the consolidation into a new full backup
across several days.

> I don't know the details of the volume header format, but it should be
> possible to implement the following method:
>
> for each file:
> 1. if source and target pool are different, use standard copy method
> 2. if source and target pool are not disk based, use standard copy method
> 3. update header in existing volume(s) to reflect changes (e.g.
> different job id)
> 4. update database to reflect changes
> 5. in case of pruned files, truncate corresponding chunk in the volume(s)
>
> It might be tricky to ensure atomicity of steps 3 + 4 to avoid
> inconsistencies. Most filesystems should be able to handle sparse files
> correctly, an extra "defragmentation" steps seems to be unnecessary.

> Any comments on this? Are there any obvious showstoppers?
To make that work we would need to
1. Add a new job to the catalog that references the ranges of the jobs
to be consolidated
2. Change all the block headers in existing volumes so that they belong
to the consolidated job
3. Change all the record headers so they have file ids that are strictly
increasing (from the new job's point of view)
4. Mark records that are no longer needed in some way
5. Rewrite the first SOS record, remove all other SOS and all but the
last EOS record and overwrite the last EOS record.
6. Remove all blocks that consist only of records that are no longer
needed (and make sure the SD and all tools can read volumes with nulled
blocks in them)
7. Remove the original jobs from the catalog
8. Provide a 100% failsafe way to resume operations 2 to 6, otherwise a
crash during that operation would leave all data in the job unreadable.

Sounds like quite an agenda. With the current on-disk format, I wouldn't
dare trying it. There's just too much that can go wrong in the process.

We're planning to introduce another file-based storage backend with a
different on-disk format next year. That would theoretically allow to do
virtual full backups with zero-copy for the payload (i.e. it would still
read and write the block and record headers, but wouldn't copy the
payload anymore).
However, that backend is still vaporware today and zero-copy has not
even made it to the agenda yet.

Best Regards,
Andreas

--
Andreas Rogge andrea...@bareos.com
Bareos GmbH & Co. KG Phone: +49 221-630693-86
http://www.bareos.com

Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Komplementär: Bareos Verwaltungs-GmbH
Geschäftsführer: S. Dühr, M. Außendorf, J. Steffens, Philipp Storz

Burkhard Linke

unread,
Nov 4, 2022, 6:50:55 AM11/4/22
to bareos-devel
Hi,

the volume content is user controlled, so splitting it up into multiple jobs automatically might not solve the problem (and result in way more jobs..). Doing it manually is not feasible due to limited man power.

I think the best solution is extending the AI scheme similar to the data handling in librdd for monitoring data:
- create an initial full (already done)
- run nighty incrementals (also done)
- define "stages" for incrementals, e.g.
  - monthly
  - half year
  - year
- merge all incrementals based on stages
- merge the highest stage with the full
 
It somewhat resembles a full-differential-incremental scheme, but with merges incrementals instead of real differentials runs.

An example would be keeping daily incrementals for the last two month,  weekly merged incrementals for the next 4 month, and monthly incrementals for the next 6 months. This would cover a whole year, result in (1+6+4*4+2*7=) 83 backup datasets per job, assuming 4 weeks and 30 days per month. The virtual full job merging the full and oldest incremental set can be scheduled to run at certain times of the year with low activity.

We would implement this scheme as an external script and disable the standard consolidate job. Does the python binding allow to specify which jobs should be merged in a virtual full job, similar to the code in the consolidate job?

Best regards,
Burkhard

Andreas Rogge

unread,
Nov 10, 2022, 8:33:31 AM11/10/22
to bareos...@googlegroups.com
Hi,
Am 04.11.22 um 11:50 schrieb Burkhard Linke:
>
> An example would be keeping daily incrementals for the last two month,
> weekly merged incrementals for the next 4 month, and monthly
> incrementals for the next 6 months. This would cover a whole year,
> result in (1+6+4*4+2*7=) 83 backup datasets per job, assuming 4 weeks
> and 30 days per month. The virtual full job merging the full and oldest
> incremental set can be scheduled to run at certain times of the year
> with low activity.

That sounds feasible and like something that AI should be able to do at
some point in the future. I'm just not (yet) sure how to get this
working as a general approach that we could ship in core.

> We would implement this scheme as an external script and disable the
> standard consolidate job. Does the python binding allow to specify which
> jobs should be merged in a virtual full job, similar to the code in the
> consolidate job?

Yes and no.
What the core does is that it runs a virtualfull explicitly stating the
jobids. Basically like running 'run job=your-job level=VirtualFull
jobid=1,2,3 accurate=yes'. This will produce a backup with the level of
the first job mentioned (in your case an Incremental).
If the job has AlwaysIncremental and AlwaysIncerementalJobRetention
configured, it will also prune the jobs that were consolidated.

So you can implement that if (and that's the hard part) you can reliably
figure out which jobids should be consolidated together.
If you can come up with a good solution for that, maybe you could share
it and maybe we can implement it in core then.
Reply all
Reply to author
Forward
0 new messages