Long term archiving of cold data

56 views
Skip to first unread message

Olivier Ardouin

unread,
Jul 5, 2021, 5:22:52 AM7/5/21
to bareos-users
Hi,

I'm new to bareos, in previous job I used Veritas (now Symantec) NetBackup.

I plan, among other, to backup a volume (NAS 34 Tb) which only contains cold data that will nether be modified, (DNA sequencing run for medical purpose), the only modification to this volume will be the addition of new data (new folders).

I wonder if I can rely of the AlwaysIncrementalBackup define here : https://docs.bareos.org/TasksAndConcepts/AlwaysIncrementalBackupScheme.html ?

My concern is that I only have one SD (Overland T24 auto changer with juste one drive). So I will no be able to read AND write at the same time on this SD.

I wonder if there a way in bareos to define a pool "Archive" with a very long retention (20 years due to medical rules) and only add the new data on it ? and maybe consolidate the catalog ? As I says, the data will never be modified once in these archive.

The easy ways would be to done normal backup from my NAS with full, incremental  and differential, with tape rotation but I will backup each time the same unchanged data.

Thanks

--Olivier

Frank Kohler

unread,
Jul 5, 2021, 7:50:24 AM7/5/21
to bareos...@googlegroups.com
Bonjour Olivier,

if I understand correctly this article (page 3) should help
https://translate.google.com/translate?sl=auto&tl=fr&u=https://www.admin-magazin.de/Das-Heft/2020/02/Always-Incremental-mit-Bareos/%28offset%29/6


best,
Frank
> --
> You received this message because you are subscribed to the Google
> Groups "bareos-users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to bareos-users...@googlegroups.com
> <mailto:bareos-users...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/bareos-users/d5a26cf4-0586-41e8-b975-e562fb064cb2n%40googlegroups.com
> <https://groups.google.com/d/msgid/bareos-users/d5a26cf4-0586-41e8-b975-e562fb064cb2n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Frank Kohler
Bareos GmbH & Co. KG
Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Komplementär: Bareos Verwaltungs-GmbH
Geschäftsführer: Stephan Dühr, M. Außendorf,
J. Steffens, P. Storz

Olivier Ardouin

unread,
Jul 8, 2021, 11:50:47 AM7/8/21
to bareos-users
Hi Frank,

Thank you for this article, but it's not applicable in my case.

I have only one client (for this Archive task) and my datas are, when archived, frozen, just some new data to append from time to time. The total volume of data (more than 34 tb now and still growing) is not compatible with consolidation tasks (and only one drive in the autoloader).
In fact it could be done whithout bareos by a simple tar command and add all the new folder to the tape manually. I may end with this solutions and keep a track of which folders were in wich tape. Restore will be more manual cause I will not be able to rely on the bareos catalog. I guess the fonction should be in bareos in the pool type = Archive which is not yet implemented in Bareos.

A workaround may be to specified a file set for the "new" data do regular full backup of this file set with 30 years retention time and a specific "archive" pool, and move them automatically after the job in an "archived" folder exclude from the file set.

Did someone know if the Archive pool type will be implemented soon ?

Thank's

spadaj...@gmail.com

unread,
Jul 8, 2021, 1:09:18 PM7/8/21
to bareos-users

I had a similar project once few years ago. I cannot post the detailed solution due to legal restrictions but I can describe the general idea.
The situation was that a piece of software running on a server would create archive files in a given directory. Those files were to be archived in sufficiently many copies and then removed.
So I did a full backup job with a fileset created dynamicaly using a script which would do a directory scan in the directory, then run a query against bareos catalog and only backup those files which:
1) Weren't backed up sufficiently many times
and
2) Weren't backed up on this media yet.
Apart from that I'd run a cron job completely asynchronously which would scan the directory and remove the files which had been already backed up enough times.
This way I made sure that each archive file would get backed up on several different media and only after that it would get removed from the source server.
In my case it was an all-in-one installation of bareos so I had easy access to both director database and FD directory contents. If you have those components separated you might need to do some access rights juggling of course.
Of course you need to set the retention values to some ridiculously high periods in order to not get the files pruned from the database.

Hope this helps

rivim...@gmail.com

unread,
Aug 5, 2021, 11:31:23 AM8/5/21
to bareos-users
Oliver, I'm not really understanding why you are seeing a problem... is this not a case of having a Pool with the very long retention time mentioned, autoprune disabled, and only normally running Incremental jobs?  The AlwaysIncremental switch will I believe stop bareos upgrading Incremental to Diff/Full jobs, so I would think that a useful addition to prevent surprises, though I don't think it is strictly necessary. The important thing is to set long enough retention times.

You would need to ensure the jobs in the database were not deleted, too, or bareos will loose track of what is on tape even if it won't overwrite it.

It might be worth thinking of creating a number of filesets that cover groups of folders. For example, if your folders are named by date - 2021-08-10, for example - then make filesets for all folders starting 2020, 2021, etc. This would result in jobs becoming "complete" and no more backups (of any level) needing to be done. It would, though, mean any restore will have to "know" which fileset the required data is in, which might be problematic.
Reply all
Reply to author
Forward
0 new messages