show volumes, that only have data from specific client. cleanup database and volumes

31 views
Skip to first unread message

Miguel da Silva

unread,
May 19, 2020, 4:57:14 AM5/19/20
to bareos-users
Hello,

I have a massive Bareos Setup and one of my clients (Let's name him "client1") has had backup errors or slow backups.
So i investigated and found the Problem on client1.

Now i want to remove all old traces of data of client1 from the database and the actual volumes on the storage.
Since the first storage (600TB) is almost full, i want to regain some space.

Is there a neat way to get volume names that only have data from "client1"?

I guess after that i would have to
1. delete the Volumes from The Storage.
2. bconsole > delete volume=volname1
3. somehow clean the database which has also grown now to around 400GB when dumped.

Help would be much appreciated.

I don't want to make any errors or delete wrong Volumes, since 200 Clients are now being saved and 200 more are coming when the next storage arrives.

Thank you!

Spadajspadaj

unread,
May 19, 2020, 5:34:36 AM5/19/20
to bareos...@googlegroups.com
There are two approaches:

1) The bareos way - purge jobs associated with given client and let the
bareos do its job. And that's the approach I'd recommend. Purge jobs
from given client, make sure you have "purge volume action=truncate" set
and the prune all volumes. If volumes are no longer associated with any
jobs (because you purged jobs from a particular client) they get purged.
When they get purged, they are truncated. And voila

2) The manuall fiddling way - you get list of volumes containing only
jobs from a given client by clever sql query (like inner join on
jobmedia and job with a where on particular client and group by media id
and count to select only volumes containing this single client jobs),
then you manually delete volumes and manually delete files from storage.
It can be done but it's not something that I'd recommend since the
previous approach is IMO much safer.

You didn't tell us whether you're using one job per volume setup or any
other settings because that could make your task a little bit easier.

The database issue is another story. Depending on database type (MySQL?
Postgres? I don't suppose you're running this installation on sqlite)
and even database configuration (MySQL table engine and - for example -
file per table settings in case of innodb) you might need to do
different things. Of course dump/restore will give you a "shrinked
database" but that's a very radical approach. You might get away with
vacujming postgres database but keep in mind that for the vacuuming
process you need additional storage temporarily (my database shrunk from
3.3GB to 2.6G after vacuuming but needed over 5G during vacuuming) and
the process itself is time-consuming.


Best regards

MK

Miguel da Silva

unread,
May 19, 2020, 7:08:33 AM5/19/20
to bareos-users
There are two approaches:

1) The bareos way - purge jobs associated with given client and let the
bareos do its job. And that's the approach I'd recommend. Purge jobs
from given client, make sure you have "purge volume action=truncate" set
and the prune all volumes. If volumes are no longer associated with any
jobs (because you purged jobs from a particular client) they get purged.
When they get purged, they are truncated. And voila
 
I have an always incremental setup, would it prune/truncate the volumes anyways? Retention time for the last possible backup is set to 1 year, would i have to wait for that long?
 
2) The manuall fiddling way - you get list of volumes containing only
jobs from a given client by clever sql query (like inner join on
jobmedia and job with a where on particular client and group by media id
and count to select only volumes containing this single client jobs),
then you manually delete volumes and manually delete files from storage.
It can be done but it's not something that I'd recommend since the
previous approach is IMO much safer.
 
You are right, it would be safer to do bareos it's thing.
But i don't really want to wait for 1 year. I am not really fluent in postgre, i mostly work with mysql.
 
You didn't tell us whether you're using one job per volume setup or any
other settings because that could make your task a little bit easier.
 
It's a pretty standard always incremental Setup from the documentation with 4 pools 1.Full 2.AI-Incemental 3.AI-Consolidated and 4. AI-LongTerm. The jobs can get mixed in the Volumes.
 
The database issue is another story. Depending on database type (MySQL?
Postgres? I don't suppose you're running this installation on sqlite)
and even database configuration (MySQL table engine and - for example -
file per table settings in case of innodb) you might need to do
different things. Of course dump/restore will give you a "shrinked
database" but that's a very radical approach. You might get away with
vacujming postgres database but keep in mind that for the vacuuming
process you need additional storage temporarily (my database shrunk from
3.3GB to 2.6G after vacuuming but needed over 5G during vacuuming) and
the process itself is time-consuming.

I use postgresql, right now i have 64TB on the Storage and 12TB on the Director where The Database lives remaining.
Can i vaccum while Bareos services are running or should i stop them?

Best regards

MK

Thank you very much.

Spadajspadaj

unread,
May 19, 2020, 8:16:10 AM5/19/20
to bareos...@googlegroups.com
On 19.05.2020 13:08, Miguel da Silva wrote:
There are two approaches:

1) The bareos way - purge jobs associated with given client and let the
bareos do its job. And that's the approach I'd recommend. Purge jobs
from given client, make sure you have "purge volume action=truncate" set
and the prune all volumes. If volumes are no longer associated with any
jobs (because you purged jobs from a particular client) they get purged.
When they get purged, they are truncated. And voila
 
I have an always incremental setup, would it prune/truncate the volumes anyways? Retention time for the last possible backup is set to 1 year, would i have to wait for that long?


There are two operations. One is prune, which means "clear the information about volumes/jobs/files/whatever as long as it's safe" (i.e. there are no more jobs on a volume, or the retention period already expired).

Another one is purge which means "do as I say, don't mind the volume contents, renention periods and so on".

That's why I'm suggesting _purging_ jobs, because that's what you specified as the thing you wanted to do - just remove the jobs without waiting for retention periods.

But all other operations should be prune. If you prune the tape, bareos checks retention periods and only "clears" the media if it can safely do so. That's why, if you purge the jobs first, when pruning media you'll only "clear" the media that have no more non-expired jobs associated with it.

And as the empty volumes get purged (pruning causes bareos to purge volumes if it's safe to do so), the action defined in "purge volume action" setting is called. If it is set to "truncate", the volume file is getting truncated to bare header.

Hope that's clearer now.

 

2) The manuall fiddling way - you get list of volumes containing only
jobs from a given client by clever sql query (like inner join on
jobmedia and job with a where on particular client and group by media id
and count to select only volumes containing this single client jobs),
then you manually delete volumes and manually delete files from storage.
It can be done but it's not something that I'd recommend since the
previous approach is IMO much safer.
 
You are right, it would be safer to do bareos it's thing.
But i don't really want to wait for 1 year. I am not really fluent in postgre, i mostly work with mysql.
 
You didn't tell us whether you're using one job per volume setup or any
other settings because that could make your task a little bit easier.
 
It's a pretty standard always incremental Setup from the documentation with 4 pools 1.Full 2.AI-Incemental 3.AI-Consolidated and 4. AI-LongTerm. The jobs can get mixed in the Volumes.
It'll be indeed easier to let bareos decide which volumes it can srap then.

The database issue is another story. Depending on database type (MySQL?
Postgres? I don't suppose you're running this installation on sqlite)
and even database configuration (MySQL table engine and - for example -
file per table settings in case of innodb) you might need to do
different things. Of course dump/restore will give you a "shrinked
database" but that's a very radical approach. You might get away with
vacujming postgres database but keep in mind that for the vacuuming
process you need additional storage temporarily (my database shrunk from
3.3GB to 2.6G after vacuuming but needed over 5G during vacuuming) and
the process itself is time-consuming.

I use postgresql, right now i have 64TB on the Storage and 12TB on the Director where The Database lives remaining.
Can i vaccum while Bareos services are running or should i stop them?

https://www.postgresql.org/docs/9.1/sql-vacuum.html

"Normal" vacuuming only frees up space within existing tables for future reuse. It doesn't free up disk space but gives you more space within existing database installation so you can add data without growing the database on disk. And it can be run on fully operating database.

But vacuum full, which rebuilds the database files and reclaims space needs an exclusive lock on tables so might interfere with clients (in our case - bareos) operation.

Best regards

MK

Reply all
Reply to author
Forward
0 new messages