the diskspace on my bareos server is full and I have delete some big (old) volumes in bareos (bconsole) but in the filesystem is the space not available.
why?
with best regards
sven
I add the line:
Action On Purge=Truncate
in my pool resource (full, diff, incr) and restart the director. What is the correct next step to delete the volume e.g. "Full-0218" ?
-> bconsole
*purge volume=Full-0218 action=truncate
I get the error:
This command can be DANGEROUS!!!
It purges (deletes) all Files from a Job,
JobId, Client or Volume; or it purges (deletes)
all Jobs from a Client or Volume without regard
to retention periods. Normally you should use the
PRUNE command, which respects retention periods.
Volume "Full-0218" has VolStatus "Purged" and cannot be purged.
The VolStatus must be: Append, Full, Used, or Error to be purged.
Automatically selected Storage: File
Using Catalog "MyCatalog"
The defined Pool resources are:
1: Full
2: Diff
3: Incr
4: Scratch
Select Pool resource (1-4): 1
Connecting to Storage daemon File at kvm01.peka.lan:9103 ...
The option "Action On Purge = Truncate" was not defined in the Pool resource.
Unable to truncate volume "Full-0218"
What's wrong?
I've never configured for ActionOnPurge=Truncate. I just did that and purged a few volumes and none of them truncated. I'm running 16.2.4. So either this is another bug or I also have it configured incorrectly. I'm not going to spend much time on it, but I'll take a look at the source code and see if I can figure out where it's broken.
Having said that, the reason I've never configured for it is that it didn't look very useful to me. Maybe because of my configuration? I can see where this could save precious storage space in a system that had lots of large, expired/empty volumes sitting around. If the AI consolidation had been properly pruning the volumes I'm guessing that would not have been the case. That's how I also discovered that bug, although I didn't run out of space because I had set the max volume limit at what I thought was a reasonable level and it his that limit rather quickly.
Particularly if you are set up to consolidate every day, you are constantly creating stale volumes that take up space and don't get recycled. My recommendations ...
1) Implement that bug fix script to prune the volumes after every Consolidate job. That will get the volume recycling working properly and you won't be left with a bunch of stale volumes taking up space.
2) Us a relatively small Max Volume Bytes in your AI-Incremental and AI-Consolidated pools. I use 5G. You may settle on something bigger, but bigger is not always better. Smaller will mean more volumes, but probably less wasted space. And with disk storage there's very little performance overhead with mounting and unmounting volumes.
3) Don't consolidate every day. Consolidation is great for recovering unused space on storage volumes, but doing that daily is overkill. I do it every 4 days, for reasons that probably aren't worth discussing, but find something that makes sense to you. With tape volumes consolidation is a great way to reduce restore/recovery times because it will reduce volume mounts and tape streaming. With random access disk volumes it really doesn't matter much.
If you do these things, you'll find that your system will settle in on just the right number of volumes that it needs, with very little wasted space. The amount of space consumed on a temporary basis by not truncated the volume at the time it is purged should be down in the noise.
My 2 cents.
Dan
I looked at the source code this morning. To purge a volume named 'AI-Consolidated-100" you can simple enter ...
purge volume=AI-Consolidated-100
If you want it to be truncated, there are 3 more required arguments ...
purge volume=AI-Consolidated-100 action=Truncate pool=AI-Consolidated storage=File
where 'AI-Consolidated' is the name of the pool my volume resides in and 'File' is the name of the Storage resource associated with my AI-Consolidated pool.
The volume will then be truncated IF ...
1) Action on Purge = 'Truncate' for the volume
2) VolStatus is Append, Full, Used, or Error
3) Recycle is enabled for the volume
4) The volume size is currently > 10kB
So this resolves the issue where your command line truncation is not working. I did not find anywhere in the code where the truncate on purge function was implemented outside the console command line. So it appears it will not, as we noticed, truncate the volumes when they are automatically purged.
If you want to use this console command for volumes that you have already purged, you'll have to first set them back to one of the valid VolStatus values above.
Dan
Russell -
I'm not following the logic here. You should absolutely run the AI backup every day, that drives your RPO. The Consolidate job is purely administrative and can be run as often or as seldom as you like. It serves the purpose of reducing the total number of volumes in your pools (critical for physical tape, but not a big deal for disk storage), reducing wasted space in your pools (caused by jobs that may have expired from the volume or files that have been deleted from the backup set (which is not generally a day-to-day issue, but can be over long periods) and reducing the number of volumes required for a restore (critical for physical tape, but not a big deal for disk storage).
I ended up consolidating every 4th day based on facts like how many clients, how many virtual fulls per consolidate, how long I wanted the consolidate job to take, how my offsite data is written, what my offsite expiration time is, etc. My objective was more or less to do it as infrequently as I could get away with.
Objectives and solutions vary by user ;-)
The code specifically requires that truncate be both enabled on the volume AND specified on the command line. So it isn't clear what the design intent was. It isn't hard to work with it either way once you know what the rules are.
> Exactly. Should this be filed as a new bug? A functionality that has a
> documented option that doesn't work? I would think that Always Incremental
> doesn't work without this option.
I guess this is up to you. I don't use the truncate option and I've been running AI backups for a long while, so it isn't true that it won't work. The severity of the issue is clearly higher for you than for me. It isn't clear from the code what the design intent was, i.e. it isn't code that throws an error but rather code that doesn't exist.
BTW ... another random thought that occurs to me ... The script that I provided as a workaround for consolidate jobs not pruning the volumes executes the PRUNE command via console. Since it executes on a list of volumes that do not have any associated jobs (the query), there's no reason you can't run the PURGE command instead. You already know that the volume has no associated backup data so the safety of using the prune command isn't needed. If you write it that way you can just include the truncate option and your problem is completely solved. You just need to convince yourself that you can never accidentally purge a volume with active backup data on it.
I assume you add this to the jobdef as a RunAfter script but do you put Runs on Client to no since this needs to be executed on the server?
Thanks.
I do have 'Runs on Client = No'.
You can set it up as a RunAfter script. I actually set it up as a separate job. My only reason for doing that was for failure reporting. I want a script error to be reported as a failure, but I don't want to confuse that with a failure of the actual consolidate job. Having said that, I've never had a failure so it really doesn't matter.
Dan
Thanks.
bareos-fd runs as 'root' because it requires access to all files for backup. bareos-sd and bareos-dir should both be running as 'bareos'.
My job definition ......
Job {
Name = DR-ConsBugFixes
JobDefs = DefaultJob
Client = qco-util-fd
Type = Admin
Priority = 30
Schedule = AdminBugFixSched
Enabled = yes
Run Script {
Runs on Success = Yes
Runs on Failure = Yes
Runs on Client = No
Runs When = Before
Fail Job On Error = Yes
Command = "/etc/bareos/bareos-dir.d/job/scripts/drConsBugFixes.sh"
}
}
My script file ......
#!/bin/bash
# grab the database credentials from existing configuration files
catalogFile=`find /etc/bareos/bareos-dir.d/catalog/ -type f`
dbUser=`grep dbuser $catalogFile | grep -o '".*"' | sed 's/"//g'`
dbPwd=`grep dbpassword $catalogFile | grep -o '".*"' | sed 's/"//g'`
# Make sure all DR-Copy jobs that are in the FileCopy pool are properly set in the database as Copy (C) jobs. This is to work around a Bareos bug where consolidation removes a job and Bareos promotes the Copy job to a Backup (B) job. That 'copy' job then gets consolidated a second time and copied again.
/usr/bin/mysql bareos -u $dbUser -p$dbPwd -se "UPDATE Job J SET J.Type = 'C' WHERE J.Type <> 'C' AND EXISTS (SELECT 1 FROM Media M, JobMedia JM WHERE JM.JobId = J.JobId AND M.MediaId = JM.MediaID AND M.MediaType = 'FileCopy');"
# Get a list of volumes no longer in use and submit them to the console for pruning. This is to work around a bug where Bareos does not prune volumes after a Consolidate action.
# Query for a list of volumes (exclude DR copy volumes)
emptyVols=$(mysql bareos -u $dbUser -p$dbPwd -se "SELECT m.VolumeName FROM bareos.Media m where m.MediaType <> 'FileCopy' and m.VolStatus not in ('Append','Purged') and not exists (select 1 from bareos.JobMedia jm where jm.MediaId=m.MediaId);")
# Submit volumes to bconsole for pruning
for volName in $emptyVols
do
poolName=$(mysql bareos -u $dbUser -p$dbPwd -se "SELECT p.Name FROM bareos.Pool p where p.PoolId = (select m.PoolId from bareos.Media m where m.VolumeName='$volName');")
storageName=$(mysql bareos -u $dbUser -p$dbPwd -se "SELECT s.Name FROM bareos.Storage s where s.StorageId = (select m.StorageId from bareos.Media m where m.VolumeName='$volName');")
/bin/bconsole << EOD
purge volume=$volName action=Truncate pool=$poolName storage=$storageName yes $emptyVol
quit
EOD
done
exit
So this script is run by the director then or by fd? My dir and sd are running as bareos and fd is root according to ps aux.
For some reason the script runs fine when run from the command line (I'm root there) but not when run from the job. I'll keep digging.
emptyVols=$(su postgres -c "psql -d bareos -t -c \"select m.VolumeName from Media m where m.VolStatus not in ('Append','Purged') and not exists (Select 1 from JobMedia jm where jm.MediaId=m.MediaID);\"")
Running ./consolidatefix.sh from the command line in my scripts directory works perfectly with su using postgres and runs great with no password error.
The script runs a bareos user.
At this point I'll just slap it in a chron job and forget about Bareos running it. I just can't get it working and it should work just fine.
Thanks for the help.
emptyVols=$(psql -d bareos -t -c "select m.VolumeName from Media m where m.VolStatus not in ('Append','Purged') and not exists (Select 1 from JobMedia jm where jm.MediaId=m.MediaID);")
Best wishes,
Thomas
#!/bin/bash
#
# PURPOSE
# Bash script to purge old Bareos volumes both from the Catalog of Bareos as well as from physical disk,
# as Bareos unfortunately does not delete old AlwaysIncremental volumes of type VirtualFull automatically.
#
# DISCLAIMER
# This script is used at your own risk, without any warranty whatsoever!
#
# USE
# First save the script as /usr/local/bin/bareos-purge-old-volumes.sh
# Then make it owned by root and the group bareos and make it executable:
# chown root.bareos /usr/local/bin/bareos-purge-old-volumes.sh
# chmod 770 /usr/local/bin/bareos-purge-old-volumes.sh
#
# Then make the script run as a Bareos job:
#
#Job {
# Name = "PurgeOldVolumes"
# Type = "Admin"
# Description = "Purges old volumes left behind from AI consolidations"
# Schedule = "DailyPurgeOldVolumes"
# JobDefs = "DefaultJob"
# Client = bareos-fd
# RunAfterJob = "/usr/local/bin/bareos-purge-old-volumes.sh"
# Priority = 31 # run after consolidation
# }
#
# Also make a new schedule for the job, which should be prioritised below the priority of the Consolidate job
# and ran for example one hour after the Consolidate Job:
#
# Schedule {
# Name = "DailyPurgeOldVolumes" # Purges old volumes left behind from AI consolidations (priority 31)
# Run = Incremental mon-sun at 13:00
# }
#
# REQUIREMENTS
# Bareos version 16.2.5 or later installed with PostgreSQL as databse
# Bash
#
# PLEASE NOTE
# In order for this script to delete old volumes from disk you may need to set these 3 directives in your Bareos Pool directive(s):
# Recycle = yes # Bareos can automatically recycle Volumes
# Auto Prune = yes # Prune expired volumes
# Action on Purge = Truncate # Delete old backups
# Please read more about these directives in the Bareos manual at https://docs.bareos.org/ to understand them before making these changes.
# If you choose to set these 3 directives, then update old volumes with the new settings from the Pool(s) through running bconsole:
# bconsole <Enter>
# update volume <Enter>
# 14 <Enter> to choose 14: All Volumes from all Pools
#
# AUTHOR AND VERSION
# Written for PostgreSQL by Thomas Hojemo. Revised 2019-05-02.
# Based on original script for MySQL from dpcushing @ https://bugs.bareos.org/view.php?id=779
#
# ---- START OF SETTINGS TO CONFIGURE ------------------------------------------------------------------------------------
#
# RUN SCRIPT AS USER
# In case this script is run as root you need to set the username root should su into below - should normally be postgres:
suUser='postgres'
#
# SET DATABASE NAME
# The Catalog normally resides in the database bareos. Change below if you use another database name:
database='bareos'
#
# SET BAREOS STORAGE LOCATION
# From where the files on disk will be deleted. Is normally /var/lib/bareos/storage/ Change directory below if needed:
dirName='/var/lib/bareos/storage/'
#
# ---- END OF SETTINGS TO CONFIGURE --------------------------------------------------------------------------------------
# DEBUGGING
# Should normally be turned off, i.e. commented away:
#set -x
# SCRIPT USER
# Check which user we are now (does not need to be changed).
actualUser="$LOGNAME"
# SET WORKING DIRECTORY
# Change to /tmp directory in order to not have problems with directory permissions.
cd /tmp
# QUERY FOR VOLUMES THAT ARE NOT IN USE ANYMORE
#
if [ "$actualUser" == "root" ] # If we are root we need to su to access PostgreSQL
then # Run this SQL query in PostgreSQL via su to the PostgreSQL user
emptyVols=$(su $suUser -c "psql -d $database -t -c \"SELECT DISTINCT media.volumename FROM media WHERE media.volstatus NOT IN ('Recycle','Append','Purged') AND media.volumename NOT IN (SELECT DISTINCT volumename FROM media,jobmedia,job WHERE media.mediaid = jobmedia.mediaid AND jobmedia.jobid = job.jobid AND job.jobstatus NOT IN ('T','E','e','f','A', 'W')) AND NOT EXISTS (SELECT 1 FROM jobmedia WHERE jobmedia.mediaid=media.mediaid)\"")
else # Else - in case we are a user with direct PostgreSQL access - we can run the SQL query directly
emptyVols=$(psql -d $database -t -c "SELECT DISTINCT media.volumename FROM media WHERE media.volstatus NOT IN ('Recycle','Append','Purged') AND media.volumename NOT IN (SELECT DISTINCT volumename FROM media,jobmedia,job WHERE media.mediaid = jobmedia.mediaid AND jobmedia.jobid = job.jobid AND job.jobstatus NOT IN ('T','E','e','f','A', 'W')) AND NOT EXISTS (SELECT 1 FROM jobmedia WHERE jobmedia.mediaid=media.mediaid)")
fi
# Trim any whitespace before and after string
emptyVols=$(echo $emptyVols | xargs)
# GIVE CHANCE TO ABORT SCRIPT
# If there are volumes to purge and delete give chance to abort script
if [ -n "$emptyVols" ]
then
echo "WARNING: These volumes will be purged from Bareos and deleted from disk:"
echo $emptyVols
echo "Press Ctrl+C within 10 seconds to abort."
sleep 10
# PURGE AND DELETE VOLUMES
# Get pool name and storage name for each volume
for volName in $emptyVols # Loop through each volume name in the list we extracted via the SQL query above
do
if [ "$actualUser" == "root" ] # If we are root we need to su to access PostgreSQL
then # Run this SQL query in PostgreSQL via su to the PostgreSQL user
poolName=$(su $suUser -c "psql -d $database -t -c \"SELECT pool.name FROM pool WHERE pool.poolid = (SELECT media.poolid FROM media where media.volumename='$volName');\"")
storageName=$(su $suUser -c "psql -d $database -t -c \"SELECT storage.name FROM storage where storage.storageid = (SELECT media.storageid FROM media where media.volumename='$volName');\"")
else # Else - in case we are a user with direct PostgreSQL access - we can run the SQL query directly
poolName=$(psql -d $database -t -c "SELECT pool.name FROM pool WHERE pool.poolid = (SELECT media.poolid FROM media where media.volumename='$volName');")
storageName=$(psql -d $database -t -c "SELECT storage.name FROM storage where storage.storageid = (SELECT media.storageid FROM media where media.volumename='$volName');")
fi
# Trim any whitespace before and after string
poolName=$(echo $poolName | xargs)
storageName=$(echo $storageName | xargs)
fileName="$dirName$volName"
# Run bconsole command to purge, truncate and delete volumes
bconsole << EOD
purge volume=$volName pool=$poolName storage=$storageName yes
truncate volstatus=Purged volume=$volName pool=$poolName storage=$storageName yes
quit
EOD
# Delete file from disk
rm $fileName
done
fi
Best wishes,
Thomas
dirName='/backup/bareos/storage/'
This would make a great gist