Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

backup directory/file exclusion pattern list for borgbackup

2,341 views
Skip to first unread message

Default User

unread,
Sep 25, 2021, 6:30:03 PM9/25/21
to
Hello!

I want to try using borgbackup to do backups of my (only) user directory:
/home/debian-user

I just want to do so using Vorta, a GUI for borgbackup.

But I just need a good, general list of directory and file type
exclusions that I can just cut and paste into the Exclude Patterns
window in Vorta. Something like the default list of exclusions that
appears by default in the Backintime backup program.

Note 1: borgbackup uses a matching pattern called "Fnmatch" with which
I am not familiar, and don't want to learn by trial and error, losing
data in the process. Which is why I am looking for a "drop-in" basic
exclude list.

Note 2: I am not intending to use borgbackup to back up the whole
system; just /home/debian-user and its subdirectories. I am using
timeshift to back up the rest of the system. Timeshift uses a huge
amount of disk space, but it . . . works.

Note 3: I am aware that some use backintime to back up user data, and
I have tried it myself. But it just seems to have some "problems".
For example, the built-in "diff" utility does not seem to do anything.
It seems old and gives the impression of not being heavily developed.
The documentation is "adequate" but mediocre. And what really grinds
my gears about backintime, a problem apparently known as far back as
2014:

"Warning: A recent security audit revealed several possible attack
vectors for EncFs.

>From https://defuse.ca/audits/encfs.htm:

EncFS is probably safe as long as the adversary only gets one copy of
the ciphertext and nothing more. EncFS is not safe if the adversary
has the opportunity to see two or more snapshots of the ciphertext at
different times. EncFS attempts to protect files from malicious
modification, but there are serious problems with this feature.

This might be a problem with Back In Time snapshots."

Gee . . . think so?

Kushal Kumaran

unread,
Sep 25, 2021, 8:10:03 PM9/25/21
to
On Sat, Sep 25 2021 at 06:24:12 PM, Default User <hungupo...@gmail.com> wrote:
> Hello!
>
> I want to try using borgbackup to do backups of my (only) user directory:
> /home/debian-user
>
> I just want to do so using Vorta, a GUI for borgbackup.
>
> But I just need a good, general list of directory and file type
> exclusions that I can just cut and paste into the Exclude Patterns
> window in Vorta. Something like the default list of exclusions that
> appears by default in the Backintime backup program.
>

I don't understand what a general list of exclusions would look like.
Do you have examples of what backintime excludes by default? My own
borgbackup runs backup everything on disk; I don't feel the need to
exclude anything.

> Note 1: borgbackup uses a matching pattern called "Fnmatch" with which
> I am not familiar, and don't want to learn by trial and error, losing
> data in the process. Which is why I am looking for a "drop-in" basic
> exclude list.
>

Run "borg help patterns" to see explanation of how borgbackup deals with
patterns. It has this to say about fnmatch:

This is the default style for --exclude and --exclude-from. These
patterns use a variant of shell pattern syntax, with '*' matching
any number of characters, '?' matching any single character, '[...]'
matching any single character specified, including ranges, and
'[!...]' matching any character not specified. For the purpose of
these patterns, the path separator (backslash for Windows and '/' on
other systems) is not treated specially. Wrap meta-characters in
brackets for a literal match (i.e. [?] to match the literal
character ?). For a path to match a pattern, the full path must
match, or it must match from the start of the full path to just
before a path separator. Except for the root path, paths will never
end in the path separator when matching is attempted. Thus, if a
given pattern ends in a path separator, a '*' is appended before
matching is attempted.

> Note 2: I am not intending to use borgbackup to back up the whole
> system; just /home/debian-user and its subdirectories. I am using
> timeshift to back up the rest of the system. Timeshift uses a huge
> amount of disk space, but it . . . works.
>

I don't know how timeshift stores backups. borg uses deduplicated
storage that avoids storing identical data multiple times. My own borg
backups results in ~1G of new data every week (and about the same amount
being deleted from expiring backups). There is no significant increase
in repository size week-over-week. That obviously would not be the same
for everyone, but if you're bothered by the amount of disk space used
you can try it out.

> Note 3: I am aware that some use backintime to back up user data, and
> I have tried it myself. But it just seems to have some "problems".
> For example, the built-in "diff" utility does not seem to do anything.
> It seems old and gives the impression of not being heavily developed.
> The documentation is "adequate" but mediocre. And what really grinds
> my gears about backintime, a problem apparently known as far back as
> 2014:
>
> "Warning: A recent security audit revealed several possible attack
> vectors for EncFs.
>
>>From https://defuse.ca/audits/encfs.htm:
>
> EncFS is probably safe as long as the adversary only gets one copy of
> the ciphertext and nothing more. EncFS is not safe if the adversary
> has the opportunity to see two or more snapshots of the ciphertext at
> different times. EncFS attempts to protect files from malicious
> modification, but there are serious problems with this feature.
>
> This might be a problem with Back In Time snapshots."
>
> Gee . . . think so?

That report talks about issues with encfs design. There is nothing
backintime can do to fix those.

borg can encrypt its backup images, and it recommendeds enabling that.
So an adversary would not get access to the encfs ciphertext directly.
They could get access to borg ciphertext instead, which may or may not
be vulnerable to the same problems. AFAIK there hasn't been a security
audit of borgbackup itself. The page at
https://borgbackup.readthedocs.io/en/stable/internals/security.html#borgcrypto
describes the design of borg security.

--
regards,
kushal

Default User

unread,
Sep 25, 2021, 9:10:04 PM9/25/21
to
Hi, Kushal.

In Vorta, under the "Sources" tab, there is an area (window) for input
into which you can type or paste text, such as:

**/.cache

to denote exclusions, that is, things you do not want to back up.
This is from /home/debian_user/.config/backintime/config:

. . .
profile1.snapshots.exclude.1.value=.gvfs
profile1.snapshots.exclude.2.value=.cache/*
profile1.snapshots.exclude.3.value=.thumbnails*
profile1.snapshots.exclude.4.value=.local/share/[Tt]rash*
profile1.snapshots.exclude.5.value=*.backup*
profile1.snapshots.exclude.6.value=*~
profile1.snapshots.exclude.7.value=.dropbox*
profile1.snapshots.exclude.8.value=/proc/*
profile1.snapshots.exclude.9.value=/sys/*
profile1.snapshots.exclude.10.value=/dev/*
profile1.snapshots.exclude.11.value=/run/*
profile1.snapshots.exclude.12.value=/etc/mtab
profile1.snapshots.exclude.13.value=/var/cache/apt/archives/*.deb
profile1.snapshots.exclude.14.value=lost+found/*
profile1.snapshots.exclude.15.value=/tmp/*
profile1.snapshots.exclude.16.value=/var/tmp/*
profile1.snapshots.exclude.17.value=/var/backups/*
profile1.snapshots.exclude.18.value=.Private
. . .

Of course that is expressed in backintime's own configuration
"language", and would probably need to be translated into borgbackup's
equivalent "language".

Something like that is what I was sort of looking for. And it is not
just for efficiency. Consider this, from the Arch wiki article on
rsync:

----------

"Run the following command as root to make sure that rsync can access
all system files and preserve the ownership:

# rsync -aAXHv --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"}
/ /path/to/backup

By using the -aAX set of options, the files are transferred in archive
mode which ensures that symbolic links, devices, permissions,
ownerships, modification times, ACLs, and extended attributes are
preserved, assuming that the target file system supports the feature.
The option -H preserves hard links, but uses more memory.

The --exclude option causes files that match the given patterns to be
excluded. The directories /dev, /proc, /sys, /tmp, and /run are
included in the above command, but the contents of those directories
are excluded. This is because they are populated on boot, but the
directories themselves are not created. /lost+found is
filesystem-specific. The command above depends on brace expansion
available in both the bash and zsh shells. When using a different
shell, --exclude patterns should be repeated manually. Quoting the
exclude patterns will avoid expansion by the shell, which is
necessary, for example, when backing up over SSH. Ending the excluded
paths with * ensures that the directories themselves are created if
they do not already exist.

Note:

If you plan on backing up your system somewhere other than /mnt or
/media, do not forget to add it to the list of exclude patterns to
avoid an infinite loop.
If there are any bind mounts in the system, they should be excluded as
well so that the bind mounted contents is copied only once.
If you use a swap file, make sure to exclude it as well.
Consider if you want to backup the /home/ directory. If it contains
your data it might be considerably larger than the system. Otherwise
consider excluding unimportant sub-directories such as
/home/*/.thumbnails/*, /home/*/.cache/mozilla/*,
/home/*/.cache/chromium/*, and /home/*/.local/share/Trash/*, depending
on software installed on the system.
If GVFS is installed, /home/*/.gvfs must be excluded to prevent rsync errors.
If Dhcpcd ≥ 9.0.0 is installed, exclude the /var/lib/dhcpcd/*
directory as it mounts several system directories as sub-directories
there.

You may want to include additional rsync options, or remove some, such
as the following. See rsync(1) for the full list.

If you run on a system with very low memory, consider removing -H
option; however, it should be no problem on most modern machines.
There can be many hard links on the file system depending on the
software used (e.g. if you are using Flatpak). Many hard links reside
under the /usr/ directory.
You may want to add rsync's --delete option if you are running this
multiple times to the same backup directory. In this case make sure
that the source path does not end with /*, or this option will only
have effect on the files inside the subdirectories of the source
directory, but it will have no effect on the files residing directly
inside the source directory.
If you use any sparse files, such as virtual disks, Docker images and
similar, you should add the -S option.
The --numeric-ids option will disable mapping of user and group names;
instead, numeric group and user IDs will be transfered. This is useful
when backing up over SSH or when using a live system to backup
different system disk.
Choosing --info=progress2 option instead of -v will show the overall
progress info and transfer speed instead of the list of files being
transferred.
To avoid crossing a filesystem boundary when recursing, add the option
-x/--one-file-system. This will prevent backing up any mount point in
the hierarchy."

----------

And that isn't even (afaik) Fnmatch!
(BTW, I have read what you referenced as ' Run "borg help patterns" '.
I'm afraid it wasn't very helpful to me.)

Timeshift (system files only) currently takes up about 10Gb backing up
a relatively lean system.

Borg takes up about 4.2 Gb of user data only.

Backintime uses about 4.4Gb to back up the same user data. It seems
to be just a fancy GUI, that appears to use rsync as a backend, to
take "snapshots".

I shall take for granted that backintime developers do not code encfs.
Fine. But after 7 years (at least), why haven't they replaced encfs
with a "safer" encryption scheme, or at least just removed it and
simply not replaced it at all? IMHO, either option would seem far
better than the status quo.

I'm sure someone is saying, "Well, you don't HAVE TO use the built encryption."
Believe me, I don't. And won't.

As you noted, borg seems to take encryption much more seriously.
Which I think is a good thing, as I consider data integrity and
security to be very important.

Andrei POPESCU

unread,
Sep 26, 2021, 5:50:04 AM9/26/21
to
On Sb, 25 sep 21, 21:03:37, Default User wrote:
>
> to denote exclusions, that is, things you do not want to back up.
> This is from /home/debian_user/.config/backintime/config:
>
> . . .
> profile1.snapshots.exclude.1.value=.gvfs
> profile1.snapshots.exclude.2.value=.cache/*
> profile1.snapshots.exclude.3.value=.thumbnails*
> profile1.snapshots.exclude.4.value=.local/share/[Tt]rash*
> profile1.snapshots.exclude.5.value=*.backup*
> profile1.snapshots.exclude.6.value=*~
> profile1.snapshots.exclude.7.value=.dropbox*
> profile1.snapshots.exclude.8.value=/proc/*
> profile1.snapshots.exclude.9.value=/sys/*
> profile1.snapshots.exclude.10.value=/dev/*
> profile1.snapshots.exclude.11.value=/run/*
> profile1.snapshots.exclude.12.value=/etc/mtab
> profile1.snapshots.exclude.13.value=/var/cache/apt/archives/*.deb
> profile1.snapshots.exclude.14.value=lost+found/*
> profile1.snapshots.exclude.15.value=/tmp/*
> profile1.snapshots.exclude.16.value=/var/tmp/*
> profile1.snapshots.exclude.17.value=/var/backups/*
> profile1.snapshots.exclude.18.value=.Private
> . . .

Half of those are system directories, so they are irrelevant for your
use case (backing up a /home directory).

> Borg takes up about 4.2 Gb of user data only.
>
> Backintime uses about 4.4Gb to back up the same user data.

If I understand correctly Borg takes less space *without* exclusions to
backup the same data as timeshift *with* exclusions.

What problem are you trying to solve?

Kind regards,
Andrei
--
http://wiki.debian.org/FAQsFromDebianUser
signature.asc

Marco Möller

unread,
Sep 26, 2021, 11:20:03 AM9/26/21
to
The following is an example for what could be in an exclude file for the
borg command being used at the CLI with the option:
--exclude-from myExcludeFile

,

# The following items will be excluded from the borg backup
# use absolute paths like in: "borg create repo::archive /home/someUserName"
# do NOT use relative paths like in "borg create repo::archive ."
#
# a slash as the last character excludes all contents but not the dir
name itself
# like this the softlinks are preserved


/home/someUserName/.cache/
/home/someUserName/Downloads/
/home/someUserName/TEMP/

/home/someUserName/.julia/artifacts/
/home/someUserName/.julia/compiled/
/home/someUserName/.julia/conda/
/home/someUserName/.julia/packages/
/home/someUserName/.julia/registries/


/home/someUserName/.opt/



You will see that I personally decided to not include in my Backup some
quit common folders:
.cache
Downloads
TEMP

You will also see, that I did not exclude a particular single file, only
complete directories. You could do so, you could include in teh list
particular files, if of interest to you.

I then have some folders to which I install software relevant only to
this user, and as this user could anytime reinstall this software, the
content of these folders do not contain user data or configuration data
of importance, I decided to not fill my backups with the huge and often
changing content of these folders:

.opt
.julia/selectionOfReinstallableJuliaFolders

Note that I did not include the complete tree ".julia", because in other
sub-directories of Julia there is important user data and configuration
data which I do want to become backuped!

I wouldn't know about a general recommendation about folders which by
default are recommended for exclusion. You will have to go for the
effort and personally decide for your very own situation.
If you have an exclusion list which you are happy with from other
software, you mentioned backintime, maybe you can learn from my above
example about the borg syntax and reuse the exclusions which backintime
has configured for you?
Note the in my above example there are comment lines included,
everything behind the sign "#" is a comment and the comment ends at the
end of line. These line can be part of the exclude file and do not harm,
they will simply be ignored when borg searches for the entries of to be
excluded files or directories. These comments point out some frecuentley
parts of particular interest when populating a borg exclusion file.
However, nothing beets reading the original documentation.

Good Luck!
Marco

Kushal Kumaran

unread,
Sep 26, 2021, 12:00:04 PM9/26/21
to
[slightly re-arranged segments below]
> ...
> And that isn't even (afaik) Fnmatch!
> (BTW, I have read what you referenced as ' Run "borg help patterns" '.
> I'm afraid it wasn't very helpful to me.)

I mean you should run the command "borg help patterns" (without quotes)
from a terminal. That produces detailed explanation of what kinds of
patterns borg supports, including examples. The same content is also
available in the borg-patterns manpage.

It also mentions the --dry-run option that you can use to try your
patterns out. fnmatch is similar to shell pattern matching, like the
backintime configuration fragment you've shown above. You can take
those as-is if you want all of those to apply. The patterns above that
start with / will not apply to your scenario, where you're backing up
/home/debian-user.

>> [https://defuse.ca/audits/encfs.htm] talks about issues with encfs
>> design. There is nothing backintime can do to fix those.
>>
>> borg can encrypt its backup images, and it recommendeds enabling that.
>> So an adversary would not get access to the encfs ciphertext directly.
>> They could get access to borg ciphertext instead, which may or may not
>> be vulnerable to the same problems. AFAIK there hasn't been a security
>> audit of borgbackup itself. The page at
>> https://borgbackup.readthedocs.io/en/stable/internals/security.html#borgcrypto
>> describes the design of borg security.
>>
>
> I shall take for granted that backintime developers do not code encfs.
> Fine. But after 7 years (at least), why haven't they replaced encfs
> with a "safer" encryption scheme, or at least just removed it and
> simply not replaced it at all? IMHO, either option would seem far
> better than the status quo.

I had misunderstood the scenario. I'd read it as you using backintime
to backup encfs-encrypted content, not realizing that backintime uses
encfs to provide encrypted backups.

--
regards,
kushal

deloptes

unread,
Sep 28, 2021, 2:00:06 AM9/28/21
to
Default User wrote:

> Hello!
>
> I want to try using borgbackup to do backups of my (only) user directory:
> /home/debian-user
>
> I just want to do so using Vorta, a GUI for borgbackup.
>
> But I just need a good, general list of directory and file type
> exclusions that I can just cut and paste into the Exclude Patterns
> window in Vorta. Something like the default list of exclusions that
> appears by default in the Backintime backup program.
>

I use this. For file type I do not know

borg create --progress --stats --compression zstd,10 \
-e 'pp:/sys' \
-e 'pp:/proc' \
-e 'pp:/dev' \
-e 'pp:/run' \
-e 'pp:/tmp' \
-e 'pp:/var/tmp' \
-e 'pp:/var/log'

Default User

unread,
Oct 2, 2021, 8:00:05 PM10/2/21
to
-------------------------------------------

Hi!

Just an update.

Here is what I came up with, cobbled together, from a number of sources:
*~
*.backup*
**/.cache
/boot/*
/BORG/*
/BORG/.?*
/dev
/dev/*
/dev/.?*
/etc/mtab
/home/*/.cache/
# /home/*/.cache/chromium/*
/home/*/.cache/mozilla/*
/home/*/.cache/mozilla/firefox/*
# /home/*/.claws-mail/tmp/*
/home/*/.gvfs
/home/*/.gvfs/*
/home/*/.gvfs/.?*
# /home/*/.googleearth/Cache/*
/home/*/.opt/
/home/*/.thumbnails/*
/lost+found
/lost+found/*
/lost+found/.?*
/media
/media/*
/mnt
/mnt/*
/proc
/proc/*
/root/.gvfs/*
/root/.gvfs/.?*
/run
/run/*
/sys
/sys/*
/tmp
/tmp/*
/usr/tmp/*
/var/backups/*
/var/cache/*
/var/cache/apt/archives/*.deb
/var/lib/dhcpcd/*
/var/tmp/*
~/.adobe/Flash_Player/AssetCache
~/.cache
~/.ccache
~/.gvfs
# ~/.local/share/Steam
~/.Private
~/.recent-applications.xbel
~/.recently-used.xbel
# ~/snap/*/*/.cache
# ~/.steam/root
~/.thumbnails
~/.var/app/*/cache
~/.xsession-errors
.cache
.cache/*
# .dropbox*
.gvfs
lost+found/*
.Private
.thumbnails*

This is saved as a text file, which can be altered as needed at any
time, and is copied and pasted into the Exclude Patterns window under
the Sources tab of Vorta. I want to use Vorta, at least for now,
because it is much more user-friendly than raw borgbackup.

Note that some of the entries are commented out, as they are not
needed currently, but are there to be uncommented as needed.

Also, there are probably more entries than are really necessary. I
hope to improve the exclude list over time.

Also note that there are directory entries that probably do not apply
to my immediate use case (backing up user stuff, from
/home/debian-user on down). For example:
/dev
/dev/*
/dev/.?*
/media
/media/*
/mnt
/mnt/*
/proc
/proc/*
/run
/run/*
/sys
/sys/*
/tmp
/tmp/*
/usr/tmp/*

And some things I just don't know if I need to back them up or not. Examples:

/BORG/*
/BORG/.?*
/etc/mtab
/home/*/.opt/
/lost+found
/lost+found/*
/lost+found/.?*
lost+found/*
.Private
~/.xsession-errors

Further, the Archlinux wiki rsync article states:
"If GVFS is installed [it is!], /home/*/.gvfs must be excluded to
prevent rsync errors." Thus, these are on the exclude list:
/home/*/.gvfs
/home/*/.gvfs/*
/home/*/.gvfs/.?*
/root/.gvfs/*
/root/.gvfs/.?*
~/.gvfs
.gvfs

And . . . I make a special point of excluding:
/media
/media/*
/mnt
/mnt/*

Why?

Because long ago, when I was just learning to use rsync, I tried to
use it to do a full-system backup. Since nobody told me that /mnt and
/media had to be specifically excluded, rsync did exactly what it was
told to do, recursively backing up everything, filling up an entire
1Tb hard drive, stopping only when it ran out of room!

I do not ever want that to happen again.

Fun fact:
I can use the same exclude list to mirror the same user stuff to a
different backup directory, with obsessive frequency, using rsync:

sudo rsync -avHx --delete --stats --exclude-from
"/home/debian-user/rsync_exclude_list.txt" /home
/media/debian-user/backup_drive/backups_of_host_home_directory_only

Finally, it has been suggested that I could do full system backups
using borgbackup, and doing full system restores, if needed, using
borgbackup from a live SystemRescue cd/usb.
I may try that in the future, as I learn more about borg. It is
complicated, or at least
complex! But for now I feel secure knowing that I can restore my system
(non-data stuff) at any time, using Timeshift. And 1+ here also, for
SystemRescue.

Well, I don't know if any of this was helpful or even interesting.
But here it is, FWIW.

Thanks for the replies!

Andrei POPESCU

unread,
Oct 3, 2021, 3:00:06 AM10/3/21
to
On Sb, 02 oct 21, 19:56:02, Default User wrote:
>
> And . . . I make a special point of excluding:
> /media
> /media/*
> /mnt
> /mnt/*
>
> Why?
>
> Because long ago, when I was just learning to use rsync, I tried to
> use it to do a full-system backup. Since nobody told me that /mnt and
> /media had to be specifically excluded, rsync did exactly what it was
> told to do, recursively backing up everything, filling up an entire
> 1Tb hard drive, stopping only when it ran out of room!
>
> I do not ever want that to happen again.
>
> Fun fact:
> I can use the same exclude list to mirror the same user stuff to a
> different backup directory, with obsessive frequency, using rsync:
>
> sudo rsync -avHx --delete --stats --exclude-from
> "/home/debian-user/rsync_exclude_list.txt" /home
> /media/debian-user/backup_drive/backups_of_host_home_directory_only

In case of system backups the top-level directories themselves should
probably be included because they are needed as mount points. rsync's -x
parameter should already take care of excluding the (special) mounts
*under* /mnt, /media, /dev, /sys, /run etc.


Hope this helps,
Andrei
--
http://wiki.debian.org/FAQsFromDebianUser
signature.asc

Default User

unread,
Oct 3, 2021, 10:40:05 AM10/3/21
to
Hi Andrei, thanks for the tip!
0 new messages