bup save: error on vanished file?

19 views
Skip to first unread message

Thomas Lotze

unread,
Feb 6, 2022, 8:18:24 AM2/6/22
to bup-...@googlegroups.com
Hi all,

I've noticed a behaviour of bup save that I don't really see the point of.
If there is a good reason, I'd like to find out, otherwise I might be
tempted to provide a patch.

So, when running first bup index and then bup save, the indexed content may
change in the meantime. While further changes to files indexed as changed
are no problem and files changed only after indexing will go in the next
backup run, vanished files will cause bup save to exit with an error:

[Errno 2] No such file or directory: ...

Of course, there is a point in bup save considering it an error if it cannot
do something it is supposed to (save a file which it cannot find). But in
the context of backing up a constantly changing set of files, a file
vanishing is just as valid a change as a file's contents changing or one
appearing. I think bup save should just accept it, maybe report a warning.
At least I think there should be an option for such a behaviour.

What do you think?

Thanks and cheers
Thomas

Nix

unread,
Feb 12, 2022, 9:19:33 AM2/12/22
to Thomas Lotze, bup-...@googlegroups.com
On 6 Feb 2022, Thomas Lotze told this:

> Of course, there is a point in bup save considering it an error if it cannot
> do something it is supposed to (save a file which it cannot find). But in
> the context of backing up a constantly changing set of files, a file
> vanishing is just as valid a change as a file's contents changing or one
> appearing. I think bup save should just accept it, maybe report a warning.
> At least I think there should be an option for such a behaviour.
>
> What do you think?

I agree, FWIW. I'd prefer it if bup only reported errors if the save
actually failed to save files which *are* there: indexed files
disappearing are just a consequence of normal activity and should not be
considered errors.

--
NULL && (void)

Rob Browning

unread,
Feb 19, 2022, 1:52:32 PM2/19/22
to Thomas Lotze, bup-...@googlegroups.com
Thomas Lotze <tho...@thomas-lotze.de> writes:

> Of course, there is a point in bup save considering it an error if it cannot
> do something it is supposed to (save a file which it cannot find). But in
> the context of backing up a constantly changing set of files, a file
> vanishing is just as valid a change as a file's contents changing or one
> appearing. I think bup save should just accept it, maybe report a warning.
> At least I think there should be an option for such a behaviour.
>
> What do you think?

I tend to agree, though I think the final answer depends on the
semantics we want, which it'd be nice to have clearly specified at some
point.

For example, if conceptually the index is just supposed to be a
metadata/state cache, then it would make perfect sense to ignore an
error like this. But if it's supposed to be a specification of what
should/shouldn't be in the save (which the --exclude options suggest),
then I could imagine someone might want to know when files are missing.

I suppose in many of my cases, where I'm indexing/saving lvm snapshots,
I'd be happy to know, since that would be very surprising and might
suggest more serious trouble.

This ties in to broader questions about error-related behavior. I've
thought for a while that I'd like to have clearer
intentions/expectations with respect to the handling of
errors/warnings/info, and I could, for example, imagine that we might
want to provide some kind of control:

bup ... save --missing-paths {ignore,warn,error} /home

Though I'm not at all sure we'd want to structure things exactly like
that, nor how fine grain we'd want the control to be.

In any case, I'll want to think about it, but in this particular
situation, it might well make sense to group disappearances with content
changes and unexpected appearances, and just ignore them by default.

Then, if desired, we could provide some way to specify when you actually
do want omissions to be treated as errors.

Thanks
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

Greg Troxel

unread,
Feb 20, 2022, 7:54:49 AM2/20/22
to Rob Browning, Thomas Lotze, bup-...@googlegroups.com

Rob Browning <r...@defaultvalue.org> writes:

> In any case, I'll want to think about it, but in this particular
> situation, it might well make sense to group disappearances with content
> changes and unexpected appearances, and just ignore them by default.
>
> Then, if desired, we could provide some way to specify when you actually
> do want omissions to be treated as errors.

This sounds right to me, except that I lean to a warning that doesn't
change the return code, as being more useful averaged ove all users.
But I don't think it matters much.


signature.asc

Alek Paunov

unread,
Mar 1, 2022, 5:35:03 PM3/1/22
to Rob Browning, Thomas Lotze, bup-...@googlegroups.com
On 2/19/22 20:52, Rob Browning wrote:
> I tend to agree, though I think the final answer depends on the
> semantics we want, which it'd be nice to have clearly specified at some
> point.
>
> For example, if conceptually the index is just supposed to be a
> metadata/state cache, then it would make perfect sense to ignore an
> error like this. But if it's supposed to be a specification of what
> should/shouldn't be in the save (which the --exclude options suggest),
> then I could imagine someone might want to know when files are missing.
>

I also always use LVM (thin) snapshots for data VM volumes and FSs and
Volume shadow copy (via diskshadow.exe) under Cygwin.

I think we should strongly recommend users to not backup "hot" and
possibly inconsistent files, because contrary to situation 15 years ago,
snapshots are widely available these days (btrfs - builtin, xfs/ext are
over some LVM schema by default in most distributions, snapper is ready
to use, convenient tool in both cases).

But if we assume cold snapshot, one may ask why we have separate index
and save phases/ops? Because as per original idea (DESIGN), built-in
indexer should be only one of the ways this recipe to be prepared.

For example many DB engines do not change nothing stat-able on their
files and you have to add these somehow to the index and once index
arrive to SQLite this will even be easy :-).

Another example - duc is a project solely intended for extreme fast FS
indexing - one could modify duc with an option to generate bup indexes
some day.

Kind regards,
Alek

Nix

unread,
Mar 3, 2022, 12:24:39 PM3/3/22
to Alek Paunov, Rob Browning, Thomas Lotze, bup-...@googlegroups.com
On 1 Mar 2022, Alek Paunov spake thusly:
> I also always use LVM (thin) snapshots for data VM volumes and FSs and
> Volume shadow copy (via diskshadow.exe) under Cygwin.
>
> I think we should strongly recommend users to not backup "hot" and
> possibly inconsistent files, because contrary to situation 15 years
> ago, snapshots are widely available these days (btrfs - builtin,
> xfs/ext are over some LVM schema by default in most distributions,

Yes... but LVM snapshots of the root filesystem (more generally, the fs
with /etc, /lib and /bin on it) are not recommended: they are (still)
decidedly deadlock-prone. Because I don't like backups hanging my entire
system irrecoverably, I routinely run backups without using snapshots
(and basically avoid LVM snapshots like the plague for almost any
reason: they're just too risky).

> Another example - duc is a project solely intended for extreme fast FS
> indexing

I don't see any real way to do that without fs assistance, at least not
on spinning rust. (XFS has BULKSTAT ioctls which should speed up
statting if you're indexing a whole filesystem, but I've never done
anything about it...)

--
NULL && (void)

Rob Browning

unread,
Mar 4, 2022, 7:26:26 PM3/4/22
to Nix, Alek Paunov, Thomas Lotze, bup-...@googlegroups.com
Nix <n...@esperi.org.uk> writes:

> Yes... but LVM snapshots of the root filesystem (more generally, the fs
> with /etc, /lib and /bin on it) are not recommended: they are (still)
> decidedly deadlock-prone. Because I don't like backups hanging my entire
> system irrecoverably, I routinely run backups without using snapshots
> (and basically avoid LVM snapshots like the plague for almost any
> reason: they're just too risky).

Hmm didn't know that -- I don't think I've had any trouble across a
number of (Debian) machines (all ext4 if it matters, and all backing up
/). Perhaps I've just been lucky.

Nix

unread,
Mar 6, 2022, 9:35:34 AM3/6/22
to Rob Browning, Alek Paunov, Thomas Lotze, bup-...@googlegroups.com
On 5 Mar 2022, Rob Browning told this:

> Nix <n...@esperi.org.uk> writes:
>
>> Yes... but LVM snapshots of the root filesystem (more generally, the fs
>> with /etc, /lib and /bin on it) are not recommended: they are (still)
>> decidedly deadlock-prone. Because I don't like backups hanging my entire
>> system irrecoverably, I routinely run backups without using snapshots
>> (and basically avoid LVM snapshots like the plague for almost any
>> reason: they're just too risky).
>
> Hmm didn't know that -- I don't think I've had any trouble across a
> number of (Debian) machines (all ext4 if it matters, and all backing up
> /). Perhaps I've just been lucky.

Yeah, you need to be resource-constrained enough that paging will kick
in, and unlucky in that LVM's attempts to mlock itself into memory
didn't work (which can still happen: it happens rarely enough and the
system is unresponsive enough afterwards, with the rootfs suspended
until reboot, that nobody has yet been able to track it down or even
confirm that is actually fixed or not).

Since I prefer non-deadlocks -- and have a nasty complex mass of
bind-mounts and semioverlapping filesystems and the like to back up --
I'm just doing boring no-snapshot backups. Disappearances of files
between index and backup, particularly under ~/.config and maildirs, are
fairly common. (They all get resolved in the next backup anyway, and for
homedirs that's only three hours away for me.)

--
NULL && (void)

Greg Troxel

unread,
Mar 6, 2022, 10:52:05 AM3/6/22
to bup-...@googlegroups.com

I also run backups without snapshots. My basic view is that if a file
is created or deleted during a backup run, it doesn't really matter if
it gets backed up or not. As long as my backup includes any file that
exists before I start and is not removed/moved during the backup, I'm
happy.

I'm not trying to be neaative about people using filesystem snapshots.
Just that bup seems to work well without that.
signature.asc

Alek Paunov

unread,
Mar 10, 2022, 7:22:09 PM3/10/22
to Nix, Thomas Lotze, bup-...@googlegroups.com
On 3/6/22 16:35, Nix wrote:
> On 5 Mar 2022, Rob Browning told this:
>
>> Nix <n...@esperi.org.uk> writes:
>>
>>> Yes... but LVM snapshots of the root filesystem (more generally, the fs
>>> with /etc, /lib and /bin on it) are not recommended: they are (still)
>>> decidedly deadlock-prone. Because I don't like backups hanging my entire
>>> system irrecoverably, I routinely run backups without using snapshots
>>> (and basically avoid LVM snapshots like the plague for almost any
>>> reason: they're just too risky).
>>
>> Hmm didn't know that -- I don't think I've had any trouble across a
>> number of (Debian) machines (all ext4 if it matters, and all backing up
>> /). Perhaps I've just been lucky.
>
> Yeah, you need to be resource-constrained enough that paging will kick
> in, and unlucky in that LVM's attempts to mlock itself into memory
> didn't work (which can still happen: it happens rarely enough and the
> system is unresponsive enough afterwards, with the rootfs suspended
> until reboot, that nobody has yet been able to track it down or even
> confirm that is actually fixed or not).

Could you point to that bug report somewhere, please (I am considering
this as a bug, because the snapshotting of mounted filesystem is a
officially supported feature of the LVM).

As I said, I am using SUSE's snapper [1] utility program on tens of
Fedora instances at least from 7 years (operating to mounted filesystems
on thin LVMs) without any issues so far:

/etc/snapper/configs/root:
SUBVOLUME="/"
FSTYPE="lvm(ext4)"
...

Kind regards,
Alek

P.S. Sorry, for the late replay!.

[1] http://snapper.io/
https://www.mankier.com/8/snapper

Nix

unread,
Mar 11, 2022, 10:05:39 AM3/11/22
to Alek Paunov, Thomas Lotze, bup-...@googlegroups.com
On 11 Mar 2022, Alek Paunov uttered the following:

> On 3/6/22 16:35, Nix wrote:
>> On 5 Mar 2022, Rob Browning told this:
>>
>>> Nix <n...@esperi.org.uk> writes:
>>>
>>>> Yes... but LVM snapshots of the root filesystem (more generally, the fs
>>>> with /etc, /lib and /bin on it) are not recommended: they are (still)
>>>> decidedly deadlock-prone. Because I don't like backups hanging my entire
>>>> system irrecoverably, I routinely run backups without using snapshots
>>>> (and basically avoid LVM snapshots like the plague for almost any
>>>> reason: they're just too risky).
>>>
>>> Hmm didn't know that -- I don't think I've had any trouble across a
>>> number of (Debian) machines (all ext4 if it matters, and all backing up
>>> /). Perhaps I've just been lucky.
>> Yeah, you need to be resource-constrained enough that paging will kick
>> in, and unlucky in that LVM's attempts to mlock itself into memory
>> didn't work (which can still happen: it happens rarely enough and the
>> system is unresponsive enough afterwards, with the rootfs suspended
>> until reboot, that nobody has yet been able to track it down or even
>> confirm that is actually fixed or not).
>
> Could you point to that bug report somewhere, please (I am considering this as a bug, because the snapshotting of mounted filesystem
> is a officially supported feature of the LVM).

The bug report was a conversation with Alasdair Kergon years ago over a
coffee, I'm afraid. Not really too helpful, I'm afraid. Since then I've
occasionally run into reports of others' deadlocking, without ever
keeping track of them... it was just enough to keep me from trying
myself.

> As I said, I am using SUSE's snapper [1] utility program on tens of Fedora instances at least from 7 years (operating to mounted
> filesystems on thin LVMs) without any issues so far:

If it works for you, then great! :) It's much more likely to go wrong
under memory pressure, and most non-cloud-servery, non-mobile systems
these days don't spend very much time under memory pressure.
Reply all
Reply to author
Forward
0 new messages