Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

deduplicating file systems: VDO with Debian?

596 views
Skip to first unread message

hw

unread,
Nov 6, 2022, 9:10:06 PM11/6/22
to
Hi,

I discovered that Redhat has VDO[1] to take care of deduplicating file systems.
Aptitude didn't find any packages towards that.

Is there no VDO in Debian, and what would be good to use for deduplication with
Debian? Why isn't VDO in the stardard kernel? Or is it?

I'm not looking for deduplication that happens some time after files have
already been written like btrfs would allow: There is no point in deduplicating
backups after they're done because I don't need to save disk space for them when
I can fit them in the first place.


[1]:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/deduplicating_and_compressing_storage/deploying-vdo_deduplicating-and-compressing-storage#doc-wrapper

Anders Andersson

unread,
Nov 7, 2022, 3:20:06 AM11/7/22
to
You could always buy Red Hat Enterprise Linux license, sign up for a support contract, and ask if they could start supporting other operating systems? ("Each branch on this project is intended to work with a specific release of Enterprise Linux").

I would be more worried if my backup storage didn't have enough room for at least a full fresh and unique backup from one client.
 - If it doesn't and something unexpected happens (user fills the whole disk with something, malware encrypts all data = changes everything to unique files, etc) then it will fill up the disk and ruin every other backup.
 - If you *do* have room for one client but not many more, you can always deduplicate after each client backup which should regain everything if nothing changed.

hw

unread,
Nov 7, 2022, 4:40:05 AM11/7/22
to
On Mon, 2022-11-07 at 09:14 +0100, Anders Andersson wrote:
> On Mon, Nov 7, 2022 at 3:04 AM hw <h...@adminart.net> wrote:
>
> > Hi,
> >
> > I discovered that Redhat has VDO[1] to take care of deduplicating file
> > systems.
> > Aptitude didn't find any packages towards that.
> >
> > Is there no VDO in Debian, and what would be good to use for deduplication
> > with
> > Debian?  Why isn't VDO in the stardard kernel? Or is it?
> >
> > I'm not looking for deduplication that happens some time after files have
> > already been written like btrfs would allow: There is no point in
> > deduplicating
> > backups after they're done because I don't need to save disk space for
> > them when
> > I can fit them in the first place.
> >
> >
> > [1]:
> >
> > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/deduplicating_and_compressing_storage/deploying-vdo_deduplicating-and-compressing-storage#doc-wrapper
> >
> >
> You could always buy Red Hat Enterprise Linux license, sign up for a
> support contract, and ask if they could start supporting other operating
> systems? ("Each branch on this project is intended to work with a specific
> release of Enterprise Linux").

Huh? What would that accomplish?

> I would be more worried if my backup storage didn't have enough room for at
> least a full fresh and unique backup from one client.
>  - If it doesn't and something unexpected happens (user fills the whole
> disk with something, malware encrypts all data = changes everything to
> unique files, etc) then it will fill up the disk and ruin every other
> backup.
>  - If you *do* have room for one client but not many more, you can always
> deduplicate after each client backup which should regain everything if
> nothing changed.

None of this applies in this case. Are you saying that deduplication is not
possible with Debian?

didier gaumet

unread,
Nov 7, 2022, 5:40:05 AM11/7/22
to
Le 07/11/2022 à 10:30, hw a écrit :

Hello,

Disclaimer: I am really almqst ignorant about deduplication

> On Mon, 2022-11-07 at 09:14 +0100, Anders Andersson wrote:
>> On Mon, Nov 7, 2022 at 3:04 AM hw <h...@adminart.net> wrote:
[...]
>> You could always buy Red Hat Enterprise Linux license, sign up for a
>> support contract, and ask if they could start supporting other operating
>> systems? ("Each branch on this project is intended to work with a specific
>> release of Enterprise Linux").
>
> Huh? What would that accomplish?

I think that what Anders tries to underline is that each VDO release is
specific to Redhat, and further, is specific to a particular Redhat
release. By definition that would complicate potential VDO integration
in Debian.

[...]
> Are you saying that deduplication is not
> possible with Debian?

I may be mistaken, but I think there is a confusion here about a
deduplication at filesystem level and at backup tool level.

At (linux) filesystem level, I think in-line deduplication is only
provided by ZFS (and perhaps, out-of-tree, BTRFS)

I do not know precisely your usecase, but if it is to prevent
duplication during backup, just use a deduplicating backup tool, it just
do that: avoid duplicating backup objects before it could occur.
searching for deduplicating software packaged in Debian ('apt search
dedup' in a terminal) and sorting backup ones would give you clues.

Curt

unread,
Nov 7, 2022, 8:10:05 AM11/7/22
to
On 2022-11-07, hw <h...@adminart.net> wrote:
>
> None of this applies in this case. Are you saying that deduplication is not
> possible with Debian?
>
>

curty@einstein:~$ apt-cache search deduplication
backuppc - high-performance, enterprise-grade system for backing up PCs
borgbackup - deduplicating and compressing backup program
borgbackup-doc - deduplicating and compressing backup program (documentation)
jdupes - identify and delete or link duplicate files
libsxclient-dev - Scalable public and private cloud storage
libsxclient3 - Scalable public and private cloud storage
sx - Scalable public and private cloud storage
zbackup - Versatile deduplicating backup tool


I had to look up the word deduplicate (I was going to say, "That isn't even a
word!"), which reveals my extensive knowledge of the matter.

Dan Ritter

unread,
Nov 7, 2022, 9:50:05 AM11/7/22
to
didier gaumet wrote:
>
> I may be mistaken, but I think there is a confusion here about a
> deduplication at filesystem level and at backup tool level.
>
> At (linux) filesystem level, I think in-line deduplication is only provided
> by ZFS (and perhaps, out-of-tree, BTRFS)

ZFS deduplication is a special beast that usually does not make
people happy. It is an enterprise feature that really only works
for special cases, and requires a lot of RAM - 1GB per 1TB of
storage - to work. Worst of all, it cannot be gracefully turned
off.

As you say, deduplication in backup systems is quite common, and works
pretty well. There's also an on-disk non-filesystem utility, rdfind,
which is packaged in Debian. It can discover identical files and make
them hardlinks.

-dsr-

hede

unread,
Nov 7, 2022, 10:30:05 AM11/7/22
to
Am 07.11.2022 02:57, schrieb hw:
> Hi,
>
> Is there no VDO in Debian, and what would be good to use for
> deduplication with
> Debian? Why isn't VDO in the stardard kernel? Or is it?

I have used vdo in Debian some time ago and didn't remember big
problems. AFAIR I did compile it myself - no prebuild packages.

I switched to btrfs for other reasons. Not even for performance. The VDO
Layer eats performance, yes, but compared to naked ext4 even btrfs is
slow.

> I'm not looking for deduplication that happens some time after files
> have
> already been written like btrfs would allow: There is no point in
> deduplicating
> backups after they're done because I don't need to save disk space for
> them when
> I can fit them in the first place.

That's only one point. And it's not really some valid one, I think, as
you do typically not run into space problems with one single action
(YMMV). Running multiple sessions and out-of-band deduplication between
them works for me.

In-band deduplication (that's the one you want) has some drawbacks, too:
High Ressource usage. You need plenty of RAM (up to several Gigabytes
per Terabyte Storage) and write success is delayed (-> slow direct i/o).

For Out-of-Band deduplication there are multiple different
implementations. File based dedup on directory basis can be very fast
and resource economical, for example via rdfind or jdupes. Block based
like via bees for btrfs (that's the one I use) is more close to in-band
deduplication (including high RAM usage). Bees can be switched off and
on at any time (for example if it's a small home-system which runs more
demanding tasks from time to time) and switching it on again resumes at
the last state (it starts at the last transaction id which was processed
-> btrfs knows its transactions).

regards
hede

hede

unread,
Nov 7, 2022, 11:10:05 AM11/7/22
to
Am 07.11.2022 16:29, schrieb hede:
> Am 07.11.2022 02:57, schrieb hw:
>> Hi,
>>
>> Is there no VDO in Debian, and what would be good to use for
>> deduplication with
>> Debian? Why isn't VDO in the stardard kernel? Or is it?
>
> I have used vdo in Debian some time ago and didn't remember big
> problems.

Btw. please keep in mind: VDO is transparent to the filesystem on-top.
And deduplication (likewise compression) is some non-deterministic task.

Where btrfs' calculation of the real free space is tricky if compression
and/or dedup is in use, it's quite impossible for a filesystem ontop of
VDO. It's much wore with VDO. The filesystem on top sees a "virtual"
size of the device which is a vague guess at best and is predefined on
creation time. You need to carefully monitor the actual disk usage of
the VDO device and stop writing data to the filesystem if it fills up.
It stalls if the filesystem wants to write more data than is available.
(At least if I remember correctly. Please correct me if I'm wrong here.)

So if you are expecting issues with space, there's some risk in damaging
your (file-)system.

With something like btrfs or ZFS there's less risk in that. Both do know
the free space and even if this was indeed a Problem in first days*,
rebalancing and filled up filesystems are (AFAIK) no longer a problem
with btrfs.

*) running out of space on btrfs could render filesystems read-only,
deleting files was no longer possible. COW means even deleting a file
needs some space so it got broken. This is AFAIK resolved. For deleting
files there's always some reserved space.

regards
hede

rhkr...@gmail.com

unread,
Nov 7, 2022, 2:00:05 PM11/7/22
to
> didier gaumet wrote:
> > I may be mistaken, but I think there is a confusion here about a
> > deduplication at filesystem level and at backup tool level.

I didn't (and don't) know much about deduplication (beyond what you might
deduce from the name), so I google and found this article which was helpful to
me:

* [[https://www.linkedin.com/pulse/lets-know-vdo-virtual-data-optimizer-
ganesh-gaikwad][Lets know about VDO (virtual data optimizer)]]


--
rhk

If you reply: snip, snip, and snip again; leave attributions; avoid HTML;
avoid top posting; and keep it "on list". (Oxford comma included at no
charge.) If you change topics, change the Subject: line.

Writing is often meant for others to read and understand (legal agreements
excepted?) -- make it easier for your reader by various means, including
liberal use of whitespace and minimal use of (obscure?) jargon, abbreviations,
acronyms, and references.

If someone else has already responded to a question, decide whether any
response you add will be helpful or not ...

A picture is worth a thousand words -- divide by 10 for each minute of video
(or audio) or create a transcript and edit it to 10% of the original.

hw

unread,
Nov 7, 2022, 10:50:05 PM11/7/22
to
On Mon, 2022-11-07 at 11:32 +0100, didier gaumet wrote:
> Le 07/11/2022 à 10:30, hw a écrit :
>
> Hello,
>
> Disclaimer: I am really almqst ignorant about deduplication
>
> > On Mon, 2022-11-07 at 09:14 +0100, Anders Andersson wrote:
> > > On Mon, Nov 7, 2022 at 3:04 AM hw <h...@adminart.net> wrote:
> [...]
> > > You could always buy Red Hat Enterprise Linux license, sign up for a
> > > support contract, and ask if they could start supporting other operating
> > > systems? ("Each branch on this project is intended to work with a specific
> > > release of Enterprise Linux").
> >
> > Huh?  What would that accomplish?
>
> I think that what Anders tries to underline is that each VDO release is
> specific to Redhat, and further, is specific to a particular Redhat
> release. By definition that would complicate potential VDO integration
> in Debian.

At least in theory, it should be in Centos, but if it's so specific, who knows
if it causes combatiliy issues ...

> [...]
> > Are you saying that deduplication is not
> > possible with Debian?
>
> I may be mistaken, but I think there is a confusion here about a
> deduplication at filesystem level and at backup tool level.
>
> At (linux) filesystem level, I think in-line deduplication is only
> provided by ZFS (and perhaps, out-of-tree, BTRFS)

That's what it seems like, except VDO. Unfortunately, ZFS is said to need 5--
6GB of RAM for each 1TB of data, and that would require upgrading my server.

> I do not know precisely your usecase, but if it is to prevent
> duplication during backup, just use a deduplicating backup tool, it just
> do that: avoid duplicating backup objects before it could occur.
> searching for deduplicating software packaged in Debian ('apt search
> dedup' in a terminal) and sorting backup ones would give you clues.

Actually that's a good idea I didn't think of. But thinking about it, is that a
good idea:

When I want to have 2 (or more) generations of backups, do I actually want
deduplication? It leaves me with only one actual copy of the data which seems
to defeat the idea of having multiple generations of backups at least to some
extent.

The question is then if it makes a difference. It also creates the question if
I need (want) multiple generations of backups, especially when I end up with
only one copy anyway. Hmm ...

hw

unread,
Nov 7, 2022, 11:10:06 PM11/7/22
to
On Mon, 2022-11-07 at 09:30 -0500, Dan Ritter wrote:
> didier gaumet wrote:
> >
> > I may be mistaken, but I think there is a confusion here about a
> > deduplication at filesystem level and at backup tool level.
> >
> > At (linux) filesystem level, I think in-line deduplication is only provided
> > by ZFS (and perhaps, out-of-tree, BTRFS)
>
> ZFS deduplication is a special beast that usually does not make
> people happy. It is an enterprise feature that really only works
> for special cases, and requires a lot of RAM - 1GB per 1TB of
> storage - to work. Worst of all, it cannot be gracefully turned
> off.

Only 1GB/1TB? The FreeBSD handbook says 5--6GB per 1TB. I could live with 1:1,
and I wouldn't need to turn it off. The idea, in this case, is to make two
generations of backups of the "same" data without having all the disk space
needed for both of them.

> As you say, deduplication in backup systems is quite common, and works
> pretty well. There's also an on-disk non-filesystem utility, rdfind,
> which is packaged in Debian. It can discover identical files and make
> them hardlinks.

Well, if I had all the disk space to hold 2 full copies of the data to be able
to deduplicate it only later, I wouldn't need to deduplicate anything.

And how would pretending there are two backups while there's actually only one
because it got deduplicated be better than having only one backup to begin with?
(Yeah I haven't thought of that before ...)

Maybe use a snapshot to create the 2nd backup? Or what?

hw

unread,
Nov 7, 2022, 11:20:05 PM11/7/22
to
On Mon, 2022-11-07 at 13:57 -0500, rhkr...@gmail.com wrote:
>
>
> I didn't (and don't) know much about deduplication (beyond what you might
> deduce from the name), so I google and found this article which was helpful to
> me:
>
>    * [[https://www.linkedin.com/pulse/lets-know-vdo-virtual-data-optimizer-
> ganesh-gaikwad][Lets know about VDO (virtual data optimizer)]]

That's a good pointer, but I still wonder how VDO actually works. For example,
if I have a volume with 5TB of data on it and I write a 500kB file to that
volume a week later or whenever, and the file I'm writing is identical to
another file somewhere within the 5TB of data alreading on the volume, how does
VDO figure out that both files are identical? ZFS does it by keeping lots of
data in memory so it can look it up right away, but VDO? Will it write the new
file at first and check it later in the background and re-use the space later,
or will it delay the write to check it first? Or does it do something else?

hw

unread,
Nov 7, 2022, 11:40:05 PM11/7/22
to
On Mon, 2022-11-07 at 16:29 +0100, hede wrote:
> Am 07.11.2022 02:57, schrieb hw:
> > Hi,
> >
> > Is there no VDO in Debian, and what would be good to use for
> > deduplication with
> > Debian?  Why isn't VDO in the stardard kernel? Or is it?
>
> I have used vdo in Debian some time ago and didn't remember big
> problems. AFAIR I did compile it myself - no prebuild packages.

Cool, I could give that a try, ty.

> I switched to btrfs for other reasons. Not even for performance. The VDO
> Layer eats performance, yes, but compared to naked ext4 even btrfs is
> slow.

Really? I never noticed that btrfs would be slow. But then, it's been a long
time that I used ext4 ...

> > There is no point in
> > deduplicating
> > backups after they're done because I don't need to save disk space for
> > them when
> > I can fit them in the first place.
>
> That's only one point.

What are the others?

> And it's not really some valid one, I think, as
> you do typically not run into space problems with one single action
> (YMMV). Running multiple sessions and out-of-band deduplication between
> them works for me.

That still requires you to have enough disk space for at least two full backups.
I can see it working for three backups because you can deduplicate the first
two, but not for two. And why would I deduplicate when I have sufficient disk
space.

> In-band deduplication (that's the one you want) has some drawbacks, too:
> High Ressource usage. You need plenty of RAM (up to several Gigabytes
> per Terabyte Storage) and write success is delayed (-> slow direct i/o).

Well, if it takes 5 days or so to make a backup, that won't be very useful. It
takes more than long enough already because my discs can only sustain so much.

> For Out-of-Band deduplication there are multiple different
> implementations. File based dedup on directory basis can be very fast
> and resource economical, for example via rdfind or jdupes. Block based
> like via bees for btrfs (that's the one I use) is more close to in-band
> deduplication (including high RAM usage). Bees can be switched off and
> on at any time (for example if it's a small home-system which runs more
> demanding tasks from time to time) and switching it on again resumes at
> the last state (it starts at the last transaction id which was processed
> -> btrfs knows its transactions).

Hm. I wouldn't mind running it from time to time, though I don't know that I
would have a lot of duplicate data other than backups. How much space might I
expect to gain from using bees, and how much memory does it require to run?

hw

unread,
Nov 7, 2022, 11:50:05 PM11/7/22
to
On Mon, 2022-11-07 at 17:01 +0100, hede wrote:
> Am 07.11.2022 16:29, schrieb hede:
> > Am 07.11.2022 02:57, schrieb hw:
> > > Hi,
> > >
> > > Is there no VDO in Debian, and what would be good to use for
> > > deduplication with
> > > Debian?  Why isn't VDO in the stardard kernel? Or is it?
> >
> > I have used vdo in Debian some time ago and didn't remember big
> > problems.
>
> Btw. please keep in mind: VDO is transparent to the filesystem on-top.
> And deduplication (likewise compression) is some non-deterministic task.
>
> Where btrfs' calculation of the real free space is tricky if compression
> and/or dedup is in use, it's quite impossible for a filesystem ontop of
> VDO. It's much wore with VDO. The filesystem on top sees a "virtual"
> size of the device which is a vague guess at best and is predefined on
> creation time. You need to carefully monitor the actual disk usage of
> the VDO device and stop writing data to the filesystem if it fills up.

Yes, I figured that would be a problem.

> It stalls if the filesystem wants to write more data than is available.
> (At least if I remember correctly. Please correct me if I'm wrong here.)

Like NFS? And then what? There isn't really a way to resolve that problem once
you ran into it, is there?

> So if you are expecting issues with space, there's some risk in damaging
> your (file-)system.

Even damage it? That would really suck.

> With something like btrfs or ZFS there's less risk in that. Both do know
> the free space and even if this was indeed a Problem in first days*,
> rebalancing and filled up filesystems are (AFAIK) no longer a problem
> with btrfs.

Well, I'm finding btrfs somewhat disappointing since it doesn't support
deduplication like ZFS does, and even RAID56 is still broken. It feels like the
available file systems haven't been up to the task for almost a decade and might
never catch up.

David Christensen

unread,
Nov 8, 2022, 12:50:06 AM11/8/22
to
On 11/7/22 19:49, hw wrote:
> On Mon, 2022-11-07 at 11:32 +0100, didier gaumet wrote:

>> At (linux) filesystem level, I think in-line deduplication is only
>> provided by ZFS (and perhaps, out-of-tree, BTRFS)
>
> That's what it seems like, except VDO. Unfortunately, ZFS is said to need 5--
> 6GB of RAM for each 1TB of data, and that would require upgrading my server.


On my ZFS storage and backup servers, ZFS seems to grab the majority of
available memory. I have been unable to figure out a way to measure
memory consumed by deduplication.


> When I want to have 2 (or more) generations of backups, do I actually want
> deduplication? It leaves me with only one actual copy of the data which seems
> to defeat the idea of having multiple generations of backups at least to some
> extent.
>
> The question is then if it makes a difference. It also creates the question if
> I need (want) multiple generations of backups, especially when I end up with
> only one copy anyway. Hmm ...


I put rsync based backups on ZFS storage with compression and
de-duplication. du(1) reports 33 GiB for the current backups (e.g.
uncompressed and/or duplicated size). zfs-auto-snapshot takes snapshots
of the backup filesystems daily and monthly, and I take snapshots
manually every week. I have 78 snapshots going back ~6 months. du(1)
reports ~3.5 TiB for the snapshots. 'zfs list' reports 86.2 GiB of
actual disk usage for all 79 backups. So, ZFS de-duplication and
compression leverage my backup storage by 41:1.


ZFS compression and de-duplication also works well for jails/ VM's.


For general data, I use compression alone.


For compressed and/or encrypted archives, image, etc., I do not use
compression or de-duplication


The key is to only use de-duplication when there is a lot of duplication.


And, to a lesser extend, to only use compression on uncompressed data
(lz4 detects compressed data and does not try to compress it further).


My ZFS pools are built with HDD's. I recently added an SSD-based vdev
as a dedicated 'dedup' device, and write performance improved
significantly when receiving replication streams.


David

hw

unread,
Nov 8, 2022, 2:20:05 AM11/8/22
to
On Mon, 2022-11-07 at 21:46 -0800, David Christensen wrote:
> On 11/7/22 19:49, hw wrote:
> > On Mon, 2022-11-07 at 11:32 +0100, didier gaumet wrote:
>
> > > At (linux) filesystem level, I think in-line deduplication is only
> > > provided by ZFS (and perhaps, out-of-tree, BTRFS)
> >
> > That's what it seems like, except VDO.  Unfortunately, ZFS is said to need
> > 5--
> > 6GB of RAM for each 1TB of data, and that would require upgrading my server.
>
>
> On my ZFS storage and backup servers, ZFS seems to grab the majority of
> available memory.  I have been unable to figure out a way to measure
> memory consumed by deduplication.

Are you deduplicating? Apparently some people say bad things happen when ZFS
runs out of memory from deduplication.

> > The question is then if it makes a difference.  It also creates the question
> > if
> > I need (want) multiple generations of backups, especially when I end up with
> > only one copy anyway.  Hmm ...
>
>
> I put rsync based backups on ZFS storage with compression and
> de-duplication.  du(1) reports 33 GiB for the current backups (e.g.
> uncompressed and/or duplicated size).  zfs-auto-snapshot takes snapshots
> of the backup filesystems daily and monthly, and I take snapshots
> manually every week.  I have 78 snapshots going back ~6 months.  du(1)
> reports ~3.5 TiB for the snapshots.  'zfs list' reports 86.2 GiB of
> actual disk usage for all 79 backups.  So, ZFS de-duplication and
> compression leverage my backup storage by 41:1.

I'm unclear as to how snapshots come in when it comes to making backups. What
if you have a bunch of snapshots and want to get a file from 6 generations of
backups ago? I never figured out how to get something out of an old snapshot
and found it all confusing, so I don't even use them.

33GB in backups is far from a terrabyte. I have a lot more than that.

> ZFS compression and de-duplication also works well for jails/ VM's.
>
>
> For general data, I use compression alone.
>
>
> For compressed and/or encrypted archives, image, etc., I do not use
> compression or de-duplication

Yeah, they wouldn't compress. Why no deduplication?

> The key is to only use de-duplication when there is a lot of duplication.

How do you know if there's much to deduplicate before deduplicating?

> And, to a lesser extend, to only use compression on uncompressed data
> (lz4 detects compressed data and does not try to compress it further).
>
>
> My ZFS pools are built with HDD's.  I recently added an SSD-based vdev
> as a dedicated 'dedup' device, and write performance improved
> significantly when receiving replication streams.

Hm, with the ZFS I set up a couple years ago, the SSDs wore out and removing
them without any replacement didn't decrease performance.

I'm not too fond of ZFS, especially not when considering performance. But for
backups, it won't matter.

didier gaumet

unread,
Nov 8, 2022, 4:10:06 AM11/8/22
to
There are elements of answer in RedHat doc:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/vdo-integration
and blog, that exposes performance trade-off:
https://www.redhat.com/en/blog/look-vdo-new-linux-compression-layer

from what I understand, VDO was designed as a layer in kernel space to
provide deduplication and compression features to local or distributed
filesystems that lack it. The goal being primarily to optimize storage
space for a provider of networked virtual machines to entities or customers

didier gaumet

unread,
Nov 8, 2022, 4:30:05 AM11/8/22
to
Le 08/11/2022 à 04:49, hw a écrit :
[...]
> When I want to have 2 (or more) generations of backups, do I actually want
> deduplication? It leaves me with only one actual copy of the data which seems
> to defeat the idea of having multiple generations of backups at least to some
> extent.
[...]

I would think there is also a confusion here (in my opinion, but I may
be wrong):

- deduplication is the action of preventing or correcting an object from
having multiples occurences. The criteria here is: are objects identical?

- incremental/differential backup(2) is the action of backuping only
objects (or deltas of objects) that have varied between backups. Thus
forbiding duplicates (on the target storage) of objects that have not
varied.
But that definitely does not suppress duplicates on the source storage
(that you want to backup) nor prevent to backup these duplicates, thus
having duplicates on the target storage


(1) Wikipedia article on deduplication
https://en.wikipedia.org/wiki/Data_deduplication
(2) Wikipedia article on Backups, with incremental, differential,
dedduplications explanations:
https://en.wikipedia.org/wiki/Backup#Deduplication

Thomas Schmitt

unread,
Nov 8, 2022, 5:20:05 AM11/8/22
to
Hi,

hw wrote:
> I still wonder how VDO actually works.

There is a comparer/decider named UDS which holds an index of the valid
storage blocks, and a device driver named VDO which performes the
deduplication and hides its internals from the user by providing a
block device on top of the real storage device file.
https://www.marksei.com/vdo-linux-deduplication/


> if I have a volume with 5TB of data on it and I write a 500kB file to that
> volume a week later or whenever, and the file I'm writing is identical to
> another file somewhere within the 5TB of data alreading on the volume, how
> does VDO figure out that both files are identical?

I understand that it chops your file into 4 KiB blocks
https://github.com/dm-vdo/kvdo/issues/18
and lets UDS look up the checksum of each such block in the index. If a
match is found, then the new block is not stored as itself but only as
reference to the found block.


This might yield more often deduplication than if the file was looked as a
whole. But i still have doubts that this would yield much advantage with
my own data.
Main obstacle for partial matches is probably the demand for 4 KiB alignment.
Neither text oriented files nor compressed files will necessarily hold their
identical file parts with that alignment. Any shift of not exactly 4 KiB
would make the similiarity invisible to UDS/VDO.


didier gaumet wrote:
> The goal being primarily to optimize storage space
> for a provider of networked virtual machines to entities or customers

Deduplicating over several nearly identical filesystem images might indeed
bring good size reduction.


hw wrote:
> When I want to have 2 (or more) generations of backups, do I actually want
> deduplication?

Deduplication reduces uncontrolled redundancy, backups shall create
controlled redundancy. So both are not exactly contrary in their goal
but surely need to be coordinated.

In case of VDO i expect that you need to use different deduplicating
devices to get controlled redundancy.
I do similar with incremental backups with file granularity. My backup
Blu-rays hold 200+ sessions which mostly re-use the file data storage
of previous sessions. If a bad spot damages file content, then it is
damaged in all sessions which refer to it.
To reduce the probability of such a loss, i run several backups per day,
each on a separate BD disc.

From time to time i make verification runs on the backups discs in order
to check for any damage. It is extreme rare to find a bas spot after the
written session was verfied directly after being written.
(The verification is based on MD5 checksums, which i deem sufficient,
because my use case avoids the birthday paradox of probability theory.
UDS/VDO looks like a giant birthday party. So i assume that it uses larger
checksums or verifies content identy when checksums match.)


Have a nice day :)

Thomas

Dan Ritter

unread,
Nov 8, 2022, 7:40:05 AM11/8/22
to
hw wrote:
> > As you say, deduplication in backup systems is quite common, and works
> > pretty well. There's also an on-disk non-filesystem utility, rdfind,
> > which is packaged in Debian. It can discover identical files and make
> > them hardlinks.
>
> Well, if I had all the disk space to hold 2 full copies of the data to be able
> to deduplicate it only later, I wouldn't need to deduplicate anything.

Only two copies? That's not a good use case for any of the
deduplicators.

The point of rdfind is to use it in a cron job while some process is
generating duplicate files. For example, a backup process that copies a
filesystem every six hours will generate four identical copies of almost
every file each day. (rsnapshot would do a better job, here.)


> And how would pretending there are two backups while there's actually only one
> because it got deduplicated be better than having only one backup to begin with?
> (Yeah I haven't thought of that before ...)

It's not two backups, it's two very similar backups taken at
different times, so the majority of the files are the same but
some are different. If you want a second backup, it needs to go
on a different machine, preferably in a different location.

Maybe you should tell us what your actual use case is rather
than asking about realtime deduplication? It could be that
there's a completely different solution which would make you
happy.

-dsr-

hede

unread,
Nov 8, 2022, 9:10:06 AM11/8/22
to
On 08.11.2022 05:31, hw wrote:
> That still requires you to have enough disk space for at least two full
> backups.

Correct, if you do always full backups then the second run will consume
full backup space in the first place. (not fully correct with bees
running -> *)

That would be the first thing I'd address. Even the simplest backup
solutions (i.e. based on rsync) do make use of destination rotation and
only submitting changes to the backup (-> incremental or differential
backups). I never considered successive full backups as a backup
"solution".

For me only the first backup is a full backup, every other backup is
incremental.

Regarding dedublication, I do see benefits in dedublication either if
the user moves files from one directory to some other directory, in
partly changed files (my backup solution dedubes on file basis via
hardlinks only), and with system backups of several different machines.

I prefer file based backups. So my backup solutions dedublication skills
are really limited. But a good block based backup solution can handle
all these cases by itself. Then no filesystem based dedublication is
needed.

If your problem is only backup related and you are flexible regarding
your backup solution, then probably choosing a backup solution with a
good dedublication feature should be your best choice. The solution
don't has to be complex. Even simple backup solutions like borg backup
are fine here (borg: chunk based deduplication even of parts of files
across several backups of several different machines). Even your
criteria to not write duplicate data in the first place is fulfilled
here.

(see borgbackup in Debian repository; disclaimer: I do not have personal
experience with borg as I'm using other solutions)

> I wouldn't mind running it from time to time, though I don't know that
> I
> would have a lot of duplicate data other than backups. How much space
> might I
> expect to gain from using bees, and how much memory does it require to
> run?

Bees should run as a service 24/7 and catches all written data right
after it gets written. That's comparable to in-band dedublication even
if it's out-of-band by definition. (*) This way writing many duplicate
files will potentially result in removing duplicates even if not all
data has already written to disk.

Therefore also memory consumption is like with in-band deduplication
(ZFS...), which means you should reserve more than 1 GB RAM per 1 TB
data. But it's flexible. Even less memory is usable. But then it cannot
find all duplicates as the hash table of all the data doesn't fit into
memory. (Nevertheless even then dedublication is more efficient than
expected: if it finds some duplicate block it looks for any blocks
around this block. So for big files only one match in the hash table is
sufficient to dedublicate the whole file.)

regards
hede

Curt

unread,
Nov 8, 2022, 9:40:06 AM11/8/22
to
On 2022-11-08, DdB <debia...@potentially-spam.de-bruyn.de> wrote:
>>
> Your wording likely confuses 2 different concepts:
>
> Deduplication avoids storing identical data more than once.
> whereas
> Redundancy stores information on more than one place on purpose to avoid
> loos of data in case of havoc.

So they're antithetical concepts? Redundancy sounds a lot like a back
up.

There always seems to havoc, BTW, sooner or later..

Nicolas George

unread,
Nov 8, 2022, 9:40:06 AM11/8/22
to
Curt (12022-11-08):
> Redundancy sounds a lot like a back up.

RAID also sounds a lot like a backup, and the R means redundant.

Yet raid is not a backup.

--
Nicolas George

The Wanderer

unread,
Nov 8, 2022, 10:11:33 AM11/8/22
to
On 2022-11-08 at 09:36, Nicolas George wrote:

> Curt (12022-11-08):
>
>> Redundancy sounds a lot like a back up.
>
> RAID also sounds a lot like a backup, and the R means redundant.
>
> Yet raid is not a backup.

That depends on which sense of the word "backup" you are using.

No, it's not a "backup" in the technical "back it up to tape" sense of
the word. There are many types of data-loss scenarios in which it will
not protect you at all.

But it does mean that if one drive fails, you can still fall back to the
copy on the other drive, and thus that copy is serving as a backup to
the copy on the first drive. There are some data-loss scenarios in which
RAID will protect you.

That more general sense of "backup" as in "something that you can fall
back on" is no less legitimate than the technical sense given above, and
it always rubs me the wrong way to see the unconditional "RAID is not a
backup" trotted out blindly as if that technical sense were the only one
that could possibly be considered applicable, and without any
acknowledgment of the limited sense of "backup" which is being used in
that statement.

--
The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw

signature.asc

Dan Ritter

unread,
Nov 8, 2022, 10:40:05 AM11/8/22
to
Curt wrote:
> On 2022-11-08, DdB <debia...@potentially-spam.de-bruyn.de> wrote:
> >>
> > Your wording likely confuses 2 different concepts:
> >
> > Deduplication avoids storing identical data more than once.
> > whereas
> > Redundancy stores information on more than one place on purpose to avoid
> > loos of data in case of havoc.
>
> So they're antithetical concepts? Redundancy sounds a lot like a back
> up.


Think of it this way:

You have some data that you want to protect against the machine
dying.

So you copy it to another machine. Now you have a backup.

You need to do this repeatedly, or else your backup is stale:
lacking information that was recently changed.

If you do it repeatedly to the same target, that's a lot of
information. Maybe you can only send the changes? rsync, ZFS
send, and some other methods make that pretty easy.

But what if you accidentally deleted a file a week ago, and the
backups are done every night? You're out of luck... unless you
have somehow got a record of all the changes that you saved, or
you have a second backup that happened before the deletion.

Snapshots (rsnapshot, ZFS snapshots, others...) make it easy to
go back in time to any snapshot and retrieve the state of the
data then, while not storing full copies of all the data all the
time.

Now, let's suppose that you want your live data -- the source --
to withstand a disk dying. If all the data is on one disk,
that's not going to happen. You can stripe the data on N disks,
but since there's only one copy of any given chunk of data, that
doesn't help with resiliency to a disk failure.

Instead, you can make multiple complete copies every time you do
a write: disk mirroring, or RAID 1. This is very fast, but eats
twice the disk space.

If you can accept slower performance, you can write the data in
chunks to N disks, and write checksums calculated from that data
to M disks, such that any 1 disk of the N+M can fail and you can
still reconstruct the whole data. That's RAID 5.

A slightly more complicated calculation withstands any 2 disks of
the N+M - RAID 6. ZFS even has a three disk resiliency mode.

Depending on your risk tolerance and performance needs, you
might use RAID 10 (striping and mirroring) on your main machine,
and backup to a more efficient but slower RAID 6 on your backup
target.

What we've left out is compression and deduplication.

On modern CPUs, compression is really fast. So fast that it
usually makes sense for the filesystem to try compressing all
the data it is about to write, and store the compressed data
with a flag that says it will need to be uncompressed when read.
This not only increases your available storage capacity, it can
make some reads and writes faster because less has to be
transferred to/from the relative slow disk. There is more of an
impact on rotating disks than SSDs.

Deduplication tries to match data that has already been written
and store a pointer to the existing data instead. This is an
easy problem as long as you have two things: a fast way to match
the data perfectly, and a very fast way to look up everything
that has previously been written.

It turns out that both of those subproblems scale badly. The
main use case is for storing multiple virtual machine instances,
or something similar, where you can expect every one of them to
have a large percentage of identical files stemming from the
operating system installation.

-dsr-

Stefan Monnier

unread,
Nov 8, 2022, 5:50:05 PM11/8/22
to
> I had to look up the word deduplicate (I was going to say, "That isn't even a
> word!"), which reveals my extensive knowledge of the matter.

It was originally called to "duplicate duplicate", but then
self-application came in and the rest is history.


Stefan

David Christensen

unread,
Nov 8, 2022, 8:40:05 PM11/8/22
to
On 11/7/22 23:13, hw wrote:
> On Mon, 2022-11-07 at 21:46 -0800, David Christensen wrote:

> Are you deduplicating?


Yes.


> Apparently some people say bad things happen when ZFS
> runs out of memory from deduplication.


Okay.


16 GiB seems to be enough for my SOHO server.


>> I put rsync based backups on ZFS storage with compression and
>> de-duplication.  du(1) reports 33 GiB for the current backups (e.g.
>> uncompressed and/or duplicated size).  zfs-auto-snapshot takes snapshots
>> of the backup filesystems daily and monthly, and I take snapshots
>> manually every week.  I have 78 snapshots going back ~6 months.  du(1)
>> reports ~3.5 TiB for the snapshots.  'zfs list' reports 86.2 GiB of
>> actual disk usage for all 79 backups.  So, ZFS de-duplication and
>> compression leverage my backup storage by 41:1.
>
> I'm unclear as to how snapshots come in when it comes to making backups.


I run my backup script each night. It uses rsync to copy files and
directories from various LAN machines into ZFS filesystems named after
each host -- e.g. pool/backup/hostname (ZFS namespace) and
/var/local/backup/hostname (Unix filesystem namespace). I have a
cron(8) that runs zfs-auto-snapshot once each day and once each month
that takes a recursive snapshot of the pool/backup filesystems. Their
contents are then available via Unix namespace at
/var/local/backup/hostname/.zfs/snapshot/snapshotname. If I want to
restore a file from, say, two months ago, I use Unix filesystem tools to
get it.



> What
> if you have a bunch of snapshots and want to get a file from 6 generations of
> backups ago?


Use Unix filesystem tools to copy it out of the snapshot tree. For
example, a file from two months ago:

cp
/var/local/backup/hostname/.zfs/snapshot/zfs-auto-snap_m-2022-09-01-03h21/path/to/file
~/restored-file


> I never figured out how to get something out of an old snapshot
> and found it all confusing, so I don't even use them.


Snapshots are a killer feature. You want to figure them out. I found
the Lucas books to be very helpful:

https://mwl.io/nonfiction/os#fmzfs

https://mwl.io/nonfiction/os#fmaz


> 33GB in backups is far from a terrabyte. I have a lot more than that.


I have 3.5 TiB of backups.


>> For compressed and/or encrypted archives, image, etc., I do not use
>> compression or de-duplication
>
> Yeah, they wouldn't compress. Why no deduplication?


Because I very much doubt that there will be duplicate blocks in such files.


>> The key is to only use de-duplication when there is a lot of duplication.
>
> How do you know if there's much to deduplicate before deduplicating?


Think about the files and how often they change ("churn"). If I'm
rsync'ing the root filesystem of a half dozen FreeBSD and Linux machines
to a backup directory once a day, most of the churn will be in /home,
/tmp, and /var. When I update the OS and/or packages, install software,
etc., there will be more churn that day.


If you want hard numebers, fdupes(1), jdupes(1), or other tools should
be able to tell you.



>> My ZFS pools are built with HDD's.  I recently added an SSD-based vdev
>> as a dedicated 'dedup' device, and write performance improved
>> significantly when receiving replication streams.
>
> Hm, with the ZFS I set up a couple years ago, the SSDs wore out and removing
> them without any replacement didn't decrease performance.


My LAN has Gigabit Ethernet. I have operated with a degraded ZFS pool
in my SOHO server, and did not notice a performance drop on my client.
If I had run benchmarks on the server before and after losing a
redundant device, I expect the performance drop would be obvious. But,
losing redundant device means increased risk of losing all of the data
in the pool.


> I'm not too fond of ZFS, especially not when considering performance. But for
> backups, it won't matter.


Learn more about ZFS and invest in hardware to get performance.


David

to...@tuxteam.de

unread,
Nov 9, 2022, 1:00:06 AM11/9/22
to
And then there are those (human) languages which mark plural by
duplicating a noun.

Cheers
--
t
signature.asc

hw

unread,
Nov 9, 2022, 3:30:05 AM11/9/22
to
On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:
> On 11/7/22 23:13, hw wrote:
> > On Mon, 2022-11-07 at 21:46 -0800, David Christensen wrote:
>
> > Are you deduplicating? 
>
>
> Yes.
>
>
> > Apparently some people say bad things happen when ZFS
> > runs out of memory from deduplication.
>
>
> Okay.
>
>
> 16 GiB seems to be enough for my SOHO server.

Hmm, when you can backup like 3.5TB with that, maybe I should put FreeBSD on my
server and give ZFS a try. Worst thing that can happen is that it crashes and
I'd have made an experiment that wasn't successful. Best thing, I guess, could
be that it works and backups are way faster because the server doesn't have to
actually write so much data because it gets deduplicated and reading from the
clients is faster than writing to the server.

> > > I put rsync based backups on ZFS storage with compression and
> > > de-duplication.  du(1) reports 33 GiB for the current backups (e.g.
> > > uncompressed and/or duplicated size).  zfs-auto-snapshot takes snapshots
> > > of the backup filesystems daily and monthly, and I take snapshots
> > > manually every week.  I have 78 snapshots going back ~6 months.  du(1)
> > > reports ~3.5 TiB for the snapshots.  'zfs list' reports 86.2 GiB of
> > > actual disk usage for all 79 backups.  So, ZFS de-duplication and
> > > compression leverage my backup storage by 41:1.
> >
> > I'm unclear as to how snapshots come in when it comes to making backups.
>
>
> I run my backup script each night.  It uses rsync to copy files and

Aww, I can't really do that because my servers eats like 200-300W because it has
so many disks in it. Electricity is outrageously expensive here.

> directories from various LAN machines into ZFS filesystems named after
> each host -- e.g. pool/backup/hostname (ZFS namespace) and
> /var/local/backup/hostname (Unix filesystem namespace).  I have a
> cron(8) that runs zfs-auto-snapshot once each day and once each month
> that takes a recursive snapshot of the pool/backup filesystems.  Their
> contents are then available via Unix namespace at
> /var/local/backup/hostname/.zfs/snapshot/snapshotname.  If I want to
> restore a file from, say, two months ago, I use Unix filesystem tools to
> get it.

Sounds like a nice setup. Does that mean you use snapshots to keep multiple
generations of backups and make backups by overwriting everything after you made
a snapshot?

In that case, is deduplication that important/worthwhile? You're not
duplicating it all by writing another generation of the backup but store only
what's different through making use of the snapshots.

> > What
> > if you have a bunch of snapshots and want to get a file from 6 generations
> > of
> > backups ago? 
>
>
> Use Unix filesystem tools to copy it out of the snapshot tree.  For
> example, a file from two months ago:
>
>      cp
> /var/local/backup/hostname/.zfs/snapshot/zfs-auto-snap_m-2022-09-01-
> 03h21/path/to/file
> ~/restored-file
>

cool

> > I never figured out how to get something out of an old snapshot
> > and found it all confusing, so I don't even use them.
>
>
> Snapshots are a killer feature.  You want to figure them out.  I found
> the Lucas books to be very helpful:
>
> https://mwl.io/nonfiction/os#fmzfs
>
> https://mwl.io/nonfiction/os#fmaz

I know, I only never got around to figure it out because I didn't have the need.
But it could also be useful for "little" things like taking a snapshot of the
root volume before updating or changing some configuration and being able to
easily to undo that.

> > 33GB in backups is far from a terrabyte.  I have a lot more than that.
>
>
> I have 3.5 TiB of backups.
>
>
> > > For compressed and/or encrypted archives, image, etc., I do not use
> > > compression or de-duplication
> >
> > Yeah, they wouldn't compress.  Why no deduplication?
>
>
> Because I very much doubt that there will be duplicate blocks in such files.

Hm, would it hurt?

> > > The key is to only use de-duplication when there is a lot of duplication.
> >
> > How do you know if there's much to deduplicate before deduplicating?
>
>
> Think about the files and how often they change ("churn").  If I'm
> rsync'ing the root filesystem of a half dozen FreeBSD and Linux machines
> to a backup directory once a day, most of the churn will be in /home,
> /tmp, and /var.  When I update the OS and/or packages, install software,
> etc., there will be more churn that day.
>
>
> If you want hard numebers, fdupes(1), jdupes(1), or other tools should
> be able to tell you.

ok, ty

> > > My ZFS pools are built with HDD's.  I recently added an SSD-based vdev
> > > as a dedicated 'dedup' device, and write performance improved
> > > significantly when receiving replication streams.
> >
> > Hm, with the ZFS I set up a couple years ago, the SSDs wore out and removing
> > them without any replacement didn't decrease performance.
>
>
> My LAN has Gigabit Ethernet.  I have operated with a degraded ZFS pool
> in my SOHO server, and did not notice a performance drop on my client.
> If I had run benchmarks on the server before and after losing a
> redundant device, I expect the performance drop would be obvious.  But,
> losing redundant device means increased risk of losing all of the data
> in the pool.

Oh it's not about performance when degraded, but about performance. IIRC when
you have a ZFS pool that uses the equivalent of RAID5, you're still limited to
the speed of a single disk. When you have a mysql database on such a ZFS
volume, it's dead slow, and removing the SSD cache when the SSDs failed didn't
make it any slower. Obviously, it was a bad idea to put the database there, and
I wouldn't do again when I can avoid it. I also had my data on such a volume
and I found that the performance with 6 disks left much to desire.

>
> > I'm not too fond of ZFS, especially not when considering performance.  But
> > for
> > backups, it won't matter.
>
>
> Learn more about ZFS and invest in hardware to get performance.

Hardware like? In theory, using SSDs for cache with ZFS should improve
performance. In practise, it only wore out the SSDs after a while, and now it's
not any faster without SSD cache.

to...@tuxteam.de

unread,
Nov 9, 2022, 3:50:05 AM11/9/22
to
On Wed, Nov 09, 2022 at 09:39:45AM +0100, hw wrote:

[...]

> When you keep N full generations of backups it's different. Using rsync, you'll
> only write the changes anyway, switching between the generations. Most of the
> data is being stored N times.

Pehaps you don't know about rsync's --link-dest option: you can, with rsync,
keep generations without duplicating between them.

But, as others have said, deduplication at the file system level (or below,
as VDO does) is mainly interesting where you have a whole herd of VMs or
containers which are constantly being cloned and too few sysadmins. That's
where those solutions shine (i.e. dedup across "space", not "time").

Cheers
--
t
signature.asc

hw

unread,
Nov 9, 2022, 3:50:06 AM11/9/22
to
On Tue, 2022-11-08 at 10:26 +0100, didier gaumet wrote:
> Le 08/11/2022 à 04:49, hw a écrit :
> [...]
> > When I want to have 2 (or more) generations of backups, do I actually want
> > deduplication?  It leaves me with only one actual copy of the data which
> > seems
> > to defeat the idea of having multiple generations of backups at least to
> > some
> > extent.
> [...]
>
> I would think there is also a confusion here (in my opinion, but I may
> be wrong):
>
> - deduplication is the action of preventing or correcting an object from
> having multiples occurences. The criteria here is: are objects identical?
>
> - incremental/differential backup(2) is the action of backuping only
> objects (or deltas of objects) that have varied between backups. Thus
> forbiding duplicates (on the target storage) of objects that have not
> varied.
> But that definitely does not suppress duplicates on the source storage
> (that you want to backup) nor prevent to backup these duplicates, thus
> having duplicates on the target storage

When you keep N full generations of backups it's different. Using rsync, you'll
only write the changes anyway, switching between the generations. Most of the
data is being stored N times.

Now the question is if it makes sense to keep N full generations of backups when
you can use snapshots and/or deduplication to save space. Since the data isn't
stored N times anymore, you save space but you have only one copy of most of the
data.

Do you actually need these N copies? With backups on tapes you can switch
between, or with backups on multiple machines, it's easily an advantage to have
N copies. But when you have it all on the same machine, is there an advantage
to having N copies?

One reason for having N copies would be to be able to go back in time. But you
can do that with snapshots and that reason goes away.

Another reason is that the single copy may get damaged. But when it's all on
the same machine anyway, does it matter?

hw

unread,
Nov 9, 2022, 4:30:05 AM11/9/22
to
On Tue, 2022-11-08 at 10:04 +0100, didier gaumet wrote:
> Le 08/11/2022 à 05:13, hw a écrit :
> > On Mon, 2022-11-07 at 13:57 -0500, rhkr...@gmail.com wrote:
> > >
> > >
> > > I didn't (and don't) know much about deduplication (beyond what you might
> > > deduce from the name), so I google and found this article which was
> > > helpful to
> > > me:
> > >
> > >     *
> > > [[https://www.linkedin.com/pulse/lets-know-vdo-virtual-data-optimizer-
> > > ganesh-gaikwad][Lets know about VDO (virtual data optimizer)]]
> >
> > That's a good pointer, but I still wonder how VDO actually works.


> [...]

> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/vdo-integration
> and blog, that exposes performance trade-off:
> https://www.redhat.com/en/blog/look-vdo-new-linux-compression-layer
>
> from what I understand, VDO was designed as a layer in kernel space to
> provide deduplication and compression features to local or distributed
> filesystems that lack it. The goal being primarily to optimize storage
> space for a provider of networked virtual machines to entities or customers
>

Yes, I've seen those. I can only wonder how much performance impact VDO would
have for backups. And I wonder why it doesn't require as much memory as ZFS
seems to need for deduplication.

didier gaumet

unread,
Nov 9, 2022, 5:10:05 AM11/9/22
to
Le 09/11/2022 à 10:27, hw a écrit :
[...]
> Yes, I've seen those. I can only wonder how much performance impact VDO would
> have for backups. And I wonder why it doesn't require as much memory as ZFS
> seems to need for deduplication.

It's *only* an hypothesis, but I would suppose that ZFS was designed
(originally by Sun, hardware vendor) primarily with performances in
mind, at the expense of strong hardware needs, while RedHat (primarily
software editor before its acquisition by IBM) designed VDO more with
TCO and integration of already existant customer infrastructure in mind,
at the expense of pure performances.

hw

unread,
Nov 9, 2022, 5:20:06 AM11/9/22
to
On Tue, 2022-11-08 at 11:11 +0100, Thomas Schmitt wrote:
> Hi,
>
> hw wrote:
> > I still wonder how VDO actually works.
>
> There is a comparer/decider named UDS which holds an index of the valid
> storage blocks, and a device driver named VDO which performes the
> deduplication and hides its internals from the user by providing a
> block device on top of the real storage device file.
>   https://www.marksei.com/vdo-linux-deduplication/
>

And how come that it doesn't require as much memory as ZFS seems to need for
deduplication? Apparently, ZFS uses either 128kB or variable block sizes[1] and
could use much less memory than VDO would have to because VDO uses much smaller
blocks.


[1]: https://en.wikipedia.org/wiki/ZFS#Variable_block_sizes

> > if I have a volume with 5TB of data on it and I write a 500kB file to that
> > volume a week later or whenever, and the file I'm writing is identical to
> > another file somewhere within the 5TB of data alreading on the volume, how
> > does VDO figure out that both files are identical?
>
> I understand that it chops your file into 4 KiB blocks
>   https://github.com/dm-vdo/kvdo/issues/18
> and lets UDS look up the checksum of each such block in the index. If a
> match is found, then the new block is not stored as itself but only as
> reference to the found block.

So the VDO ppl say 4kB is a good block size and larger blocks would suck for
performance. Does ZFS suck for performance because it uses larger block sizes,
and why doesn't ZFS use the smaller block sizes when those are the most
advantagous ones?

> This might yield more often deduplication than if the file was looked as a
> whole. But i still have doubts that this would yield much advantage with
> my own data.
> Main obstacle for partial matches is probably the demand for 4 KiB alignment.
> Neither text oriented files nor compressed files will necessarily hold their
> identical file parts with that alignment. Any shift of not exactly 4 KiB
> would make the similiarity invisible to UDS/VDO.

Deduplication doesn't work when files aren't sufficiently identical, no matter
what block size is used for comparing. It seems to make sense that the larger
the blocks are, the lower chances are that two blocks are identical.

So how come that deduplication with ZFS works at all? The large block sizes
would prevent that. Maybe it doesn't work well enough to be worth it?

Is ZFS compressing blocks or files when compression is enabled? Using variable
block sizes when compression is enabled might indicate that it compresses
blocks.

> didier gaumet wrote:
> > The goal being primarily to optimize storage space
> > for a provider of networked virtual machines to entities or customers
>
> Deduplicating over several nearly identical filesystem images might indeed
> bring good size reduction.

Well, it's independant of the file system. For VM images on whatever file
system or for N copies of the same backup differing only by the time the backup
was made, I don't see why both shouldn't work well.

> hw wrote:
> > When I want to have 2 (or more) generations of backups, do I actually want
> > deduplication?
>
> Deduplication reduces uncontrolled redundancy, backups shall create
> controlled redundancy. So both are not exactly contrary in their goal
> but surely need to be coordinated.

That's a really nice way to put it. Do I want/need controlled redundancy with
backups on the same machine, or is it better to use snapshots and/or
deduplication to reduce the controlled redundancy?

> In case of VDO i expect that you need to use different deduplicating
> devices to get controlled redundancy.

How would the devices matter? It's the volume residing on devices that gets
deduplicated, not the devices.

> I do similar with incremental backups with file granularity. My backup
> Blu-rays hold 200+ sessions which mostly re-use the file data storage
> of previous sessions. If a bad spot damages file content, then it is
> damaged in all sessions which refer to it.
> To reduce the probability of such a loss, i run several backups per day,
> each on a separate BD disc.
>
> From time to time i make verification runs on the backups discs in order
> to check for any damage. It is extreme rare to find a bas spot after the
> written session was verfied directly after being written.
> (The verification is based on MD5 checksums, which i deem sufficient,
> because my use case avoids the birthday paradox of probability theory.
> UDS/VDO looks like a giant birthday party. So i assume that it uses larger
> checksums or verifies content identy when checksums match.)

How can you make backups on Bluerays? They hold only 50GB or so and I'd need
thousands of them. Do you have an automatic changer that jiggles with 1000
0
DVDs or so? :)

hw

unread,
Nov 9, 2022, 5:20:06 AM11/9/22
to
On Wed, 2022-11-09 at 09:46 +0100, to...@tuxteam.de wrote:
> On Wed, Nov 09, 2022 at 09:39:45AM +0100, hw wrote:
>
> [...]
>
> > When you keep N full generations of backups it's different.  Using rsync,
> > you'll
> > only write the changes anyway, switching between the generations.  Most of
> > the
> > data is being stored N times.
>
> Pehaps you don't know about rsync's --link-dest option: you can, with rsync,
> keep generations without duplicating between them.

No, I didn't know that. My intention has always been to create N copies. I'm
trying to figure out if I should change that.

> But, as others have said, deduplication at the file system level (or below,
> as VDO does) is mainly interesting where you have a whole herd of VMs or
> containers which are constantly being cloned and too few sysadmins. That's
> where those solutions shine (i.e. dedup across "space", not "time").

Well, I'm undecided about that when it comes to backups ... Why I would clone
VMs or containers all the time?

didier gaumet

unread,
Nov 9, 2022, 5:40:05 AM11/9/22
to

I am no expert (in Linux, backporting or anything else) and cannot emit
a viable advice about what your backup plan should be. You are in better
position to evaluate your needs, your means and design a satisfying
backup plan accordingly.

What I was underlyning is that in my opinion you are confusing
deduplicating during backup and incremental/differential backups.
(Perhaps in your context that has no consequences and is thus unimportant).
To *me* what you are talking about is incremental/differential backups,
not deduplicating backups.

For example, I am myself using Deja-Dup (based upon Duplicity, itself
based upon librsync) for my basic home laptop backup: it's incremental
backups but I would not call it deduplicating backups.

The Wikipedia deduplicating paragrah of their backup article has an
example of 100 identical workstations having a backup storage need
divided by 100 by deduplication.

Wikipedia "deduplicating" parapragh of their backup article:
https://en.wikipedia.org/wiki/Backup#Deduplication
Wikipedia "incremental" parapragh of their backup article:
https://en.wikipedia.org/wiki/Backup#Incremental
Wikipedia "differential" parapragh of their backup article:
https://en.wikipedia.org/wiki/Backup#Differential

Thomas Schmitt

unread,
Nov 9, 2022, 6:10:05 AM11/9/22
to
Hi,

i wrote:
> >   https://github.com/dm-vdo/kvdo/issues/18

hw wrote:
> So the VDO ppl say 4kB is a good block size

They actually say that it's the only size which they support.


> Deduplication doesn't work when files aren't sufficiently identical,

The definition of sufficiently identical probably differs much between
VDO and ZFS.
ZFS has more knowledge about the files than VDO has. So it might be worth
for it to hold more info in memory.


> It seems to make sense that the larger
> the blocks are, the lower chances are that two blocks are identical.

Especially if the filesystem's block size is smaller than the VDO
block size, or if the filesystem does not align file content intervals
to block size, like ReiserFS does.


> So how come that deduplication with ZFS works at all?

Inner magic and knowledge about how blocks of data form a file object.
A filesystem does not have to hope that identical file content is
aligned to a fixed block size.


didier gaumet wrote:
> > > The goal being primarily to optimize storage space
> > > for a provider of networked virtual machines to entities or customers

I wrote:
> > Deduplicating over several nearly identical filesystem images might indeed
> > bring good size reduction.

hw wrote:
> Well, it's independant of the file system.

Not entirely. As stated above, i would expect VDO to work not well for
ReiserFS with its habit to squeeze data into unused parts of storage blocks.
(This made it great for storing many small files, but also led to some
performance loss by more fragmentation.)


> Do I want/need controlled redundancy with
> backups on the same machine, or is it better to use snapshots and/or
> deduplication to reduce the controlled redundancy?

I would want several independent backups on the first hand.

The highest risk for backup is when a backup storage gets overwritten or
updated. So i want several backups still untouched and valid, when the
storage hardware or the backup software begin to spoil things.

Deduplication increases the risk that a partial failure of the backup
storage damages more than one backup. On the other hand it decreases the
work load on the storage and the time window in which the backuped data
can become inconsistent on the application level.
Snapshot before backup reduces that window size to 0. But this still
does not prevent application level inconsistencies if the application is
caught in the act of reworking its files.

So i would use at least four independent storage facilities interchangeably.
I would make snapshots, if the filesystem supports them, and backup those
instead of the changeable filesystem.
I would try to reduce the activity of applications on the filesystem when
the snapshot is made.
I would allow each independent backup storage to do its own deduplication,
not sharing it with the other backup storages.


> > In case of VDO i expect that you need to use different deduplicating
> > devices to get controlled redundancy.

> How would the devices matter? It's the volume residing on devices that gets
> deduplicated, not the devices.

I understand that one VDO device implements one deduplication.
So if no sharing of deduplication is desired between the backups, then i
expect that each backup storage needs its own VDO device.


> How can you make backups on Bluerays? They hold only 50GB or so and I'd
> need thousands of them.

My backup needs are much smaller than yours, obviously.
I have an active $HOME tree of about 4 GB and some large but less agile
data hoard of about 500 GB.
The former gets backuped 5 times per day on appendable 25 GB BD media
(as stated, 200+ days fit on one BD).
The latter gets an incremental update on a single-session 25 GB BD every
other day. A new base backup needs about 20 BD media. Each time the
single update BD is full, it joins the base backup in its cake box and a
new incremental level gets started.

If you have much more valuable data to backup then you will probably
decide for rotating magnetic storage. Not only for capacity but also for
the price/capacity ratio.
But you should consider to have at least some of your backups on
removable media, e.g. hard disks in USB boxes. Only those can be isolated
from the risks of daily operation, which i deem crucial for safe backup.

to...@tuxteam.de

unread,
Nov 9, 2022, 6:20:05 AM11/9/22
to
On Wed, Nov 09, 2022 at 11:15:15AM +0100, hw wrote:
> On Wed, 2022-11-09 at 09:46 +0100, to...@tuxteam.de wrote:

[...]

> > Pehaps you don't know about rsync's --link-dest option: you can, with rsync,
> > keep generations without duplicating between them.
>
> No, I didn't know that. My intention has always been to create N copies. I'm
> trying to figure out if I should change that.

It's nifty, believe me :)

> > But, as others have said, deduplication at the file system level (or below,
> > as VDO does) is mainly interesting where you have a whole herd of VMs [...]

> Well, I'm undecided about that when it comes to backups ... Why I would clone
> VMs or containers all the time?

Then you're better served with deduplication built into the backup, not
into the file system or (worse!) into the block layer. Either the simple
rsync --link-dest or the more "pro" à la backuppc et al (most of them do
have deduplication).

Cheers
--
t
signature.asc

hw

unread,
Nov 9, 2022, 6:30:06 AM11/9/22
to
On Tue, 2022-11-08 at 07:19 -0500, Dan Ritter wrote:
> hw wrote:
> > > As you say, deduplication in backup systems is quite common, and works
> > > pretty well. There's also an on-disk non-filesystem utility, rdfind,
> > > which is packaged in Debian. It can discover identical files and make
> > > them hardlinks.
> >
> > Well, if I had all the disk space to hold 2 full copies of the data to be
> > able
> > to deduplicate it only later, I wouldn't need to deduplicate anything.
>
> Only two copies? That's not a good use case for any of the
> deduplicators.

Why not?

> The point of rdfind is to use it in a cron job while some process is
> generating duplicate files. For example, a backup process that copies a
> filesystem every six hours will generate four identical copies of almost
> every file each day. (rsnapshot would do a better job, here.)

That only works when you can make the backups fast enough and have sufficient
disk space to create so many copies.

> > And how would pretending there are two backups while there's actually only
> > one
> > because it got deduplicated be better than having only one backup to begin
> > with?
> > (Yeah I haven't thought of that before ...)
>
> It's not two backups, it's two very similar backups taken at
> different times, so the majority of the files are the same but
> some are different.

right

> If you want a second backup, it needs to go
> on a different machine, preferably in a different location.

That would certainly be an advantage, and I wouldn't want to deduplicate the
copies.

> Maybe you should tell us what your actual use case is rather
> than asking about realtime deduplication? It could be that
> there's a completely different solution which would make you
> happy.

The use case comes down to making backups once in a while. When making another
backup, at least the latest previous backup must not be overwritten. Sooner or
later, there won't be enough disk space to keep two full backups. With disk
prices as crazy high as they currently are, I might even move discs from the
backup server to the active server when it runs out of space before I move data
into archive (without backup) or start deleting stuff. All prices keep going
up, so I don't expect disc prices to go down.

Deduplication is only one possible way to go about it. I'm undecided if it's
better to have only one full backup and to use snapshots instead. Deduplicating
the backups would kinda turn two copies into only one for whatever gets
deduplicated, so that might not be better as snapshots. Or I could use both and
perhaps save even more space.

hw

unread,
Nov 9, 2022, 6:50:04 AM11/9/22
to
On Wed, 2022-11-09 at 10:35 +0100, DdB wrote:
> Am 09.11.2022 um 09:24 schrieb hw:
> > > Learn more about ZFS and invest in hardware to get performance.
> > Hardware like?  In theory, using SSDs for cache with ZFS should improve
> > performance.  In practise, it only wore out the SSDs after a while, and now
> > it's
> > not any faster without SSD cache.
> >
> >
>
> metoo had that unpleasant experience of a worn out SSD that had been
> used as a ZFS cache and zil device. After that, my next comp got huge
> amounts of ECC-RAM, freeing up the SSD for the OS and stuff.

I don't have anything without ECC RAM, and my server was never meant for ZFS.

> Also i did
> change the geometry of the main pool to a collection of mirrors (much
> faster than raid) and left the raid only on the slower backup server.

But then you have less capacity ...

> Due to snapshots and increments, i am now backing up only once in 2
> weeks, which takes somewhat around 1 hour bcoz of a slow connection. But
> i am satisfied with zfs performance from spinning rust, if i dont fill
> up the pool too much, and defrag after a while ... ;-)

With mirroring, I could fit only one backup, not two.

In any case, I'm currently tending to think that putting FreeBSD with ZFS on my
server might be the best option. But then, apparently I won't be able to
configure the controller cards, so that won't really work. And ZFS with Linux
isn't so great because it keeps fuse in between.

hw

unread,
Nov 9, 2022, 7:00:05 AM11/9/22
to
On Wed, 2022-11-09 at 12:13 +0100, to...@tuxteam.de wrote:
> On Wed, Nov 09, 2022 at 11:15:15AM +0100, hw wrote:
> > On Wed, 2022-11-09 at 09:46 +0100, to...@tuxteam.de wrote:
>
> [...]
> > > But, as others have said, deduplication at the file system level (or
> > > below,
> > > as VDO does) is mainly interesting where you have a whole herd of VMs
> > > [...]
>
> > Well, I'm undecided about that when it comes to backups ...  Why I would
> > clone
> > VMs or containers all the time?
>
> Then you're better served with deduplication built into the backup, not
> into the file system or (worse!) into the block layer. Either the simple
> rsync --link-dest or the more "pro" à la backuppc et al (most of them do
> have deduplication).

Both only works when I don't keep two copies. Aren't snapshots better than
using hardlinks like that?

hw

unread,
Nov 9, 2022, 7:20:05 AM11/9/22
to
On Wed, 2022-11-09 at 11:37 +0100, didier gaumet wrote:
>
> I am no expert (in Linux, backporting or anything else) and cannot emit
> a viable advice about what your backup plan should be. You are in better
> position to evaluate your needs, your means and design a satisfying
> backup plan accordingly.
>
> What I was underlyning is that in my opinion you are confusing
> deduplicating during backup and incremental/differential backups.
> (Perhaps in your context that has no consequences and is thus unimportant).
> To *me* what you are talking about is incremental/differential backups,
> not deduplicating backups.

I don't know why you think that. To clarify, I haven't been making incremental
backups. Instead, I keep two full backups that were created at different times.
Making incremental backups through snapshots or other means is very different.
Perhaps it's a better solution because it needs less disk space.

> For example, I am myself using Deja-Dup (based upon Duplicity, itself
> based upon librsync) for my basic home laptop backup: it's incremental
> backups but I would not call it deduplicating backups.
>
> The Wikipedia deduplicating paragrah of their backup article has an
> example of 100 identical workstations having a backup storage need
> divided by 100 by deduplication.
>
> Wikipedia "deduplicating" parapragh of their backup article:
> https://en.wikipedia.org/wiki/Backup#Deduplication
> Wikipedia "incremental" parapragh of their backup article:
> https://en.wikipedia.org/wiki/Backup#Incremental
> Wikipedia "differential" parapragh of their backup article:
> https://en.wikipedia.org/wiki/Backup#Differential

I've made it simple and haven't done any of this :) It's only two full copies
that get updated interchangeably.

This isn't to say that it would be better or worse. Only when you have two
copies of almost the same data, you can easily think that it would make sense to
deduplicate them. What's confusing about it?

hw

unread,
Nov 9, 2022, 7:30:05 AM11/9/22
to
On Tue, 2022-11-08 at 09:52 +0100, DdB wrote:
> Am 08.11.2022 um 05:31 schrieb hw:
> > > That's only one point.
> > What are the others?
> >
> > >  And it's not really some valid one, I think, as
> > > you do typically not run into space problems with one single action
> > > (YMMV). Running multiple sessions and out-of-band deduplication between
> > > them works for me.
> > That still requires you to have enough disk space for at least two full
> > backups.
> > I can see it working for three backups because you can deduplicate the first
> > two, but not for two.  And why would I deduplicate when I have sufficient
> > disk
> > space.
> >
> Your wording likely confuses 2 different concepts:

Noooo, I'm not confusing that :) Everyone says so and I don't know why ...

> Deduplication avoids storing identical data more than once.
> whereas
> Redundancy stores information on more than one place on purpose to avoid
> loos of data in case of havoc.
> ZFS can do both, as it combines the features of a volume manager with
> those of a filesystem and a software RAID.( I am using zfsonlinux since
> its early days, for over 10 years now, but without dedup. )
>
> In the past, i used shifting/rotating external backup media for that
> purpose, because, as the saying goes: RAID is NOT a backup! Today, i
> have a second server only for the backups, using zfs as well, which
> allows for easy incremental backups, minimizing traffic and disk usage.
>
> but you should be clear as to what you want: redundancy or deduplication?

The question is rather if it makes sense to have two full backups on the same
machine for redundancy and to be able to go back in time, or if it's better to
give up on redundancy and to have only one copy and use snapshots or whatever to
be able to go back in time.

Of course it would better to have more than one machine, but I don't have that.

hw

unread,
Nov 9, 2022, 7:50:05 AM11/9/22
to
On Tue, 2022-11-08 at 10:04 -0500, The Wanderer wrote:
> On 2022-11-08 at 09:36, Nicolas George wrote:
>
> > Curt (12022-11-08):
> >
> > > Redundancy sounds a lot like a back up.
> >
> > RAID also sounds a lot like a backup, and the R means redundant.
> >
> > Yet raid is not a backup.
>
> That depends on which sense of the word "backup" you are using.
>
> No, it's not a "backup" in the technical "back it up to tape" sense of
> the word. There are many types of data-loss scenarios in which it will
> not protect you at all.
>
> But it does mean that if one drive fails, you can still fall back to the
> copy on the other drive, and thus that copy is serving as a backup to
> the copy on the first drive. There are some data-loss scenarios in which
> RAID will protect you.
>
> That more general sense of "backup" as in "something that you can fall
> back on" is no less legitimate than the technical sense given above, and
> it always rubs me the wrong way to see the unconditional "RAID is not a
> backup" trotted out blindly as if that technical sense were the only one
> that could possibly be considered applicable, and without any
> acknowledgment of the limited sense of "backup" which is being used in
> that statement.

RAID is like the incarnation of (having) something to fall back on. Backups
aren't.

Dan Ritter

unread,
Nov 9, 2022, 8:00:06 AM11/9/22
to
hw wrote:
>
> The question is rather if it makes sense to have two full backups on the same
> machine for redundancy and to be able to go back in time, or if it's better to
> give up on redundancy and to have only one copy and use snapshots or whatever to
> be able to go back in time.


And for this, we have a clear answer: if you have two full
backups on one machine, it is very likely that a disaster will
kill them both. Therefore, you should consider it one full
backup.

Therefore, improving your functionality with some snapshot
mechanism will make your life better without increasing your
risk.

You might consider identifying some subset of the data that you
really, really care about, and using a locally encrypting remote
backup service to make a second copy elsewhere -- or, if it
doesn't change much, occasionally attaching a storage device,
copying it, and moving the storage device to some safe location
elsewhere.

-dsr-

hw

unread,
Nov 9, 2022, 8:00:06 AM11/9/22
to
On Tue, 2022-11-08 at 15:07 +0100, hede wrote:
> On 08.11.2022 05:31, hw wrote:
> > That still requires you to have enough disk space for at least two full
> > backups.
>
> Correct, if you do always full backups then the second run will consume
> full backup space in the first place. (not fully correct with bees
> running -> *)

Does that work? Does bees run as long as there's something to deduplicate and
only stops when there isn't? I thought you start it when the data is place and
not before that.

> That would be the first thing I'd address. Even the simplest backup
> solutions (i.e. based on rsync) do make use of destination rotation and
> only submitting changes to the backup (-> incremental or differential
> backups). I never considered successive full backups as a backup
> "solution".

You can easily make changes to two full copies --- "make changes" meaning that
you only change what has been changed since last time you made the backup.

> For me only the first backup is a full backup, every other backup is
> incremental.

When you make a second full backup, that second copy is not incremental. It's a
full backup.

> Regarding dedublication, I do see benefits in dedublication either if
> the user moves files from one directory to some other directory, in
> partly changed files (my backup solution dedubes on file basis via
> hardlinks only), and with system backups of several different machines.

But not with copies?

> I prefer file based backups. So my backup solutions dedublication skills
> are really limited. But a good block based backup solution can handle
> all these cases by itself. Then no filesystem based dedublication is
> needed.

What difference does it make wether the deduplication is block based or somehow
file based (whatever that means).

> If your problem is only backup related and you are flexible regarding
> your backup solution, then probably choosing a backup solution with a
> good dedublication feature should be your best choice. The solution
> don't has to be complex. Even simple backup solutions like borg backup
> are fine here (borg: chunk based deduplication even of parts of files
> across several backups of several different machines). Even your
> criteria to not write duplicate data in the first place is fulfilled
> here.

I'm flexible, but I distrust "backup solutions".

> (see borgbackup in Debian repository; disclaimer: I do not have personal
> experience with borg as I'm using other solutions)
>
> >  I wouldn't mind running it from time to time, though I don't know that
> > I
> > would have a lot of duplicate data other than backups.  How much space
> > might I
> > expect to gain from using bees, and how much memory does it require to
> > run?
>
> Bees should run as a service 24/7 and catches all written data right
> after it gets written. That's comparable to in-band dedublication even
> if it's out-of-band by definition. (*) This way writing many duplicate
> files will potentially result in removing duplicates even if not all
> data has already written to disk.
>
> Therefore also memory consumption is like with in-band deduplication
> (ZFS...), which means you should reserve more than 1 GB RAM per 1 TB
> data. But it's flexible. Even less memory is usable. But then it cannot
> find all duplicates as the hash table of all the data doesn't fit into
> memory. (Nevertheless even then dedublication is more efficient than
> expected: if it finds some duplicate block it looks for any blocks
> around this block. So for big files only one match in the hash table is
> sufficient to dedublicate the whole file.)

Sounds good. Before I try it, I need to make a backup in case something goes
wrong.

didier gaumet

unread,
Nov 9, 2022, 8:20:06 AM11/9/22
to
Le 09/11/2022 à 13:12, hw a écrit :
> On Wed, 2022-11-09 at 11:37 +0100, didier gaumet wrote:
[...]
>> in my opinion you are confusing
>> deduplicating during backup and incremental/differential backups.
[...]
> I don't know why you think that.[...]

Because earlier in a previous message you stated:
"When I want to have 2 (or more) generations of backups, do I actually
want deduplication? It leaves me with only one actual copy of the data
which seems to defeat the idea of having multiple generations of backups
at least to some extent."

To me you are considering that this is a deduplication that is leaving
only one backup object from multiple objects across time (the multiple
variations of a single object). And I do not agree: if you do a two full
(not differential nor incremental) backups with a deduplicating backup
tool you will obtain 2 backup objects from one source object having two
different states of evolution.

So I think that you are using the word "deduplication" but are really
talking about incremental or differential backup features. But I am
perhaps nitpicking here and that is probably not important in your
context :-)

Nicolas George

unread,
Nov 9, 2022, 8:30:06 AM11/9/22
to
hw (12022-11-08):
> When I want to have 2 (or more) generations of backups, do I actually want
> deduplication? It leaves me with only one actual copy of the data which seems
> to defeat the idea of having multiple generations of backups at least to some
> extent.

The idea of having multiple generations of backups is not to have the
data physically present in multiple places, this is the role of RAID.

The idea if having multiple generations of backups is that if you
accidentally overwrite half your almost-completed novel with lines of
ALL WORK AND NO PLAY MAKES JACK A DULL BOY and the backup tool runs
before you notice it, you still have the precious data in the previous
generation.

Regards,

--
Nicolas George

hw

unread,
Nov 9, 2022, 8:30:07 AM11/9/22
to
On Wed, 2022-11-09 at 11:05 +0100, didier gaumet wrote:
> Le 09/11/2022 à 10:27, hw a écrit :
> [...]
> > Yes, I've seen those.  I can only wonder how much performance impact VDO
> > would
> > have for backups.  And I wonder why it doesn't require as much memory as ZFS
> > seems to need for deduplication.
>
> It's *only* an hypothesis, but I would suppose that ZFS was designed
> (originally by Sun, hardware vendor) primarily with performances in
> mind,

I don't think it was, see https://docs.freebsd.org/en/books/handbook/zfs/

I does mention performance, but I remember other statements saying that was
designed for arrays with 40+ disks and, besides data integrity, with ease of use
in mind. Performance doesn't seem paramount. Also see
https://wiki.gentoo.org/wiki/ZFS

> at the expense of strong hardware needs, while RedHat (primarily
> software editor before its acquisition by IBM) designed VDO more with
> TCO and integration of already existant customer infrastructure in mind,
> at the expense of pure performances.

Well, the question is what you mean by performance. Maybe ZFS can deduplicate
faster than VDO, but eating tons of RAM and/or having to replace all the
hardware may not be a kind of performance one would be looking for.

didier gaumet

unread,
Nov 9, 2022, 8:40:05 AM11/9/22
to
Le 09/11/2022 à 12:41, hw a écrit :
[...]
> In any case, I'm currently tending to think that putting FreeBSD with ZFS on my
> server might be the best option. But then, apparently I won't be able to
> configure the controller cards, so that won't really work. And ZFS with Linux
> isn't so great because it keeps fuse in between.

I am really not so well aware of ZFS state but my impression was that:
- FUSE implementation of ZoL (ZFS on Linux) is deprecated and that,
Ubuntu excepted (classic module?), ZFS is now integrated by a DKMS module
- *BSDs integrate directly ZFS because there are no licences conflicts
- *BSDs nowadays have departed from old ZFS code and use the same source
code stack as Linux (OpenZFS)
- Linux distros don't directly integrate ZFS because they generally
consider there are licences conflicts. The notable exception being
Ubuntu that considers that after legal review the situation is clear and
there is no licence conflicts.

didier gaumet

unread,
Nov 9, 2022, 8:50:05 AM11/9/22
to
Le 09/11/2022 à 14:25, hw a écrit :

> I don't think it was, see https://docs.freebsd.org/en/books/handbook/zfs/
>
> I does mention performance, but I remember other statements saying that was
> designed for arrays with 40+ disks and, besides data integrity, with ease of use
> in mind. Performance doesn't seem paramount. Also see
> https://wiki.gentoo.org/wiki/ZFS

> Well, the question is what you mean by performance. Maybe ZFS can deduplicate
> faster than VDO, but eating tons of RAM and/or having to replace all the
> hardware may not be a kind of performance one would be looking for.

My bad: I'm french and my english is not as fluent as I would like it to
be ;-)

I was using the word "performance" here as I would have in french (same
word), thinking of technical abilities (speed, scalability and so on)
without realizing that in english in the particular context of computer
science that means primarily speed (if I understand correctly) :-)

hw

unread,
Nov 9, 2022, 11:40:05 AM11/9/22
to
Hm, no, performance refers to how well something is up to something, like
fulfilling some requirements or achieving some goal. If something is fast, it
can be performant. If something doesn't require much RAM, it can as well be
performant, may it be slow or not. It all depends on what you want from
something.

hw

unread,
Nov 9, 2022, 12:20:05 PM11/9/22
to
On Wed, 2022-11-09 at 14:29 +0100, didier gaumet wrote:
> Le 09/11/2022 à 12:41, hw a écrit :
> [...]
> > In any case, I'm currently tending to think that putting FreeBSD with ZFS on
> > my
> > server might be the best option.  But then, apparently I won't be able to
> > configure the controller cards, so that won't really work.  And ZFS with
> > Linux
> > isn't so great because it keeps fuse in between.
>
> I am really not so well aware of ZFS state but my impression was that:
> - FUSE implementation of ZoL (ZFS on Linux) is deprecated and that,
> Ubuntu excepted (classic module?), ZFS is now integrated by a DKMS module

Hm that could be. Debian doesn't seem to have it as a module.

> - *BSDs integrate directly ZFS because there are no licences conflicts
> - *BSDs nowadays have departed from old ZFS code and use the same source
> code stack as Linux (OpenZFS)
> - Linux distros don't directly integrate ZFS because they generally
> consider there are licences conflicts. The notable exception being
> Ubuntu that considers that after legal review the situation is clear and
> there is no licence conflicts.

Well, I'm not touching Ubuntu. I want to get away from Fedora because of their
hostility and that includes Centos since that has become a derivative of it.
FreeBSD has ZFS but can't even configure the disk controllers, so that won't
work. I don't want to go with Gentoo because updating is a nightmare to the
point where you suddenly find yourself unable to update at all because they
broke something. Arch is apparently for machosists, and I don't want
derivatives, especially not Ubuntu, and that leaves only Debian. I don't want
Debian either because when they introduced their brokenarch, they managed to
make it so that NVIDIA drivers didn't work anymore with no fix in sight and
broke other stuff as well, and you can't let your users down like that. But
what's the alternative?

However, Debian has apparently bad ZFS support (apparently still only Gentoo
actually supports it), so I'd go with btrfs. Now that's gona suck because I'd
have to use mdadm to create a RAID5 (or use the hardware RAID but that isn't fun
after I've seen the hardware RAID refusing to rebuild a volume after a failed
disk was replaced) and put btrfs on that because btrfs doesn't even support
RAID5.

Or what else?

hw

unread,
Nov 9, 2022, 12:30:05 PM11/9/22
to
On Wed, 2022-11-09 at 17:29 +0100, DdB wrote:
> Am 09.11.2022 um 12:41 schrieb hw:
> > In any case, I'm currently tending to think that putting FreeBSD with ZFS on
> > my
> > server might be the best option.  But then, apparently I won't be able to
> > configure the controller cards, so that won't really work.  And ZFS with
> > Linux
> > isn't so great because it keeps fuse in between.
>
> NO fuse, neither FreeBSD nor debian would need the outdated zfs-fuse,
> use the in.kernel modules from zfsonlinux.org (packages for debian are
> in contrib IIRC).
>

Ok, all the better --- I only looked at the package management. Ah, let me see,
I have a Debian VM and no contrib ... hm, zfs-dkms and such? That's promising,
thank you :)

Christoph Brinkhaus

unread,
Nov 9, 2022, 12:40:06 PM11/9/22
to
Am Wed, Nov 09, 2022 at 06:11:34PM +0100 schrieb hw:

Hi hw,

> On Wed, 2022-11-09 at 14:29 +0100, didier gaumet wrote:
> > Le 09/11/2022 à 12:41, hw a écrit :
> > [...]
> > > In any case, I'm currently tending to think that putting FreeBSD with ZFS on
> > > my
> > > server might be the best option.  But then, apparently I won't be able to
> > > configure the controller cards, so that won't really work.  And ZFS with
> > > Linux
> > > isn't so great because it keeps fuse in between.
> >
> > I am really not so well aware of ZFS state but my impression was that:
> > - FUSE implementation of ZoL (ZFS on Linux) is deprecated and that,
> > Ubuntu excepted (classic module?), ZFS is now integrated by a DKMS module
>
> Hm that could be. Debian doesn't seem to have it as a module.
>
> > - *BSDs integrate directly ZFS because there are no licences conflicts
> > - *BSDs nowadays have departed from old ZFS code and use the same source
> > code stack as Linux (OpenZFS)
> > - Linux distros don't directly integrate ZFS because they generally
> > consider there are licences conflicts. The notable exception being
> > Ubuntu that considers that after legal review the situation is clear and
> > there is no licence conflicts.
>
> Well, I'm not touching Ubuntu. I want to get away from Fedora because of their
> hostility and that includes Centos since that has become a derivative of it.
> FreeBSD has ZFS but can't even configure the disk controllers, so that won't
> work.

If I understand you right you mean RAID controllers?
According to my knowledge ZFS should be used without any RAID
controllers. Disks or better partions are fine.

> I don't want to go with Gentoo because updating is a nightmare to the
> point where you suddenly find yourself unable to update at all because they
> broke something. Arch is apparently for machosists, and I don't want
> derivatives, especially not Ubuntu, and that leaves only Debian. I don't want
> Debian either because when they introduced their brokenarch, they managed to
> make it so that NVIDIA drivers didn't work anymore with no fix in sight and
> broke other stuff as well, and you can't let your users down like that. But
> what's the alternative?
>
> However, Debian has apparently bad ZFS support (apparently still only Gentoo
> actually supports it), so I'd go with btrfs.

I have no knowledge about the status of ZFS on Linux distributions,
just about FreeBSD.

> Now that's gona suck because I'd
> have to use mdadm to create a RAID5 (or use the hardware RAID but that isn't fun
> after I've seen the hardware RAID refusing to rebuild a volume after a failed
> disk was replaced) and put btrfs on that because btrfs doesn't even support
> RAID5.
>
> Or what else?
>
Kind regards,
Christoph

Linux-Fan

unread,
Nov 9, 2022, 1:30:05 PM11/9/22
to
hw writes:

> On Wed, 2022-11-09 at 14:29 +0100, didier gaumet wrote:
> > Le 09/11/2022 à 12:41, hw a écrit :

[...]

> > I am really not so well aware of ZFS state but my impression was that:
> > - FUSE implementation of ZoL (ZFS on Linux) is deprecated and that,
> > Ubuntu excepted (classic module?), ZFS is now integrated by a DKMS module
>
> Hm that could be. Debian doesn't seem to have it as a module.

As already mentioned by others, zfs-dkms is readily available in the contrib
section along with zfsutils-linux. Here is what I noted down back when I
installed it:

https://masysma.net/37/zfs_commands_shortref.xhtml

I have been using ZFS on Linux on Debian since end of 2020 without any
issues. In fact, the dkms-based approach has run much more reliably than
my previous experiences with out-of-tree modules would have suggested...

My setup works with a mirrored zpool and no deduplication, I did not need
nor test anything else yet.

> > - *BSDs integrate directly ZFS because there are no licences conflicts
> > - *BSDs nowadays have departed from old ZFS code and use the same source
> > code stack as Linux (OpenZFS)
> > - Linux distros don't directly integrate ZFS because they generally
> > consider there are licences conflicts. The notable exception being
> > Ubuntu that considers that after legal review the situation is clear and
> > there is no licence conflicts.

[...]

> broke something. Arch is apparently for machosists, and I don't want
> derivatives, especially not Ubuntu, and that leaves only Debian. I don't
> want
> Debian either because when they introduced their brokenarch, they managed to
> make it so that NVIDIA drivers didn't work anymore with no fix in sight and
> broke other stuff as well, and you can't let your users down like that. But
> what's the alternative?

Nvidia drivers have been working for me in all releases from Debian 6 to 10 both
inclusive. I did not have any need for them on Debian 11 yet, since I have
switched to an AMD card for my most recent system.

> However, Debian has apparently bad ZFS support (apparently still only Gentoo
> actually supports it), so I'd go with btrfs. Now that's gona suck because

You can use ZFS on Debian (see link above). Of course it remains your choice
whether you want to trust your data to the older, but less-well-integrated
technology (ZFS) or to the newer, but more easily integrated technology
(BTRFS).

> I'd
> have to use mdadm to create a RAID5 (or use the hardware RAID but that isn't

AFAIK BTRFS also includes some integrated RAID support such that you do not
necessarily need to pair it with mdadm. It is advised against using for RAID
5 or 6 even in most recent Linux kernels, though:

https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid56-status-and-recommended-practices

RAID 5 and 6 have their own issues you should be aware of even when running
them with the time-proven and reliable mdadm stack. You can find a lot of
interesting results by searching for “RAID5 considered harmful” online. This
one is the classic that does not seem to make it to the top results, though:

https://www.baarf.dk/BAARF/RAID5_versus_RAID10.txt

If you want to go with mdadm (irrespective of RAID level), you might also
consider running ext4 and trade the complexity and features of the advanced
file systems for a good combination of stability and support.

> fun
> after I've seen the hardware RAID refusing to rebuild a volume after a failed
> disk was replaced) and put btrfs on that because btrfs doesn't even support
> RAID5.

YMMV
Linux-Fan

öö

hw

unread,
Nov 9, 2022, 10:50:05 PM11/9/22
to
On Wed, 2022-11-09 at 18:26 +0100, Christoph Brinkhaus wrote:
> Am Wed, Nov 09, 2022 at 06:11:34PM +0100 schrieb hw:
> [...]
> > FreeBSD has ZFS but can't even configure the disk controllers, so that won't
> > work. 
>
> If I understand you right you mean RAID controllers?

yes

> According to my knowledge ZFS should be used without any RAID
> controllers. Disks or better partions are fine.

I know, but it's what I have. JBOD controllers are difficult to find. And it
doesn't really matter because I can configure each disk as a single disk ---
still RAID though. It may even be an advantage because the controllers have 1GB
cache each and the computers CPU doesn't need to do command queuing.

And I've been reading that when using ZFS, you shouldn't make volumes with more
than 8 disks. That's very inconvenient.

Why would partitions be better than the block device itself? They're like an
additional layer and what could be faster and easier than directly using the
block devices?

hw

unread,
Nov 10, 2022, 12:00:05 AM11/10/22
to
On Wed, 2022-11-09 at 19:17 +0100, Linux-Fan wrote:
> hw writes:
>
> > On Wed, 2022-11-09 at 14:29 +0100, didier gaumet wrote:
> > > Le 09/11/2022 à 12:41, hw a écrit :
>
> [...]
>
> > > I am really not so well aware of ZFS state but my impression was that:
> > > - FUSE implementation of ZoL (ZFS on Linux) is deprecated and that,
> > > Ubuntu excepted (classic module?), ZFS is now integrated by a DKMS module
> >
> > Hm that could be.  Debian doesn't seem to have it as a module.
>
> As already mentioned by others, zfs-dkms is readily available in the contrib 
> section along with zfsutils-linux. Here is what I noted down back when I 
> installed it:
>
> https://masysma.net/37/zfs_commands_shortref.xhtml

Thanks, that's good information.

> I have been using ZFS on Linux on Debian since end of 2020 without any 
> issues. In fact, the dkms-based approach has run much more reliably than 
> my previous experiences with out-of-tree modules would have suggested...

Hm, issues? I have one:


ls -la
insgesamt 5
drwxr-xr-x 3 namefoo namefoo 3 16. Aug 22:36 .
drwxr-xr-x 24 root root 4096 1. Nov 2017 ..
drwxr-xr-x 2 namefoo namefoo 2 21. Jan 2020 ?
namefoo@host /srv/datadir $ ls -la '?'
ls: Zugriff auf ? nicht möglich: Datei oder Verzeichnis nicht gefunden
namefoo@host /srv/datadir $


This directory named ? appeared on a ZFS volume for no reason and I can't access
it and can't delete it. A scrub doesn't repair it. It doesn't seem to do any
harm yet, but it's annoying.

Any idea how to fix that?

> Nvidia drivers have been working for me in all releases from Debian 6 to 10
> both 
> inclusive. I did not have any need for them on Debian 11 yet, since I have 
> switched to an AMD card for my most recent system.
>

Maybe it was longer ago. I recently switched to AMD, too. NVIDIA remains
uncooperative and their drivers are a hassle, so why would I support NVIDIA by
buying their products. It was a good choice and it just works out of the box.

I can't get the 2nd monitor to work, but that's probably not an AMD issue.

> > However, Debian has apparently bad ZFS support (apparently still only Gentoo
> > actually supports it), so I'd go with btrfs.  Now that's gona suck because
>
> You can use ZFS on Debian (see link above). Of course it remains your choice 
> whether you want to trust your data to the older, but less-well-integrated 
> technology (ZFS) or to the newer, but more easily integrated technology 
> (BTRFS).
>
>

It's fine when using the kernel module. This isn't about newer, and ZFS seems
more mature than btrfs. Somehow, development of btrfs is excruciatingly slow.

If it doesn't work out, I can always do something else and make a new backup.

> > I'd
> > have to use mdadm to create a RAID5 (or use the hardware RAID but that isn't
>
> AFAIK BTRFS also includes some integrated RAID support such that you do not 
> necessarily need to pair it with mdadm.

Yes, but RAID56 is broken in btrfs.

> It is advised against using for RAID 
> 5 or 6 even in most recent Linux kernels, though:
>
> https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid56-status-and-recommended-practices
>

Yes, that's why I would have to use btrfs on mdadm when I want to make a RAID5.
That kinda sucks.

> RAID 5 and 6 have their own issues you should be aware of even when running 
> them with the time-proven and reliable mdadm stack. You can find a lot of 
> interesting results by searching for “RAID5 considered harmful” online. This 
> one is the classic that does not seem to make it to the top results, though:

Hm, really? The only time that RAID5 gave me trouble was when the hardware RAID
controller steadfastly refused to rebuild the array after a failed disk was
replaced. How often does that happen?

So yes, there are ppl saying that RAID5 is so bad, and I think it's exaggerated.
At at the end of the day, for all I know lightning could strike the server and
burn out all the disks and no alternative to RAID5 could prevent that. So all
variants of RAID are bad and ZFS and btrfs and whatever are all just as bad and
any way of storing data is bad because something could happen to the data.
Gathering data is actually bad to begin with and getting worse all the time.
The less data you have, the better, because less data is less unwieldy.

There is a write hole with RAID5? Well, I have an UPS and the controllers have
backup batteries. So is there really gona be a write hole? When I use mdadm, I
don't have a backup battery. Then what? Do JBOD controllers have backup
batteries or are you forced to use file systems that make them unnecessary?
Bits can flip and maybe whatever controls the RAID may not be able to tell which
copy is the one to use. The checksums ZFS and btrfs use may be insufficient and
then what. ZFS and btrfs may not be a good idea to use because the software,
like Centos 7, is too old and prefers xfs instead. Now what? Rebuild the
server like every year or so to use the latest and greatest? Oh no, the latest
and greatest may be unstable ...

More than one disk can fail? Sure can, and it's one of the reasons why I make
backups.

You also have to consider costs. How much do you want to spend on storage and
and on backups? And do you want make yourself crazy worrying about your data?

> https://www.baarf.dk/BAARF/RAID5_versus_RAID10.txt
>
> If you want to go with mdadm (irrespective of RAID level), you might also 
> consider running ext4 and trade the complexity and features of the advanced 
> file systems for a good combination of stability and support.
>

Is anyone still using ext4? I'm not saying it's bad or anything, it only seems
that it has gone out of fashion.

I'm considering using snapshots. Ext4 didn't have those last time I checked.

David Christensen

unread,
Nov 10, 2022, 12:40:05 AM11/10/22
to
On 11/9/22 05:29, didier gaumet wrote:

> - *BSDs nowadays have departed from old ZFS code and use the same source
> code stack as Linux (OpenZFS)


AIUI FreeBSD 12 and prior use ZFS-on-Linux code, while FreeBSD 13 and
later use OpenZFS code.



On 11/9/22 05:44, didier gaumet wrote:

> I was using the word "performance" here as I would have in french (same
> word), thinking of technical abilities (speed, scalability and so on)
> without realizing that in english in the particular context of computer
> science that means primarily speed (if I understand correctly) :-)


I tend to use the term "performance" to mean minimum processor cycles,
minimum memory consumption, minimum latency, and/or maximum data
transfer per unit time.


David

David Christensen

unread,
Nov 10, 2022, 12:40:05 AM11/10/22
to
On 11/9/22 03:08, Thomas Schmitt wrote:

> So i would use at least four independent storage facilities interchangeably.
> I would make snapshots, if the filesystem supports them, and backup those
> instead of the changeable filesystem.
> I would try to reduce the activity of applications on the filesystem when
> the snapshot is made.
> I would allow each independent backup storage to do its own deduplication,
> not sharing it with the other backup storages.


+1


David

David Christensen

unread,
Nov 10, 2022, 12:40:05 AM11/10/22
to
On 11/9/22 00:24, hw wrote:
> On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:

> Hmm, when you can backup like 3.5TB with that, maybe I should put
FreeBSD on my
> server and give ZFS a try. Worst thing that can happen is that it
crashes and
> I'd have made an experiment that wasn't successful. Best thing, I
guess, could
> be that it works and backups are way faster because the server
doesn't have to
> actually write so much data because it gets deduplicated and reading
from the
> clients is faster than writing to the server.


Be careful that you do not confuse a ~33 GiB full backup set, and 78
snapshots over six months of that same full backup set, with a full
backup of 3.5 TiB of data. I would suggest a 10 GiB pool to backup the
latter.


Writing to a ZFS filesystem with deduplication is much slower than
simply writing to, say, an ext4 filesystem -- because ZFS has to hash
every incoming block and see if it matches the hash of any existing
block in the destination pool. Storing the existing block hashes in a
dedicated dedup virtual device will expedite this process.


>> I run my backup script each night. It uses rsync to copy files and
>
> Aww, I can't really do that because my servers eats like 200-300W
because it has
> so many disks in it. Electricity is outrageously expensive here.


Perhaps platinum rated power supplies? Energy efficient HDD's/ SSD's?


>> directories from various LAN machines into ZFS filesystems named after
>> each host -- e.g. pool/backup/hostname (ZFS namespace) and
>> /var/local/backup/hostname (Unix filesystem namespace). I have a
>> cron(8) that runs zfs-auto-snapshot once each day and once each month
>> that takes a recursive snapshot of the pool/backup filesystems. Their
>> contents are then available via Unix namespace at
>> /var/local/backup/hostname/.zfs/snapshot/snapshotname. If I want to
>> restore a file from, say, two months ago, I use Unix filesystem tools to
>> get it.
>
> Sounds like a nice setup. Does that mean you use snapshots to keep
multiple
> generations of backups and make backups by overwriting everything
after you made
> a snapshot?


Yes.


> In that case, is deduplication that important/worthwhile? You're not
> duplicating it all by writing another generation of the backup but
store only
> what's different through making use of the snapshots.


Without deduplication or compression, my backup set and 78 snapshots
would require 3.5 TiB of storage. With deduplication and compression,
they require 86 GiB of storage.


> ... I only never got around to figure [ZFS snapshots] out because I
didn't have the need.


I accidentally trash files on occasion. Being able to restore them
quickly and easily with a cp(1), scp(1), etc., is a killer feature.
Users can recover their own files without needing help from a system
administrator.


> But it could also be useful for "little" things like taking a
snapshot of the
> root volume before updating or changing some configuration and being
able to
> easily to undo that.


FreeBSD with ZFS-on-root has a killer feature called "Boot Environments"
that has taken that idea to the next level:

https://klarasystems.com/articles/managing-boot-environments/


>> I have 3.5 TiB of backups.


It is useful to group files with similar characteristics (size,
workload, compressibility, duplicates, backup strategy, etc.) into
specific ZFS filesystems (or filesystem trees). You can then adjust ZFS
properties and backup strategies to match.


>>>> For compressed and/or encrypted archives, image, etc., I do not use
>>>> compression or de-duplication
>>>
>>> Yeah, they wouldn't compress. Why no deduplication?
>>
>>
>> Because I very much doubt that there will be duplicate blocks in
such files.
>
> Hm, would it hurt?


Yes. ZFS deduplication is resource intensive.


> Oh it's not about performance when degraded, but about performance.
IIRC when
> you have a ZFS pool that uses the equivalent of RAID5, you're still
limited to
> the speed of a single disk. When you have a mysql database on such a ZFS
> volume, it's dead slow, and removing the SSD cache when the SSDs
failed didn't
> make it any slower. Obviously, it was a bad idea to put the database
there, and
> I wouldn't do again when I can avoid it. I also had my data on such
a volume
> and I found that the performance with 6 disks left much to desire.


What were the makes and models of the 6 disks? Of the SSD's? If you
have a 'zpool status' console session from then, please post it.


Constructing a ZFS pool to match the workload is not easy. STFW there
are plenty of articles. Here is a general article I found recently:

https://klarasystems.com/articles/choosing-the-right-zfs-pool-layout/


MySQL appears to have the ability to use raw disks. Tuned correctly,
this should give the best results:

https://dev.mysql.com/doc/refman/8.0/en/innodb-system-tablespace.html#innodb-raw-devices


If ZFS performance is not up to your expectations, and there are no
hardware problems, next steps include benchmarking, tuning, and/or
adding or adjusting the hardware and its usage.


>> ... invest in hardware to get performance.

> Hardware like?


Server chassis, motherboards, chipsets, processors, memory, disk host
bus adapters, disk racks, disk drives, network interface cards, etc..



> In theory, using SSDs for cache with ZFS should improve
> performance. In practise, it only wore out the SSDs after a while,
and now it's
> not any faster without SSD cache.


Please run 'zpool status' and post the console session (prompt, command
entered, output displayed). Please correlate the vdev's to disk drive
makes and models.


On 11/9/22 03:41, hw wrote:

> I don't have anything without ECC RAM,


Nice.


> and my server was never meant for ZFS.


What is the make and model of your server?


> With mirroring, I could fit only one backup, not two.


Add another mirror to your pool. Or, use a process of substitution and
resilvering to replace existing drives with larger capacity drives.


> In any case, I'm currently tending to think that putting FreeBSD with ZFS on my
> server might be the best option. But then, apparently I won't be able to
> configure the controller cards, so that won't really work.


What is the make and model of your controller cards?


> And ZFS with Linux
> isn't so great because it keeps fuse in between.


+1

https://packages.debian.org/bullseye/zfs-dkms

https://packages.debian.org/bullseye/zfsutils-linux


On 11/9/22 03:20, hw wrote:

> The use case comes down to making backups once in a while. When
making another
> backup, at least the latest previous backup must not be overwritten.
Sooner or
> later, there won't be enough disk space to keep two full backups.
With disk
> prices as crazy high as they currently are, I might even move discs
from the
> backup server to the active server when it runs out of space before I
move data
> into archive (without backup) or start deleting stuff. All prices
keep going
> up, so I don't expect disc prices to go down.
>
> Deduplication is only one possible way to go about it. I'm undecided
if it's
> better to have only one full backup and to use snapshots instead.
Deduplicating
> the backups would kinda turn two copies into only one for whatever gets
> deduplicated, so that might not be better as snapshots. Or I could
use both and
> perhaps save even more space.

On 11/9/22 04:28, hw wrote:
> Of course it would better to have more than one machine, but I don't
have that.


If you already have a ZFS pool, the way to back it up is to replicate
the pool to another pool. Set up an external drive with a pool and
replicate your server pool to that periodically.


David

David Christensen

unread,
Nov 10, 2022, 12:40:05 AM11/10/22
to
On 11/9/22 01:35, DdB wrote:
> > But
> i am satisfied with zfs performance from spinning rust, if i dont fill
> up the pool too much, and defrag after a while ...


What is your technique for defragmenting ZFS?


David

gene heskett

unread,
Nov 10, 2022, 2:20:06 AM11/10/22
to
Which brings up another suggestion in two parts:

1: use amanda, with tar and compression to reduce the size of the
backups. And use a backup cycle of a week or 2 because amanda will if
advancing a level, only backup that which has been changed since the
last backup. On a quiet system, a level 3 backup for a 50gb network of
several machines can be under 100 megs. More on a busy system of course.
Amanda keeps track of all that automatically.

2: As disks fail, replace them with SSD's which use much less power than
spinning rust. And they are typically 5x faster than commodity spinning
rust.

Here, and historically with spinning rust, backing up 5 machines, at 3am
every morning is around 10gb total and under 45 minutes. This includes
the level 0's it does by self adjusting the schedule to spread the level
0's, AKA the fulls, out over the backup cycle so the amount of storage
used for any one backup run is fairly consistent.
> .

Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>

Christoph Brinkhaus

unread,
Nov 10, 2022, 4:40:06 AM11/10/22
to
Using the block device is no issue until you have a mirror or so.
In case of a mirror ZFS will use the capacity of the smallest drive.

I have read that a for example 100GB disk might be slightly larger
then 100GB. When you want to replace a 100GB disk with a spare one
which is less larger than the original one the pool will not fit on
the disk and the replacement fails.

With partitions you can specify the space. It does not hurt if there
are a few MB unallocated. But then the partitions of the diks have
exactly the same size.

Kind regards,
Christoph

hede

unread,
Nov 10, 2022, 5:50:05 AM11/10/22
to
On Wed, 09 Nov 2022 13:52:26 +0100 hw <h...@adminart.net> wrote:

> Does that work? Does bees run as long as there's something to
> deduplicate and
> only stops when there isn't?

Bees is a service (daemon) which runs 24/7 watching btrfs transaction
state (the checkpoints). If there are new transactions then it kicks in.
But it's a niced service (man nice, man ionice). If your backup process
has higher priority than "idle" (which is typically the case) and
produces high load it will potentially block out bees until the backup
is finished (maybe!).

> I thought you start it when the data is place and
> not before that.

That's the case with fdupes, duperemove, etc.

> You can easily make changes to two full copies --- "make changes"
> meaning that
> you only change what has been changed since last time you made the
> backup.

Do you mean to modify (make changes) to one of the backups? I never
considered making changes to my backups. I do make changes to the live
data and next time (when the incremental backup process runs) these
changes do get into backup storage. Making changes to some backups ... I
won't call that backups anymore.

Or do you mean you have two copies and alternatively "update" these
copies to reflect the live state? I do not see a benefit in this. At
least if both reside on the same storage system. There's a waste in
storage space (doubled files). One copy with many incremental backups
would be better. And if you plan to deduplicate both copys, simply use a
backup solution with incremental backups.

Syncing two adjacent copies means to submit all changes a second time,
which was already transferred for the first copy. The second copy is
still on some older state the moment you update this one.

Yet again I do prefer a single process for having one[sic] consistent
backup storage with a working history.

Two copies on two different locations is some other story, that indeed
can have benefits.

> > For me only the first backup is a full backup, every other backup is
> > incremental.
>
> When you make a second full backup, that second copy is not
> incremental. It's a
> full backup.

correct. That's the reason I do make incremental backups. And with
incremental backups I do mean that I can restore "full" backups for
several days: every day of the last week, one day for every month of the
year, even several days of past years and so on. But the whole backup of
all those "full" backups is not even two full backups in size. It's less
in size but offers more.

For me a single full backup needs several days (Terabytes via DSL upload
to the backup location) while incremental backups are MUCH faster
(typically a few minutes if there wasn't changed that much). So I use
the later one.

> What difference does it make wether the deduplication is block based or
> somehow
> file based (whatever that means).

File based deduplication means files do get compared in a whole. Result:
Two big and nearly identical files need to get stored in full: they do
differ.
Say for example a backup of a virtual machine image which got started
between two backup runs. More than 99% of the image is the same as
before, but because there's some log written inside the VM image they do
differ. Those files are nearly identical, even in position of identical
data.

Block based deduplication can find parts of a file to be exclusive
(changed blocks) and other parts to set shared (blocks with same
content):

#####
# btrfs fi du file1 file2

Total Exclusive Set shared Filename
2.30GiB 23.00MiB 2.28GiB file1
2.30GiB 149.62MiB 2.16GiB file2
#####
here both files share data but do also have their exclusive data.

> I'm flexible, but I distrust "backup solutions".

I would say, it depends on. I do also distrust everything, but some sane
solution maybe I do distrust a little less then my "self built" one. ;-)

Don't trust your own solution more than others "on principle", without
some real reasons for distrust.

> Sounds good. Before I try it, I need to make a backup in case
> something goes
> wrong.

;-)

regards
hede

Greg Wooledge

unread,
Nov 10, 2022, 7:10:06 AM11/10/22
to
On Thu, Nov 10, 2022 at 05:54:00AM +0100, hw wrote:
> ls -la
> insgesamt 5
> drwxr-xr-x 3 namefoo namefoo 3 16. Aug 22:36 .
> drwxr-xr-x 24 root root 4096 1. Nov 2017 ..
> drwxr-xr-x 2 namefoo namefoo 2 21. Jan 2020 ?
> namefoo@host /srv/datadir $ ls -la '?'
> ls: Zugriff auf ? nicht möglich: Datei oder Verzeichnis nicht gefunden
> namefoo@host /srv/datadir $
>
>
> This directory named ? appeared on a ZFS volume for no reason and I can't access
> it and can't delete it. A scrub doesn't repair it. It doesn't seem to do any
> harm yet, but it's annoying.
>
> Any idea how to fix that?

ls -la might not be showing you the true name. Try this:

printf %s * | hd

That should give you a hex dump of the bytes in the actual filename.

If you misrepresented the situation, and there's actually more than one
file in this directory, then use something like this instead:

shopt -s failglob
printf '%s\0' ? | hd

Note that the ? is *not* quoted here, because we want it to match any
one-character filename, no matter what that character actually is. If
this doesn't work, try ?? or * as the glob, until you manage to find it.

If it turns out that '?' really is the filename, then it becomes a ZFS
issue with which I can't help.

hw

unread,
Nov 10, 2022, 8:20:05 AM11/10/22
to
On Wed, 2022-11-09 at 12:08 +0100, Thomas Schmitt wrote:
> Hi,
>
> i wrote:
> > >   https://github.com/dm-vdo/kvdo/issues/18
>
> hw wrote:
> > So the VDO ppl say 4kB is a good block size
>
> They actually say that it's the only size which they support.
>
>
> > Deduplication doesn't work when files aren't sufficiently identical,
>
> The definition of sufficiently identical probably differs much between
> VDO and ZFS.
> ZFS has more knowledge about the files than VDO has. So it might be worth
> for it to hold more info in memory.

Dunno, apparently they keep checksums of blocks in memory. More checksums, more
memory ...

> > It seems to make sense that the larger
> > the blocks are, the lower chances are that two blocks are identical.
>
> Especially if the filesystem's block size is smaller than the VDO
> block size, or if the filesystem does not align file content intervals
> to block size, like ReiserFS does.

That would depend on the files.

> > So how come that deduplication with ZFS works at all?
>
> Inner magic and knowledge about how blocks of data form a file object.
> A filesystem does not have to hope that identical file content is
> aligned to a fixed block size.

No, but when it uses large blocks it can store more files in a block and won't
be able to deduplicate the identical files in a block because the blocks are
atoms in deduplication. The larger the blocks are, the less likely it seems
that multiple blocks are identical.

> didier gaumet wrote:
> > > > The goal being primarily to optimize storage space
> > > > for a provider of networked virtual machines to entities or customers
>
> I wrote:
> > > Deduplicating over several nearly identical filesystem images might indeed
> > > bring good size reduction.
>
> hw wrote:
> > Well, it's independant of the file system.
>
> Not entirely. As stated above, i would expect VDO to work not well for
> ReiserFS with its habit to squeeze data into unused parts of storage blocks.
> (This made it great for storing many small files, but also led to some
> performance loss by more fragmentation.)

VDO is independant of the file system, and 4k blocks are kinda small. It
doesn't matter how files are aligned to blocks of a file system because VDO
always uses chunks of 4k each and compares them and always works the same. You
can always create a file system with an unlucky block size for the files on it
or even one that makes sure that all the 4k blocks are not identical. We could
call it spitefs maybe :)

> > Do I want/need controlled redundancy with
> > backups on the same machine, or is it better to use snapshots and/or
> > deduplication to reduce the controlled redundancy?
>
> I would want several independent backups on the first hand.

Independent? Like two full copies like I'm making?

> The highest risk for backup is when a backup storage gets overwritten or
> updated. So i want several backups still untouched and valid, when the
> storage hardware or the backup software begin to spoil things.

That's what I thought, but I'm about to run out of disk space for multiple full
copies.

> Deduplication increases the risk that a partial failure of the backup
> storage damages more than one backup. On the other hand it decreases the
> work load on the storage

It may make all backups unusable because the single copy deduplication has left
has been damaged. However, how likely is a partial failure of a stoarge volume
to happen, and how relevant is it? How often does a storage volume --- the
underlying media doesn't necessarily matter; for example, when a disk goes bad
in a RAID, you replace it and keep going --- goes bad in only one place? When
the volume has gone away, so have all the copies.

> and the time window in which the backuped data
> can become inconsistent on the application level.

Huh?

> Snapshot before backup reduces that window size to 0. But this still
> does not prevent application level inconsistencies if the application is
> caught in the act of reworking its files.

You make the snapshot of the backup before starting to make a backup, not while
making one.

Or are you referring to the data being altered while a backup is in progress?

> So i would use at least four independent storage facilities interchangeably.
> I would make snapshots, if the filesystem supports them, and backup those
> instead of the changeable filesystem.
> I would try to reduce the activity of applications on the filesystem when
> the snapshot is made.

right

> I would allow each independent backup storage to do its own deduplication,
> not sharing it with the other backup storages.

If you have them on different machines or volumes, it would be difficult to do
it otherwise.

> > > In case of VDO i expect that you need to use different deduplicating
> > > devices to get controlled redundancy.
>
> > How would the devices matter?  It's the volume residing on devices that gets
> > deduplicated, not the devices.
>
> I understand that one VDO device implements one deduplication.
> So if no sharing of deduplication is desired between the backups, then i
> expect that each backup storage needs its own VDO device.

right

Would you even make so many backups on the same machine?

> > How can you make backups on Bluerays? They hold only 50GB or so and I'd
> > need thousands of them.
>
> My backup needs are much smaller than yours, obviously.
> I have an active $HOME tree of about 4 GB and some large but less agile
> data hoard of about 500 GB.
> The former gets backuped 5 times per day on appendable 25 GB BD media
> (as stated, 200+ days fit on one BD).

That makes it a lot easier. Isn't 5 times a day a bit much? And it's an odd
number.

> The latter gets an incremental update on a single-session 25 GB BD every
> other day. A new base backup needs about 20 BD media. Each time the
> single update BD is full, it joins the base backup in its cake box and a
> new incremental level gets started.
>
> If you have much more valuable data to backup then you will probably
> decide for rotating magnetic storage. Not only for capacity but also for
> the price/capacity ratio.

Yes, I'm re-using the many small hard discs that have accumulated over the
years. It's much easier and way more efficient to use few large discs for the
active data than many small ones, and using the small ones for backups is way
better than just having them laying around unused.

I wish we could still (relatively) easily make backups on tapes. Just change
the tape every day and you can have a reasonable number of full backups. Of
course, spooling and seeking tapes kinda sucks, but how often do you need to do
that.

> But you should consider to have at least some of your backups on
> removable media, e.g. hard disks in USB boxes. Only those can be isolated
> from the risks of daily operation, which i deem crucial for safe backup.

The backup server is turned off unless I'm making backups and its PDU port is
switched off, so not much will happen to it. I'm bad because I'm making them
only once in a while, and last time was very long ago ...

Once I've figured out what to do, I'll make a backup. A full new backup takes
ages and I need to stop modifying stuff and not start all over again all the
time. I think last time I created a btrfs RAID5, being unaware that that's a
bad idea ...

hw

unread,
Nov 10, 2022, 8:30:05 AM11/10/22
to
On Thu, 2022-11-10 at 10:34 +0100, Christoph Brinkhaus wrote:
> Am Thu, Nov 10, 2022 at 04:46:12AM +0100 schrieb hw:
> > On Wed, 2022-11-09 at 18:26 +0100, Christoph Brinkhaus wrote:
> > > Am Wed, Nov 09, 2022 at 06:11:34PM +0100 schrieb hw:
> > > [...]
> [...]
> > >
> >
> > Why would partitions be better than the block device itself?  They're like
> > an
> > additional layer and what could be faster and easier than directly using the
> > block devices?
>  
>  Using the block device is no issue until you have a mirror or so.
>  In case of a mirror ZFS will use the capacity of the smallest drive.

But you can't make partitions larger than the drive.

>  I have read that a for example 100GB disk might be slightly larger
>  then 100GB. When you want to replace a 100GB disk with a spare one
>  which is less larger than the original one the pool will not fit on
>  the disk and the replacement fails.

Ah yes, right! I kinda did that a while ago for spinning disks that might be
replaced by SSDs eventually and wanted to make sure that the SSDs wouldn't be
too small. I forgot about that, my memory really isn't what it used to be ...

>  With partitions you can specify the space. It does not hurt if there
>  are a few MB unallocated. But then the partitions of the diks have
>  exactly the same size.

yeah

hw

unread,
Nov 10, 2022, 8:30:05 AM11/10/22
to
On Thu, 2022-11-10 at 10:59 +0100, DdB wrote:
> Am 10.11.2022 um 04:46 schrieb hw:
> > On Wed, 2022-11-09 at 18:26 +0100, Christoph Brinkhaus wrote:
> > > Am Wed, Nov 09, 2022 at 06:11:34PM +0100 schrieb hw:
> > > [...]
> [...]
> > >
> > Why would partitions be better than the block device itself?  They're like
> > an
> > additional layer and what could be faster and easier than directly using the
> > block devices?
> >
> >
> hurts my eyes to see such desinformation circulating.

What's wrong about it?

Curt

unread,
Nov 10, 2022, 8:50:05 AM11/10/22
to
On 2022-11-08, The Wanderer <wand...@fastmail.fm> wrote:
>
> That more general sense of "backup" as in "something that you can fall
> back on" is no less legitimate than the technical sense given above, and
> it always rubs me the wrong way to see the unconditional "RAID is not a
> backup" trotted out blindly as if that technical sense were the only one
> that could possibly be considered applicable, and without any
> acknowledgment of the limited sense of "backup" which is being used in
> that statement.
>

Maybe it's a question of intent more than anything else. I thought RAID
was intended for a server scenario where if a disk fails, you're down
time is virtually null, whereas as a backup is intended to prevent data
loss. RAID isn't ideal for the latter because it doesn't ship the saved
data off-site from the original data (or maybe a RAID array is
conceivable over a network and a distance?).

Of course, I wouldn't know one way or another, but the complexity (and
substantial verbosity) of this thread seem to indicate that that all
these concepts cannot be expressed clearly and succinctly, from which I
draw my own conclusions.

hw

unread,
Nov 10, 2022, 8:50:05 AM11/10/22
to
On Thu, 2022-11-10 at 07:03 -0500, Greg Wooledge wrote:
> On Thu, Nov 10, 2022 at 05:54:00AM +0100, hw wrote:
> > ls -la
> > insgesamt 5
> > drwxr-xr-x  3 namefoo namefoo    3 16. Aug 22:36 .
> > drwxr-xr-x 24 root    root    4096  1. Nov 2017  ..
> > drwxr-xr-x  2 namefoo namefoo    2 21. Jan 2020  ?
> > namefoo@host /srv/datadir $ ls -la '?'
> > ls: Zugriff auf ? nicht möglich: Datei oder Verzeichnis nicht gefunden
> > namefoo@host /srv/datadir $
> >
> >
> > This directory named ? appeared on a ZFS volume for no reason and I can't
> > access
> > it and can't delete it.  A scrub doesn't repair it.  It doesn't seem to do
> > any
> > harm yet, but it's annoying.
> >
> > Any idea how to fix that?
>
> ls -la might not be showing you the true name.  Try this:
>
> printf %s * | hd
>
> That should give you a hex dump of the bytes in the actual filename.

good idea:

printf %s * | hexdump
0000000 77c2 6861 0074
0000005

> If you misrepresented the situation, and there's actually more than one
> file in this directory, then use something like this instead:
>
> shopt -s failglob
> printf '%s\0' ? | hd

shopt -s failglob
printf '%s\0' ? | hexdump
0000000 00c2
0000002

> Note that the ? is *not* quoted here, because we want it to match any
> one-character filename, no matter what that character actually is.  If
> this doesn't work, try ?? or * as the glob, until you manage to find it.

printf '%s\0' ?? | hexdump
-bash: Keine Entsprechung: ??

(meaning something like "no equivalent")


printf '%s\0' * | hexdump
0000000 00c2 6177 7468 0000
0000007


> If it turns out that '?' really is the filename, then it becomes a ZFS
> issue with which I can't help.

I would think it is. Is it?

perl -e 'print chr(0xc2) . "\n"'

... prints a blank line. What's 0xc2? I guess that should be UTF8 ...


printf %s *
aht

What would you expect it to print after shopt?

The Wanderer

unread,
Nov 10, 2022, 9:00:06 AM11/10/22
to
On 2022-11-10 at 08:40, Curt wrote:

> On 2022-11-08, The Wanderer <wand...@fastmail.fm> wrote:
>
>> That more general sense of "backup" as in "something that you can
>> fall back on" is no less legitimate than the technical sense given
>> above, and it always rubs me the wrong way to see the unconditional
>> "RAID is not a backup" trotted out blindly as if that technical
>> sense were the only one that could possibly be considered
>> applicable, and without any acknowledgment of the limited sense of
>> "backup" which is being used in that statement.
>
> Maybe it's a question of intent more than anything else. I thought
> RAID was intended for a server scenario where if a disk fails, you're
> down time is virtually null, whereas as a backup is intended to
> prevent data loss.

If the disk fails, the data stored on the disk is lost (short of
forensic-style data recovery, anyway), so anything that ensures that
that data is still available serves to prevent data loss.

RAID ensures that the data is still available even if the single disk
fails, so it qualifies under that criterion.

> RAID isn't ideal for the latter because it doesn't ship the saved
> data off-site from the original data (or maybe a RAID array is
> conceivable over a network and a distance?).

Shipping the data off-site is helpful to protect against most possible
causes for data loss, such as damage to or theft of the on-site
equipment. (Or, for that matter, accidental deletion of the live data.)

It's not necessary to protect against some causes, however, such as
failure of a local disk. For that cause, RAID fulfills the purpose just
fine.

RAID does not protect against most of those other scenarios, however, so
there's certainly still a role for - and a reason to recommend! -
off-site backup. It's just that the existence of those options does not
mean RAID does not have a role to play in avoiding data loss, and
thereby a valid sense in which it can be considered to provide something
to fall back on, which is the approximate root meaning of the
nontechnical sense of "backup".

--
The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw

signature.asc

Nicolas George

unread,
Nov 10, 2022, 9:00:06 AM11/10/22
to
Curt (12022-11-10):
> Maybe it's a question of intent more than anything else. I thought RAID
> was intended for a server scenario where if a disk fails, you're down
> time is virtually null, whereas as a backup is intended to prevent data
> loss.

Maybe just use common sense. RAID means your data is present on several
drives. You can just deduce what it can help for:

one drive fails → you can replace it immediately, no downtime

one drive fails → the data is present elsewhere, no data loss

several¹ drive fail → downtime and data loss²

1: depending on RAID level
2: or not if you have backups too

> RAID isn't ideal for the latter because it doesn't ship the saved
> data off-site from the original data (or maybe a RAID array is
> conceivable over a network and a distance?).

It is always a matter of compromise. You cannot duplicate your data
off-site at the same rate as you duplicate it on a second local drive.

That means your off-site data will survive an EMP, but you will lose
minutes / hours / days of data prior to the EMP. OTOH, RAID will not
survive an EMP, but it will prevent all data loss caused by isolated
hardware failure.

--
Nicolas George

Dan Ritter

unread,
Nov 10, 2022, 9:10:05 AM11/10/22
to
hw wrote:
> And I've been reading that when using ZFS, you shouldn't make volumes with more
> than 8 disks. That's very inconvenient.


Where do you read these things?

The number of disks in a zvol can be optimized, depending on
your desired redundancy method, total number of drives, and
tolerance for reduced performance during resilvering.

Multiple zvols together form a zpool. Filesystems are allocated from
a zpool.

8 is not a magic number.

-dsr-

Dan Ritter

unread,
Nov 10, 2022, 9:30:05 AM11/10/22
to
Curt wrote:
> On 2022-11-08, The Wanderer <wand...@fastmail.fm> wrote:
> >
> > That more general sense of "backup" as in "something that you can fall
> > back on" is no less legitimate than the technical sense given above, and
> > it always rubs me the wrong way to see the unconditional "RAID is not a
> > backup" trotted out blindly as if that technical sense were the only one
> > that could possibly be considered applicable, and without any
> > acknowledgment of the limited sense of "backup" which is being used in
> > that statement.
> >
>
> Maybe it's a question of intent more than anything else. I thought RAID
> was intended for a server scenario where if a disk fails, you're down
> time is virtually null, whereas as a backup is intended to prevent data
> loss. RAID isn't ideal for the latter because it doesn't ship the saved
> data off-site from the original data (or maybe a RAID array is
> conceivable over a network and a distance?).

RAID means "redundant array of inexpensive disks". The idea, in the
name, is to bring together a bunch of cheap disks to mimic a single more
expensive disk, in a way which hopefully is more resilient to failure.

If you need a filesystem that is larger than a single disk (that you can
afford, or that exists), RAID is the name for the general approach to
solving that.

The three basic technologies of RAID are:

striping: increase capacity by writing parts of a data stream to N
disks. Can increase performance in some situations.

mirroring: increase resiliency by redundantly writing the same data to
multiple disks. Can increase performance of reads.

checksums/erasure coding: increase resilency by writing data calculated
from the real data (but not a full copy) that allows reconstruction of
the real data from a subset of disks. RAID5 allows one failure, RAID6
allows recovery from two simultaneous failures, fancier schemes may
allow even more.

You can work these together, or separately.

Now, RAID is not a backup because it is a single store of data: if you
delete something from it, it is deleted. If you suffer a lightning
strike to the server, there's no recovery from molten metal.

Some filesystems have snapshotting. Snapshotting can protect you from
the accidental deletion scenario, by allowing you to recover quickly,
but does not protect you from lightning.

The lightning scenario requires a copy of the data in some other
location. That's a backup.

You can store the backup on a RAID. You might need to store the backup
on a RAID, or perhaps by breaking it up into pieces to store on tapes or
optical disks or individual hard disks. The kind of RAID you choose for
the backup is not related to the kind of RAID you use on your primary
storage.

> Of course, I wouldn't know one way or another, but the complexity (and
> substantial verbosity) of this thread seem to indicate that that all
> these concepts cannot be expressed clearly and succinctly, from which I
> draw my own conclusions.

The fact that many people talk about things that they don't understand
does not restrict the existence of people who do understand it. Only
people who understand what they are talking about can do so clearly and
succinctly.

-dsr-

Greg Wooledge

unread,
Nov 10, 2022, 9:40:05 AM11/10/22
to
On Thu, Nov 10, 2022 at 02:48:28PM +0100, hw wrote:
> On Thu, 2022-11-10 at 07:03 -0500, Greg Wooledge wrote:
> good idea:
>
> printf %s * | hexdump
> 0000000 77c2 6861 0074
> 0000005

Looks like there might be more than one file here.

> > If you misrepresented the situation, and there's actually more than one
> > file in this directory, then use something like this instead:
> >
> > shopt -s failglob
> > printf '%s\0' ? | hd
>
> shopt -s failglob
> printf '%s\0' ? | hexdump
> 0000000 00c2
> 0000002

OK, that's a good result.

> > Note that the ? is *not* quoted here, because we want it to match any
> > one-character filename, no matter what that character actually is.  If
> > this doesn't work, try ?? or * as the glob, until you manage to find it.
>
> printf '%s\0' ?? | hexdump
> -bash: Keine Entsprechung: ??
>
> (meaning something like "no equivalent")

The English version is "No match".

> printf '%s\0' * | hexdump
> 0000000 00c2 6177 7468 0000
> 0000007

I dislike this output format, but it looks like there are two files
here. The first is 0xc2, and the second is 0x77 0x61 0x68 0x74 if
I'm reversing and splitting the silly output correctly. (This spells
"waht", if I got it right.)

> > If it turns out that '?' really is the filename, then it becomes a ZFS
> > issue with which I can't help.
>
> I would think it is. Is it?

The file in question appears to have a name which is the single byte 0xc2.
Since that's not a valid UTF-8 character, ls chooses something to display
instead. In your case, it chose a '?' character. I'm guessing this is on
an older release of Debian.

In my case, it does this:

unicorn:~$ mkdir /tmp/x && cd "$_"
unicorn:/tmp/x$ touch $'\xc2'
unicorn:/tmp/x$ ls -la
total 80
-rw-r--r-- 1 greg greg 0 Nov 10 09:21 ''$'\302'
drwxr-xr-x 2 greg greg 4096 Nov 10 09:21 ./
drwxrwxrwt 20 root root 73728 Nov 10 09:21 ../

In my version of ls, there's a --quoting-style= option that can help
control what you see. But that's a tangent you can explore later.

Since we know the actual name of the file (subdirectory) now, let's just
rename it to something sane.

mv $'\xc2' subdir

Then you can investigate it, remove it, or do whatever else you want.

Thomas Schmitt

unread,
Nov 10, 2022, 9:40:05 AM11/10/22
to
Hi,

i wrote:
> > the time window in which the backuped data
> > can become inconsistent on the application level.

hw wrote:
> Or are you referring to the data being altered while a backup is in
> progress?

Yes. Data of different files or at different places in the same file
may have relations which may become inconsistent during change operations
until the overall change is complete.
If you are unlucky you can even catch a plain text file that is only half
stored.

The risk for this is not 0 with filesystem snapshots, but it grows further
if there is a time interval during which changes may or may not be copied
into the backup, depending on filesystem internals and bad luck.


> Would you even make so many backups on the same machine?

It depends on the alternatives.
If you have other storage systems which can host backups, then it is of
course good to use them for backup storage. But if you have less separate
storage than independent backups, then it is still worthwhile to put more
than one backup on the same storage.


> Isn't 5 times a day a bit much?

It depends on how much you are willing to lose in case of a mishap.
My $HOME backup runs last about 90 seconds each. So it is not overly
cumbersome.


> And it's an odd number.

That's because the early afternoon backup is done twice. (A tradition
which started when one of my BD burners began to become unreliable.)


> Yes, I'm re-using the many small hard discs that have accumulated over the
> years.

If it's only their size which disqualifies them for production purposes,
then it's ok. But if they are nearing the end of their life time, then
i would consider to decommission them.


> I wish we could still (relatively) easily make backups on tapes.

My personal endeavor with backups on optical media began when a customer
had a major data mishap and all backup tapes turned out to be unusable.
Frequent backups had been made and allegedly been checkread. But in the
end it was big drama.
I then proposed to use a storage where the boss of the department can
make random tests with the applications which made and read the files.
So i came to writing backup scripts which used mkisofs and cdrecord
for CD-RW media.


> Just change
> the tape every day and you can have a reasonable number of full backups.

If you have thousandfold the size of Blu-rays worth of backup, then
probably a tape library would be needed. (I find LTO tapes with up to
12 TB in the web, which is equivalent to 480 BD-R.)


> A full new backup takes ages

It would help if you could divide your backups into small agile parts and
larger parts which don't change often.
The agile ones need frequent backup, whereas the lazy ones would not suffer
so much damage if the newset available backup is a few days old.


> I need to stop modifying stuff and not start all over again

The backup part of a computer system should be its most solid and artless
part. No shortcuts, no fancy novelties, no cumbersome user procedures.


Have a nice day :)

Thomas

The Wanderer

unread,
Nov 10, 2022, 9:40:06 AM11/10/22
to
On 2022-11-10 at 09:06, Dan Ritter wrote:

> Now, RAID is not a backup because it is a single store of data: if
> you delete something from it, it is deleted. If you suffer a
> lightning strike to the server, there's no recovery from molten
> metal.

Here's where I find disagreement.

Say you didn't use RAID, and you had two disks in the same machine.

In order to avoid data loss in the event that one of the disks failed,
you engaged in a practice of copying all files from one disk onto the other.

That process could, and would, easily be referred to as backing up the
files. It's not a very distant backup, and it wouldn't protect against
that lightning strike, but it's still a separate backed-up copy.

But copying those files manually is a pain, so you might well set up a
process to automate it. That then becomes a scheduled backup, from one
disk onto another.

That scheduled process means that you have periods where the most
recently updated copy of the live data hasn't made it into the backup,
so there's still a time window where you're at risk of data loss if the
first disk fails. So you might set things up for the automated process
to in effect run continuously, writing the data to both disks in
parallel as it comes in.

And at that point you've basically reinvented mirroring RAID.

You've also lost the protection against "if you delete something from
it"; unlike deeper, more robust forms of backup, RAID does not protect
against accidental deletion. But you still have the protection against
"if one disk fails" - and that one single layer of protection against
one single cause of data loss is, I contend, still valid to refer to as
a "backup" just as much as the original manually-made copies were.

> Some filesystems have snapshotting. Snapshotting can protect you
> from the accidental deletion scenario, by allowing you to recover
> quickly, but does not protect you from lightning.
>
> The lightning scenario requires a copy of the data in some other
> location. That's a backup.

There are many possible causes of data loss. My contention is that
anything that protects against *any* of them qualifies as some level of
backup, and that there are consequently multiple levels / tiers /
degrees / etc. of backup.

RAID is not an advanced form of protection against data loss; it only
protects against one type of cause. But it still does protect against
that one type, and thus it is not valid to kick it out of that circle
entirely.
signature.asc

Curt

unread,
Nov 10, 2022, 9:50:05 AM11/10/22
to
On 2022-11-10, Nicolas George <geo...@nsup.org> wrote:
> Curt (12022-11-10):
>> Maybe it's a question of intent more than anything else. I thought RAID
>> was intended for a server scenario where if a disk fails, you're down
>> time is virtually null, whereas as a backup is intended to prevent data
>> loss.
>
> Maybe just use common sense. RAID means your data is present on several
> drives. You can just deduce what it can help for:
>
> one drive fails → you can replace it immediately, no downtime

That's precisely what I said, so I'm baffled by the redundancy of your
words. Or are you a human RAID?

Nicolas George

unread,
Nov 10, 2022, 10:00:06 AM11/10/22
to
Curt (12022-11-10):
> > one drive fails → you can replace it immediately, no downtime
> That's precisely what I said,

I was not stating that THIS PART of what you said was srong.

> so I'm baffled by the redundancy of your
> words.

Hint: my mail did not stop at the line you quoted. Reading mails to the
end is usually a good practice to avoid missing information.

--
Nicolas George

Nicolas George

unread,
Nov 10, 2022, 10:10:06 AM11/10/22
to
Curt (12022-11-10):
> Why restate it then needlessly?

To NOT state that you were wrong when you were not.

This branch of the discussion bores me. Goodbye.

--
Nicolas George

Curt

unread,
Nov 10, 2022, 10:10:06 AM11/10/22
to
On 2022-11-10, Nicolas George <geo...@nsup.org> wrote:
> Curt (12022-11-10):
>> > one drive fails → you can replace it immediately, no downtime
>> That's precisely what I said,
>
> I was not stating that THIS PART of what you said was srong.

Why restate it then needlessly?

>> so I'm baffled by the redundancy of your
>> words.
>
> Hint: my mail did not stop at the line you quoted. Reading mails to the
> end is usually a good practice to avoid missing information.
>

It's also an insect repellent.

Curt

unread,
Nov 10, 2022, 10:20:05 AM11/10/22
to
On 2022-11-10, Nicolas George <geo...@nsup.org> wrote:
> Curt (12022-11-10):
>> Why restate it then needlessly?
>
> To NOT state that you were wrong when you were not.
>
> This branch of the discussion bores me. Goodbye.
>

This isn't solid enough for a branch. It couldn't support a hummingbird.
And me too! That old ennui! Adieu!

Dan Ritter

unread,
Nov 10, 2022, 10:30:05 AM11/10/22
to
Brad Rogers wrote:
> On Thu, 10 Nov 2022 08:48:43 -0500
> Dan Ritter <d...@randomstring.org> wrote:
>
> Hello Dan,
>
> >8 is not a magic number.
>
> Clearly, you don't read Terry Pratchett. :-)

In the context of ZFS, 8 is not a magic number.

May you be ridiculed by Pictsies.

-dsr-

d...@chris.oldnest.ca

unread,
Nov 10, 2022, 10:50:05 AM11/10/22
to
On Wed, 09 Nov 2022 13:28:46 +0100
hw <h...@adminart.net> wrote:

> On Tue, 2022-11-08 at 09:52 +0100, DdB wrote:
> > Am 08.11.2022 um 05:31 schrieb hw:
> > > > That's only one point.
> > > What are the others?
> > >
> > > >  And it's not really some valid one, I think, as
> > > > you do typically not run into space problems with one single
> > > > action (YMMV). Running multiple sessions and out-of-band
> > > > deduplication between them works for me.
> > > That still requires you to have enough disk space for at least
> > > two full backups.
> > > I can see it working for three backups because you can
> > > deduplicate the first two, but not for two.  And why would I
> > > deduplicate when I have sufficient disk
> > > space.
> > >
> > Your wording likely confuses 2 different concepts:
>
> Noooo, I'm not confusing that :) Everyone says so and I don't know
> why ...
>
> > Deduplication avoids storing identical data more than once.
> > whereas
> > Redundancy stores information on more than one place on purpose to
> > avoid loos of data in case of havoc.
> > ZFS can do both, as it combines the features of a volume manager
> > with those of a filesystem and a software RAID.( I am using
> > zfsonlinux since its early days, for over 10 years now, but without
> > dedup. )
> >
> > In the past, i used shifting/rotating external backup media for that
> > purpose, because, as the saying goes: RAID is NOT a backup! Today, i
> > have a second server only for the backups, using zfs as well, which
> > allows for easy incremental backups, minimizing traffic and disk
> > usage.
> >
> > but you should be clear as to what you want: redundancy or
> > deduplication?
>
> The question is rather if it makes sense to have two full backups on
> the same machine for redundancy and to be able to go back in time, or
> if it's better to give up on redundancy and to have only one copy and
> use snapshots or whatever to be able to go back in time.

And the answer is no. The redundancy you gain from this is almost,
though not quite, meaningless, because of the large set of common
data-loss scenarios against which it offers no protection. You've made
it clear that the cost of storage media is a problem in your situation.
Doubling your backup server's requirement for scarce and expensive disk
space in order to gain a tiny fraction of the resiliency that's
normally implied by "redundancy" doesn't make sense. And being able to
go "back in time" can be achieved much more efficiently by using a
solution (be it off-the-shelf or roll-your-own) that starts with a full
backup and then just stores deltas of changes over time (aka incremental
backups). None of this, for the record, is "deduplication", and I
haven't seen any indication in this thread so far that actual
deduplication is relevant to your use case.

> Of course it would better to have more than one machine, but I don't
> have that.

Fine, just be realistic about the fact that this means you cannot in
any meaningful sense have "two full backups" or "redundancy". If and
when you can some day devote an RPi tethered to some disks to the job,
then you can set it up to hold a second, completely independent,
store of "full backup plus deltas". And *then* you would have
meaningful redundancy that offers some real resilience. Even better if
the second one is physically offsite.

In the meantime, storing multiple full copies of your data on one
backup server is just a way to rapidly run out of disk space on your
backup server for essentially no reason.


Cheers!
-Chris

hw

unread,
Nov 10, 2022, 10:50:06 AM11/10/22
to
On Wed, 2022-11-09 at 21:36 -0800, David Christensen wrote:
> On 11/9/22 00:24, hw wrote:
>  > On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:
>
>  > Hmm, when you can backup like 3.5TB with that, maybe I should put
> FreeBSD on my
>  > server and give ZFS a try.  Worst thing that can happen is that it
> crashes and
>  > I'd have made an experiment that wasn't successful.  Best thing, I
> guess, could
>  > be that it works and backups are way faster because the server
> doesn't have to
>  > actually write so much data because it gets deduplicated and reading
> from the
>  > clients is faster than writing to the server.
>
>
> Be careful that you do not confuse a ~33 GiB full backup set, and 78
> snapshots over six months of that same full backup set, with a full
> backup of 3.5 TiB of data.  I would suggest a 10 GiB pool to backup the
> latter.

The full backup isn't deduplicated?

> Writing to a ZFS filesystem with deduplication is much slower than
> simply writing to, say, an ext4 filesystem -- because ZFS has to hash
> every incoming block and see if it matches the hash of any existing
> block in the destination pool.  Storing the existing block hashes in a
> dedicated dedup virtual device will expedite this process.

But when it needs to write almost nothing because almost everthing gets
deduplicated, can't it be faster than having to write everthing?

>  >> I run my backup script each night.  It uses rsync to copy files and
>  >
>  > Aww, I can't really do that because my servers eats like 200-300W
> because it has
>  > so many disks in it.  Electricity is outrageously expensive here.
>
>
> Perhaps platinum rated power supplies?  Energy efficient HDD's/ SSD's?

If you pay for it ... :)

Running it once in a while for some hours to make backups is still possible.
Replacing the hardware is way more expensive.

> [...]
>  > Sounds like a nice setup.  Does that mean you use snapshots to keep
> multiple
>  > generations of backups and make backups by overwriting everything
> after you made
>  > a snapshot?
>
> Yes.

I start thinking more and more that I should make use of snapshots.

>  > In that case, is deduplication that important/worthwhile?  You're not
>  > duplicating it all by writing another generation of the backup but
> store only
>  > what's different through making use of the snapshots.
>
> Without deduplication or compression, my backup set and 78 snapshots
> would require 3.5 TiB of storage.  With deduplication and compression,
> they require 86 GiB of storage.

Wow that's quite a difference! What makes this difference, the compression or
the deduplication? When you have snapshots, you would store only the
differences from one snapshot to the next, and that would mean that there aren't
so many duplicates that could be deduplicated.

>  > ... I only never got around to figure [ZFS snapshots] out because I
> didn't have the need.
>
>
> I accidentally trash files on occasion.  Being able to restore them
> quickly and easily with a cp(1), scp(1), etc., is a killer feature.

indeed

> Users can recover their own files without needing help from a system
> administrator.

You have users who know how to get files out of snapshots?

>  > But it could also be useful for "little" things like taking a
> snapshot of the
>  > root volume before updating or changing some configuration and being
> able to
>  > easily to undo that.
>
>
> FreeBSD with ZFS-on-root has a killer feature called "Boot Environments"
> that has taken that idea to the next level:
>
> https://klarasystems.com/articles/managing-boot-environments/

That's really cool. Linux is missing out on a lot by treating ZFS as an alien.

I guess btrfs could, in theory, make something like boot environments possible,
but you can't even really boot from btrfs because it'll fail to boot as soon as
the boot volume is degraded, like when a disc has failed, and then you're
screwed because you can't log in through ssh to fix anything but have to
actually go to the machine to get it back up. That's a non-option and you have
to use something else than btrfs to boot from.

>  >> I have 3.5 TiB of backups.
>
>
> It is useful to group files with similar characteristics (size,
> workload, compressibility, duplicates, backup strategy, etc.) into
> specific ZFS filesystems (or filesystem trees).  You can then adjust ZFS
> properties and backup strategies to match.

That's a good idea.

>  >>>> For compressed and/or encrypted archives, image, etc., I do not use
>  >>>> compression or de-duplication
>  >>>
>  >>> Yeah, they wouldn't compress.  Why no deduplication?
>  >>
>  >>
>  >> Because I very much doubt that there will be duplicate blocks in
> such files.
>  >
>  > Hm, would it hurt?
>
>
> Yes.  ZFS deduplication is resource intensive.

But you're using it already.

>  > Oh it's not about performance when degraded, but about performance.
> IIRC when
>  > you have a ZFS pool that uses the equivalent of RAID5, you're still
> limited to
>  > the speed of a single disk.  When you have a mysql database on such a ZFS
>  > volume, it's dead slow, and removing the SSD cache when the SSDs
> failed didn't
>  > make it any slower.  Obviously, it was a bad idea to put the database
> there, and
>  > I wouldn't do again when I can avoid it.  I also had my data on such
> a volume
>  > and I found that the performance with 6 disks left much to desire.
>
>
> What were the makes and models of the 6 disks?  Of the SSD's?  If you
> have a 'zpool status' console session from then, please post it.

They were (and still are) 6x4TB WD Red (though one or two have failed over time)
and two Samsung 850 PRO, IIRC. I don't have an old session anymore.

These WD Red are slow to begin with. IIRC, both SDDs failed and I removed them.

The other instance didn't use SSDs but 6x2TB HGST Ultrastar. Those aren't
exactly slow but ZFS is slow.

> Constructing a ZFS pool to match the workload is not easy.

Well, back then there wasn't much information because ZFS was a pretty new
thing.

>   STFW there
> are plenty of articles.  Here is a general article I found recently:
>
> https://klarasystems.com/articles/choosing-the-right-zfs-pool-layout/

Thanks! If I make a zpool for backups (or anything else), I need to do some
reading beforehand anyway.

> MySQL appears to have the ability to use raw disks.  Tuned correctly,
> this should give the best results:
>
> https://dev.mysql.com/doc/refman/8.0/en/innodb-system-tablespace.html#innodb-raw-devices

Could mysql 5.6 already do that? I'll have to see if mariadb can do that now
...

> If ZFS performance is not up to your expectations, and there are no
> hardware problems, next steps include benchmarking, tuning, and/or
> adding or adjusting the hardware and its usage.

In theory, yes :)

I'm very reluctant to mess with the default settings of file systems. When xfs
became available for Linux some time in 90ies, I managed to loose data when an
xfs file system got messed up. Fortunately, I was able to recover almost all
from backups and from the file system. I never really found out what caused it,
but long time later I figured that I probably hadn't used mounting options I
should have used. I had messed with the defaults for some reason I don't
remember. That tought me a lesson.

>  >> ... invest in hardware to get performance.
>
>  > Hardware like?
>
>
> Server chassis, motherboards, chipsets, processors, memory, disk host
> bus adapters, disk racks, disk drives, network interface cards, etc..

Well, who's gona pay for that?

>  > In theory, using SSDs for cache with ZFS should improve
>  > performance.  In practise, it only wore out the SSDs after a while,
> and now it's
>  > not any faster without SSD cache.
>
>
> Please run 'zpool status' and post the console session (prompt, command
> entered, output displayed).  Please correlate the vdev's to disk drive
> makes and models.

See above ... The pool is a raidz1-0 with the 6x4TB Red drives, and no SSDs are
left.

> On 11/9/22 03:41, hw wrote:
>
> > I don't have anything without ECC RAM,
>
>
> Nice.

Yes :) Buying used has it's advantages. You don't get the fastest, but you get
tons of ECC RAM and awesome CPUs and reliability.

> > and my server was never meant for ZFS.
>
>
> What is the make and model of your server?

I put it together myself. The backup server uses a MSI mainboard with the
designation S0121 C204 SKU in a Chenbro case that has a 16xLFF backplane. It
has only 16GB RAM and would max out at 32GB. Unless you want ZFS with
deduplication, that's more than enough to make backups :)

I could replace it with a Dell r720 to get more RAM, but those can have only
12xLFF. I could buy a new Tyan S7012 WGM4NR for EUR 50 before they're sold out
and stuff at least 48GB RAM into it and 2x5690 Xeons (which are supposed to go
into a Z800 I have sitting around and could try to sell, but I'm lazy), but then
I'd probably have to buy CPU coolers for it (I'm not sure the coolers of the
Z800 fit) and a new UPS because it would need so much power. (I also have the
48GB because they came in a server I bought for the price of the 5690s (to get
the 5690s) and another 48GB in the Z800, but not all of it might fit ...)

It would be fun, but I don't really feel like throwing money at technology that
old just for making a backup once in a while. If you can tell me that the
coolers of the Z800 definitely fit the Tyan board, I'll buy one and throw it
into my server. It would be worth spending the EUR 50. Hm, maybe I should find
out, but that'll be difficult ... and the fan connctors won't fit even if the
coolers do. They're 4 pin. ... Ok, I could replace the fans, but I don't have
any 90mm fans. Those shouldn't cost too much, though.

> > With mirroring, I could fit only one backup, not two.
>
>
> Add another mirror to your pool.  Or, use a process of substitution and
> resilvering to replace existing drives with larger capacity drives.

Lol, I can't create a pool in thin air. Wouldn't it be great if ZFS could do
that? :) Use ambient air for storage ... just make sure the air doesn't escape
;)

There's nothing to resilver, the backup server is currently using btrfs.

Have you checked disc prices recently? Maybe I'll get lucky on black friday,
but if I get some, they'll go into my active server.

> > In any case, I'm currently tending to think that putting FreeBSD with ZFS on
> > my
> > server might be the best option.  But then, apparently I won't be able to
> > configure the controller cards, so that won't really work.
>
>
> What is the make and model of your controller cards?

They're HP smart array P410. FreeBSD doesn't seem to support those.

> [...]
> I have a Debian VM and no contrib ... hm, zfs-dkms and such?  That's
> promising,
>
>
> +1
>
> https://packages.debian.org/bullseye/zfs-dkms
>
> https://packages.debian.org/bullseye/zfsutils-linux

yeah

> [...]
> If you already have a ZFS pool, the way to back it up is to replicate
> the pool to another pool.  Set up an external drive with a pool and
> replicate your server pool to that periodically.

No, the data to back up is mostly (or even all) on btrfs. IIRC, btrfs has some
sending feature, but I rather don't do anything complicated and just copy the
files over with rsync. It's not like I could replicate some volume/pool because
the data comes from different machines and all backs up to one volume.

hw

unread,
Nov 10, 2022, 11:40:05 AM11/10/22
to
On Thu, 2022-11-10 at 10:47 +0100, DdB wrote:
> Am 10.11.2022 um 06:38 schrieb David Christensen:
> > What is your technique for defragmenting ZFS?
> well, that was meant more or less a joke: there is none apart from
> offloading all the data, destroying and rebuilding the pool, and filling
> it again from the backup. But i do it from time to time if fragmentation
> got high, the speed improvements are obvious. OTOH the process takes
> days on my SOHO servers
>

Does it save you days so that you save more time than you spend on defragmenting
it because access is faster?

Perhaps after so many days of not defragging, but how many days?

Maybe use an archive pool that doesn't get deleted from?

hw

unread,
Nov 10, 2022, 11:40:06 AM11/10/22
to
On Thu, 2022-11-10 at 02:19 -0500, gene heskett wrote:
> On 11/10/22 00:37, David Christensen wrote:
> > On 11/9/22 00:24, hw wrote:
> >  > On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:
>
> [...]
> Which brings up another suggestion in two parts:
>
> 1: use amanda, with tar and compression to reduce the size of the
> backups.  And use a backup cycle of a week or 2 because amanda will if
> advancing a level, only backup that which has been changed since the
> last backup. On a quiet system, a level 3 backup for a 50gb network of
> several machines can be under 100 megs. More on a busy system of course.
> Amanda keeps track of all that automatically.

Amanda is nice, yet quite unwieldy (try to get a file out of the backups ...).
I used it long time ago (with tapes) and I'd have to remember or re-learn how to
use amanda to back up particular directories and such ...

I think I might be better off learning more about snapshots.

> 2: As disks fail, replace them with SSD's which use much less power than
> spinning rust. And they are typically 5x faster than commodity spinning
> rust.

Is this a joke?

https://www.dell.com/en-us/shop/visiontek-16tb-class-qlc-7mm-25-ssd/apd/ab329068/storage-drives-media

Cool, 30% discount on black friday saves you $2280 for every pair of disks, and
it even starts right now. (Do they really mean that? What if I had a datacenter
and ordered 512 or so of them? I'd save almost $1.2 million, what a great
deal!)

And mind you, SSDs are *designed to fail* the sooner the more data you write to
them. They have their uses, maybe even for storage if you're so desperate, but
not for backup storage.

> Here, and historically with spinning rust, backing up 5 machines, at 3am
> every morning is around 10gb total and under 45 minutes. This includes
> the level 0's it does by self adjusting the schedule to spread the level
> 0's, AKA the fulls, out over the backup cycle so the amount of storage
> used for any one backup run is fairly consistent.

That's almost half a month for 4TB. Why does it take so long?

Michael Stone

unread,
Nov 10, 2022, 12:00:05 PM11/10/22
to
On Thu, Nov 10, 2022 at 05:34:32PM +0100, hw wrote:
>And mind you, SSDs are *designed to fail* the sooner the more data you write to
>them. They have their uses, maybe even for storage if you're so desperate, but
>not for backup storage.

It's unlikely you'll "wear out" your SSDs faster than you wear out your
HDs.

hw

unread,
Nov 10, 2022, 12:30:05 PM11/10/22
to
On Wed, 2022-11-09 at 14:22 +0100, Nicolas George wrote:
> hw (12022-11-08):
> > When I want to have 2 (or more) generations of backups, do I actually want
> > deduplication?  It leaves me with only one actual copy of the data which
> > seems
> > to defeat the idea of having multiple generations of backups at least to
> > some
> > extent.
>
> The idea of having multiple generations of backups is not to have the
> data physically present in multiple places, this is the role of RAID.
>
> The idea if having multiple generations of backups is that if you
> accidentally overwrite half your almost-completed novel with lines of
> ALL WORK AND NO PLAY MAKES JACK A DULL BOY and the backup tool runs
> before you notice it, you still have the precious data in the previous
> generation.

Nicely put :)

Let me rephrase a little:

How likely is it that a storage volume (not the underlying media, like discs in
a RAID array) would become unreadble in only some places so that it could be an
advantage to have multiple copies of the same data on the volume?

It's like I can't help unconsciously thinking that it's an advantage to have
several multple copies on a volume for any other reason than not to overwrite
the almost complete novel. At the same time, I find it difficult to imagine how
a volume could get damaged only in some places, and I don't see other reasons
than that.

Ok, another reason to keep multiple full copies on a volume is making things
simple, easy and thus perhaps more reliable than more complicated solutions. At
least that's an intention. But it costs a lot of disk space.

hw

unread,
Nov 10, 2022, 1:00:06 PM11/10/22
to
I have already done that.

hw

unread,
Nov 10, 2022, 1:00:06 PM11/10/22
to
On Thu, 2022-11-10 at 09:30 -0500, Greg Wooledge wrote:
> On Thu, Nov 10, 2022 at 02:48:28PM +0100, hw wrote:
> > On Thu, 2022-11-10 at 07:03 -0500, Greg Wooledge wrote:
>
> [...]
> > printf '%s\0' * | hexdump
> > 0000000 00c2 6177 7468 0000                   
> > 0000007
>
> I dislike this output format, but it looks like there are two files
> here.  The first is 0xc2, and the second is 0x77 0x61 0x68 0x74 if
> I'm reversing and splitting the silly output correctly.  (This spells
> "waht", if I got it right.)
> >

Ah, yes. I tricked myself because I don't have hd installed, so I redirected
the output of printf into a file --- which I wanted to name 'what' but I
mistyped as 'waht' --- so I could load it into emacs and use hexl-mode. But the
display kinda sucked and I found I have hexdump installed and used that.
Meanwhile I totally forgot about the file I had created.

> [...]
> >
> The file in question appears to have a name which is the single byte 0xc2.
> Since that's not a valid UTF-8 character, ls chooses something to display
> instead.  In your case, it chose a '?' character.

I'm the only one who can create files there, and I didn't create that. Using
0xc2 as a file name speaks loudly against that I'd create that file
accidentially.

>   I'm guessing this is on
> an older release of Debian.

It's an ancient Gentoo which couldn't be updated in years because they broke the
update process. Back then, Gentoo was the only Linux distribution that didn't
need fuse for ZFS that I could find.

> In my case, it does this:
>
> unicorn:~$ mkdir /tmp/x && cd "$_"
> unicorn:/tmp/x$ touch $'\xc2'
> unicorn:/tmp/x$ ls -la
> total 80
> -rw-r--r--  1 greg greg     0 Nov 10 09:21 ''$'\302'
> drwxr-xr-x  2 greg greg  4096 Nov 10 09:21  ./
> drwxrwxrwt 20 root root 73728 Nov 10 09:21  ../
>
> In my version of ls, there's a --quoting-style= option that can help
> control what you see.  But that's a tangent you can explore later.
>
> Since we know the actual name of the file (subdirectory) now, let's just
> rename it to something sane.
>
> mv $'\xc2' subdir
>
> Then you can investigate it, remove it, or do whatever else you want.

Cool, I've renamed it, thank you very much :) I'm afraid that the file system
will crash when I remove it ... It's an empty directory. Ever since I noticed
it, I couldn't do anything with it and I thought it's some bug in the file
system.

Greg Wooledge

unread,
Nov 10, 2022, 1:10:05 PM11/10/22
to
On Thu, Nov 10, 2022 at 06:54:31PM +0100, hw wrote:
> Ah, yes. I tricked myself because I don't have hd installed,

It's just a symlink to hexdump.

lrwxrwxrwx 1 root root 7 Jan 20 2022 /usr/bin/hd -> hexdump

unicorn:~$ dpkg -S usr/bin/hd
bsdextrautils: /usr/bin/hd
unicorn:~$ dpkg -S usr/bin/hexdump
bsdextrautils: /usr/bin/hexdump

> It's an ancient Gentoo

Ahhhh. Anyway, from the Debian man page:

-C, --canonical
Canonical hex+ASCII display. Display the input offset in hexa‐
decimal, followed by sixteen space-separated, two-column, hexa‐
decimal bytes, followed by the same sixteen bytes in %_p format
enclosed in '|' characters. Invoking the program as hd implies
this option.

Why on earth the default format of "hexdump" uses that weird 16-bit
little endian nonsense is beyond me.

Linux-Fan

unread,
Nov 10, 2022, 4:41:42 PM11/10/22
to
hw writes:

> On Wed, 2022-11-09 at 19:17 +0100, Linux-Fan wrote:
> > hw writes:
> > > On Wed, 2022-11-09 at 14:29 +0100, didier gaumet wrote:
> > > > Le 09/11/2022 à 12:41, hw a écrit :

[...]

> > > I'd
> > > have to use mdadm to create a RAID5 (or use the hardware RAID but that
> > > isn't
> >
> > AFAIK BTRFS also includes some integrated RAID support such that you do
> > not necessarily need to pair it with mdadm.
>
> Yes, but RAID56 is broken in btrfs.
>
> > It is advised against using for RAID 
> > 5 or 6 even in most recent Linux kernels, though:
> >
> > https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid56-status-and-recommended-practices
>
> Yes, that's why I would have to use btrfs on mdadm when I want to make a
> RAID5.
> That kinda sucks.
>
> > RAID 5 and 6 have their own issues you should be aware of even when
> > running 
> > them with the time-proven and reliable mdadm stack. You can find a lot of 
> > interesting results by searching for “RAID5 considered harmful” online.
> > This 
> > one is the classic that does not seem to make it to the top results,
> > though:
>
> Hm, really? The only time that RAID5 gave me trouble was when the hardware

[...]

I have never used RAID5 so how would I know :)

I think the arguments of the RAID5/6 critics summarized were as follows:

* Running in a RAID level that is 5 or 6 degrades performance while
a disk is offline significantly. RAID 10 keeps most of its speed and
RAID 1 only degrades slightly for most use cases.

* During restore, RAID5 and 6 are known to degrade performance more compared
to restoring one of the other RAID levels.

* Disk space has become so cheap that the savings of RAID5 may
no longer rectify the performance and reliability degradation
compared to RAID1 or 10.

All of these arguments come from a “server” point of view where it is
assumed that

(1) You win something by running the server so you can actually
tell that there is an economic value in it. This allows for
arguments like “storage is cheap” which may not be the case at
all if you are using up some thightly limited private budget.

(2) Uptime and delivering the service is paramount. Hence there
are some considerations regarding the online performance of
the server while the RAID is degraded and while it is restoring.
If you are fine to take your machine offline or accept degraded
performance for prolonged times then this does not apply of
course. If you do not value the uptime making actual (even
scheduled) copies of the data may be recommendable over
using a RAID because such schemes may (among other advantages)
protect you from accidental file deletions, too.

Also note that in today's computing landscape, not all unwanted file
deletions are accidental. With the advent of “crypto trojans” adversaries
exist that actually try to encrypt or delete your data to extort a ransom.

> More than one disk can fail? Sure can, and it's one of the reasons why I
> make
> backups.
>
> You also have to consider costs. How much do you want to spend on storage
> and
> and on backups? And do you want make yourself crazy worrying about your
> data?

I am pretty sure that if I separate my PC into GPU, CPU, RAM and Storage, I
spent most on storage actually. Well established schemes of redundancy and
backups make me worry less about my data.

I still worry enough about backups to have written my own software:
https://masysma.net/32/jmbb.xhtml
and that I am also evaluating new developments in that area to probably
replace my self-written program by a more reliable (because used by more
people!) alternative:
https://masysma.net/37/backup_tests_borg_bupstash_kopia.xhtml

> > https://www.baarf.dk/BAARF/RAID5_versus_RAID10.txt
> >
> > If you want to go with mdadm (irrespective of RAID level), you might also 
> > consider running ext4 and trade the complexity and features of the
> > advanced file systems for a good combination of stability and support.
>
> Is anyone still using ext4? I'm not saying it's bad or anything, it only
> seems that it has gone out of fashion.

IIRC its still Debian's default. Its my file system of choice unless I have
very specific reasons against it. I have never seen it fail outside of
hardware issues. Performance of ext4 is quite acceptable out of the box.
E.g. it seems to be slightly faster than ZFS for my use cases.
Almost every Linux live system can read it. There are no problematic
licensing or stability issues whatsoever. By its popularity its probably one
of the most widely-deployed Linux file systems which may enhance the chance
that whatever problem you incur with ext4 someone else has had before...

> I'm considering using snapshots. Ext4 didn't have those last time I checked.

Ext4 still does not offer snapshots. The traditional way to do snapshots
outside of fancy BTRFS and ZFS file systems is to add LVM to the equation
although I do not have any useful experience with that. Specifically, I am
not using snapshots at all so far, besides them being readily available on
ZFS :)

HTH and YMMV
Linux-Fan

öö

Dan Ritter

unread,
Nov 10, 2022, 8:50:05 PM11/10/22
to
Linux-Fan wrote:
> I think the arguments of the RAID5/6 critics summarized were as follows:
>
> * Running in a RAID level that is 5 or 6 degrades performance while
> a disk is offline significantly. RAID 10 keeps most of its speed and
> RAID 1 only degrades slightly for most use cases.
>
> * During restore, RAID5 and 6 are known to degrade performance more compared
> to restoring one of the other RAID levels.

* RAID 5 and 6 restoration incurs additional stress on the other
disks in the RAID which makes it more likely that one of them
will fail. The advantage of RAID 6 is that it can then recover
from that...

* RAID 10 gets you better read performance in terms of both
throughput and IOPS relative to the same number of disks in
RAID 5 or 6. Most disk activity is reading.

> * Disk space has become so cheap that the savings of RAID5 may
> no longer rectify the performance and reliability degradation
> compared to RAID1 or 10.

I think that's a case-by-base basis. Every situation is
different, and should be assessed for cost, reliability and
performance concerns.

> All of these arguments come from a “server” point of view where it is
> assumed that
>
> (1) You win something by running the server so you can actually
> tell that there is an economic value in it. This allows for
> arguments like “storage is cheap” which may not be the case at
> all if you are using up some thightly limited private budget.
>
> (2) Uptime and delivering the service is paramount. Hence there
> are some considerations regarding the online performance of
> the server while the RAID is degraded and while it is restoring.
> If you are fine to take your machine offline or accept degraded
> performance for prolonged times then this does not apply of
> course. If you do not value the uptime making actual (even
> scheduled) copies of the data may be recommendable over
> using a RAID because such schemes may (among other advantages)
> protect you from accidental file deletions, too.

Even in household situations, knowing that you could have traded $100
last year for a working computer right now is an incentive to set up
disk mirroring. If you're storing lots of data that other
people in the household depend on, that might factor in to your
decisions, too.

Everybody has a budget. Some have big budgets, and some have
small. The power of open source software is that we can make
opportunities open to people with small budgets that are
otherwise reserved for people with big budgets.

Most of the computers in my house have one disk. If I value any
data on that disk, I back it up to the server, which has 4 4TB
disks in ZFS RAID10. If a disk fails in that, I know I can
survive that and replace it within 24 hours for a reasonable
amount of money -- rather more reasonable in the last few
months.

> > Is anyone still using ext4? I'm not saying it's bad or anything, it
> > only seems that it has gone out of fashion.
>
> IIRC its still Debian's default. Its my file system of choice unless I have
> very specific reasons against it. I have never seen it fail outside of
> hardware issues. Performance of ext4 is quite acceptable out of the box.
> E.g. it seems to be slightly faster than ZFS for my use cases. Almost every
> Linux live system can read it. There are no problematic licensing or
> stability issues whatsoever. By its popularity its probably one of the most
> widely-deployed Linux file systems which may enhance the chance that
> whatever problem you incur with ext4 someone else has had before...

All excellent reasons to use ext4.

-dsr-

Stefan Monnier

unread,
Nov 10, 2022, 10:00:06 PM11/10/22
to
>> Or are you referring to the data being altered while a backup is in
>> progress?
> Yes. Data of different files or at different places in the same file
> may have relations which may become inconsistent during change operations
> until the overall change is complete.

Arguably this can be considered as a bug in the application (because
a failure in the middle could thus result in an inconsistent state).

> If you are unlucky you can even catch a plain text file that is only half
> stored.

Indeed, many such files are written an a non-atomic way.

> The risk for this is not 0 with filesystem snapshots, but it grows further
> if there is a time interval during which changes may or may not be copied
> into the backup, depending on filesystem internals and bad luck.

With snapshots, such problems can be considered application bugs, but if
you don't use snapshots, then your backup will not see "the state at time
T" but instead will see the state of different files at different times,
and in the case you can very easily see an inconsistent state even
without any bug in an application: the bug is in the backup
process itself.

If some part of your filesystem is frequently/constantly being modified,
then such inconsistent backups can be very common.


Stefan

Michael Stone

unread,
Nov 10, 2022, 11:10:06 PM11/10/22
to
Then you're either well into "not normal" territory and need to buy an
SSD with better write longevity (which I seriously doubt for a backup
drive) or you just got unlucky and got a bad copy (happens with
anything) or you've misdiagnosed some other issue.

Michael Stone

unread,
Nov 10, 2022, 11:20:05 PM11/10/22
to
On Thu, Nov 10, 2022 at 08:32:36PM -0500, Dan Ritter wrote:
>* RAID 5 and 6 restoration incurs additional stress on the other
> disks in the RAID which makes it more likely that one of them
> will fail.

I believe that's mostly apocryphal; I haven't seen science backing that
up, and it hasn't been my experience either.

> The advantage of RAID 6 is that it can then recover
> from that...

The advantage to RAID 6 is that it can tolerate a double disk failure.
With RAID 1 you need 3x your effective capacity to achieve that and even
though storage has gotten cheaper, it hasn't gotten that cheap. (e.g.,
an 8 disk RAID 6 has the same fault tolerance as an 18 disk RAID 1 of
equivalent capacity, ignoring pointless quibbling over probabilities.)

David Christensen

unread,
Nov 11, 2022, 12:20:05 AM11/11/22
to
On 11/10/22 07:44, hw wrote:
> On Wed, 2022-11-09 at 21:36 -0800, David Christensen wrote:
>> On 11/9/22 00:24, hw wrote:
>>  > On Tue, 2022-11-08 at 17:30 -0800, David Christensen wrote:

>> Be careful that you do not confuse a ~33 GiB full backup set, and 78
>> snapshots over six months of that same full backup set, with a full
>> backup of 3.5 TiB of data.

> The full backup isn't deduplicated?


"Full", "incremental", etc., occur at the backup utility level -- e.g.
on top of the ZFS filesystem. (All of my backups are full backups using
rsync.) ZFS deduplication occurs at the block level -- e.g. the bottom
of the ZFS filesystem. If your backup tool is writing to, or reading
from, a ZFS filesystem, the backup tool is oblivious to the internal
operations of ZFS (compression or none, deduplicaton or none, etc.) so
long as the filesystem "just works".


>> Writing to a ZFS filesystem with deduplication is much slower than
>> simply writing to, say, an ext4 filesystem -- because ZFS has to hash
>> every incoming block and see if it matches the hash of any existing
>> block in the destination pool.  Storing the existing block hashes in a
>> dedicated dedup virtual device will expedite this process.
>
> But when it needs to write almost nothing because almost everthing gets
> deduplicated, can't it be faster than having to write everthing?


There are many factors that affect how fast ZFS can write files to disk.
You will get the best answers if you run benchmarks using your
hardware and data.


>>  >> I run my backup script each night.  It uses rsync to copy files and
>>  >
>>  > Aww, I can't really do that because my servers eats like 200-300W
>> because it has
>>  > so many disks in it.  Electricity is outrageously expensive here.
>>
>>
>> Perhaps platinum rated power supplies?  Energy efficient HDD's/ SSD's?
>
> If you pay for it ... :)
>
> Running it once in a while for some hours to make backups is still possible.
> Replacing the hardware is way more expensive.


My SOHO server has ~1 TiB of data. A ZFS snapshot takes a few seconds.
ZFS incremental replication to the backup server proceeds at anywhere
from 0 to 50 MB/s, depending upon how much content is new or has changed.



>>  > Sounds like a nice setup.  Does that mean you use snapshots to keep
>> multiple
>>  > generations of backups and make backups by overwriting everything
>> after you made
>>  > a snapshot?
>>
>> Yes.
>
> I start thinking more and more that I should make use of snapshots.


Taking snapshots is fast and easy. The challenge is deciding when to
destroy them.


zfs-auto-snapshot can do both automatically:

https://packages.debian.org/bullseye/zfs-auto-snapshot

https://manpages.debian.org/bullseye/zfs-auto-snapshot/zfs-auto-snapshot.8.en.html


>> Without deduplication or compression, my backup set and 78 snapshots
>> would require 3.5 TiB of storage.  With deduplication and compression,
>> they require 86 GiB of storage.
>
> Wow that's quite a difference! What makes this difference, the compression or
> the deduplication?


Deduplication.


> When you have snapshots, you would store only the
> differences from one snapshot to the next,
> and that would mean that there aren't
> so many duplicates that could be deduplicated.


I do not know -- I have not crawled the ZFS code; I just use it.


>> Users can recover their own files without needing help from a system
>> administrator.
>
> You have users who know how to get files out of snapshots?


Not really; but the feature is there.


>>  >>>> For compressed and/or encrypted archives, image, etc., I do not use
>>  >>>> compression or de-duplication
>>  >>>
>>  >>> Yeah, they wouldn't compress.  Why no deduplication?
>>  >>
>>  >>
>>  >> Because I very much doubt that there will be duplicate blocks in
>> such files.
>>  >
>>  > Hm, would it hurt?
>>
>> Yes.  ZFS deduplication is resource intensive.
>
> But you're using it already.


I have learned the hard way to only use deduplication when it makes sense.


>> What were the makes and models of the 6 disks?  Of the SSD's?  If you
>> have a 'zpool status' console session from then, please post it.
>
> They were (and still are) 6x4TB WD Red (though one or two have failed over time)
> and two Samsung 850 PRO, IIRC. I don't have an old session anymore.
>
> These WD Red are slow to begin with. IIRC, both SDDs failed and I removed them.
>
> The other instance didn't use SSDs but 6x2TB HGST Ultrastar. Those aren't
> exactly slow but ZFS is slow.


Those HDD's should be fine with ZFS; but those SSD's are desktop drives,
not cache devices. That said, I am making the same mistake with Intel
SSD 520 Series. I have considered switching to one Intel Optane Memory
Series and a PCIe 4x adapter card in each server.


>> MySQL appears to have the ability to use raw disks.  Tuned correctly,
>> this should give the best results:
>>
>> https://dev.mysql.com/doc/refman/8.0/en/innodb-system-tablespace.html#innodb-raw-devices
>
> Could mysql 5.6 already do that? I'll have to see if mariadb can do that now
> ...


I do not know -- I do not run MySQL or Maria.


>> Please run 'zpool status' and post the console session (prompt, command
>> entered, output displayed).  Please correlate the vdev's to disk drive
>> makes and models.
>
> See above ... The pool is a raidz1-0 with the 6x4TB Red drives, and no SSDs are
> left.


Please run and post the relevant command for LVM, btrfs, whatever.



>> On 11/9/22 03:41, hw wrote:

>> What is the make and model of your server?
>
> I put it together myself. The backup server uses a MSI mainboard with the
> designation S0121 C204 SKU in a Chenbro case that has a 16xLFF backplane. It
> has only 16GB RAM and would max out at 32GB.

> ... the backup server is currently using btrfs.


Okay.


>> What is the make and model of your controller cards?
>
> They're HP smart array P410. FreeBSD doesn't seem to support those.


I use the LSI 9207-8i with "IT Mode" firmware (e.g. host bus adapter,
not RAID):

https://www.ebay.com/sch/i.html?_from=R40&_trksid=p2380057.m570.l1313&_nkw=lsi+9207&_sacat=0


> ... the data to back up is mostly (or even all) on btrfs. ... copy the
> files over with rsync. ...
> the data comes from different machines and all backs up to one volume.


I suggest creating a ZFS pool with a mirror vdev of two HDD's. If you
can get past your dislike of SSD's, add a mirror of two SSD's as a
dedicated dedup vdev. (These will not see the hard usage that cache
devices get.) Create a filesystem 'backup'. Create child filesystems,
one for each host. Create grandchild filesystems, one for the root
filesystem on each host. Set up daily rsync backups of the root
filesystems on the various hosts to the ZFS grandchild filesystems. Set
up zfs-auto-snapshot to take daily snapshots of everything, and retain
10 snapshots. Then watch what happens.


David

David Christensen

unread,
Nov 11, 2022, 12:30:05 AM11/11/22
to
>> On Thu, Nov 10, 2022 at 05:54:00AM +0100, hw wrote:
>>> ls -la
>>> insgesamt 5
>>> drwxr-xr-x  3 namefoo namefoo    3 16. Aug 22:36 .
>>> drwxr-xr-x 24 root    root    4096  1. Nov 2017  ..
>>> drwxr-xr-x  2 namefoo namefoo    2 21. Jan 2020  ?
>>> namefoo@host /srv/datadir $ ls -la '?'
>>> ls: Zugriff auf ? nicht möglich: Datei oder Verzeichnis nicht gefunden
>>> namefoo@host /srv/datadir $
>>>
>>>
>>> This directory named ? appeared on a ZFS volume for no reason and I can't
>>> access
>>> it and can't delete it.  A scrub doesn't repair it.  It doesn't seem to do
>>> any
>>> harm yet, but it's annoying.
>>>
>>> Any idea how to fix that?


2022-11-10 21:24:23 dpchrist@f3 ~/foo
$ freebsd-version ; uname -a
12.3-RELEASE-p7
FreeBSD f3.tracy.holgerdanske.com 12.3-RELEASE-p6 FreeBSD
12.3-RELEASE-p6 GENERIC amd64

2022-11-10 21:24:45 dpchrist@f3 ~/foo
$ bash --version
GNU bash, version 5.2.0(3)-release (amd64-portbld-freebsd12.3)
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

2022-11-10 21:24:52 dpchrist@f3 ~/foo
$ ll
total 13
drwxr-xr-x 2 dpchrist dpchrist 2 2022/11/10 21:24:21 .
drwxr-xr-x 14 dpchrist dpchrist 30 2022/11/10 21:24:04 ..

2022-11-10 21:25:03 dpchrist@f3 ~/foo
$ touch '?'

2022-11-10 21:25:08 dpchrist@f3 ~/foo
$ ll
total 14
drwxr-xr-x 2 dpchrist dpchrist 3 2022/11/10 21:25:08 .
drwxr-xr-x 14 dpchrist dpchrist 30 2022/11/10 21:24:04 ..
-rw-r--r-- 1 dpchrist dpchrist 0 2022/11/10 21:25:08 ?

2022-11-10 21:25:11 dpchrist@f3 ~/foo
$ rm '?'
remove ?? y

2022-11-10 21:25:19 dpchrist@f3 ~/foo
$ ll
total 13
drwxr-xr-x 2 dpchrist dpchrist 2 2022/11/10 21:25:19 .
drwxr-xr-x 14 dpchrist dpchrist 30 2022/11/10 21:24:04 ..


David
It is loading more messages.
0 new messages