Possible to disable deduplication?

244 views
Skip to first unread message

Matt Van Mater

unread,
Feb 11, 2015, 10:13:19 AM2/11/15
to zba...@googlegroups.com
This may seem like a stupid question, but is it possible to disable deduplication when performing backup jobs?

The data I want to backup is over 900 GB and ~140,000 files stored on a Windows 2012 server.  Most of that capacity is home videos and other content that I know will not benefit from deduplication.  This data changes very little from day to day but I want to back this data up daily over the internet, probably over SSH unless there is a good reason not to.

I have tried several backup programs and they all have problems one way or the other.  In most situations my backup setup consists of a Linux Virtual Machine mount the SMB share and then backup the data to a remote location.  I have performed these backups "locally" over the LAN before moving the hard drive to the remote location, so I am not backing up the entire ~900GB over the Internet.

I like the concept of zbackup because it natively supports encryption, and remote ssh data stores.  (which saves me from having to use luks, or truecrypt for encryption and simplifies the usage of the remote datastore)

However I don't want to allocate 2+GB of RAM to this VM so it can store a deduplication hash table that I know will not be beneficial.  Will zbackup start to cache the deduplication hash table to disk if it runs out of RAM?  I don't care if the process takes a long time to complete, as long as it completes and can fail gracefully and resume where it left off.

Also, I'm not clear if zbackup is used by other people on datasets of similar size?  Based on the comments on the webpage i think it might not be the case but please correct me if i am wrong.

I have tried a number of other open source backup programs and they all fail and die for one reason or another.  Here are a few examples:
rdiff-backup -- runs very slow and if it has a problem with a single file in the 140k list, it will want to roll back the entire backup operation (which itself can take hours).  does not natively support encryption so i have to use another underlying method.  I like the fact that the filesystem is natively browseable, but i think it has problems creating so many hardlinks after over a period of time.  I ran this for a long time but it is not reliable, and its status seems like an abandoned project and doesn't inspire confidence.
backuppc - does not have a concept of a remote data store, everything is "local".  The only solution is to mount the /var/lib/backuppc directory to a remotely mounted sshfs partition that is encrypted with another underlying method.  However I have had problems using this method (you can't create unix sockets over sshfs, and backuppc depends upon a socket for some reason).
duplicati - nice friendly interface and runs natively on the Windows server, but dies after only 200 MB of data with no error logs or indications of failure.  
rsync -  runs great in all scenarios but of course has no history or rollback capability in the event there is an accidental deletion

Attic looks nice, but it doesn't have a nice debian/ubuntu package and i'm getting lazy so i haven't tried it :) so I wanted to check out zbackup.

Vladimir Stackov

unread,
Feb 11, 2015, 1:34:00 PM2/11/15
to Matt Van Mater, zba...@googlegroups.com
No, it's not possible to disable deduplication because deduplication
is a core of zbackup.

With current master branch you can use different chunk size and bundle
payload size to reduce deduplication overhead but you can't disable
it.
Also you can't disable compression yet but it will be implemented in
near future.

But regarding to your case: why not to use plain rsync? Seems like you
don't need anything else or I understand you incorrectly.
> --
> You received this message because you are subscribed to the Google Groups
> "ZBackup general discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to zbackup+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Kind regards,
Vladimir.

Matt Van Mater

unread,
Feb 11, 2015, 1:56:05 PM2/11/15
to Vladimir Stackov, zba...@googlegroups.com
Thanks for the clarification, that's what I suspected.

Regarding rsync, it works ok (coupled with an encrypted filesystem) but I like the idea of having multiple checkpoints in the past so I can restore the data as it appeared at a particular date/time.  It helps in the rare occasion where I have an accidental deletion.

One other feature I would really like to have is automated email notifications in the event a backup fails.  Relatively few open source backup programs have this, and i consider it a very useful feature so I don't have to babysit automated backups to determine if they are successful or not.

niz...@gmail.com

unread,
Mar 24, 2016, 12:19:08 PM3/24/16
to ZBackup general discussion, amigo...@gmail.com, matt.v...@gmail.com
> Regarding rsync, it works ok (coupled with an encrypted filesystem) but I like the idea of having multiple checkpoints in the past so I can restore the data as it appeared at a particular date/time.  It helps in the rare occasion where I have an accidental deletion.

You should consider rdiff-backup instead of rsync. It will give you incremental-forever, with a FUSE layer for added ease. It should fit your needs if you encrypt the underlying filesystem yourself.

Regards,
Martin
Reply all
Reply to author
Forward
0 new messages