rdiff-backup performance question

4 views
Skip to first unread message

Duane Abrames

unread,
May 28, 2025, 1:37:37 PMMay 28
to rdiffweb
I know this is not a forum for general rdiff-backup discussions, and feel free to tell me to pound sand or point me to a more appropriate place to ask my question.

My question has to do with "many small files" performance.  I have a folder of media files, 1.31 TB in 13,241 folders and 177,501 files . This is just one folder in my backup.  The other top-level folders mostly include files which are larger, and are split into fewer folders.  I am still new to rdiff-backup, and my initial back for this server has been running for 6 days now. The concerning part is that this folder is the last one to back up, and it has been running for 3.5 days.  The total data set is about 25TB, which means that this folder has used half the time to do 5% of the data.  Both servers are Unraid, with spinning disks formatted with XFS.  What I currently have configured is rdiff-backup running in a docker on the target server, with the source servers disk exposed via NFS and volume-mapped into the container.  I have started spinning up a new container to put on the source server with ssh and rdiff-backup.  Do the veterans here think that rdiff-backup to rdiff-backup over ssh will provide better performance than the NFS mounts?  I should be able to arrange it so that the relative paths stay the same, so I don't think I'll need to do another initial backup if I switch to ssh. 

Thanks for employing your brain cells on my behalf. 

Duane Abrames

unread,
May 28, 2025, 1:39:12 PMMay 28
to rdiffweb
In re-reading my post, I realize I forgot to mention these servers are about 8 feet apart, communicating via gigabit ethernet.

Patrik Dufresne

unread,
May 28, 2025, 2:54:19 PMMay 28
to Duane Abrames, rdiffweb
Hello Duane,

> I know this is not a forum for general rdiff-backup discussions, and feel free to tell me to pound sand or point me to a more appropriate place to ask my question.

Indeed. This is not the best place to ask technical questions regarding rdiff-backup. The best place would be to ask on rdiff-backup mailing list which I also monitor: https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Feel free to forward my answer to rdiff-backup mailing list.

I will try to deconstruct your message in small pieces to answer your question.

> My question has to do with "many small files" performance.  I have a folder of media files, 1.31 TB in 13,241 folders and 177,501 files . 

This feels reasonable to me. I have a couple of repositories with 2TiB+ with 1 million files without any issues.

> This is just one folder in my backup.  The other top-level folders mostly include files which are larger, and are split into fewer folders.  I am still new to rdiff-backup, and my initial back for this server has been running for 6 days now. The concerning part is that this folder is the last one to back up, and it has been running for 3.5 days. The total data set is about 25TB, which means that this folder has used half the time to do 5% of the data.

Obviously, transferring 25TiB might take a while even if the servers were using the 1 Gbps network interface at full speed, which is not happening with rdiff-backup. There is always too much latency waiting for the IO.
To have a better understanding of what is happening, you might want to run "strace -p <pid>". It's a Linux utility to inspect what the program is doing at the kernel level. You should see file read and file writes.

> Both servers are Unraid, with spinning disks formatted with XFS. What I currently have configured is rdiff-backup running in a docker on the target server, with the source server's disk exposed via NFS and volume-mapped into the container.  I have started spinning up a new container to put on the source server with ssh and rdiff-backup.  Do the veterans here think that rdiff-backup to rdiff-backup over ssh will provide better performance than the NFS mounts?  I should be able to arrange it so that the relative paths stay the same, so I don't think I'll need to do another initial backup if I switch to ssh.

My experience shows it's better to backup through SSH than using an NFS mount point. It's faster and more stable.


--
You received this message because you are subscribed to the Google Groups "rdiffweb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdiffweb+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rdiffweb/32816368-36cd-4eae-9fbc-ea9ad08b56a9n%40googlegroups.com.


--
** Par mesure d'efficacité, je consulte mes courriels une fois par jour.
IKUS Software
Reply all
Reply to author
Forward
0 new messages