Reduce calls to mkdir by caching

24 views
Skip to first unread message

Daniel Hahler

unread,
Nov 29, 2012, 9:56:02 AM11/29/12
to back...@googlegroups.com
While stracing a running backshift progress, I see a lot of calls to mkdir, like this:

stat("files/1354189261.62_host_slash_thu-nov-29-12-41-01-2012_8f4e075b684ce05f/dir-etc/dir-.git/dir-objects/dir-2a", 0xd27e150) = -1 ENOENT (No such file or directory)
mkdir("files", 0755)                    = -1 EEXIST (File exists)
mkdir("files/1354189261.62_host_slash_thu-nov-29-12-41-01-2012_8f4e075b684ce05f", 0755) = -1 EEXIST (File exists)
mkdir("files/1354189261.62_host_slash_thu-nov-29-12-41-01-2012_8f4e075b684ce05f/dir-etc", 0755) = -1 EEXIST (File exists)
mkdir("files/1354189261.62_host_slash_thu-nov-29-12-41-01-2012_8f4e075b684ce05f/dir-etc/dir-.git", 0755) = -1 EEXIST (File exists)
mkdir("files/1354189261.62_host_slash_thu-nov-29-12-41-01-2012_8f4e075b684ce05f/dir-etc/dir-.git/dir-objects", 0755) = -1 EEXIST (File exists)
mkdir("files/1354189261.62_host_slash_thu-nov-29-12-41-01-2012_8f4e075b684ce05f/dir-etc/dir-.git/dir-objects/dir-2a", 0755) = 0
open("files/1354189261.62_host_slash_thu-nov-29-12-41-01-2012_8f4e075b684ce05f/dir-etc/dir-.git/dir-objects/dir-2a/entries", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/.git/objects/2a/494214459abb11c0b5672d4c3077d442e94a86", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=45, ...}) = 0

I think that this could get optimized by remembering which directories already exist (because backshift accessed/created them during the current run), and skip calling mkdir for these.

This should make backshift behave far better on a low-latency filesystem (e.g. sshfs).

Besides, what is the first call to open() meant to do? The file cannot exist (because the parent directories did not exist), and backshift does not appear to create the file there anyway.


Regards,
Daniel.

Dan Stromberg

unread,
Nov 29, 2012, 4:17:34 PM11/29/12
to back...@googlegroups.com

Good point.  Want to create a patch?  I wonder if we could just memoize the mkdir code.
--
Dan Stromberg

Daniel Hahler

unread,
Nov 29, 2012, 6:45:43 PM11/29/12
to back...@googlegroups.com
Memoizing seems like a good idea at first, but then my_mkdir would have to call itself recursively, except for the last slice of the full path; because after calling my_mkdir("/foo/bar"), my_mkdir("/foo/baz") would not be memoized and still attempt to mkdir("/foo").
But this would produce quite some overhead (and another call to stat() with the current implementation (via os.path.isdir)).

Seems like my_mkdir should get wrapped in a class and remember the created/existing dirs in a class variable - this might then even skip the call to os.path.isdir - but that is probably cached well enough on the files ystem layer?!


Regards,
Daniel

Dan Stromberg

unread,
Nov 29, 2012, 7:58:25 PM11/29/12
to back...@googlegroups.com

Sounds good.

Probably should be a cache of recently-needed directories too.  This way, we don't save all the directories ever needed, possibly overflowing RAM.  That is, things that haven't been needed in a while could be dumped from the cache.
--
Dan Stromberg

Dan Stromberg

unread,
Dec 1, 2012, 8:14:08 PM12/1/12
to back...@googlegroups.com

This is a lot like the already-extant cacher_mod.py, which is used as a cache of dohdbm tables...
--
Dan Stromberg

Daniel Hahler

unread,
Dec 5, 2012, 11:23:43 AM12/5/12
to back...@googlegroups.com
I am attaching a patch I came up with, which is only a proof of concept and should get adopted, documented and tested if taken.

It adds an ExistingDirs object holding the list of known existing dirs (from my_mkdir).

While it reduces a lot of calls to mkdir, backshift is still rather slow (especially via sshfs).

Is there something planned to skip the calls to utime("chunks/...", NULL) ?

backshift could manage a list of chunk timestamps in a single file / database and use this instead of hitting the file system.


Regards,
Daniel.
dirops_mod.py.diff

Dan Stromberg

unread,
Dec 9, 2012, 12:34:46 PM12/9/12
to back...@googlegroups.com

Thanks, for the patch.  I modified it a little - got it passing pylint and pep8, decreased the size of the cache (since it's linear), etc.  It's passing the automated tests nicely, and it does appear to be reducing the number of mkdir's in an strace.

What sort of copyright/license do you want assigned to this code?

Easiest is if we just put it under my copyright, GPLv3, but if you prefer, I can create another copyright file.

Using a database is a good idea, and one that I've played with a little.  Simple-minded database approaches will kill the parallel-ness of the system (right now, any number of systems can back up at the same time), but something like MySQL or MongoDB would be interesting.  However, I'd want to preserve the ability to back up to a simple filesystem, as I tend to contract a lot, and many places I've contracted have only had a CIFS or NFS share for backups.

I believe backshift is slow, because the deduplication does a lot of random I/O, which translates to inter-track seeks, which are slow.

Thanks!

--
Dan Stromberg

Dan Stromberg

unread,
Dec 9, 2012, 1:10:57 PM12/9/12
to back...@googlegroups.com
On Sun, Dec 9, 2012 at 9:34 AM, Dan Stromberg <stro...@gmail.com> wrote:
I believe backshift is slow, because the deduplication does a lot of random I/O, which translates to inter-track seeks, which are slow.
 
Also, backshift uses xz compression by default, which compresses hard, but takes a while.  If it can't find xz, it'll fall back on bz2, which is a bit faster but doesn't compress as hard.
 
--
Dan Stromberg

Daniel Hahler

unread,
Dec 12, 2012, 9:28:49 AM12/12/12
to back...@googlegroups.com
Your copyright and license is fine for me and this patch.

Regarding the database, this would be saved on the file system itself (e.g. SQLite or some own format) and therefore would work with a CIFS and NFS share, too.
Would this allow to bypass calls to utime()?

My intention is to make incremental backups where not much has changed a lot faster.
Currently it takes several hours on my setup (CIFS share). This could be made faster by not having to lookup every file on the remote system, but only compare it to the "mtime" stored in the database.


Thanks,
Daniel.

Daniel Hahler

unread,
Dec 12, 2012, 10:46:43 AM12/12/12
to back...@googlegroups.com
Am Mittwoch, 12. Dezember 2012 15:28:49 UTC+1 schrieb Daniel Hahler:
 
My intention is to make incremental backups where not much has changed a lot faster.
Currently it takes several hours on my setup (CIFS share). This could be made faster by not having to lookup every file on the remote system, but only compare it to the "mtime" stored in the database.

I just had the idea to use a timestamp file with find's "-newer" option to only pass files changed since the last run to backshift.
However, this would then also miss any newly created files with an older timestamp (e.g. from newly installed packages).


Cheers,
Daniel.

Dan Stromberg

unread,
Dec 12, 2012, 6:23:53 PM12/12/12
to back...@googlegroups.com
On Wed, Dec 12, 2012 at 6:28 AM, Daniel Hahler <google-gr...@thequod.de> wrote:
Your copyright and license is fine for me and this patch.
Great.  Thanks.
 
Regarding the database, this would be saved on the file system itself (e.g. SQLite or some own format) and therefore would work with a CIFS and NFS share, too.
Would this allow to bypass calls to utime()?
SQLite or gdbm or bsddb should work, but wouldn't be very parallel.  Right now, things are very parallel.

I think there are three things worth thinking about:
1) Filesystem only, what we have now
2) A simple, non-parallel database like SQLite, gdbm or bsddb.
3) A parallel database like MySQL or MongoDB.

I could imagine having an option to select among the three.

The main fly in the ointment for #2 above, is if you start out using the system for a single host, #2 would be a great starting point, but could be a pain to switch off of later (to #1 or #3) unless some sort of migration code is also available.
 
My intention is to make incremental backups where not much has changed a lot faster.
Currently it takes several hours on my setup (CIFS share). This could be made faster by not having to lookup every file on the remote system, but only compare it to the "mtime" stored in the database.
I agree that incrementals could be faster.

Reply all
Reply to author
Forward
0 new messages