Storing metadata in bup

已查看 191 次
跳至第一个未读帖子

Avery Pennarun

未读,
2010年1月23日 20:48:112010/1/23
收件人 bup-list
Hi all,

I'm back from Mexico and I'm thinking about the best way to go about
storing file metadata (attributes, ownership, etc) in bup. As you
probably know, git's basic trees/blobs format is pretty useless for
this, other than basic settings like "read only" vs. "read write" and
"normal file" vs. "symlink." Those are pretty much all you need when
storing source code, which is what git was designed to do. But for
backing up a complete filesystem (in such a way that it'll be usable if
you restore it later), you need much more.

Attributes we should store right away:

- mtime (and maybe ctime and atime? but you can't restore ctime, and
atime changes just from doing the backup...)

- exact file mode (as opposed to git's "normalized" file mode),
including "device files" and named pipes

- uid *and* username (because sometimes you want to restore on systems
where apenwarr != 1001)

- gid *and* groupname

Attributes we should keep in mind for later:

- hard links

- Linux/POSIX ACLs

- Linux extended attributes (like immutable and selinux stuff)

- MacOS resource forks (do they still use those?)

- Windows extended attributes

The "for later" list is stuff that I basically never use, so it won't
significantly affect my life if they aren't implemented, so hopefully
someone who cares will submit patches after the basics are in place.

Now, as for the attribute storage format:

I'm thinking the best way to do it might be to store the data in one
tree, and the metadata in another tree. So the root of a backup made
by 'bup save' would be just links to data/ and meta/. The two trees
would have all the same filenames, but the "blob" contents for each
object in the meta/ tree would be a file containing the attributes,
rather than the actual file.

We'll have to store the attributes for directories somehow too, which
implies we might need some name mangling in the meta/ tree somewhere.

Other ways of organizing the tree that might or might not be better
(comments?):

- put the files at the root, as they are now, and add an *additional*
.bupmeta folder. Then we'd also need some mangling for files already
named .bupmeta, but that's not too bad. The advantage of this method
is that you can still refer to the contents of a file using 'git
show' notation as "git show branchname:path/to/file" instead of "git
show branchname:data/path/to/file", which is nicer. You could also
use 'git checkout' on the toplevel tree with relatively minor impact,
but you'd still end up checking out a bunch of extra crap in
.bupmeta.

- put a .bupmeta into *each* subtree. This sounds crazy, but it would
reduce the amount of name mangling for subdirs (every subtree's
.bupmeta could just have a root-level entry for describing the
directory itself), and it would mean that every subtree is
self-contained, so you could restore any subtree without knowing
anything about the *root* of that subtree. On the other hand, doing
"git checkout" without accidentally checking out a lot of .bupmeta
folders would be virtually impossible, and changing any file's
attributes would end up changing the data *and* metadata folders for
every folder all the way up, which seems unnecessarily stupid, so I
don't like this option much.

Thoughts on these?

As for actually encoding the attributes, here are my current thoughts.

First of all, the #1 concern should be compressibility: how much space
does it take to store these attributes? Assuming a single toplevel
meta/ tree, the minimum uncompressed space needed to store attributes
for n files would be:

n * [ len('100644 ') + len(avg filename) + 20 + len(metafile) ]

Now that's a bit misleading, because git tree files are actually
compressed. That means len('100644 ') is actually just a few bits, and
filenames tend to be pretty compressible too. So let's assume those
parts are negligibly small. What you have left is the 20-byte sha1
(which is totally uncompressible, being a random number) and the
metafile itself, which has varying levels of compressibility depending
how verbose the file format is.

But thinking about this, I realized: how often will multiple files use
the *same* metafile contents? Pretty often, I think. Imagine a
typical directory full of documents or mp3s or whatever. The
attributes like permissions, ownership, ACLs, and even file types (if
you're storing MacOS resource forks or whatever) are going to be the
same for every file in the whole directory, or at worst you can divide
them into subsets.

In *that* case, the sha1sum becomes *highly* compressible: you store it
only once per tree, and every file in a tree that reuses those
attributes uses the same sha1sum, which zlib will happily compress down
to almost nothing.

There's only one catch: the mtime (and ctime/atime if we decide we need
those). Pretty much every file will have a different one of those, and
that would destroy our compressibility.

So what I'm thinking is... why not encode the mtime into the filename
of each file in the meta/ tree? That takes the ever-changing stuff out
of the metafile, and pulls it up into the tree where it can be
compressed more intelligently (since most mtimes will be *similar* to
each other, and zlib excels at handling similar bits between a bunch of
different entries).

So basically a tree (in 'git ls-tree HEAD' format), instead of looking
like this:

100644 blob 4941e35... git.py
100644 blob 7dee3d6... hashsplit.py

would look like this:

100644 blob 4941e35... git.py 1263446699
100644 blob 7dee3d6... hashsplit.py 1263446700

We could also throw in the ctime and atime if we wanted, perhaps
comma-separating them. (The attributes are appended instead of the
prepended, so that filenames are in the same order as in the data/
tree.)

Of course, *that* line of reasoning makes me wonder if we wouldn't be
better off just encoding *all* the attributes into the filename and
skipping the blobs altogether. We could even skip the meta/ tree and
encode them into the filenames in the data/ section. Of course, that
wouldn't be very good for storing extended attributes, etc, but we
could reserve a toplevel .bupmeta/ tree for just *that* stuff, and it
would be a lot smaller, plus the disk access patterns would probably be
nicer.

So as you can see, there are some good reasons why we don't save file
metadata yet :)

Thoughts?

Avery

Dave Coombs

未读,
2010年1月24日 15:47:412010/1/24
收件人 bup-...@googlegroups.com
Welcome back!

> Attributes we should store right away:
>
> - mtime (and maybe ctime and atime? but you can't restore ctime, and
> atime changes just from doing the backup...)

I think mtime is enough. I don't know any applications that *rely* on atime being accurate, although it's occasionally informative for humans. The only application of ctime I know is for "normal" incremental backup algorithms. So skip that. Plus, as you say, you can't restore ctime. (You could hack atime back to the original value after reading for the backup, if you decide that's important, but I really don't think it is. And that changes ctime again too.)

> - uid *and* username (because sometimes you want to restore on systems
> where apenwarr != 1001)
>
> - gid *and* groupname

Thank you for wanting to do this right. :)

> Attributes we should keep in mind for later:
>
> - hard links

This is admirable, but I have no idea how you'd do it. Given a file with link count > 1, I don't know if there's any way (never mind a portable way) to find out what other filenames link to the same file.

> - MacOS resource forks (do they still use those?)

I was researching this the other day, and the answer is: no-ish. It's technically possible to create and use a resource fork (or any other named fork). Apple has entirely stopped doing this, realizing that forks make no sense on any non-HFS filesystem. I imagine most apps have stopped too.

What Apple has started doing instead is putting things in an extended attribute called "com.apple.ResourceFork". So as long as you're backing up extended attributes, you're set. (You can see these in mac os with ls -l@, and you can use xattr -p to view the attribute contents.)

> Now, as for the attribute storage format:

I have no particular suggestions, although there are obviously benefits and drawbacks to each of the schemes you've listed. I suggest maybe you're overengineering, though, since metadata is way smaller than file contents. As long as backups and restores wind up being halfway efficient, I don't really care how compressible the metadata usually is. (I care a bit. But I don't *really* care.) :-)

Have fun,
-Dave

Allan Wind

未读,
2010年1月24日 18:42:492010/1/24
收件人 bup-list
On 2010-01-23T20:48:11, Avery Pennarun wrote:
> Now, as for the attribute storage format:
>
> I'm thinking the best way to do it might be to store the data in one
> tree, and the metadata in another tree. So the root of a backup made
> by 'bup save' would be just links to data/ and meta/. The two trees
> would have all the same filenames, but the "blob" contents for each
> object in the meta/ tree would be a file containing the attributes,
> rather than the actual file.
>
> We'll have to store the attributes for directories somehow too, which
> implies we might need some name mangling in the meta/ tree somewhere.
>
> Other ways of organizing the tree that might or might not be better
> (comments?):
>
> - put the files at the root, as they are now, and add an *additional*
> .bupmeta folder. Then we'd also need some mangling for files already
> named .bupmeta, but that's not too bad. The advantage of this method
> is that you can still refer to the contents of a file using 'git
> show' notation as "git show branchname:path/to/file" instead of "git
> show branchname:data/path/to/file", which is nicer. You could also
> use 'git checkout' on the toplevel tree with relatively minor impact,
> but you'd still end up checking out a bunch of extra crap in
> .bupmeta.

Ultimately you need to either follow a path of bup being some
added functionality to git in which case notation is a feature,
or bup standing on its own that happens to use git behind the
scenes which makes the above an artificial constraint.

Have you looked into how others have tackled the meta problem?

http://www.jukie.net/~bart/blog/20070312134706
http://lists-archives.org/git/707737-new-proposal-simple-for-metadata-in-git-commits-git-meta.html
http://www.kernel.org/pub/software/scm/git/docs/git-svn.html
http://eigenclass.org/hiki/gibak-backup-system-introduction
http://joey.kitenet.net/code/etckeeper/ (http://david.hardeman.nu/software.php)

You probably want to avoid an i/o per file, and I was never a fan
how subversion puts .svn directories all over the place.

git-notes might be interesting but you probably want to bake the
data into the change history which notes does not do.

It is fairly important that you tell the user on _backup_
which meta data you do not support.

Carbonite tells you on _restore_ that, btw, video files are only
backed up if you mark the folder specifically. We did not lose
any data but Carbonite lost a customer.


/Allan
--
Allan Wind
Life Integrity, LLC
<http://lifeintegrity.com>

Avery Pennarun

未读,
2010年1月24日 20:35:052010/1/24
收件人 Dave Coombs、bup-...@googlegroups.com
On Sun, Jan 24, 2010 at 3:47 PM, Dave Coombs <dco...@gmail.com> wrote:
> I think mtime is enough.  I don't know any applications that *rely* on atime being accurate,
> although it's occasionally informative for humans.

Well, popularity-contest uses it :)

It is indeed almost the only program I know of that does.

>> - uid *and* username (because sometimes you want to restore on systems
>>  where apenwarr != 1001)
>>
>> - gid *and* groupname
>
> Thank you for wanting to do this right. :)

I've been screwed by it before :)

>> Attributes we should keep in mind for later:
>>
>> - hard links
>
> This is admirable, but I have no idea how you'd do it.  Given a file with link count > 1, I
> don't know if there's any way (never mind a portable way) to find out what other
> filenames link to the same file.

Turns out it can be done, it's just mildly disgusting. (rsync has an
option to do it.) Basically, you keep a dict of all files with link
count > 1 (which is probably not a huge number of files), and hardlink
together the ones where the device+inode numbers from stat() are
identical.

Since we have a bup indexfile with the device+inode information
already in it, it's theoretically easy to generate the list of
hardlinked files. But it's not very important to me.

>> Now, as for the attribute storage format:
>

> [...] I suggest maybe you're overengineering, though, since metadata is way smaller than file contents.

This is true enough. It probably just doesn't matter overly much.

One thing I've learned from git is that it doesn't hurt to think about
your *core data structures* a bit up front, because that can make the
entire rest of your life go more smoothly. So I'd like to do this
"right." But maybe it's not worth thinking too hard.

Have fun,

Avery

Avery Pennarun

未读,
2010年1月24日 20:47:002010/1/24
收件人 bup-list
On Sun, Jan 24, 2010 at 6:42 PM, Allan Wind
<allan...@lifeintegrity.com> wrote:
> On 2010-01-23T20:48:11, Avery Pennarun wrote:
>> - put the files at the root, as they are now, and add an *additional*
>>   .bupmeta folder.  Then we'd also need some mangling for files already
>>   named .bupmeta, but that's not too bad.  The advantage of this method
>>   is that you can still refer to the contents of a file using 'git
>>   show' notation as "git show branchname:path/to/file" instead of "git
>>   show branchname:data/path/to/file", which is nicer.  You could also
>>   use 'git checkout' on the toplevel tree with relatively minor impact,
>>   but you'd still end up checking out a bunch of extra crap in
>>   .bupmeta.
>
> Ultimately you need to either follow a path of bup being some
> added functionality to git in which case notation is a feature,
> or bup standing on its own that happens to use git behind the
> scenes which makes the above an artificial constraint.

I prefer to think of bup as "a different but compatible implementation
that uses the git file format." There are a (very) few places where
it calls into "real" git, but I think (if only for portability
reasons) that these should probably go away eventually.

However, just because bup doesn't call into git, I don't necessarily
want to give up on file format compatibility. File format
compatibility means more programs in the future that will be able to
read your 20-year-old backups when you really, really need to get back
that old file.

> Have you looked into how others have tackled the meta problem?

Thanks for the links. It always helps to survey other people's solutions :)

> http://www.jukie.net/~bart/blog/20070312134706

Doesn't store metadata at all.

> http://lists-archives.org/git/707737-new-proposal-simple-for-metadata-in-git-commits-git-meta.html

Stores per-commit metadata, not per-file metadata.

> http://www.kernel.org/pub/software/scm/git/docs/git-svn.html

Also uses per-commit metadata, plus a bunch of cache files that aren't
part of the actual repo, thus can't be cloned/etc.

These two both use metastore (http://david.hardeman.nu/software.php)
to create a *single* file with all the metadata in it. That's a very
interesting idea, and would certainly defeat any claims of
overengineering :)

One thing I don't like about the idea is that having it all in a
single file makes access to the backups less random-access; you have
to expand the whole metastore file every time you want to read any
file. Then again, I guess you can pack a *lot* of metadata into a
pretty small space, particularly with gzip.

Any thoughts?

> You probably want to avoid an i/o per file,

Definitely. Although since so many files will share similar
attributes, it's probably nowhere near one I/O per file in the
*normal* case.

> and I was never a fan
> how subversion puts .svn directories all over the place.

Me neither. I hate it quite a bit, in fact :)

> git-notes might be interesting but you probably want to bake the
> data into the change history which notes does not do.

Yeah, git-notes is more appropriate for stuff you don't know at commit
time, and this stuff is already well-defined. Plus it's again for
per-commit metadata, not per-file metadata.

> It is fairly important that you tell the user on _backup_
> which meta data you do not support.
>
> Carbonite tells you on _restore_ that, btw, video files are only
> backed up if you mark the folder specifically.  We did not lose
> any data but Carbonite lost a customer.

Well, telling people what your program *can't* do is a bit tricky;
there are an infinite number of things in that list, and some of types
of metadata probably haven't even been invented yet.

Not telling you at backup time that some of your files have been
ignored is pretty incredibly rude, though. To my horror, several of
the Linux git-oriented backup apps I looked at before writing bup did
something similar: they silently auto-skip all files over a certain
size. This is a FEATURE?! Why, yes, because it makes git not crash
when backing up your system (since git doesn't handle large files very
well). Um.

So yeah, I sympathize with you on that one and will try not to screw
up bup too terribly :)

Have fun,

Avery

Dave Coombs

未读,
2010年1月24日 21:21:402010/1/24
收件人 bup-...@googlegroups.com
>> I think mtime is enough. I don't know any applications that *rely* on atime being accurate,
>> although it's occasionally informative for humans.
>
> Well, popularity-contest uses it :)

Your popcon reporting stats are probably the least of your worries when you get to restoring a backup. :)

>>> - uid *and* username (because sometimes you want to restore on systems
>>> where apenwarr != 1001)
>>>
>>> - gid *and* groupname
>>
>> Thank you for wanting to do this right. :)
>
> I've been screwed by it before :)

Screwed's pretty harsh... I've been inconvenienced by it before, certainly.

> Turns out it can be done, it's just mildly disgusting. (rsync has an
> option to do it.) Basically, you keep a dict of all files with link
> count > 1 (which is probably not a huge number of files), and hardlink
> together the ones where the device+inode numbers from stat() are
> identical.

OK, that makes sense. Though, as you say, whether it's worth the effort is pretty debatable. You can certainly just do it later if it becomes important, too.

>>> Now, as for the attribute storage format:
>>
>> [...] I suggest maybe you're overengineering, though, since metadata is way smaller than file contents.
>
> This is true enough. It probably just doesn't matter overly much.
>
> One thing I've learned from git is that it doesn't hurt to think about
> your *core data structures* a bit up front, because that can make the
> entire rest of your life go more smoothly. So I'd like to do this
> "right." But maybe it's not worth thinking too hard.

Sure. So, if you agree that, in this case, it's more important to make it work correctly than it is to minimize the disk space or time used by the metadata store, then I suppose you have the option (the luxury, even) of doing it the way that's most convenient to you as a developer.

Whatever that might be. :)

Have fun,
-Dave

Avery Pennarun

未读,
2010年1月24日 22:14:512010/1/24
收件人 Dave Coombs、bup-...@googlegroups.com
On Sun, Jan 24, 2010 at 9:21 PM, Dave Coombs <dco...@gmail.com> wrote:
>>> I think mtime is enough.  I don't know any applications that *rely* on atime being accurate,
>>> although it's occasionally informative for humans.
>>
>> Well, popularity-contest uses it :)
>
> Your popcon reporting stats are probably the least of your worries when you get to restoring a backup. :)

Okay, true.

>>>> - uid *and* username (because sometimes you want to restore on systems
>>>>  where apenwarr != 1001)
>>>>
>>>> - gid *and* groupname
>>>
>>> Thank you for wanting to do this right. :)
>>
>> I've been screwed by it before :)
>
> Screwed's pretty harsh... I've been inconvenienced by it before, certainly.

Fair enough. I guess I didn't lose any actual data. Unless you count
file ownership and setuid permissions on a couple hundred thousand
files as data :)

>> Turns out it can be done, it's just mildly disgusting.  (rsync has an
>> option to do it.)  Basically, you keep a dict of all files with link
>> count > 1 (which is probably not a huge number of files), and hardlink
>> together the ones where the device+inode numbers from stat() are
>> identical.
>
> OK, that makes sense.  Though, as you say, whether it's worth the effort is
> pretty debatable.  You can certainly just do it later if it becomes important, too.

That was my plan :) Or even better, wait for someone else to do it.

Have fun,

Avery

Allan Wind

未读,
2010年1月25日 00:07:072010/1/25
收件人 bup-list
On 2010-01-24T20:47:00, Avery Pennarun wrote:
> Thanks for the links. It always helps to survey other people's solutions :)

Yup. This was by no means exhaustive just knew that others
looked into this already.

> One thing I don't like about the idea is that having it all in a
> single file makes access to the backups less random-access; you have
> to expand the whole metastore file every time you want to read any
> file. Then again, I guess you can pack a *lot* of metadata into a
> pretty small space, particularly with gzip.
>
> Any thoughts?

If you store that file in pack you get all the benefits of delta
encoding. Which gets us back to where to store that, of course.

> Well, telling people what your program *can't* do is a bit tricky;
> there are an infinite number of things in that list, and some of types
> of metadata probably haven't even been invented yet.

You tell the user about classes of problems and update it as you
learn more. "File x has acl attributes, but we do not store that
yet".

> Not telling you at backup time that some of your files have been
> ignored is pretty incredibly rude, though. To my horror, several of
> the Linux git-oriented backup apps I looked at before writing bup did
> something similar: they silently auto-skip all files over a certain
> size. This is a FEATURE?! Why, yes, because it makes git not crash
> when backing up your system (since git doesn't handle large files very
> well). Um.

Wow.

Avery Pennarun

未读,
2010年1月25日 01:10:152010/1/25
收件人 bup-list
On Mon, Jan 25, 2010 at 12:07 AM, Allan Wind
<allan...@lifeintegrity.com> wrote:
> On 2010-01-24T20:47:00, Avery Pennarun wrote:
>> One thing I don't like about the idea is that having it all in a
>> single file makes access to the backups less random-access; you have
>> to expand the whole metastore file every time you want to read any
>> file.  Then again, I guess you can pack a *lot* of metadata into a
>> pretty small space, particularly with gzip.
>
> If you store that file in pack you get all the benefits of delta
> encoding.  Which gets us back to where to store that, of course.

Well, bup doesn't currently write packs using any kind of delta
encoding; the delta encoding heuristics in git are quite crazy and
could be hard to do in a high-data-size application like bup.
Moreover, delta encoding would increase the interdependencies of
different backups even further, making it very hard to prune things
later (I think).

So this file would end up near-duplicated between backups. Although I
guess we could use the usual bup file chunking algorithm for this, so
it might not actually be so bad. Hmm.

>> Well, telling people what your program *can't* do is a bit tricky;
>> there are an infinite number of things in that list, and some of types
>> of metadata probably haven't even been invented yet.
>
> You tell the user about classes of problems and update it as you
> learn more.  "File x has acl attributes, but we do not store that
> yet".

Detecting the existence of extended attributes at all in a portable
way is probably about the same order of magnitude of work as
supporting them, unfortunately.

Have fun,

Avery

Rob Browning

未读,
2010年2月16日 03:33:542010/2/16
收件人 Avery Pennarun、bup-list
Avery Pennarun <apen...@gmail.com> writes:

> - Linux extended attributes (like immutable and selinux stuff)
>
> - MacOS resource forks (do they still use those?)
>
> - Windows extended attributes

... and whatever else comes along later. So I think the storage format
should be extensible. At a bare minimum, some kind of metadata format
version should probably be stored somewhere, whether in each tree,
in each metadata file, or whatever.

More generally, for a backup system, I'd be inclined to care more about
the long term. So within reason, I'd be inclined to store more data
rather than less, and to choose formats that are clear and extensible,
even at some computational and storage expense. I also suspect that you
will see quite a bit of compression, even for fairly naive storage
formats.

So, for example, I would go ahead and store all the file times (or at
least make doing so configurable), even if they're not always used. I'd
prefer for a backup to make it possible to restore everything.

With respect to formats, I've spent a lot of time in the lisp world, so
I'm often inclined to reach for s-expressions when the performance is
viable -- being able to examine and adjust your data (decades later)
with sed/perl/emacs/vi is always nice. That said, I know s-expressions
are not to everyone's taste, and they're not always efficient enough,
nor an easy match for some target languages.

However, I do think it might be worth considering formats that are very
well defined, extensible, and allow you to avoid writing/maintaining a
lot of custom parsing code. That should make it easier to fold in
support for next year's new OS, FS, or fancy metadata scheme. (Although
I don't know much about them yet, I wondered if protocol buffers might
be interesting.)

All that said, I'm not opposed manual parsing and a lot of cleverness
when justified.

In any case, I know these comments are somewhat general; I'll try to
follow up with more specific comments about your proposals a bit later.

Thanks
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

Avery Pennarun

未读,
2010年2月16日 18:03:442010/2/16
收件人 Rob Browning、bup-list
On Tue, Feb 16, 2010 at 3:33 AM, Rob Browning <r...@defaultvalue.org> wrote:
> ... and whatever else comes along later.  So I think the storage format
> should be extensible.  At a bare minimum, some kind of metadata format
> version should probably be stored somewhere, whether in each tree,
> in each metadata file, or whatever.
>
> More generally, for a backup system, I'd be inclined to care more about
> the long term. So within reason, I'd be inclined to store more data
> rather than less, and to choose formats that are clear and extensible,
> even at some computational and storage expense.  I also suspect that you
> will see quite a bit of compression, even for fairly naive storage
> formats.
>
> So, for example, I would go ahead and store all the file times (or at
> least make doing so configurable), even if they're not always used.  I'd
> prefer for a backup to make it possible to restore everything.

Yeah, those are good points. I think you're right that storing
ctime/mtime/atime has little downside (even if we never restore it)
but having them *not* available when we need them would be much worse.
Especially ctime/mtime tend to be so close to each other, most of the
time, that after gzipping the difference in space used with and
without ctime should be almost nil. Similarly with atime.

Extensibility is another big one, and I agree, it really needs to be
somehow extensible, since there will definitely be metadata that we
haven't thought of yet... never mind the metadata we *have* thought of
but aren't going to implement on day one.

I'm not totally in favour of version codes, though. I observe that
the main git data structures (blobs, trees, commits, tags) don't have
version numbers, but they're extensible nevertheless. git
packfiles/indexes have version numbers, but that's a bit different
since they can be replaced (eg. upgraded) without affecting the
structure of the data that was stored.

It looks like we can probably get away without a version code just by
having arbitrary extension fields, sort of like MIME headers or
whatever.

> With respect to formats, I've spent a lot of time in the lisp world, so
> I'm often inclined to reach for s-expressions when the performance is
> viable -- being able to examine and adjust your data (decades later)
> with sed/perl/emacs/vi is always nice.  That said, I know s-expressions
> are not to everyone's taste, and they're not always efficient enough,
> nor an easy match for some target languages.
>
> However, I do think it might be worth considering formats that are very
> well defined, extensible, and allow you to avoid writing/maintaining a
> lot of custom parsing code.  That should make it easier to fold in
> support for next year's new OS, FS, or fancy metadata scheme.  (Although
> I don't know much about them yet, I wondered if protocol buffers might
> be interesting.)
>

You can manipulate s-expressions with sed? Well, I guess you can
manipulate anything with sed, but it's not always easy... :)

My initial thoughts on file format were to use the usual modern
standbys, ie. email-header-style, xml, json, or yaml. XML sucks so I
think I'd just skip that. email-style headers are basically what git
commits use, and it works okay, but I don't think they're great for
even moderately complex data structures (such as Linux ACLs or SELinux
attributes, never mind Mac resource forks).

json and yaml don't have those flaws, but the fact that they're
parse-intensive formats kind of annoys me, since we'll be generating
them potentially millions of times during a backup. I was on a
text-only binge for a long time, but playing with git's data
structures (which are often, but not always, binary) has given me a
new appreciation for binary-formatted data.

When you suggested it to me privately earlier, I took another look at
Google's Protocol Buffers stuff. It's actually pretty cool. Well,
the data format is pretty cool (it's just a pretty,
recursively-defined, extensible, straightforward way of essentially
having binary key-value pairs). There's also a well-defined text
format that can represent exactly the same data as the binary format
does, which means it's easy to read/write binary data efficiently
while also making it easy to manipulate.

I tried out their library, though, and I don't like it; it's very
"enterprisey" (in which it has a *compiler* written in C++, which then
generates python code). The whole thing together is 1.4 megs
*compressed* of crap, which is much much bigger than all of bup. So I
wouldn't want to use their library. Luckily, the data format is easy
enough to implement ourselves anyway.

So I think protocol buffers might be the winner here. It's just that,
in the name of sanity and not creating bup dependencies on things like
C++ compilers, we'd have to implement it ourselves. That wouldn't be
so bad, I think.

...

Also, having thought about it for a while now, I'm coming out more and
more against storing all the metadata in a single file (metastore
style) and more in favour of storing it in its own tree (git style),
however we actually name that tree in the end.

That means there are basically two tasks to implementing metadata:

1) converters between a file's metadata and a git-storable blob
representing that metadata in our bup format

2) something that arranges the metadata into a tree.

These two tasks could be parallelized. Since I know a lot about how
bup organizes things into trees, probably I'm most suited to implement
#2. But if someone wants to implement the first version of #1 and
contribute it, that would be great. I have a feeling people using
other platforms (or who need fancy metadata) then won't find it too
hard to contribute patches to add the new kinds of metadata to the
existing encoder/decoder.

Have fun,

Avery

Rob Browning

未读,
2010年2月17日 03:18:492010/2/17
收件人 Avery Pennarun、bup-list
Avery Pennarun <apen...@gmail.com> writes:

> It looks like we can probably get away without a version code just by
> having arbitrary extension fields, sort of like MIME headers or
> whatever.

Oh right, a version number definitely isn't critical. I just mentioned
that as one (minimal) way to avoid being painted into a corner.

> So I think protocol buffers might be the winner here. It's just that,
> in the name of sanity and not creating bup dependencies on things like
> C++ compilers, we'd have to implement it ourselves. That wouldn't be
> so bad, I think.

OK, at this point, you definitely know more about protocol buffers than
I do. So if bup used the existing protocol buffers implementation,
would it actually be a run-time dependency or just a build-time
dependency?

> Also, having thought about it for a while now, I'm coming out more and
> more against storing all the metadata in a single file (metastore
> style) and more in favour of storing it in its own tree (git style),
> however we actually name that tree in the end.

I had been thinking about something like this, but at least initially I
was thinking about having one blob in the meta tree per directory in the
data tree. That would probably cut down on the blob count and perhaps
fragmentation, but would increase random access time, and would require
writing notably more metadata when only a few files change attributes in
a large directory.

I suppose originally I was more or less thinking of something like an
extended git tree object, which we might even be able to add if we
weren't concerned about maintaining physical compatibility with git.

Avery Pennarun

未读,
2010年2月17日 12:19:492010/2/17
收件人 Rob Browning、bup-list
On Wed, Feb 17, 2010 at 3:18 AM, Rob Browning <r...@defaultvalue.org> wrote:
> Avery Pennarun <apen...@gmail.com> writes:
>> So I think protocol buffers might be the winner here.  It's just that,
>> in the name of sanity and not creating bup dependencies on things like
>> C++ compilers, we'd have to implement it ourselves.  That wouldn't be
>> so bad, I think.
>
> OK, at this point, you definitely know more about protocol buffers than
> I do.  So if bup used the existing protocol buffers implementation,
> would it actually be a run-time dependency or just a build-time
> dependency?

From what I understand, the protobup compiler (a C++ app) is a
build-time dependency. It then generates python code (writing a C++
program to generate python? Something here is already crazy) which
creates some python classes. But those python classes are actually
instances of *metaclasses* (I don't actually know what a metaclass is;
it's some fancy python feature I've never had to learn about), and the
metaclasses are provided by the protobuf runtime library. And on top
of all that, they have warnings that the protobuf python
implementation isn't really speed-optimized yet.

So all in all, not so inspiring. I don't care too much about the
speed optimizations (one tiny encoding run per file, and only if that
file has changed, isn't going to kill bup performance) but if it's not
going to be speed optimized anyway, then all that other crap is just
pointless. It would be easy to write a basic encoder using pure
python and skip the metaclass nonsense.

>> Also, having thought about it for a while now, I'm coming out more and
>> more against storing all the metadata in a single file (metastore
>> style) and more in favour of storing it in its own tree (git style),
>> however we actually name that tree in the end.
>
> I had been thinking about something like this, but at least initially I
> was thinking about having one blob in the meta tree per directory in the
> data tree.  That would probably cut down on the blob count and perhaps
> fragmentation, but would increase random access time, and would require
> writing notably more metadata when only a few files change attributes in
> a large directory.
>
> I suppose originally I was more or less thinking of something like an
> extended git tree object, which we might even be able to add if we
> weren't concerned about maintaining physical compatibility with git.

I see what you mean here. The metaphor is that git trees store some
attribute stuff, the filename, and the blobid, and there's one per
directory. So if we're going to store more metadata, maybe what we
should have is a single git tree-like file (.bupmeta?) with the
metadata for the stuff in that tree. It would be basically an
extension of the tree file itself, and metadata that's redundant
between multiple files in the same directory would be very efficient
to gzip.

I wouldn't worry about slowness during random access... it's no worse
than the existing trees, which aren't seekable anyway. And any
individual tree wouldn't be expected to have a huge number of objects
in it.

Furthermore, since the code for generating a new tree object is all in
one place in bup, it should be pretty easy to implement.

Okay, I think I'm convinced by this one... now all we need is an
implementation :)

Have fun,

Avery

Rob Browning

未读,
2010年2月21日 00:01:162010/2/21
收件人 Avery Pennarun、bup-list
Avery Pennarun <apen...@gmail.com> writes:

> From what I understand, the protobup compiler (a C++ app) is a
> build-time dependency. It then generates python code (writing a C++
> program to generate python? Something here is already crazy) which
> creates some python classes. But those python classes are actually
> instances of *metaclasses* (I don't actually know what a metaclass is;
> it's some fancy python feature I've never had to learn about), and the
> metaclasses are provided by the protobuf runtime library. And on top
> of all that, they have warnings that the protobuf python
> implementation isn't really speed-optimized yet.
>
> So all in all, not so inspiring. I don't care too much about the
> speed optimizations (one tiny encoding run per file, and only if that
> file has changed, isn't going to kill bup performance) but if it's not
> going to be speed optimized anyway, then all that other crap is just
> pointless. It would be easy to write a basic encoder using pure
> python and skip the metaclass nonsense.

While I can see that it might be nice to have a pure python
implementation, assuming you're not concerned about the possible
performance issue at the moment, then the main concern I see might be
the runtime library (if it means an extra run-time dependency). In
Debian that's not a big deal, since protobuf is already available, but
everything isn't Debian.

In any case, since you've actually played with the protobuf
implementation, you're in a better postion to judge, but offhand, I'd be
tempted to set the threshold fairly high for "doing it ourselves" unless
it really is easy to duplicate their work, including the ability to
easily alter/extend the storage types, and to switch to text formats for
debugging.

More broadly, one other thing to consider is whether or not protobufs
are suitable for long-term storage -- i.e. is the format stable?, etc.

> I see what you mean here. The metaphor is that git trees store some
> attribute stuff, the filename, and the blobid, and there's one per
> directory. So if we're going to store more metadata, maybe what we
> should have is a single git tree-like file (.bupmeta?) with the
> metadata for the stuff in that tree. It would be basically an
> extension of the tree file itself, and metadata that's redundant
> between multiple files in the same directory would be very efficient
> to gzip.

Yes, that's basically what I was thinking. It originally started when I
looked at the output of "git ls-tree" and realized that all I really
wanted at the time, was the ability to add more info to the line (even
if git just ignored all the extra bits).

> I wouldn't worry about slowness during random access... it's no worse
> than the existing trees, which aren't seekable anyway. And any
> individual tree wouldn't be expected to have a huge number of objects
> in it.

OK, my main concern was large directories, on the order of a few hundred
thousand files (i.e. large maildirs). Of course, if we're creating and
maintaining a new type of tree object, we can always make adjustments if
that becomes a problem, presuming we have a way to accomodate changes to
the data format -- say if we decide add an index, for example. The key
thing is to be able to look at a given tree object and know how to read
it.

Oh, and with respect large tree blobs, I wondered if there might be a
way to fit them in with the "chunking" that you've been working on, so
that when you backup a large directory where only a few files have
changed, you don't have to write an entirely new tree object.

Of course, if we come up with an extended tree blob and it works well,
then we might start to ask why we need a separate meta tree (other than
trying to maintain some form of binary git compatibility).

Avery Pennarun

未读,
2010年2月21日 01:32:412010/2/21
收件人 Rob Browning、bup-list
On Sun, Feb 21, 2010 at 12:01 AM, Rob Browning <r...@defaultvalue.org> wrote:
> [Google protobuf library]

> While I can see that it might be nice to have a pure python
> implementation, assuming you're not concerned about the possible
> performance issue at the moment, then the main concern I see might be
> the runtime library (if it means an extra run-time dependency).  In
> Debian that's not a big deal, since protobuf is already available, but
> everything isn't Debian.

Right. I'm not that set on a pure python implementation - bup already
isn't pure python, and I expect to need to optimize even more parts in
C eventually. But I'm rather opposed to sucking in an extra
dependency that's about 7x the current size of bup, just to implement
a simple file format.

> In any case, since you've actually played with the protobuf
> implementation, you're in a better postion to judge, but offhand, I'd be
> tempted to set the threshold fairly high for "doing it ourselves" unless
> it really is easy to duplicate their work, including the ability to
> easily alter/extend the storage types, and to switch to text formats for
> debugging.
>
> More broadly, one other thing to consider is whether or not protobufs
> are suitable for long-term storage -- i.e. is the format stable?, etc.

The file format itself is at least really simple, which means it's
pretty easy to write an encoder/decoder. Of course, writing it
ourselves means the possibility of introducing bugs that will make old
versions not quite capable of reading new files (if we use protobuf's
extensibility features) just because it's hard to test that sort of
thing. But then again, I don't think that's very important. As long
as new versions can read old files, life isn't too complicated.

The protobuf data format is definitely stable and extensible. That's
one thing that makes it nice. But it's by no means the only way to do
it.

>> I wouldn't worry about slowness during random access... it's no worse
>> than the existing trees, which aren't seekable anyway.  And any
>> individual tree wouldn't be expected to have a huge number of objects
>> in it.
>
> OK, my main concern was large directories, on the order of a few hundred
> thousand files (i.e. large maildirs).  Of course, if we're creating and
> maintaining a new type of tree object, we can always make adjustments if
> that becomes a problem, presuming we have a way to accomodate changes to
> the data format -- say if we decide add an index, for example.  The key
> thing is to be able to look at a given tree object and know how to read
> it.

Oh. Ouch. Hundreds of thousands of files in a single directory is
definitely not something I was planning for, but I see what you mean.

In the particular case of maildir (which is probably the *only* case I
can imagine with that many files in one directory), probably every
file will have pretty much the same attributes. If we stored the
attributes along with each filename, gzip would obviously have an easy
time compressing it, but it wouldn't be at all random-access. If we
wanted to do it in a random-access way, we'd have to at least store an
uncompressed version in RAM, and 100k files * 1024 bytes (say) would
be 100 megs just to store the file index for one folder. That's
pretty gross.

If we go back to my earlier suggestion - use a tree object instead,
and have it link to blobs representing the attributes - then we'd only
have to store the common attributes once, and the cost per entry in
the uncompressed tree file would be closer to 30-40 bytes (10-20 bytes
filename + 20 bytes blob sha1). It still wouldn't be inherently
seekable, but things like 'bup fuse' could just load the whole tree
into RAM and it wouldn't be totally ridiculously large. (100k * 40 =
4 megs)

Of course when just restoring or saving files (the usual cases), it's
not a big deal if it isn't seekable. We just iterate through it once.
So maybe it's not worth optimizing.

> Oh, and with respect large tree blobs, I wondered if there might be a
> way to fit them in with the "chunking" that you've been working on, so
> that when you backup a large directory where only a few files have
> changed, you don't have to write an entirely new tree object.

Something like the chunking might work, yes. Probably something a
little more straightforward, though, like just turning the flat
filename space into a tree by breaking it every n characters of the
filename.

Of course we'd have to do this for the actual file content part of the
backup, too, which would be annoying (and make our backups even more
unlike plain git...)

> Of course, if we come up with an extended tree blob and it works well,
> then we might start to ask why we need a separate meta tree (other than
> trying to maintain some form of binary git compatibility).

I think the binary git compatibility is still valuable. I can imagine
my opinion changing eventually, but it would have to be for a good
reason :)

Have fun,

Avery

Rob Browning

未读,
2010年2月23日 01:55:592010/2/23
收件人 bup-...@googlegroups.com
Avery Pennarun <apen...@gmail.com> writes:

> In the particular case of maildir (which is probably the *only* case I
> can imagine with that many files in one directory), probably every
> file will have pretty much the same attributes. If we stored the
> attributes along with each filename, gzip would obviously have an easy
> time compressing it, but it wouldn't be at all random-access. If we
> wanted to do it in a random-access way, we'd have to at least store an
> uncompressed version in RAM, and 100k files * 1024 bytes (say) would
> be 100 megs just to store the file index for one folder. That's
> pretty gross.

Though assuming you really only care about random access based on path
names, then you could store a path->offset index followed by the data,
and then sync the gzipped output every so often. That should allow
quick access to any given point (after loading the index), and support
streaming from there. That said, I'm not sure such an arrangement is
really what we'd want.

> If we go back to my earlier suggestion - use a tree object instead,
> and have it link to blobs representing the attributes - then we'd only
> have to store the common attributes once, and the cost per entry in
> the uncompressed tree file would be closer to 30-40 bytes (10-20 bytes
> filename + 20 bytes blob sha1). It still wouldn't be inherently
> seekable, but things like 'bup fuse' could just load the whole tree
> into RAM and it wouldn't be totally ridiculously large. (100k * 40 =
> 4 megs)

> Of course when just restoring or saving files (the usual cases), it's
> not a big deal if it isn't seekable. We just iterate through it once.
> So maybe it's not worth optimizing.

The case I was thinking about is when you accidentally delete ".../foo",
and want to restore it, but unless you're trying to restore some
completely scattered set of files, I doubt it will be all that
expensive, even if we don't tune the representation for that case.

> Something like the chunking might work, yes. Probably something a
> little more straightforward, though, like just turning the flat
> filename space into a tree by breaking it every n characters of the
> filename.
>
> Of course we'd have to do this for the actual file content part of the
> backup, too, which would be annoying (and make our backups even more
> unlike plain git...)

Hmm, I'm not sure I understand exactly what you're suggesting.

What I was thinking about was a way to just cut down on the cost of
small changes in large directories (say adding/deleting a handful of
files, touching a file or two, etc.).

For example, you might make the metadata tree object a list of sha1s to
chunks, where each chunk contains the metadata (directly or indirectly)
for a thousand files. That way, if you change one file, you just have
to re-write the relevant chunk and all the parent chunk-lists on the
path back to the root. You could also consider only applying this
approach when the number of files in a directory is greater than some
threshold.

Avery Pennarun

未读,
2010年2月23日 17:54:002010/2/23
收件人 Rob Browning、bup-...@googlegroups.com
On Tue, Feb 23, 2010 at 1:55 AM, Rob Browning <r...@defaultvalue.org> wrote:
> Avery Pennarun <apen...@gmail.com> writes:
>> In the particular case of maildir (which is probably the *only* case I
>> can imagine with that many files in one directory), probably every
>> file will have pretty much the same attributes.  If we stored the
>> attributes along with each filename, gzip would obviously have an easy
>> time compressing it, but it wouldn't be at all random-access.  If we
>> wanted to do it in a random-access way, we'd have to at least store an
>> uncompressed version in RAM, and 100k files * 1024 bytes (say) would
>> be 100 megs just to store the file index for one folder.  That's
>> pretty gross.
>
> Though assuming you really only care about random access based on path
> names, then you could store a path->offset index followed by the data,
> and then sync the gzipped output every so often.  That should allow
> quick access to any given point (after loading the index), and support
> streaming from there.  That said, I'm not sure such an arrangement is
> really what we'd want.

It's rather un-git-like: such an index would affect the sha1sum of the
tree itself, if it were stored inside the tree object. (Of course,
the existing bup chunk encoding also has that problem. That also
annoys me, but it's also fundamental to why the chunking system works
at all - the chunks *do* need to have their own names - so it's at
least partly justified.)

I think we can summarize the git object storage model as "just store
exactly what the object wants to express, no more, no less." So pure
performance optimizations like indexes are right out. We could store
those *alongside* the object as a cache, but that's a bit gross too.

>> Of course when just restoring or saving files (the usual cases), it's
>> not a big deal if it isn't seekable.  We just iterate through it once.
>>  So maybe it's not worth optimizing.
>
> The case I was thinking about is when you accidentally delete ".../foo",
> and want to restore it, but unless you're trying to restore some
> completely scattered set of files, I doubt it will be all that
> expensive, even if we don't tune the representation for that case.

True. Iterating through a hundred thousand files isn't very
expensive, really, and that's in the absolutely horrible case of a
maildir with a hundred thousand files. (Last I checked, mutt would
take *forever* to start up if you had anything like that, so bup is
the least of your worries :))

>> Something like the chunking might work, yes.  Probably something a
>> little more straightforward, though, like just turning the flat
>> filename space into a tree by breaking it every n characters of the
>> filename.
>>
>> Of course we'd have to do this for the actual file content part of the
>> backup, too, which would be annoying (and make our backups even more
>> unlike plain git...)
>
> Hmm, I'm not sure I understand exactly what you're suggesting.
>
> What I was thinking about was a way to just cut down on the cost of
> small changes in large directories (say adding/deleting a handful of
> files, touching a file or two, etc.).
>
> For example, you might make the metadata tree object a list of sha1s to
> chunks, where each chunk contains the metadata (directly or indirectly)
> for a thousand files.  That way, if you change one file, you just have
> to re-write the relevant chunk and all the parent chunk-lists on the
> path back to the root.  You could also consider only applying this
> approach when the number of files in a directory is greater than some
> threshold.

Yes, exactly. I'm just saying that the chunk divisions for a tree
object can be decided using a different formula than the one we use
for blobs. For example, you might take a file named
"abcdefg-hooga-17" and split it into trees by pretending its name is
"ab/cd/efg-hooga-17" or something (with more subdivisions if there are
more objects). That's less arbitrary than a rolling checksum based
scheme, and it would do useful things like only change one single
branch of the tree if you added a file with the same prefix as some
other file. It would also guarantee that each chunk would be
parseable from its first byte.

But it's sounding more and more like this situation isn't worth
worrying about right now. A single per-tree blob to store metadata
should handle all normal situations reasonably, and edge cases (like a
single directory with 100k+ files) with only a bit of slowness. And
we can always straightforwardly add chunking for the metadata file
later if it turns out to be a problem.

Have fun,

Avery

Rob Browning

未读,
2010年3月7日 16:18:382010/3/7
收件人 Avery Pennarun、bup-list
Avery Pennarun <apen...@gmail.com> writes:

> The protobuf data format is definitely stable and extensible. That's
> one thing that makes it nice. But it's by no means the only way to do
> it.

Indeed. I'm actually considering alternatives. I think one of the
biggest benefits would come from having help maintaining the
implementation, and without that, the case for protobuf is weaker.

> If we go back to my earlier suggestion - use a tree object instead,
> and have it link to blobs representing the attributes - then we'd only
> have to store the common attributes once, and the cost per entry in
> the uncompressed tree file would be closer to 30-40 bytes (10-20 bytes
> filename + 20 bytes blob sha1). It still wouldn't be inherently
> seekable, but things like 'bup fuse' could just load the whole tree
> into RAM and it wouldn't be totally ridiculously large. (100k * 40 =
> 4 megs)

Recently, I've also been thinking that we might just start with the "one
metadata blob per file" approach, and see how it goes, but I'm wondering
a bit about that now.

One concern, which I'd already thought about was with file times,
especially if atime is enabled. If we store all of a file's attributes
in a single blob, then I suspect that the variance in times for
different paths might leave us with much less structure sharing than we
might like (though we would still share structure across subsequent
backups when nothing else has changed).

This is less a concern if compression is layered on top of the blob
store, so that we compress some number of metadata blobs together. Do
things already work that way? i.e. can we store many blobs in the same
compressed "chunk"?

Of course we could also consider ways to store the more volatile
information separately, but then the overhead of additional blobs per
file might make it hard to win.

> Of course when just restoring or saving files (the usual cases), it's
> not a big deal if it isn't seekable. We just iterate through it once.
> So maybe it's not worth optimizing.

Perhaps not. For now, what I really want is to extract a small set of
files much more quickly than tar. So I don't care if we have to read
through tens of MB to extract 1 MB from a GB archive, as long as we
never have to read through nearly a GB.

Avery Pennarun

未读,
2010年3月7日 16:25:592010/3/7
收件人 Rob Browning、bup-list
On Sun, Mar 7, 2010 at 4:18 PM, Rob Browning <r...@defaultvalue.org> wrote:
> Avery Pennarun <apen...@gmail.com> writes:
>> The protobuf data format is definitely stable and extensible.  That's
>> one thing that makes it nice.  But it's by no means the only way to do
>> it.
>
> Indeed.  I'm actually considering alternatives.  I think one of the
> biggest benefits would come from having help maintaining the
> implementation, and without that, the case for protobuf is weaker.

True.

> Recently, I've also been thinking that we might just start with the "one
> metadata blob per file" approach, and see how it goes, but I'm wondering
> a bit about that now.
>
> One concern, which I'd already thought about was with file times,
> especially if atime is enabled.  If we store all of a file's attributes
> in a single blob, then I suspect that the variance in times for
> different paths might leave us with much less structure sharing than we
> might like (though we would still share structure across subsequent
> backups when nothing else has changed).
>
> This is less a concern if compression is layered on top of the blob
> store, so that we compress some number of metadata blobs together.  Do
> things already work that way?  i.e. can we store many blobs in the same
> compressed "chunk"?
>
> Of course we could also consider ways to store the more volatile
> information separately, but then the overhead of additional blobs per
> file might make it hard to win.

If we wanted one metadata blob per file, we would definitely have to
store the ctime/mtime/atime in the tree that *points* at those blobs,
or you're right, we'd get basically zero sharing between blobs and it
wouldn't accomplish anything.

If we want to compress multiple metadata blobs together, I think we
might as well just do it on a per-tree basis. That is, one file per
tree object. Most trees won't be stupidly large, and for the few that
are, I think we can just live with it. (We'll already be living with
the stupidly large tree itself :))

>> Of course when just restoring or saving files (the usual cases), it's
>> not a big deal if it isn't seekable.  We just iterate through it once.
>>  So maybe it's not worth optimizing.
>
> Perhaps not.  For now, what I really want is to extract a small set of
> files much more quickly than tar.  So I don't care if we have to read
> through tens of MB to extract 1 MB from a GB archive, as long as we
> never have to read through nearly a GB.

Agreed. I think the case of a directory containing hundreds of
thousands of files is not worth optimizing for. At least not right
now :)

Have fun,

Avery

Rob Browning

未读,
2010年3月7日 17:10:012010/3/7
收件人 Avery Pennarun、bup-list
Avery Pennarun <apen...@gmail.com> writes:

> If we wanted one metadata blob per file, we would definitely have to
> store the ctime/mtime/atime in the tree that *points* at those blobs,
> or you're right, we'd get basically zero sharing between blobs and it
> wouldn't accomplish anything.

So I think I'm a bit confused, and I just want to make sure I haven't
misunderstood something. Would that be possible without changing the
git tree format? I suppose you could use some kind of path encoding.

> If we want to compress multiple metadata blobs together, I think we
> might as well just do it on a per-tree basis. That is, one file per
> tree object. Most trees won't be stupidly large, and for the few that
> are, I think we can just live with it. (We'll already be living with
> the stupidly large tree itself :))

OK, so you mean one meta/ file per data/ tree object?

Thanks

Avery Pennarun

未读,
2010年3月8日 16:16:542010/3/8
收件人 Rob Browning、bup-list
On Sun, Mar 7, 2010 at 5:10 PM, Rob Browning <r...@defaultvalue.org> wrote:
> Avery Pennarun <apen...@gmail.com> writes:
>> If we wanted one metadata blob per file, we would definitely have to
>> store the ctime/mtime/atime in the tree that *points* at those blobs,
>> or you're right, we'd get basically zero sharing between blobs and it
>> wouldn't accomplish anything.
>
> So I think I'm a bit confused, and I just want to make sure I haven't
> misunderstood something.  Would that be possible without changing the
> git tree format?  I suppose you could use some kind of path encoding.

Yes, this would be doable.  That's what I was thinking.  Instead of
encoding tree elements like this:

blob 100644 passwd\0%s\0

encode them like this:

blob 100644 passwd (1268082694,1268082694,1268082694).bup\0%s\0

Which is kind of disgusting, but would do the job nicely.  The %s is
the binary sha1 of the *metadata* blob.

Since tree file entries are actually just plaintext, this would be
storable and less disgusting:

blob 100644/1268082694/1268082694/1268082694 passwd\0%s\0

...but git would presumably reject it, so we can't really do that.

Note that this method is only needed if we create a metadata *tree*
for each data tree, which is the opposite of:

>> If we want to compress multiple metadata blobs together, I think we
>> might as well just do it on a per-tree basis.  That is, one file per
>> tree object.  Most trees won't be stupidly large, and for the few that
>> are, I think we can just live with it.  (We'll already be living with
>> the stupidly large tree itself :))
>
> OK, so you mean one meta/ file per data/ tree object?

Yes, in the above I'm talking about one .bupmeta file stored as a
member of each tree.

The more I think about it, the more I realize this way is probably for
the best.  Unless I'm wrong, of course :)

Have fun,

Avery

Rob Browning

未读,
2010年3月9日 00:41:382010/3/9
收件人 Avery Pennarun、bup-list
Avery Pennarun <apen...@gmail.com> writes:

> Yes, in the above I'm talking about one .bupmeta file stored as a
> member of each tree.
>
> The more I think about it, the more I realize this way is probably for
> the best.  Unless I'm wrong, of course :)

That's the direction I'm leaning right now too.

回复全部
回复作者
转发
0 个新帖子