Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Metadata support in bup-fuse
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  18 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Gabriele Santilli  
View profile  
 More options Jul 11 2012, 6:20 am
From: Gabriele Santilli <santilli.gabri...@gmail.com>
Date: Wed, 11 Jul 2012 12:20:21 +0200
Local: Wed, Jul 11 2012 6:20 am
Subject: Metadata support in bup-fuse
I don't know any python, but maybe I can try doing this if someone's
willing to help me.

What would it take to add metadata support to bup-fuse? Where should I
look? Any pointers and suggestions?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Oei, YC  
View profile  
 More options Jul 11 2012, 8:55 am
From: "Oei, YC" <oei.yungc...@gmail.com>
Date: Wed, 11 Jul 2012 13:55:47 +0100
Local: Wed, Jul 11 2012 8:55 am
Subject: Re: Metadata support in bup-fuse
This would be brilliant to have - I'd very much like to be able to eg.
verify backups without the need for a restore (nor the space for
that).

On 11 July 2012 11:20, Gabriele Santilli <santilli.gabri...@gmail.com> wrote:

> I don't know any python, but maybe I can try doing this if someone's
> willing to help me.

I'm terrible in python despite working with it every day, but would be
keen to work on this.

For the fuse side of things, I think it's "just" a matter of defining
some more methods in BupFs, for instance getxattr(). Also some things
currently have a dummy-implementation, eg. if you follow st_nlink it's
eventually just a "return 1". I figure the tricky bit is in the
metadata "backend" side of things, where you'd have to consult the
.bupm files all the time (and should probably do some sort of
caching).

I'm optimistic, mostly out of ignorance. Probably Rob's the best
person to say if this has any chance of success?

YC


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriele Santilli  
View profile  
 More options Jul 11 2012, 9:19 am
From: Gabriele Santilli <santilli.gabri...@gmail.com>
Date: Wed, 11 Jul 2012 15:19:15 +0200
Local: Wed, Jul 11 2012 9:19 am
Subject: Re: Metadata support in bup-fuse

On Wed, Jul 11, 2012 at 2:55 PM, Oei, YC <oei.yungc...@gmail.com> wrote:
> For the fuse side of things, I think it's "just" a matter of defining
> some more methods in BupFs, for instance getxattr().

That was my hope...

> Also some things
> currently have a dummy-implementation, eg. if you follow st_nlink it's
> eventually just a "return 1". I figure the tricky bit is in the
> metadata "backend" side of things, where you'd have to consult the
> .bupm files all the time (and should probably do some sort of
> caching).

Right, and here is where I don't know how much work needs to be done.
I've been holding off attempting this for a long time, because I don't
really have free time and I don't know python. But... having this
feature might save me quite a bit of time (if my assumption about
something else is correct).

So my plan, I guess, is that I'll be throwing bad code to the list in
the hope that someone will review and fix it. :/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriel Filion  
View profile   Translate to Translated (View Original)
 More options Jul 11 2012, 12:40 pm
From: Gabriel Filion <lelu...@gmail.com>
Date: Wed, 11 Jul 2012 12:40:08 -0400
Local: Wed, Jul 11 2012 12:40 pm
Subject: Re: Metadata support in bup-fuse
Hello,

On 12-07-11 09:19 AM, Gabriele Santilli wrote:

> On Wed, Jul 11, 2012 at 2:55 PM, Oei, YC <oei.yungc...@gmail.com> wrote:

>> For the fuse side of things, I think it's "just" a matter of defining
>> some more methods in BupFs, for instance getxattr().

> That was my hope...

I started work last week (before being buried by a ton and a half of
work) on "bup ls -l", which would be exposing premissions and other
stuff that ls -l exposes.

I hit the point where I need to expose metadata in vfs.py ... and that
would make exposing them in fuse trivial! :)

so if we have this last key part, we can easily implement fuse and ls -l
exposition.

we'd probably have to mirror what bup-restore is doing to fetch metadata
for a single file and move that inside vfs.py to expose metadata in Node
objects.

>> Also some things
>> currently have a dummy-implementation, eg. if you follow st_nlink it's
>> eventually just a "return 1". I figure the tricky bit is in the
>> metadata "backend" side of things, where you'd have to consult the
>> .bupm files all the time (and should probably do some sort of
>> caching).

> Right, and here is where I don't know how much work needs to be done.
> I've been holding off attempting this for a long time, because I don't
> really have free time and I don't know python. But... having this
> feature might save me quite a bit of time (if my assumption about
> something else is correct).

caching can be implemented in a second phase (it will probably help a
bunch). but we can start by exposing metadata, however slow that might
prove to be, and then add caching to that to optimize performance.

--
Gabriel Filion


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriele Santilli  
View profile  
 More options Jul 11 2012, 12:47 pm
From: Gabriele Santilli <santilli.gabri...@gmail.com>
Date: Wed, 11 Jul 2012 18:47:35 +0200
Local: Wed, Jul 11 2012 12:47 pm
Subject: Re: Metadata support in bup-fuse

On Wed, Jul 11, 2012 at 6:40 PM, Gabriel Filion <lelu...@gmail.com> wrote:
> I hit the point where I need to expose metadata in vfs.py ... and that
> would make exposing them in fuse trivial! :)

Cool, let us know how things proceed then. :)

> caching can be implemented in a second phase (it will probably help a
> bunch). but we can start by exposing metadata, however slow that might
> prove to be, and then add caching to that to optimize performance.

Agreed.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriel Filion  
View profile  
 More options Jul 11 2012, 1:01 pm
From: Gabriel Filion <lelu...@gmail.com>
Date: Wed, 11 Jul 2012 13:01:28 -0400
Local: Wed, Jul 11 2012 1:01 pm
Subject: Re: Metadata support in bup-fuse
On 12-07-11 12:47 PM, Gabriele Santilli wrote:

>> I hit the point where I need to expose metadata in vfs.py ... and that
>> > would make exposing them in fuse trivial! :)
> Cool, let us know how things proceed then. :)

I hope to try my luck at it soon, but I can't promise when: I've had
multiple power failures in datacenters so I'm going to be moving things
out of there for some time now... bleh

I'll keep you posted. If HOPE 9 gets boring at some point, maybe I'll
have some time this weekend ;)

(p.s.: if some of you guys are going to HOPE, maybe we can arrange [with
private mails] a meeting there ^-^ -- sorry for this _totally_ off topic
message)

--
Gabriel Filion


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriele Santilli  
View profile   Translate to Translated (View Original)
 More options Jul 11 2012, 1:04 pm
From: Gabriele Santilli <santilli.gabri...@gmail.com>
Date: Wed, 11 Jul 2012 19:04:18 +0200
Local: Wed, Jul 11 2012 1:04 pm
Subject: Re: Metadata support in bup-fuse

On Wed, Jul 11, 2012 at 7:01 PM, Gabriel Filion <lelu...@gmail.com> wrote:
> I hope to try my luck at it soon, but I can't promise when: I've had
> multiple power failures in datacenters so I'm going to be moving things
> out of there for some time now... bleh

No worries, it would probably take me more time anyway...

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Browning  
View profile  
 More options Jul 11 2012, 5:07 pm
From: Rob Browning <r...@defaultvalue.org>
Date: Wed, 11 Jul 2012 16:07:34 -0500
Local: Wed, Jul 11 2012 5:07 pm
Subject: Re: Metadata support in bup-fuse

Gabriel Filion <lelu...@gmail.com> writes:
> caching can be implemented in a second phase (it will probably help a
> bunch). but we can start by exposing metadata, however slow that might
> prove to be, and then add caching to that to optimize performance.

I haven't thought about it carefully yet, but one very simple approach
could be to just change vfs.py so that we have a Node._metadata, rather
than a Dir._metadata, and so that Node._metadata stores the actual
Metadata object for *that* Node, rather than a File object for the Dir's
.bupm.

Then we could add a Dir.metadata() like this:

  def metadata():
      if self._subs == None:
          self._mksubs()
      return self._metadata

and have Dir's _mksubs() populate _metadata for itself and all of its
immediate children.  Non-dirs would just return self._metadata.

Offhand, I think this should be fairly easy to implement, and if we
like, I could probably handle it immediately, but I suppose might be too
expensive as a default.

Thoughts?
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Browning  
View profile  
 More options Jul 12 2012, 1:39 pm
From: Rob Browning <r...@defaultvalue.org>
Date: Thu, 12 Jul 2012 12:39:19 -0500
Local: Thurs, Jul 12 2012 1:39 pm
Subject: Re: Metadata support in bup-fuse

Rob Browning <r...@defaultvalue.org> writes:
> Offhand, I think this should be fairly easy to implement, and if we
> like, I could probably handle it immediately, but I suppose might be too
> expensive as a default.

...and as a bit of example, imagine the case where a directory has 100k+
files (i.e. debian-devel maildir).  Calling metadata() for that
directory or any file in it, or calling _mksubs() indirectly, would
result in the immediate creation of 100k+ Metadata objects.

--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriel Filion  
View profile  
 More options Jul 12 2012, 1:44 pm
From: Gabriel Filion <lelu...@gmail.com>
Date: Thu, 12 Jul 2012 13:44:16 -0400
Local: Thurs, Jul 12 2012 1:44 pm
Subject: Re: Metadata support in bup-fuse
On 12-07-11 05:07 PM, Rob Browning wrote:

> I haven't thought about it carefully yet, but one very simple approach
> could be to just change vfs.py so that we have a Node._metadata, rather
> than a Dir._metadata, and so that Node._metadata stores the actual
> Metadata object for *that* Node, rather than a File object for the Dir's
> .bupm.

> Then we could add a Dir.metadata() like this:

>   def metadata():
>       if self._subs == None:
>           self._mksubs()
>       return self._metadata

that function is already in the Dir class (named metadata_file() ). So
we only need to implement something that exposes metadata in File and
Symlink.
Then, we can then retrieve the metadata of each object when iterating
over them in "front-end" commands like bup ls, bup fuse, bup web.

--
Gabriel Filion


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Browning  
View profile  
 More options Jul 12 2012, 3:31 pm
From: Rob Browning <r...@defaultvalue.org>
Date: Thu, 12 Jul 2012 14:31:35 -0500
Local: Thurs, Jul 12 2012 3:31 pm
Subject: Re: Metadata support in bup-fuse

Gabriel Filion <lelu...@gmail.com> writes:
> that function is already in the Dir class (named metadata_file() ). So
> we only need to implement something that exposes metadata in File and
> Symlink.

Right, but metadata_file() is the file containing the entire set of
metadata entries (in encoded format) for all of the items in the
directory, including the directory itself.  If that's all you need, (and
plan to linearly search it for a particular entry, or will normally just
be iterating over them all -- which is efficient) then that's fine, but
I was assuming you might need to be able to retrieve the metadata for
each individual file/dir/symlink quickly.

--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriel Filion  
View profile  
 More options Jul 14 2012, 4:16 pm
From: Gabriel Filion <lelu...@gmail.com>
Date: Sat, 14 Jul 2012 16:16:28 -0400
Local: Sat, Jul 14 2012 4:16 pm
Subject: Re: Metadata support in bup-fuse
On 12-07-12 03:31 PM, Rob Browning wrote:

> Gabriel Filion <lelu...@gmail.com> writes:
>> that function is already in the Dir class (named metadata_file() ). So
>> we only need to implement something that exposes metadata in File and
>> Symlink.

> Right, but metadata_file() is the file containing the entire set of
> metadata entries (in encoded format) for all of the items in the
> directory, including the directory itself.  If that's all you need, (and
> plan to linearly search it for a particular entry, or will normally just
> be iterating over them all -- which is efficient) then that's fine, but
> I was assuming you might need to be able to retrieve the metadata for
> each individual file/dir/symlink quickly.

right, I'd want that. but for now the .bupm files don't have an index,
so we need to iterate over it to find metadata for one particular file,
don't we?

we'd need to either make the code cache metadata entries, or add an
index to the .bupm file.

the former sounds like it can suffer on performance and memory footprint
in cases where directories contain a lot of files.

--
Gabriel Filion


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Browning  
View profile  
 More options Jul 14 2012, 4:38 pm
From: Rob Browning <r...@defaultvalue.org>
Date: Sat, 14 Jul 2012 15:38:48 -0500
Local: Sat, Jul 14 2012 4:38 pm
Subject: Re: Metadata support in bup-fuse

Gabriel Filion <lelu...@gmail.com> writes:
> right, I'd want that. but for now the .bupm files don't have an index,
> so we need to iterate over it to find metadata for one particular file,
> don't we?

Right.

> we'd need to either make the code cache metadata entries, or add an
> index to the .bupm file.

Right -- the former was what I was initially talking about.  I think I
can add that easily (and probably fairly immediately), but I wasn't sure
if it was reasonable.

> the former sounds like it can suffer on performance and memory footprint
> in cases where directories contain a lot of files.

Exactly.

Though now that I think about it -- I wonder if we could just implement
the dumb caching approach for now, and add indexing (or something else
smarter) when we get a chance.

For example, if the fs_item.metadata() API seems reasonable, then for
the moment, I could just implement it via (potentially expensive)
on-demand caching, and later, we could be smarter.

Come to think of it, if we do end up with indexes, I suppose as a first
pass, they could also be generated on-demand, the first time we retrieve
an object from that .bupm.

Thoughts?
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriel Filion  
View profile  
 More options Jul 14 2012, 4:50 pm
From: Gabriel Filion <lelu...@gmail.com>
Date: Sat, 14 Jul 2012 16:50:12 -0400
Local: Sat, Jul 14 2012 4:50 pm
Subject: Re: Metadata support in bup-fuse
On 12-07-14 04:38 PM, Rob Browning wrote:

> Though now that I think about it -- I wonder if we could just implement
> the dumb caching approach for now, and add indexing (or something else
> smarter) when we get a chance.

> For example, if the fs_item.metadata() API seems reasonable, then for
> the moment, I could just implement it via (potentially expensive)
> on-demand caching, and later, we could be smarter.

> Come to think of it, if we do end up with indexes, I suppose as a first
> pass, they could also be generated on-demand, the first time we retrieve
> an object from that .bupm.

sounds like a reasonable plan. we'll have at least something to use for
exposing data in the front-ends and we can optimize later, since adding
an index means more thinking and more work.

--
Gabriel Filion


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Browning  
View profile  
 More options Jul 14 2012, 5:00 pm
From: Rob Browning <r...@defaultvalue.org>
Date: Sat, 14 Jul 2012 16:00:58 -0500
Local: Sat, Jul 14 2012 5:00 pm
Subject: Re: Metadata support in bup-fuse

Gabriel Filion <lelu...@gmail.com> writes:
> sounds like a reasonable plan. we'll have at least something to use for
> exposing data in the front-ends and we can optimize later, since adding
> an index means more thinking and more work.

OK, well, the newer metadata bits are still in their own branch for now
anyway, so I think I'll implement the simple approach I initially
mentioned, and push it so those interested can try it out.  We can
always rework whatever's broken.

I've also nearly finished some preliminary support for "save [[--ignore
<pattern>] ...]" which works more or less like top-level gitignore
patterns.  I'll probably post that to the list soon, so people can take
a look.  At the moment, it requires a system-level (i.e. non-python)
fnmatch().

Thanks
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Browning  
View profile  
 More options Jul 16 2012, 10:28 pm
From: Rob Browning <r...@defaultvalue.org>
Date: Mon, 16 Jul 2012 21:28:48 -0500
Local: Mon, Jul 16 2012 10:28 pm
Subject: Re: Metadata support in bup-fuse

Gabriel Filion <lelu...@gmail.com> writes:
> sounds like a reasonable plan. we'll have at least something to use for
> exposing data in the front-ends and we can optimize later, since adding
> an index means more thinking and more work.

OK, I got it working at debconf.  I'll probably push a patch for
evaluation in the next week.  The interface is just meta =
vfs_obj.metadata(), and then you can call meta.uid, etc.

--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gabriel Filion  
View profile  
 More options Jul 17 2012, 7:25 pm
From: Gabriel Filion <lelu...@gmail.com>
Date: Tue, 17 Jul 2012 19:25:12 -0400
Local: Tues, Jul 17 2012 7:25 pm
Subject: Re: Metadata support in bup-fuse
On 12-07-16 10:28 PM, Rob Browning wrote:

> Gabriel Filion <lelu...@gmail.com> writes:
>> sounds like a reasonable plan. we'll have at least something to use for
>> exposing data in the front-ends and we can optimize later, since adding
>> an index means more thinking and more work.

> OK, I got it working at debconf.  I'll probably push a patch for
> evaluation in the next week.  The interface is just meta =
> vfs_obj.metadata(), and then you can call meta.uid, etc.

very nice, I'll be checking it out when you can send it.

I couldn't work on anything useful during hope. but once you send your
patch, I'll make sure to try and allocate some time to test it out and
to base my work on it to finish bup ls -l

--
Gabriel Filion


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Browning  
View profile  
 More options Jul 21 2012, 5:56 pm
From: Rob Browning <r...@defaultvalue.org>
Date: Sat, 21 Jul 2012 16:56:27 -0500
Local: Sat, Jul 21 2012 5:56 pm
Subject: Re: Metadata support in bup-fuse

Gabriel Filion <lelu...@gmail.com> writes:
> very nice, I'll be checking it out when you can send it.

> I couldn't work on anything useful during hope. but once you send your
> patch, I'll make sure to try and allocate some time to test it out and
> to base my work on it to finish bup ls -l

OK, I've pushed node_dir_file_or_symlink.metadata() to tmp/pending/meta
here:

    http://git.debian.org/?p=users/rlb/bup.git
    clone URLs:
      git://anonscm.debian.org/users/rlb/bup.git
      git+ssh://git.debian.org/git/users/rlb/bup.git
      http://anonscm.debian.org/git/users/rlb/bup.git

Consider it an initial attempt, and let me know if you have any
trouble.

I've also finished (but not included) initial support for
a gitignore(5)-style "bup index --ignore ..." facility:

  bup index --ignore '/mnt/*' --ignore '/proc/*' --ignore '*.tmp' ...

The behavior is intended to be mostly identical to gitignore,
except that a trailing "/*" will ignore the contents of a directory, but
not the directory itself.  The current matcher probably needs help -- it
just splits complex paths on '/' and otherwise relies on python's
fnmatch.

I'll post separately about that soon (and send the patch to the list,
since it's not metadata-specific).

I think that may be the last thing I really needed (though "bup gc"
would be nice), before starting to test with real backups here.

Thanks
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »