MogileFS::Server 2.60 going up (checksum support!)

59 views
Skip to first unread message

dormando

unread,
Mar 30, 2012, 6:33:49 PM3/30/12
to mog...@googlegroups.com
Yo,

MogileFS::Client 1.16, MogileFS::Utils 2.23, and ::Server 2.60 are going
up to CPAN right now.

2.60 is the "checksums" release, which Eric Wong has gracefully
contributed to the project. Tirelessly prodding at it since... november I
think :)

I've spent some time hacking at it as well, but please be wary as with all
new batches of code.

For the next few releases, we reserve the right to make significant
changes to the checksums code, or how it operates. We want to hear your
feedback on its functionality and if it is flexible enough for your uses.

There is some preliminary documentation in docs/checksums.txt. Eric will
be posting a wiki page when he gets back (I hope!).

The one thing worth noting that's not in the file is the "fsck_checksum"
option:

$ mogadm settings set fsck_checksum [off|class|MD5]

If set to "off", it will force fsck to not test checksums even if they are
specified in a class.

If set to "MD5', FSCK will compare checksums on all files even if you
store none in your database. If you wish to evluate the checksums feature,
this is a great way to take a look. It will log errors about any
inconsistencies with the checksums of your stored files. If you find
enough, maybe consider enabling it via the class.

Checksums adds a new table, so will store one extra row of information per
fid stored in the database. This may be too difficult for many users, so
the feature is by all means optional. If not enabled, it does not store
any extra data.

have fun,
-Dormando

dormando

unread,
Mar 30, 2012, 6:38:53 PM3/30/12
to mog...@googlegroups.com
Looks like he posted some preliminary documentation:

http://code.google.com/p/mogilefs/wiki/Checksums

I hope he doesn't mind me linking it here :P May be slightly inaccurate or
otherwise missing data.

Eric Wong

unread,
Apr 11, 2012, 6:49:11 PM4/11/12
to mog...@googlegroups.com
dormando <dorm...@rydia.net> wrote:
> Looks like he posted some preliminary documentation:
>
> http://code.google.com/p/mogilefs/wiki/Checksums
>
> I hope he doesn't mind me linking it here :P May be slightly inaccurate or
> otherwise missing data.

I don't mind at all :)

I think it includes everything necessary for Perl users, though the Perl
API could probably be made easier-to-use wrt checksums.

I mainly use the Ruby client bindings, and I think I still need to
document how that uses checksums though it's had checksum support for a
while...

I'll let others deal with Python/PHP/Java/etc clients since I don't use
them.

dormando

unread,
Apr 12, 2012, 1:09:06 AM4/12/12
to mog...@googlegroups.com
> dormando <dorm...@rydia.net> wrote:
> > Looks like he posted some preliminary documentation:
> >
> > http://code.google.com/p/mogilefs/wiki/Checksums
> >
> > I hope he doesn't mind me linking it here :P May be slightly inaccurate or
> > otherwise missing data.
>
> I don't mind at all :)

Welcome back!

> I think it includes everything necessary for Perl users, though the Perl
> API could probably be made easier-to-use wrt checksums.

Suggestions?

> I mainly use the Ruby client bindings, and I think I still need to
> document how that uses checksums though it's had checksum support for a
> while...

yay!

> I'll let others deal with Python/PHP/Java/etc clients since I don't use
> them.

You want to link it into the rest of the wiki? or would you like me to
decide where that goes?

Eric Wong

unread,
Apr 12, 2012, 3:40:28 AM4/12/12
to mog...@googlegroups.com
dormando <dorm...@rydia.net> wrote:
> Welcome back!

Thanks, still a bit deaf and disoriented and will likely remain
so for a bit :x

> > I think it includes everything necessary for Perl users, though the Perl
> > API could probably be made easier-to-use wrt checksums.
>
> Suggestions?

The Ruby client library can automatically calculate the checksum
as it uploads the file, so there's no need to manually checksum
files yourself.

It can send Content-MD5 as a HTTP trailer (if configured to do so, since
few HTTP servers support it), too. This is very useful for things that
are streamed (or very large), since you only need to read it once.

> > I'll let others deal with Python/PHP/Java/etc clients since I don't use
> > them.
>
> You want to link it into the rest of the wiki? or would you like me to
> decide where that goes?

I'll let you decide, I have enough trouble deciding how to organize
code already :>

David Birdsong

unread,
Jan 12, 2015, 3:34:27 PM1/12/15
to mog...@googlegroups.com
Eric, what got decided about being able to send an HTTP trailer to cmogstored? I'm streaming writes and I'd prefer not to buffer the files in service of calculating checksums. Sending a trailer would be great.

Dave Lambley

unread,
Jan 12, 2015, 5:39:31 PM1/12/15
to mog...@googlegroups.com
On 12 January 2015 at 20:34, David Birdsong <david.b...@gmail.com> wrote:
Eric, what got decided about being able to send an HTTP trailer to cmogstored? I'm streaming writes and I'd prefer not to buffer the files in service of calculating checksums. Sending a trailer would be great.


To get files protected by the MogileFS checksum, you can supply the checksum in the create_close message, which is sent after the completion of the HTTP upload.  You do not need to use the Content-MD5 header, handy though it is.  This bit of Perl performs the checksum in a child process while uploading,

https://metacpan.org/source/DLAMBLEY/MogileFS-Client-Async-0.030/lib/MogileFS/Client/CallbackFile.pm#L355

... but you could alternatively run every block through MD5 before write()ing it to the socket.  You can optionally make create_close block on checksumming, depending on your attitude to risk.  Beware that if it takes longer than the command timeout to checksum your file server-side, blocking will cause create_close to apparently fail.

Dave

David Birdsong

unread,
Jan 12, 2015, 6:37:38 PM1/12/15
to mog...@googlegroups.com
this part 'you could alternatively run every block through MD5 before write()ing it to the socket.' that's exactly what I'm doing while writing to the (c)mogstored. If I buffer the request and calculate the checksum, I can get (c)mogsotred to refuse any files I try to PUT to them if they're corrupted. I don't recall which 40x level error it is, but it's the exact behavior I'd like.

If I could attach a trailer, I could avoid the buffering.

The rest about supplying the checksum to create_close I'm familiar with.



Dave

--

---
You received this message because you are subscribed to the Google Groups "mogile" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mogile+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eric Wong

unread,
Jan 12, 2015, 7:05:44 PM1/12/15
to mog...@googlegroups.com
David Birdsong <david.b...@gmail.com> wrote:
> Eric, what got decided about being able to send an HTTP trailer to
> cmogstored? I'm streaming writes and I'd prefer not to buffer the files in
> service of calculating checksums. Sending a trailer would be great.

cmogstored has always had trailer support. There was a bug in the early
days, but that was fixed in the 0.7.0 release.

David Birdsong

unread,
Jan 12, 2015, 7:11:58 PM1/12/15
to mog...@googlegroups.com
I'm new to HTTP trailers. Since I've got your attention, one more QQ.

Does this require chunked-encoding PUTs?

Eric Wong

unread,
Jan 12, 2015, 8:20:22 PM1/12/15
to mog...@googlegroups.com
David Birdsong <david.b...@gmail.com> wrote:
> I'm new to HTTP trailers. Since I've got your attention, one more QQ.
>
> Does this require chunked-encoding PUTs?

Yes, it's in the HTTP 1.x specs.
Reply all
Reply to author
Forward
0 new messages