Resumable pk put?

Ian Denhardt

unread,

May 3, 2019, 2:13:44 PM5/3/19

to per...@googlegroups.com

Hey All,

I have about 2TB of files that I'm looking at importing into perkeep. I
have a couple questions.

First, do others have experience they can share re: how perkeep performs
holding this much data? From what I've read it sounds like
architecturally it should be manageable, but I'd like to know if anyone
can say how that's worked out in practice for them.

Assuming this is realistic, I have some logistical questions about
getting the data in there in the first place.

I left a pk-put going on a large sub-tree last night, and came back to
it today. It had spent about 12 hours copying things, finally running in
to some hiccough uploading a particular file (I don't have the error
message recorded, but it was something along the lines of "server did
not receive blob"). Trying to upload that file again worked fine, so I
assume some transient thing.

During the transfer, usage on the drives holding the blobs grew by about
80 GiB. This is transferring data between two hard drives connected to
the same machine via USB 3.0. Questions:

1. Is that kind of performance normal for pk-put?
2. Is there currently any way to do a "resumable" version of pk-put,
where it can quickly pick up where it left off?

If the answer to (2) is no, I might be interested in contributing such a
feature, and would appreciate pointers as to where to start.

Thanks.

-Ian

Brad Fitzpatrick

unread,

May 3, 2019, 3:16:40 PM5/3/19

to per...@googlegroups.com

Perkeep in generally is very (too) aggressive at fsyncing per blob, and it cuts up files into lots of small blobs, so importing lots of data is slow. There's a plan to fix this, but life (baby) got in the way, so it's kinda on hold until I find a few minutes to think. The two high level plans is to let clients specify transactions implicitly or explicitly: implicitly = one multipart/mime POST of a bunch of blobs is one transaction so should be 1 fsync, not 1 per blob, serially. The more complex one involves API changes and lets clients create their own transactions and associate, say, a whole file or directory upload with that transaction, and then wait on all the associated blobs to be committed (fsynced, or whatever blob storage impl requires) before noting that it's good locally.

As for (2), though, pk-put won't repeat any work it's done. It'll still walk your local filesystem to see what's there, but it'll learn that it's already uploaded from either its local cache or from the server before it uploads chunks again.

So it might be slow (throughput wise) but holding 2TB should be no problem, and auto-resume should work. If you run with the pk-put verbose option it'll show lots of stats about where which phases are at.

--
You received this message because you are subscribed to the Google Groups "Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email to perkeep+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eric Drechsel

unread,

May 3, 2019, 3:17:14 PM5/3/19

to per...@googlegroups.com

As I understand, puts of existing blobs don't actually transfer the bytes, but since most of the time (with local transfer) is taken by hashing that doesn't speed things up much.

The only way I can think of to speed that up would be to somehow cache the file hashes (doesn't zfs support storing hashes? maybe that could be used as a fast path for hashing?)

On Fri, May 3, 2019 at 11:13 AM Ian Denhardt <i...@zenhack.net> wrote:

--
You received this message because you are subscribed to the Google Groups "Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email to perkeep+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

best, Eric
eric.pdxhub.org

Brad Fitzpatrick

unread,

May 3, 2019, 3:18:26 PM5/3/19

to per...@googlegroups.com

There's a local cache for the local hashing too, though. If the file's stat metadata doesn't change at all (inode, mtime, size, ctime, etc) then it's not re-digested.

Eric Drechsel

unread,

May 3, 2019, 3:19:35 PM5/3/19

to per...@googlegroups.com

Ah, good to know. So it should resume fast then?

Brad Fitzpatrick

unread,

May 3, 2019, 3:36:24 PM5/3/19

to per...@googlegroups.com

It still needs to do some stats and readdirs and local key/value database lookups to find out where to resume, but yeah.

Ralph Corderoy

unread,

May 4, 2019, 4:53:43 AM5/4/19

to per...@googlegroups.com

Hi Ian,

> This is transferring data between two hard drives connected to the
> same machine via USB 3.0.

That could well be a bottleneck. Though parts of the equipment might be
USB 3.0, it doesn't mean the data path all the way through to the drives
is USB 3.0's SuperSpeed transfer type at 5 Gbp/s. `lsusb -t' is one way
to check, following up the ancestors from the drive; the speed is at
the end of the line: 5000M, 480M, 12M.

--
Cheers, Ralph.

clive.boulton

unread,

May 4, 2019, 1:59:03 PM5/4/19

to Perkeep

ZFS best practices talk at LinuxFest was packed last weekend. I recall many tips shared for ZFS throughput. Jim Salter (@jrssnet) also admins ZFS on Reddit.
https://jrs-s.net/presentations/2019-LFNW-ZFS-Best-Practices/img0.html

Ian Denhardt

unread,

May 6, 2019, 7:24:00 PM5/6/19

to Brad Fitzpatrick, per...@googlegroups.com

Thanks for the pointers. I've managed to solve the performance issue,
through two things:

1. I wrote a simple seccomp wrapper that just silently ignores calls to
fsync & sync. Obviously I don't have any intention of using this
after the initial import; I'm not crazy. But as expected, this sped
things up a lot.
2. The bigger difference came from switching to diskpacked storage. Is
there a reason this isn't the default?

It managed to get through copying about 20GiB of data while I was in the
shower, so I think this solves my immediate issue.

Thanks again,

-Ian

Quoting Brad Fitzpatrick (2019-05-03 15:16:25)

> Perkeep in generally is very (too) aggressive at fsyncing per blob, and
> it cuts up files into lots of small blobs, so importing lots of data is
> slow. There's a plan to fix this, but life (baby) got in the� way, so
> it's kinda on hold until I find a few minutes to think. The two high
> level plans is to let clients specify transactions implicitly or
> explicitly: implicitly = one multipart/mime POST of a bunch of blobs is
> one transaction so should be 1 fsync, not 1 per blob, serially. The
> more complex one involves API changes and lets clients create their own
> transactions and associate, say, a whole file or directory upload with
> that transaction, and then wait on all the associated blobs to be
> committed (fsynced, or whatever blob storage impl requires) before
> noting that it's good locally.
> As for (2), though, pk-put won't repeat any work it's done. It'll still
> walk your local filesystem to see what's there, but it'll learn that
> it's already uploaded from either its local cache or from the server
> before it uploads chunks again.
> So it might be slow (throughput wise) but holding 2TB should be no
> problem, and auto-resume should work. If you run with the pk-put
> verbose option it'll show lots of stats about where which phases are
> at.
>

> On Fri, May 3, 2019 at 11:13 AM Ian Denhardt <[1]i...@zenhack.net>

> send an email to [2]perkeep+u...@googlegroups.com.
> For more options, visit [3]https://groups.google.com/d/optout.

>
> --
> You received this message because you are subscribed to the Google
> Groups "Perkeep" group.
> To unsubscribe from this group and stop receiving emails from it, send

> an email to [4]perkeep+u...@googlegroups.com.
> For more options, visit [5]https://groups.google.com/d/optout.
>
> Verweise
>
> 1. mailto:i...@zenhack.net
> 2. mailto:perkeep%2Bunsu...@googlegroups.com
> 3. https://groups.google.com/d/optout
> 4. mailto:perkeep+u...@googlegroups.com
> 5. https://groups.google.com/d/optout

ta...@gulacsi.eu

unread,

May 7, 2019, 12:15:33 AM5/7/19

to Brad Fitzpatrick, per...@googlegroups.com

Diskpacked uses a few big files, so it is gentler to the filesystem.

The best combination is a blobpacked storage, with filepacked small-, and diskpacked large blob backend.

This ensures that the blobs are packed, zipped, close together, but uses less files.

As I know this is achievable only through the low-level config, now.

Tamás Gulácsi


-- 
You received this message because you are subscribed to the Google Groups "Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email to perkeep+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brad Fitzpatrick

unread,

May 7, 2019, 1:23:26 AM5/7/19

to per...@googlegroups.com

On Mon, May 6, 2019 at 4:24 PM Ian Denhardt <i...@zenhack.net> wrote:

Thanks for the pointers. I've managed to solve the performance issue,
through two things:

1. I wrote a simple seccomp wrapper that just silently ignores calls to
fsync & sync. Obviously I don't have any intention of using this
after the initial import; I'm not crazy. But as expected, this sped
things up a lot.
2. The bigger difference came from switching to diskpacked storage. Is
there a reason this isn't the default?

diskpacked is good for write throughput, but not great for reads (often bad locality). blobpacked (the default) has perfect locality for files and has fast paths that cut through a bunch of the layering for sequential reads (e.g. downloading a file that is otherwise in thousands of logical blobs), but does more work on uploads. When your data grows slowly over time, that makes sense. When you're mass importing data into it, it's not very ideal.

To unsubscribe from this group and stop receiving emails from it, send an email to perkeep+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ian Denhardt

unread,

May 7, 2019, 10:32:14 AM5/7/19

to per...@googlegroups.com

Quoting Brad Fitzpatrick (2019-05-07 01:23:13)

> diskpacked is good for write throughput, but not great for reads (often
> bad locality). blobpacked (the default) has perfect locality for files
> and has fast paths that cut through a bunch of the layering for
> sequential reads (e.g. downloading a file that is otherwise in
> thousands of logical blobs), but does more work on uploads. When your
> data grows slowly over time, that makes sense. When you're mass
> importing data into it, it's not very ideal.

Thanks for the explanation, this makes sense.

-Ian

Reply all

Reply to author

Forward