Thanks for the pointers. I've managed to solve the performance issue,
through two things:
1. I wrote a simple seccomp wrapper that just silently ignores calls to
fsync & sync. Obviously I don't have any intention of using this
after the initial import; I'm not crazy. But as expected, this sped
things up a lot.
2. The bigger difference came from switching to diskpacked storage. Is
there a reason this isn't the default?
It managed to get through copying about 20GiB of data while I was in the
shower, so I think this solves my immediate issue.
Thanks again,
-Ian
Quoting Brad Fitzpatrick (2019-05-03 15:16:25)
> Perkeep in generally is very (too) aggressive at fsyncing per blob, and
> it cuts up files into lots of small blobs, so importing lots of data is
> slow. There's a plan to fix this, but life (baby) got in the� way, so
> it's kinda on hold until I find a few minutes to think. The two high
> level plans is to let clients specify transactions implicitly or
> explicitly: implicitly = one multipart/mime POST of a bunch of blobs is
> one transaction so should be 1 fsync, not 1 per blob, serially. The
> more complex one involves API changes and lets clients create their own
> transactions and associate, say, a whole file or directory upload with
> that transaction, and then wait on all the associated blobs to be
> committed (fsynced, or whatever blob storage impl requires) before
> noting that it's good locally.
> As for (2), though, pk-put won't repeat any work it's done. It'll still
> walk your local filesystem to see what's there, but it'll learn that
> it's already uploaded from either its local cache or from the server
> before it uploads chunks again.
> So it might be slow (throughput wise) but holding 2TB should be no
> problem, and auto-resume should work. If you run with the pk-put
> verbose option it'll show lots of stats about where which phases are
> at.
>
> On Fri, May 3, 2019 at 11:13 AM Ian Denhardt <[1]
i...@zenhack.net>
> send an email to [2]
perkeep+u...@googlegroups.com.
> For more options, visit [3]
https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Perkeep" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [4]
perkeep+u...@googlegroups.com.
> For more options, visit [5]
https://groups.google.com/d/optout.
>
> Verweise
>
> 1. mailto:
i...@zenhack.net
> 2. mailto:
perkeep%2Bunsu...@googlegroups.com
> 3.
https://groups.google.com/d/optout
> 4. mailto:
perkeep+u...@googlegroups.com
> 5.
https://groups.google.com/d/optout