multiple concurrent bup-splits fail, bup-midx issue

7 vistas
Ir al primer mensaje no leído

Joey Hess

no leída,
8 ago 2022, 11:23:16 a.m.8/8/22
para bup-list
Hi, I'm forwarding a bup problem that a user reported as a bug on git-annex, which can use bup as one of its storage methods. The user was running a git-annex command that uses bup-split. They had enabled two concurrent jobs. With this concurrency, bup-split often failed with a backtrace.

You can see the backtraces here


I'm thinking this is a bup-midx issue, because the backtraces each contain "midx", although there are several different patterns of the backtrace.

Also, to the limited extent I understand bup-midx, it sounds like it might delete idx files and write midx files, which might not work well if there are two processes doing that at the same time.

I am probably going to disable concurrent calls to bup-split in git-annex, but I wanted to report this bug to you in case there is an easy fix.

Greg Troxel

no leída,
8 ago 2022, 12:15:37 p.m.8/8/22
para Joey Hess,bup-list

Joey Hess <joey...@gmail.com> writes:

> I am probably going to disable concurrent calls to bup-split in git-annex,
> but I wanted to report this bug to you in case there is an easy fix.

Rob will surely explain better but:

IMHO concurrency is a weak spot in bup and most users don't run bup in
parallel. I have constructed my backup scripts to avoid it.

there aren't clear specs for when this is safe

I'm not at all sure there is locking against what is documented to be
safe

your plan of only invoking anything bup once, basically putting a lock
around any bup invocations so only one happens at a time, sounds like
exactly the right thing to do

signature.asc

Greg Troxel

no leída,
8 ago 2022, 12:17:41 p.m.8/8/22
para Joey Hess,bup-list

Greg Troxel <g...@lexort.com> writes:

> I'm not at all sure there is locking against what is documented to be
> safe

Sorry, that was incoherent. I meant that in an ideal world if two bups
accessing a bupdir was unsafe, there would be a biglock
acquired/released around the invocation, to brute force get back to
sound even if it was really slow, but i don't think this is true. (It's
also hard with remote filesystems but the main bup path is to run a bup
server near the remote disk.)
signature.asc

Joey Hess

no leída,
8 ago 2022, 4:09:50 p.m.8/8/22
para bup-list
Thanks for confirming that bup-split is probably not concurrency safe.

Do you think that an operation like bup-join that (I assume) only reads
existing data is likely to be safe to run several instances concurrently?

--
see shy jo
signature.asc

Greg Troxel

no leída,
8 ago 2022, 8:10:57 p.m.8/8/22
para Joey Hess,bup-list

Joey Hess <i...@joeyh.name> writes:

> Thanks for confirming that bup-split is probably not concurrency safe.

I'm only 97% sure it's not safe, but yeah.

> Do you think that an operation like bup-join that (I assume) only reads
> existing data is likely to be safe to run several instances concurrently?

I think that is going to be ok. I would say if bup join writes at all,
that's a bug, and I can't see how read-read would go wrong.

So one writer, or N readers, but a writer and some readers seems not
ok.

I suspect the fundamental issue is that as bup saves, it

writes a tmp packfile, and this should be ~ok because the random-ish
names will ~never collide

moves that into place, and that should be ok given naming

creates (sometimes, not sure), midx files that aggregate the idx files
to make object lookup work. This involves reading all the idx files
and writing a midx and that seems to be what is collding. The obvious
plan is to have a lock for writing to midx, and I think the scary part
is that locking is not universally reliable -- but I think that if it
worked when POSIX locks works, it would be a huge step forward.

Keep in mind that I am a little fuzzy on all of this.
signature.asc

Rob Browning

no leída,
9 ago 2022, 9:12:35 p.m.9/8/22
para Greg Troxel,Joey Hess,bup-list
Greg Troxel <g...@lexort.com> writes:

> Joey Hess <i...@joeyh.name> writes:
>
>> Thanks for confirming that bup-split is probably not concurrency safe.
>
> I'm only 97% sure it's not safe, but yeah.

I don't think we promise anything there yet, though I'd love to have
time to review and make very clear what's supported in general (and/or
just make it safe where possible). For now, as Greg says, I'd assume
it's possibly not.

>> Do you think that an operation like bup-join that (I assume) only reads
>> existing data is likely to be safe to run several instances concurrently?
>
> I think that is going to be ok. I would say if bup join writes at all,
> that's a bug, and I can't see how read-read would go wrong.

I also think it's quite likely, though without careful checking, I can't
say for absolute certain. The packfiles and packfile indexes should
definitely be immutable, excepting any obviously destructive operations
like rm or gc, so if join doesn't/can't refer to anything else, it
should be safe.

--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos