Re: Question about "stale file handle error"

53 views
Skip to first unread message

Martin Fick

unread,
Dec 7, 2016, 6:32:46 PM12/7/16
to repo-d...@googlegroups.com, Hugo Arès
On Wednesday, December 07, 2016 04:19:15 PM you wrote:
> Hi Martin,
>
> I know we discussed that at last hackathon but I do not
> remember how did you fixed that one one your side.
>
> How did you fix the "stale file handle error" happening on
> slaves while uploading a pack? I mean when the upload is
> already started.
>
> Thanks
> Hugo

Hugo,

What we do is rename/move packfiles that are supposed to be
deleted to preserve them for one extra repacking iteration
before deleting them.

Specifically, we added (and use on every invocation) the
following two switches to the git repack shell script (we
use an old version of git when it still was a shell script).

-p prune oldpack files from old-packs subdir
-k keep oldpack files around in old-packs
subdir

-p happens first, and it deletes the contents of the
$PACKDIR/old-packs directory

-k happens next, and it move the pack (and index) files to be
deleted to $PACKDIR/old-packs and alters their extensions to
be prefixed with "old-", so ".old-pack" and ".old-idx". By
renaming the extensions, the old files avoid being captured
by a "find" looking for files with the original extensions.
By placing the "old-packs" dir under the normal $PACKDIR, it
helps ensure the move will not cross filesystems.

This aaproach keeps pack files around a bit after they should
be deleted, while no longer being in a location that new git
operations will use them. This avoids NFS stale file handle
errors for git read operations which may already have them
open before the move, as long as they complete before
repacking is run again (and the old pack files actually get
deleted). This ends up trading off some extra diskspace in
order to prevent failing git operations during repacking.
WIthout this, running git gc on a server is inherently
problematic and likely to cause a few failed operations.
While some operations can be retried by git when a stale FH
exception occurs, there are some (such as sending a
compressed deltafied object once part of it has already been
sent) that are not practical to retry. This solution should
help with all operations.

I hope that helps. My coworker James should be working soon
(this week) on porting this to jgit.

Any chance you could share your NFS "noac" test script?

Thanks,

-Martin

Martin Fick

unread,
Jan 3, 2017, 5:22:07 PM1/3/17
to repo-d...@googlegroups.com, Hugo Arès
On Wednesday, December 07, 2016 04:32:41 PM Martin Fick
wrote:
> On Wednesday, December 07, 2016 04:19:15 PM you wrote:
> > How did you fix the "stale file handle error" happening
> > on slaves while uploading a pack? I mean when the
> > upload is already started.
>
> What we do is rename/move packfiles that are supposed to
> be deleted to preserve them for one extra repacking
> iteration before deleting them.

My co-worker, James Melvin, uploaded a version of our fix
ported to jgit with slightly different (hopefully better)
naming of the switches. See this thread:

https://groups.google.com/forum/?hl=en#!topic/repo-
discuss/EQThRr4odV8

I hope it works for you.

Thanks,

-Martin

--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation

Hugo Arès

unread,
Jan 4, 2017, 7:09:29 AM1/4/17
to Repo and Gerrit Discussion, hug...@gmail.com
Thanks, we will review the change it and test it.
Reply all
Reply to author
Forward
0 new messages