Control: reassign -1 git
Control: found -1 1:2.1.4-2.1+deb8u2
Hi, git maintainers. I'm having a bit of a conundrum with the dgit
git server and I hope you can help.
I don't know that this is a bug in git. I'm reassigning this bug to
git so as to ask for your input, and I expect that after we are done
with triage this bug will be split and/or reassigned and/or made into
DSA tickets or something.
So, on to the problem. I see this:
mariner:d> git clone
https://git.dgit.debian.org/_test_botch.git
Cloning into '_test_botch'...
remote: Counting objects: 1114, done.
remote: warning: suboptimal pack - out of memory
remote: fatal: Out of memory? mmap failed: Cannot allocate memory
remote: aborting due to possible repository corruption on the remote side.
fatal: protocol error: bad pack header
mariner:d>
The server (a VM,
cgi-grnet-01.debian.org) has 1G of RAM and 500M of
swap. The repo _test_botch.git is a copy of botch.git, which is the
botch repo from the dgit-repos server for Debian as mirrored to the
git.dgit.d.o server using rsync. The server is running cgit for web
access on
browse.dgit.debian.org, and the git http smart transport on
git.dgit.d.o.
I have put a copy of the repo, in tarball form, here:
http://www.chiark.greenend.org.uk/~ijackson/quicksand/2016/_test_botch.git.tar.gz
(NB 210 Mby!)
Searching the intertubes for the error messages produced a lot of
people suggesting `oh just run git-repack' (with various options).
I ran `git repack' (with no options) and it made no difference.
On a hunch I ran `git-gc' on the source repo's actual live copy of
botch.git, and re-mirrored it. Now it works. (That is, `git clone
https://git.dgit.debian.org/botch.git' works, so this problem is no
longer affecting the production copy.)
It is possible that there is actually something wrong with the way I'm
handling my repo. Perhaps I need to explicitly run git-gc
occasionally, or something. I was under the impression that git would
do this automatically when it felt it was appropriate. I find
git-gc(1) slightly unclear on the question.
Can you please advise which of the following you think apply:
- This is a bug of some kind in git.
- The server should be provisioned with more RAM and/or swap.
- The dgit repos should be subjected to more background activity to
`tidy them up'. In this case, I would appreciate any advice you
had about: what the appropriate periodic activity is; how often it
should be run; and whether I need to lock against concurrent
updates by other programs.
- I am confused.
- Better documentation in git might help.
I am, of course, happy to supply more information, and I can do tests
etc. as well if that's helpful.
FYI the dgit-repos are handled by the dgit server in a slightly
unusual way. I don't think this is relevant, because ultimately it
means that the way the actual real repo is dealt with is fairly
conventional, at least from the point of view of the object store,
but:
The usual approach by the dgit server is to make a temporary repo
which is a hardlink farm to the real repo, and receive pushes into the
temporary repo. They are then inspected. If the push is considered
bad, the temporary repo is destroyed. If the push is considered good,
the relevant updates are pushed from the temporary repo to the real
repo. This is all achieved with a wrapper for git-receive pack as
well as some quite exciting hooks. The purpose of this is to avoid
adding objects from bad pushes (which might include unauthorised
pushes of harmful objects) to the real repo's object store.
Thanks for your attention.
Ian.
--
Ian Jackson <
ijac...@chiark.greenend.org.uk> These opinions are my own.
If I emailed you from an address @
fyvzl.net or @
evade.org.uk, that is
a private address which bypasses my fierce spamfilter.