On Wed, Nov 7, 2012 at 8:44 PM, "ken1 (Kenneth Ölwing)"
<
kenneth...@klarna.com> wrote:
> On 2012-11-06 16:33, Sitaram Chamarty wrote:
>> Git itself won't be affected at all. Since it is a content-addressed
>> system, any object can only be overwritten by *itself* if at all. So
>> it'll all work fine.
>
> Yes, that makes sense; although I'd guess finer points occur if things like
> git gc/repack etc enter the equation, as they muck around under objects. But
> I'd expect that git handles possible concurrency for that.
No. The name of the packfile is itself a SHA, and in turn uniquely
describes the contents. Same logic applies.
The only time you *may* have trouble, if I recall, is on some kinds of
non-Unix file systems like CIFS. I'll have to dig up the link; sorry
don't have it handy.
>> However, my best attempts at making something nasty happen have failed
>> so far. I tried with about 6000 repositories, at which point a
>> 'gitolite compile' takes about 6 seconds (on my laptop). I then ran
That 6 seconds is wrong, sorry about that. The critical part is 0.3
seconds for 11,000 (yes, eleven thousand) repos. See below for
details.
>> multiple 'gitolite compile' commands, as many as 4 overlapping runs.
>> The end result (the compiled file) was still produced correctly.
>
> As I described, with compile times of 10 minutes, the 'may' might be a bit
I seem to recall you have a few hundred repos so I don't understand
why this is taking 10 minutes. The worst case timing for 500 repos on
my laptop is 12 seconds on v3. (Even if I have to *create* the repos
its about 2 minutes, but that's only one time so I am not counting
that).
If you're using v2 but without GL_BIG_CONFIG, please don't even bother
replying. I didn't spend time coding it for people to ignore it and
go into hypothetical situations that I don't want to deal with.
----------
Details on time taken by 'git push' on the admin repo, with a conf
containing 11,000 repos. All timings are on my lenovo X201 laptop.
Let's get these things out of the way first:
(1) This is all for v3. The total timings for v2 with GL_BIG_CONFIG
should be similar, although it is not easy to determine the breakup
because it was not modular enough.
(2) When you add a new repo, some extra things happen. You won't
notice unless you add a few hundred repos in one shot. I'm going to
assume we're talking about a push where the number of repos is not
significantly changed but perhaps some users were added etc., or their
access was changed, etc.
(3) The number of users is irrelevant for timing purposes on the push.
Your ~/.ssh/authorized_keys may become so big that sshd takes time to
log you in, but that's not *inside* gitolite. (My laptop adds about 1
second per 2500-3000 lines in the authkeys file, so don't worry about
it unless you have more than a thousand or so users).
Now for the timings...
(1) a 'git push' on the admin repo causes two things to happen:
(1.1) gitolite compile
This parses the config, converts it into a bunch of perl hashes, and
writes them to files. There is one common "
gitolite.conf-compiled.pm"
in ~/.gitolite/conf which contains everything that is not specific to
an actual repo, and then each actual, named, repo, has a "gl-conf"
file in it. (See
http://sitaramc.github.com/gitolite/g2/bc.html for
some details; although it's a v2 document the basic idea is the same
in v3).
So there's one common file, and 11,000 repo-specific files to write.
Parse: 30 seconds (cold cache time. If you repeat it this goes
down to 7 seconds).
Write 11,000 files: 5 to 7 seconds.
Write the common file: 0.2 to 0.3 seconds.
Potential race conditions are certainly possible in theory, and yes
it's trivial to use a lock file at the top of the post-update hook
code if you need to, but that's not the point here. If you're firing
off admin pushes at a rate that makes it likely you will hit two of
them within the same 0.2 second slot, you have something seriously
wrong in your setup or your understanding of gitolite.
(1.2) gitolite trigger POST_COMPILE
This does all the non-core stuff like setting up permissions for
gitweb and git-daemon and acting upon 'config' lines to turn them into
'git config ...' for each repo as needed.
This takes a *lot* of time for 11,000 repos:
update-git-configs: about 7 minutes
update gitweb access: 29 minutes
update git-daemon access: 21 minutes
However, if you're not interested in gitweb/daemon, you can remove
those lines from the POST_COMPILE list (as well as the POST_CREATE
list) in the rc file. Poof; all gone.