Maintaining mirrors of external repos

2,106 views
Skip to first unread message

Luke Lu

unread,
Mar 7, 2012, 7:32:17 PM3/7/12
to gitolite
We use a very simple way to maintain mirrors of external repos:

git clone --mirror git://external/repo.git

and then in a cron job we do

cd /<path>/repo.git && git fetch -q

What's the right way to do maintain mirrors of external repos in
gitolite?

Thanks!

__Luke

milk

unread,
Mar 7, 2012, 8:25:33 PM3/7/12
to Luke Lu, gitolite
It's in the docs:
http://sitaramc.github.com/gitolite/mirroring.html
--
Hope is a dimension of the spirit. It is not outside us, but within us. When you lose it, you must seek it again within yourself and in people around you -- not in objects or even events.    -
-Vaclav Havel

Luke Lu

unread,
Mar 7, 2012, 8:57:08 PM3/7/12
to gitolite
Note, I've read the gitolite mirror docs. It does not seem to address
external repos that we only have read-only access, say the linux
kernel repository.

Sitaram Chamarty

unread,
Mar 7, 2012, 9:00:36 PM3/7/12
to Luke Lu, gitolite
On Thu, Mar 8, 2012 at 7:27 AM, Luke Lu <vic...@gmail.com> wrote:
> Note, I've read the gitolite mirror docs. It does not seem to address
> external repos that we only have read-only access, say the linux
> kernel repository.

Gitolite does not do that. If you need *caching* of external repos,
try 'gitpod': https://github.com/sitaramc/gitpod

Sitaram Chamarty

unread,
Mar 7, 2012, 9:03:28 PM3/7/12
to Luke Lu, gitolite

damn... gitolite *can* do that. My instructions for gitpod say how to
make it work!

sweet... (and I'm getting old[er], sigh!)

--
Sitaram

Luke Lu

unread,
Mar 8, 2012, 12:18:29 AM3/8/12
to Sitaram Chamarty, gitolite
Appreciate the reply (and the work of course :), Sitaram!

It'll be great if the feature can be integrated into gitolite. I want
to maintain some open source repos as well as private branches (forks
of public repos) in the same repos, so that changes can be
diffed/merged/rebased easily when upstream changes. I want to allow
write access to these mirrored repos in these private branches (with a
prefix) and read-only for the original branches. Thanks to the tip of
gl-pre-git, I think I can check for remote.origin.mirror and maybe a
gitolite.mirror.ttl (to avoid fetch every time or a cron job) in the
repo config to do git fetch there.

IMO, this logic should be part of gitolite. For normal repos, it does
nothing. For clone mirrored repos, it works automagically.

Sitaram Chamarty

unread,
Mar 8, 2012, 12:30:26 AM3/8/12
to Luke Lu, gitolite
On Thu, Mar 8, 2012 at 10:48 AM, Luke Lu <vic...@gmail.com> wrote:
> Appreciate the reply (and the work of course :), Sitaram!
>
> It'll be great if the feature can be integrated into gitolite. I want
> to maintain some open source repos as well as private branches (forks
> of public repos) in the same repos, so that changes can be
> diffed/merged/rebased easily when upstream changes. I want to allow
> write access to these mirrored repos in these private branches (with a
> prefix) and read-only for the original branches. Thanks to the tip of
> gl-pre-git, I think I can check for remote.origin.mirror and maybe a
> gitolite.mirror.ttl (to avoid fetch every time or a cron job) in the
> repo config to do git fetch there.
>
> IMO, this logic should be part of gitolite. For normal repos, it does
> nothing. For clone mirrored repos, it works automagically.

There is nothing here that cannot be done by a gl-pre-git hook.

Write one and I'll add it to "contrib" but it's not going in "core". Sorry :)

Sitaram Chamarty

unread,
Jun 5, 2012, 1:01:12 PM6/5/12
to Luke Lu, gitolite
That last reply of mine was sent in the g2 days.

With g3 (v3.x) this is just as easy, except instead of "gl-pre-git"
hook, we now say "PRE_GIT trigger" :-)

I just pushed a little script called "upstream" that does this. It's
just a few lines of shell (as usual, more comments/doc than code).

Every time a client fetches such a repo, an upstream fetch happens,
then the client fetch is completed. Optionally, you can be "nice" and
impose a minimum delay between 2 upstream fetches.

Might help someone...

--
Sitaram

Andreas Stenius

unread,
Jun 5, 2012, 3:45:53 PM6/5/12
to Sitaram Chamarty, Luke Lu, gitolite
I didn't look to close at the pre git trigger, so I'm not sure what it does/how it works.

But here's what I did to keep mirrors of remote repositories (both git, svn and cvs): (since they are mirrors of public repos, I've also made my mirrors public, here: https://git.astekk.se)

Excerpt from gitolite.conf:

# gitolite mirror
@mirror = mirror/gitolite
repo mirror/gitolite
     config remote.origin.url = git://github.com/sitaramc/gitolite.git
     config remote.github.url = g...@github.com:kaos/gitolite.git

# cgit mirror
@mirror = mirror/cgit
repo mirror/cgit
     config remote.origin.url = git://hjemli.net/pub/git/cgit
     config remote.github.url = g...@github.com:kaos/cgit.git

# zotonic mirror
@mirror = mirror/zotonic
repo mirror/zotonic
     config remote.origin.url = git://github.com/zotonic/zotonic.git
     config remote.github.url = g...@github.com:kaos/zotonic.git

## default config for all git-based mirrors
repo @mirror
     R          = @all
     config gitweb.owner =
     config remote.origin.fetch = +refs/heads/*:refs/heads/*
     config remote.github.mirror = true
     option cgit.section = Mirrors
     option cgit.visible-for = *

## subversion mirror of cpputest
repo mirror/cpputest
     R          = @all
     config gitweb.owner =
     option cgit.section = Mirrors
     option cgit.visible-for = *
     config svn-remote.svn.url = https://cpputest.svn.sourceforge.net/svnroot/cpputest
     config svn-remote.svn.fetch = trunk:refs/heads/master
     config svn-remote.svn.branches = branches/*:refs/heads/*
     config svn-remote.svn.tags = tags/*:refs/tags/*
     config remote.github.url = g...@github.com:kaos/cpputest.git
     config remote.github.mirror = true

## cvs mirror of ecos
repo mirror/ecos
     R		= @all
     config gitweb.owner = 
     option cgit.section = Mirrors
     option cgit.visible-for = *
     config cvsimport.module = ecos
     config cvsimport.d = :pserver:ano...@ecos.sourceware.org:/cvs/ecos
     config remote.github-mirror.url = g...@github.com:kaos/ecos.git
     config remote.github-mirror.mirror = true


Then I use cron to run some fetches for those mirrors:

kaos@ganesha:~$ sudo crontab -l git
Password:
0 8-22 * * 1-5 /home/git/bin/update-mirrors.sh
0 22 * * 0 /home/git/bin/update-ecos-mirror.sh

I don't sync the ecos (cvs) mirror too often, to not load the cvs server too much...


My update mirrors script:

kaos@ganesha:~$ cat /home/git/bin/update-mirrors.sh
#!/usr/bin/env bash

# setup
GL=/home/git/bin/gitolite
GIT=/usr/gnu/bin/git

RB=`$GL query-rc GL_REPO_BASE`

# for each repo
for r in `$GL list-phy-repos` ; do
    # fetch from git remotes
    if $GL git-config -q -r $r remote\..+\.fetch ; then
        cd $RB/$r.git
        $GIT fetch --all --prune --quiet
    fi

    # fetch from svn remotes
    if $GL git-config -q -r $r svn\-remote\..+\.fetch ; then
        cd $RB/$r.git
        $GIT svn fetch --quiet
    fi

    # push to github
    if $GL git-config -q -r $r remote\.github\.url ; then
        cd $RB/$r.git
        $GIT push github --quiet
    fi
done

I treat the ecos mirror specifically (being the only cvs mirror, I didn't bother making it as generic as the others)

kaos@ganesha:~$ cat /home/git/bin/update-ecos-mirror.sh
#!/usr/bin/env bash

# setup
GL=/home/git/bin/gitolite
GIT=/usr/gnu/bin/git

RB=`$GL query-rc GL_REPO_BASE`

# sync eCos from upstream cvs server
cd $RB/mirror/ecos.git
export GIT_DIR=.
$GIT cvsimport -i

# push to github
$GIT push github-mirror --quiet


I setup both the svn and cvs based mirrors outside of gitolite first, then moved the .git dir to the right place, since I wanted some special treatment when initializing those repos, and wasn't bothered to get those args into the config file and handled properly that way.

But perhaps this may prove helpful to someone... 

Cheers,
Andreas


2012/6/5 Sitaram Chamarty <sita...@gmail.com>

Andreas Stenius

unread,
Jun 6, 2012, 4:27:53 PM6/6/12
to Luke Lu, gitolite
2012/6/5 Luke Lu <vic...@gmail.com>
On Tue, Jun 5, 2012 at 12:45 PM, Andreas Stenius <g...@astekk.se> wrote:
> I didn't look to close at the pre git trigger, so I'm not sure what it
> does/how it works.

Using pre git trigger and per repo configurable nice/ttl is easier and
more versatile than having to configure cron jobs. It also does not
waste resource (cpu/bandwidth) on less used the repos.


Thanks for the explanation. I now know it doesn't quite do what I want.

I don't need to touch the cron job after initial setup. Also adding new mirrors are easily done with just a config update (for git mirrors, at least). So the main difference seems to be the nice, and pull from mirror upon request.

For saving bandwidth is another topic though. My motivation for keeping local mirrors is to pull in changes automatically without me having to pull them myself. Even if the pull from the remote repo is triggered from a local pull, I won't see that there is anything to pull from the mirror until I try it...

//Andreas

Luke Lu

unread,
Jun 7, 2012, 6:17:18 PM6/7/12
to Andreas Stenius, gitolite
If you want to change the update interval for a particular repo, then
you have to edit the cron job. Cron job also doesn't scale if you have
many (say >100) upstream repos but only actively develop on a few of
them by more than one devs (so only one of them will wait a few
seconds for the upstream fetch most of the time).

Andreas Stenius

unread,
Jun 8, 2012, 3:32:48 AM6/8/12
to Luke Lu, gitolite
Sure enough. This is for my use case, which won't have that many repos...
It's not a silver bullet for every case, but perhaps useful for others in a similar situation.

By all means, I find the hook based solution more pleasing - and with some tweaking I'm sure
it'll do what I do with cron and scripts, and better too, but I'm not going to invest time into it
when what I already have works fine for me.

2012/6/8 Luke Lu <vic...@gmail.com>

Sitaram Chamarty

unread,
Aug 22, 2012, 10:12:31 AM8/22/12
to Erwin, gito...@googlegroups.com, Luke Lu
On Wed, Aug 22, 2012 at 6:27 PM, Erwin <junkmail...@arcor.de> wrote:
> Thanks a lot! The upstream trigger was precisely I was looking for in a long
> search. Unfortunately, it was not mentioned in the gitolite docs (e.g., next
> to partial-copy in 'non-core').

As http://sitaramc.github.com/gitolite/non-core.html says, in the
triggers section, "In general, the source code for each trigger will
tell you what it is doing and which trigger list you should add it
to."

I agree it's debatable whether every single trigger should be added to
the documentation; at the moment I am choosing not to.

> I think the script needs a little fix, see below. It should not check for
> the first argument passed to the trigger script being 'PRE_GIT', but for
> the last argument. Action is only required if this argument corresponds to
> 'git-upload-pack'.

I dont think you understood what/how it is supposed to behave.

You are making a distinction between a fetch and a push (from a
gitolite client).

I am making one between "automatically invoked by gitolite by dint of
being listed in PRE_GIT trigger chain" versus "manually invoked by
cron or command line".

>
> Regards,
> Erwin
>
> diff --git a/src/triggers/upstream b/src/triggers/upstream
> index b66ae87..7906cea 100755
> --- a/src/triggers/upstream
> +++ b/src/triggers/upstream
> @@ -9,7 +9,7 @@ url=$(gitolite git-config $repo
> gitolite-options.upstream.url)
>
> cd $GL_REPO_BASE/$repo.git || exit 1
>
> -[ "$1" != "fetch" ] && {
> +[ "$6" != "git-upload-pack" ] && {
> nice=$(gitolite git-config $repo gitolite-options.upstream.nice)
> [ -n "$nice" ] && find FETCH_HEAD -mmin -$nice | grep . >/dev/null && exit 0
--
Sitaram

Erwin

unread,
Aug 23, 2012, 10:03:28 AM8/23/12
to gito...@googlegroups.com, Luke Lu
Thanks for the quick and helpful response. I indeed overlooked the difference in the triggered and manual invocation.

As a suggestion: you may add a short note in the documentation that there are more non-core hooks in the sources and where to find them. Currently, only partial-copy is mentionend and thus I didn't realise that there is more.

Regards,
Erwin

Sitaram Chamarty

unread,
Aug 23, 2012, 10:37:11 AM8/23/12
to Erwin, gito...@googlegroups.com, Luke Lu
On Thu, Aug 23, 2012 at 7:33 PM, Erwin <junkmail...@arcor.de> wrote:
> Thanks for the quick and helpful response. I indeed overlooked the
> difference in the triggered and manual invocation.
>
> As a suggestion: you may add a short note in the documentation that there
> are more non-core hooks in the sources and where to find them. Currently,
> only partial-copy is mentionend and thus I didn't realise that there is
> more.

sure...
Reply all
Reply to author
Forward
0 new messages