partial-copy and mirroring

93 views
Skip to first unread message

Jan Remmet

unread,
Jan 27, 2016, 10:17:54 AM1/27/16
to gito...@googlegroups.com
Hello,
I tried to mirror a partial-copy repo. The manual only say that this is not
tested. But it is also not working ;)

There are no real pushes to the partial-copy repos. They are filled by
fetching in the PRE_GIT trigger.
Can we raise a "gitolite trigger POST_GIT repo nouser W any no-command" there
to trigger the mirroring?

Jan

milk

unread,
Jan 28, 2016, 12:13:16 AM1/28/16
to Jan Remmet, gitolite
On Wed, Jan 27, 2016 at 7:17 AM, Jan Remmet <J.Re...@phytec.de> wrote:
Hello,
I tried to mirror a partial-copy repo. The manual only say that this is not
tested. But it is also not working ;)

Can you illustrate the "not working" part with a transcript?

The steps I imaging mirroring to be doing:
1. (master) mirror push
2. (master) git push --mirror
3. (slave) INPUT trigger in Mirroring.pm sets GL_BYPASS_CHECKS
3. (slave) receives git push and populates local repo

So, with this, I don't expect any issues.

For your suggestion, a manual POST_GIT trigger will not lead to a PRE_GIT trigger. You can just as easily operate on the repo with git ls-remote or such.

Perhaps you don't need to mirror the repo explicitly. As long as its defined in the slave's gitolite-admin, the repo will be created and will act as a normal partial-copy.

-milki

Sitaram Chamarty

unread,
Jan 28, 2016, 12:41:07 AM1/28/16
to milk, Jan Remmet, gitolite
On 28/01/16 10:43, milk wrote:
> On Wed, Jan 27, 2016 at 7:17 AM, Jan Remmet <J.Re...@phytec.de <mailto:J.Re...@phytec.de>> wrote:
>
> Hello,
> I tried to mirror a partial-copy repo. The manual only say that this is not
> tested. But it is also not working ;)
>
>
> Can you illustrate the "not working" part with a transcript?

Milki:

It's simple. Push to the partial-copy. It will cause an update to the
main repo. This update won't carry to the slaves.

Look at it this way (use a mail reader with monospace fonts):

MASTER SLAVE
------ -----

--(1)---> partial-copy -------(3)-------> partial-copy
| .
| .
(2) (5)
| .
v v
main . . . . . (4) . . . . .> main


#1 user pushes a ref
#2 the partial-copy VREF (VREF, not trigger!) pushes the ref to 'main'
#3 the mirroring trigger pushes partial-copy repo to the slave

#4 doesn't happen, since main is not the actual repo involved; the
mirroring triggers don't fire
#5 doesn't happen, since this is from a VREF (and VREFs don't run on
slaves, due to GL_BYPASS...)

I'll look at this on the weekend. Most likely I'll force #4 manually
within the same VREF that does #2. I'm not sure invoking the POST_GIT
trigger is a good idea; I'd rather do something like:

for m in `gitolite mirror list slaves $GL_REPO`; do
gitolite mirror push $m $GL_REPO
done

regards
sitaram

Jan Remmet

unread,
Jan 28, 2016, 4:14:40 AM1/28/16
to Sitaram Chamarty, milk, gitolite
Is "main" here the repository with the subset of refs from "partial-copy" ?

In our setup we have a repository with the clean code, but also the WIP and
HACK stuff. The clean code also go via partial-copy into a other repo.
We now want to mirror the clean-code-only repo to a machine with gitweb or
cgit on it.
So we are even missing the transition 3

>
> I'll look at this on the weekend. Most likely I'll force #4 manually
> within the same VREF that does #2. I'm not sure invoking the POST_GIT
> trigger is a good idea; I'd rather do something like:
>
> for m in `gitolite mirror list slaves $GL_REPO`; do
> gitolite mirror push $m $GL_REPO
> done

Inside src/triggers/partial-copy I only get the usage if I run
gitolite mirror list slaves $repo
Usage 1: gitolite mirror push <slave> <repo>
Usage 2: ssh git@master-server mirror push <slave> <repo>
....

$repo is pointing to right repo.
>
> regards
> sitaram

Sitaram Chamarty

unread,
Jan 28, 2016, 4:25:48 AM1/28/16
to Jan Remmet, milk, gitolite
"partial-copy" (as the word "partial" indicates) is the subset. "main"
is the superset repo.

> In our setup we have a repository with the clean code, but also the WIP and
> HACK stuff. The clean code also go via partial-copy into a other repo.
> We now want to mirror the clean-code-only repo to a machine with gitweb or
> cgit on it.

You've lost me. I'm not even sure "partial-copy" is what you want or
should be using.

> So we are even missing the transition 3

Hmm that works fine here, though if we manage to get #4 working, then #3
will become superfluous anyway.

>
>>
>> I'll look at this on the weekend. Most likely I'll force #4 manually
>> within the same VREF that does #2. I'm not sure invoking the POST_GIT
>> trigger is a good idea; I'd rather do something like:
>>
>> for m in `gitolite mirror list slaves $GL_REPO`; do
>> gitolite mirror push $m $GL_REPO
>> done
>
> Inside src/triggers/partial-copy I only get the usage if I run
> gitolite mirror list slaves $repo
> Usage 1: gitolite mirror push <slave> <repo>
> Usage 2: ssh git@master-server mirror push <slave> <repo>
> ....
>
> $repo is pointing to right repo.

You may need to upgrade; the list slaves feature may have been recent.

regards
sitaram

Jan Remmet

unread,
Feb 1, 2016, 10:53:32 AM2/1/16
to Sitaram Chamarty, milk, gitolite
In our setup we push only to the main, and not to the partial-copy. I was
confused by #1. Maybe I missed something about pushing to the partial-copy.

> > In our setup we have a repository with the clean code, but also the WIP and
> > HACK stuff. The clean code also go via partial-copy into a other repo.
> > We now want to mirror the clean-code-only repo to a machine with gitweb or
> > cgit on it.
>
> You've lost me. I'm not even sure "partial-copy" is what you want or
> should be using.
By now it works fine. We have this branches in main repo:

V1.0
V1.1
V2.0
WIP/issue_1
WIP/issue_2
WIP/issue_3
HACK/something


and with partial-copy we get this:
V1.0
V1.1
V2.0


Now we want to mirror the partial-copy to a other machine and make it public
available.
We want to only mirror the partial-copy repo, not the main repo.

Is this a bad usage of partial-copy?

>
> > So we are even missing the transition 3
>
> Hmm that works fine here, though if we manage to get #4 working, then #3
> will become superfluous anyway.
>
> >
> >>
> >> I'll look at this on the weekend. Most likely I'll force #4 manually
> >> within the same VREF that does #2. I'm not sure invoking the POST_GIT
> >> trigger is a good idea; I'd rather do something like:
> >>
> >> for m in `gitolite mirror list slaves $GL_REPO`; do
> >> gitolite mirror push $m $GL_REPO
> >> done
> >
> > Inside src/triggers/partial-copy I only get the usage if I run
> > gitolite mirror list slaves $repo
> > Usage 1: gitolite mirror push <slave> <repo>
> > Usage 2: ssh git@master-server mirror push <slave> <repo>
> > ....
> >
> > $repo is pointing to right repo.
>
> You may need to upgrade; the list slaves feature may have been recent.
I'm on v3.6.4. The problem was the GL_USER checks in the mirror command.
If I add a "unset GL_USER" to src/triggers/partial-copy it get the list with
slaves.
You can trigger the issue:
GL_USER="foo" /home/git/gitolite/src/gitolite mirror list slaves REPO

Jan
>
> regards
> sitaram

Sitaram Chamarty

unread,
Feb 1, 2016, 11:20:03 AM2/1/16
to Jan Remmet, milk, gitolite
On 01/02/16 21:23, Jan Remmet wrote:

> Now we want to mirror the partial-copy to a other machine and make it public
> available.
> We want to only mirror the partial-copy repo, not the main repo.

sorry that won't work, and I have no plans to make it work -- it's too
much of a special case.

Jan Remmet

unread,
Mar 9, 2016, 11:06:01 AM3/9/16
to gito...@googlegroups.com
If your partial-copy gets updated by the main repo, this will not be
mirrored to slaves.

GL_USER will force usage() from the mirror command.
Unset GL_USER to get the list of mirrors.

This has been tested in a setup where only the partial-copys were
mirrored to an other machine.

Signed-off-by: Jan Remmet <j.re...@phytec.de>
---
src/triggers/partial-copy | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/src/triggers/partial-copy b/src/triggers/partial-copy
index 79b4d48..e708eb8 100755
--- a/src/triggers/partial-copy
+++ b/src/triggers/partial-copy
@@ -66,4 +66,10 @@ do
git push -f $GL_REPO_BASE/$repo.git :$ref || die "FATAL: failed to delete $ref"
done

+# force pushes to mirrors
+unset GL_USER
+for m in `gitolite mirror list slaves $repo`; do
+ gitolite mirror push $m $repo
+done
+
exit 0
--
1.9.1

Sitaram Chamarty

unread,
Mar 9, 2016, 10:57:04 PM3/9/16
to Jan Remmet, gito...@googlegroups.com
On 09/03/16 19:46, Jan Remmet wrote:
> If your partial-copy gets updated by the main repo, this will not be
> mirrored to slaves.
>
> GL_USER will force usage() from the mirror command.
> Unset GL_USER to get the list of mirrors.
>
> This has been tested in a setup where only the partial-copys were
> mirrored to an other machine.

I'm sorry but I've already said that is not the goal of partial-copy.
I think what this use case, as well as what Tony (in a different thread)
was talking about, should both be separated from partial-copy, and be
called something else, like maybe "publish" or perhaps
"partial-publish".

I'll write a longer email about this soon-ish.

regards
sitaram

Sitaram Chamarty

unread,
Mar 10, 2016, 3:37:28 AM3/10/16
to Jan Remmet, gito...@googlegroups.com, Tony Finch
On 10/03/16 09:26, Sitaram Chamarty wrote:

> I'm sorry but I've already said that is not the goal of partial-copy.
> I think what this use case, as well as what Tony (in a different thread)
> was talking about, should both be separated from partial-copy, and be
> called something else, like maybe "publish" or perhaps
> "partial-publish".
>
> I'll write a longer email about this soon-ish.

(nomenclature: "PCR" - partial copy repo, "MAIN" - the main repo)

1. rationale for partial-copy

First, the rationale for the partial-copy feature was that certain
developers needed to be able to work on a repo but some of the
branches should not be visible to them.

This means pushes *will* happen on PCR, which need to be propagated
to MAIN immediately. Treating the PCR as merely a transit point
keeps things sane and simple:

* right before you access a PCR, it gets stuff from MAIN

* right after you push to a PCR, it pushes that stuff to MAIN
(The code can even take care of situations where someone pushes
to MAIN and someone else pushes to PCR, and these two clash!)

This becomes especially important if you need multiple PCRs. I seem
to recall one user who needed 3 off of one MAIN, due to different
jurisdictions or some such legal nonsense.

2. partial-copy with mirroring

Based on the above, the only logical way is for MAIN to be mirrored
to all slaves as soon as the "PCR -> MAIN" happens on the master
server. So that is what we will be doing (the code has not yet been
pushed; I expect to do this by this weekend or so).

In fact there is no need for the PCRs to be mirrored at all. If
they are, the mirroring will happen, but will effectively be a waste
- the next time someone fetches or clones from the slave, it will
get a fresh update from its MAIN anyway.

3. "publishing" a repo

The need to have a public copy that needs to be accessed from
outside gitolite (like say gitweb for instance), is quite different.

In this case, we will have to assume/expect that no one will be
developing on those copies, so we don't have to worry about keeping
things in sync in both directions. A push happens to the main and
it will generate a push to the copy.

For this, the code that Tony linked to in his email [1][2] -- let's
call it "partial-publish" for the sake of clarity -- is fine, but I
suggest that the "-" rules be in the copy's ruleset, not in the main
repo's ruleset. This lets you have multiple "public" repos, with
different restrictions, if needed, without cluttering up the access
list for the main repo.

4. mirroring a "published" repo

I am pretty sure "partial-publish" can have a few more lines added
along the lines of:

unset GL_USER
for m in `gitolite mirror list slaves $repo`; do
gitolite mirror push $m $main
done

hope this helps

regards
sitaram

[1]: https://groups.google.com/d/msg/gitolite/019OTccFwmw/cCMfVAvvEQAJ
[2]: http://article.gmane.org/gmane.comp.version-control.gitolite/4246

Tony Finch

unread,
Mar 10, 2016, 6:10:38 AM3/10/16
to Sitaram Chamarty, Jan Remmet, gito...@googlegroups.com
Sitaram Chamarty <sita...@gmail.com> wrote:

> 1. rationale for partial-copy
>
> First, the rationale for the partial-copy feature was that certain
> developers needed to be able to work on a repo but some of the
> branches should not be visible to them.
>
> This means pushes *will* happen on PCR, which need to be propagated
> to MAIN immediately. Treating the PCR as merely a transit point
> keeps things sane and simple:

I think this is the right general attitude but I'm not sure the
implementation is quite up to it.

> * right before you access a PCR, it gets stuff from MAIN

I tried this with the bind9 repo (which has over 500 refs) and it went off
into the weeds for several minutes. In the end I gave up and killed it,
and scaled back to a much smaller test example.

> * right after you push to a PCR, it pushes that stuff to MAIN
> (The code can even take care of situations where someone pushes
> to MAIN and someone else pushes to PCR, and these two clash!)

I think in principle it ought to be possible to make this bidirectional,
i.e. a push to either repo is immediately propagated to the other, so
there's no need to propagate changes when a repo is read.

The tricky bit is that you (probably) want to make the set of writable
refs disjoint between the repos.

I'm not sure how to make this easy to set up and reasonably safe.

Tony.
--
f.anthony.n.finch <d...@dotat.at> http://dotat.at/
Bailey: South or southwest, veering west later, 5 to 7, increasing gale 8 at
times. Rough, becoming very rough. Occasional rain. Good, occasionally
moderate.

Sitaram Chamarty

unread,
Mar 10, 2016, 7:37:25 AM3/10/16
to Tony Finch, Jan Remmet, gito...@googlegroups.com

To start with, this is a kludge, and as long as git does not change,
will remain a kludge. We're doing something that is not directly
supported by git.

That said... read on :)

On 10/03/16 16:37, Tony Finch wrote:
> Sitaram Chamarty <sita...@gmail.com> wrote:
>>
>> 1. rationale for partial-copy
>>
>> First, the rationale for the partial-copy feature was that certain
>> developers needed to be able to work on a repo but some of the
>> branches should not be visible to them.
>>
>> This means pushes *will* happen on PCR, which need to be propagated
>> to MAIN immediately. Treating the PCR as merely a transit point
>> keeps things sane and simple:
>
> I think this is the right general attitude but I'm not sure the
> implementation is right.
>
>> * right before you access a PCR, it gets stuff from MAIN
>
> I tried this with the bind9 repo (which has over 500 refs) and it went off
> into the weeds for several minutes. In the end I gave up and killed it,
> and scaled back to a much smaller test example.

See the first line of src/triggers/partial-copy :-)

I suspect we may not have to go to perl though. One of the following
improvements should help:

1. use 'gitolite access' batch mode instead of running the access
command one by one. There's info on that in 'gitolite access -h'
and an example in src/triggers/post-compile/update-git-daemon-access-list.

2. setup the repo with "alternates" (a la e430ba62, the
"create-with-reference" trigger that milki wrote). That would
eliminate the copy time.

I'd try #1 first, because #2 should only be a factor the first time a
repo is populated; after that it's "mostly there" anyway.

>> * right after you push to a PCR, it pushes that stuff to MAIN
>> (The code can even take care of situations where someone pushes
>> to MAIN and someone else pushes to PCR, and these two clash!)
>
> I think in principle it ought to be possible to make this bidirectional,
> i.e. a push to either repo is immediately propagated to the other, so
> there's no need to propagate changes when a repo is read.

If bidirectional would work in this case, it would work for multi-master
mirroring also. And by and large it will, except when it doesn't :) I
don't want to deal with those races.

> The tricky bit is that you (probably) want to make the set of writable
> refs disjoint between the repos.

That won't fly at all, in the use cases I am aware of. The main repo
will always have a fully writable superset of refs.

> I'm not sure how to make this easy to set up and reasonably safe.

Indeed!

Sitaram Chamarty

unread,
Mar 10, 2016, 9:50:58 AM3/10/16
to Tony Finch, Jan Remmet, gito...@googlegroups.com
On 10/03/16 18:07, Sitaram Chamarty wrote:

> 1. use 'gitolite access' batch mode instead of running the access
> command one by one. There's info on that in 'gitolite access -h'
> and an example in src/triggers/post-compile/update-git-daemon-access-list.
>
> 2. setup the repo with "alternates" (a la e430ba62, the
> "create-with-reference" trigger that milki wrote). That would
> eliminate the copy time.
>
> I'd try #1 first, because #2 should only be a factor the first time a
> repo is populated; after that it's "mostly there" anyway.

Umm no. It turns out that is not it. I tried it with the bind9 repo.

What that script is doing is 3 things:

1. for each refs/heads/ in MAIN
if it's allowed in PCR
fetch it

This took about a minute the first time (when the actual data had to
come in), but the next time, it took about 2 seconds. As expected.

2. for each refs/heads and refs/tags in PCR
if the ref does not exist in MAIN or we don't have access
delete it

This took the same time every time, about 15 seconds. We could cut
short the access check by batching it (git-show-ref has a batch mode
of sorts), using that remove refs that do not exist in MAIN, then
batch the access check for the remaining.

3. this one is the killer...

for each tag in PCR
check if it's reachable from any of the refs/heads
if reachable from none, delete it

This took about 2 minutes. I don't see a simple way to short
circuit that without making some assumptions somewhere.

The push method will be superior to this only in the "when do you do all
this" sense. For an active repo with many devs on MAIN and only a few
on PCR (often the case, really), it will be just as bad.

In addition, the push method has the issue that if the admin adds more
restrictions they don't take effect immediately - until the next push to
MAIN, the PCR will happily serve up these now verboten refs.

Reply all
Reply to author
Forward
0 new messages