Deploying more robust hg-git replication for gecko

John O'Duinn

unread,

Aug 30, 2013, 9:29:08 PM8/30/13

to dev. planning, dev-pl...@lists.mozilla.org, dev-b2g, dev-gaia, release

(cross-posting to make sure this is not missed; please respond in
dev-planning or bug#847727)

tl,dr: we're moving replication of gecko+gaia repos to more robust
systems. This email gives quick status, and summary of rollout plan.

In the rush to B2G 1.0 release, we had to quickly stand up replication
systems to mirror changes from github to hg.m.o and to git.m.o. These
locally-patched systems replicate 368 repos, 24x7, and have carried us
a long way, but they are fragile to network/system hiccups, and tricky
to configure for new branches. Now that v1.0 has shipped, and we're
into more predictable development cadence, its time to switch to more
robust systems.

Over the last few months, we've been standing up new systems alongside
our current production systems. On the new systems, we've been testing
newer version of hg-git conversion by running our biggest, most
comprehensive testdata we have - the entire cvs-history and hg-history
of mozilla-central. When we hit problems that break the hg-git
conversion tools, we upstreamed fixes to the hg-git tools. Being able
to process all the edge-cases that Mozilla developers have thrown at
cvs and hg since the start of Mozilla gives us confidence that these
scripts are robust enough to use in production going forward.

The curious can see the generated, fully converted repo, at
https://github.com/escapewindow/test-beagle. It's roughly equivalent to
https://github.com/mozilla/mozilla-central, but with SHAs that match
http://git.mozilla.org/?p=releases/gecko.git;a=summary .

The "test-beagle" repo currently contains:
* mozilla-central with full CVS history
* mozilla-b2g18
* mozilla-b2g18_v1_1_0_hd
* mozilla-b2g18_v1_0_1
* mozilla-b2g18_v1_0_0
* mozilla-aurora
* mozilla-beta, with relbranches
* mozilla-release, with relbranches
* mozilla-esr17
* mozilla-inbound
* b2g-inbound
* fx-team

If you see anything that looks wrong in this repo, please do let us know.

Note:
1) Do *not* use this for any production work or real development
work... not just yet. This repo is, as the name suggests, a test repo
that we reserve the right to reset at any point as/when we find problems.
2) The branches to be included in this repo, and the final name and
location of the "test-beagle" repo, are still in flux. We're
considering having separate conversions of other repos that can be
pulled into local clones if desired. For more background, read the blog
post below for details.
3) Note that these can not be identical repos due to partner
requirements for gecko.git; see the blog post below).
4) All the release branches generated per release are present, but
currently no release tags. Moving tags is problematic in git; see the
blog post below for details.

Our rollout plan is to:
* enable "test-beagle" repo in mozilla account on github (using real
name of course!)
* replace/reset these repos on github
(https://github.com/mozilla/releases-mozilla-central,
https://github.com/mozilla/releases-mozilla-aurora,
https://github.com/mozilla/releases-mozilla-release) since they don't
have CVS history and have a different set of SHAs. If they're unneeded
and we can nuke 'em, even better.
* switch over http://git.mozilla.org/?p=releases/gecko.git;a=summary
to be on these new scripts. As the SHAs are the same, this should be
invisible to Mozilla devs and partners using it, and should buy us
better stability.

There's plenty of other work after this deploy is completed (l10n,
bitbucket, etc), but this is a big leap forward and exciting to see.

Hope all that makes sense - for more information, please see:
* the blog post http://escapewindow.dreamwidth.org/238476.html
* the bug https://bugzilla.mozilla.org/show_bug.cgi?id=847727

Thanks
Aki, Hal and John.

signature.asc

Ehsan Akhgari

unread,

Sep 3, 2013, 11:25:04 AM9/3/13

to jod...@mozilla.com, dev. planning, dev-pl...@lists.mozilla.org

Thanks for the update on this, John! I have a number of questions about
this:

1. On the issue of the hg tags, can you please be more specific about the
problem that you're encountering? In my experience, git deals with updated
tags the same way as it does with any non-fast-forward merge, and I've
never experienced any problems with that.

2. How frequently are these branches updated?

3. What are the plans for adding the rest of the branches that we currently
have in https://github.com/mozilla/mozilla-central, an what is the process
for developers to request their project branches to be added to the
conversion jobs? Right now the process is pretty light-weight (they
usually just ping me on IRC or send me an email). It would be nice if we
kept that property.

4. About the interaction between gecko.git and beagle, when you say that
they cannot be identical, are you talking about the location where they're
hosted or the SHA1s for the same commits? I think we need to guarantee the
latter being identical going forward, but the former doesn't matter as much.

5. Is the git-mapfile generated from the conversion scrips available
somewhere? (Mine is currently available here:
http://git.mozilla.org/?p=users/eakh...@mozilla.com/mozilla-history-tools.git;a=history;f=updates/git-mapfile;h=5c8ecf4bf1d7339ad1dff1ea4ab62a712da2fb6c;hb=HEAD
)

6. Do you have any plans for how we're going to approach the problem of
migrating the current git mirror with beagle? Specifically, I think we
need to:
a) Figure out how github deals with a repository being deleted/recreated,
would that break people's forks, stars, etc.?
b) Are there any alternative solutions in replacing all of the
branches/tags under mozilla/mozilla-central with the new repo?
c) What is the developer work flow going to look like in rebasing the local
branches once the move has been completed?

Cheers,

--
Ehsan
<http://ehsanakhgari.org/>

> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
>

Gregory Szorc

unread,

Sep 3, 2013, 1:21:28 PM9/3/13

to Ehsan Akhgari, dev. planning, dev-pl...@lists.mozilla.org, jod...@mozilla.com

On 9/3/13 8:25 AM, Ehsan Akhgari wrote:
> 5. Is the git-mapfile generated from the conversion scrips available
> somewhere? (Mine is currently available here:
> http://git.mozilla.org/?p=users/eakh...@mozilla.com/mozilla-history-tools.git;a=history;f=updates/git-mapfile;h=5c8ecf4bf1d7339ad1dff1ea4ab62a712da2fb6c;hb=HEAD
> )
>
> 6. Do you have any plans for how we're going to approach the problem of
> migrating the current git mirror with beagle? Specifically, I think we
> need to:
> a) Figure out how github deals with a repository being deleted/recreated,
> would that break people's forks, stars, etc.?
> b) Are there any alternative solutions in replacing all of the
> branches/tags under mozilla/mozilla-central with the new repo?
> c) What is the developer work flow going to look like in rebasing the local
> branches once the move has been completed?

Publishing the mapfile as a raw file and possibly as a simple lookup
service would help here. I'd love to integrate this mapfile into my
"mozext" Mercurial extension [1] so things like |hg log| can easily
display the corresponding Git commits. Publishing the mapping of old Git
SHA-1's to new SHA-1's also enables easy transition/rebasing to the new
SHA-1's. tl;dr publish all the data publicly and good things happen.

[1] https://hg.mozilla.org/users/gszorc_mozilla.com/hgext-gecko-dev

Aki Sasaki

unread,

Sep 3, 2013, 9:41:36 PM9/3/13

to

(sigh, reply-to: doesn't automatically cc the newsgroup?)

On 9/3/13 8:25 AM, Ehsan Akhgari wrote:

> Thanks for the update on this, John! I have a number of questions about
> this:
>
> 1. On the issue of the hg tags, can you please be more specific about the
> problem that you're encountering? In my experience, git deals with updated
> tags the same way as it does with any non-fast-forward merge, and I've
> never experienced any problems with that.

I'm explicitly not converting most tags, and only whitelisting certain
ones to avoid the issue caused by moving tags.

To quote
https://www.kernel.org/pub/software/scm/git/docs/git-tag.html#_on_re_tagging
on moving git tags:

``Does this seem a bit complicated? It should be. There is no way that
it would be correct to just "fix" it automatically. People need to know
that their tags might have been changed.''

We move tags regularly on hg repos; this is standard operating procedure
for a release build 2, or if a build 1 has an automation hiccup. While
we *could* convert the tags automatically, then either never move them
or move them behind users' backs, users would then never get the updated
tags unless they explicitly delete and re-fetch the tag by name...
something people wouldn't typically do without prompting. In my
opinion, tags pointing at the wrong revision are worse than no tags.

Also, I need to limit the tags pushed to gecko.git, as there is a hard
fastforward-only rule there, and notifying partners to delete and
recreate tags seems like a non-starter. So I built in tag-limiting
whitelists for safety.

However, there appears to be an issue with the way I'm limiting tags.
Rather than delay things further, I decided to publish as-is and see if
anyone really cares about the tags, or if they would be fine using the
tip of each relbranch instead.

Since you bring up the point of tags, can you give examples of how you
use tags, or how you have seen others use tags?

> 2. How frequently are these branches updated?

The current job is running on a 5 minute cron job. We can, of course,
change that if needed. When I add a new repo or a slew of new branches
to convert, the job can take longer to complete, but it typically
finishes in about 6 minutes.

> 3. What are the plans for adding the rest of the branches that we currently
> have in https://github.com/mozilla/mozilla-central, an what is the process
> for developers to request their project branches to be added to the
> conversion jobs? Right now the process is pretty light-weight (they
> usually just ping me on IRC or send me an email). It would be nice if we
> kept that property.

There are two concerns here: Project branches are often reset, and also
individual developers care about different subsets of branches.

Providing a core set of branches that everyone uses, and which we can
safely support at scale, seems a good set to include by default. For
users of other branches, one approach we're looking at is to provide
supported documentation that shows developers how to add
any-branch-of-their-choice to their local repo. We're still figuring out
what this default set should be, and as you have clearly expressed
opinions on this in the past, we'd of course be interested in your
current opinions here.

> 4. About the interaction between gecko.git and beagle, when you say that
> they cannot be identical, are you talking about the location where they're
> hosted or the SHA1s for the same commits? I think we need to guarantee the
> latter being identical going forward, but the former doesn't matter as much.

It's a requirement to keep the SHA1s the same, and we are confident we
can do that.
The location and naming for legacy b2g branches will be different, as
well as a hard fastforward-only rule on gecko.git.

> 5. Is the git-mapfile generated from the conversion scrips available
> somewhere? (Mine is currently available here:
> http://git.mozilla.org/?p=users/eakh...@mozilla.com/mozilla-history-tools.git;a=history;f=updates/git-mapfile;h=5c8ecf4bf1d7339ad1dff1ea4ab62a712da2fb6c;hb=HEAD
> )

I'm uploading this per run, but it's currently not going anywhere
publicly accessible.
We'll change the location of the upload to a public location before
signing off on this repo.

> 6. Do you have any plans for how we're going to approach the problem of
> migrating the current git mirror with beagle? Specifically, I think we
> need to:
> a) Figure out how github deals with a repository being deleted/recreated,
> would that break people's forks, stars, etc.?
> b) Are there any alternative solutions in replacing all of the
> branches/tags under mozilla/mozilla-central with the new repo?
> c) What is the developer work flow going to look like in rebasing the local
> branches once the move has been completed?

This is a one time hit, but obviously, we want the least churn and
disruption possible. We're still putting that plan together, so all good
questions to add to the planning criteria.
First, lets get feedback that the test-beagle repos looks good and can
be signed off.

aki

(originals to end)

Ehsan Akhgari

unread,

Sep 5, 2013, 12:51:01 PM9/5/13

to mozilla.de...@lists.mozilla.org, dev-planning@lists.mozilla.org planning, Aki Sasaki, dev-pl...@lists.mozilla.org, jod...@mozilla.com

On 2013-09-03 9:39 PM, Aki Sasaki wrote:
> On 9/3/13 8:25 AM, Ehsan Akhgari wrote:
>> Thanks for the update on this, John! I have a number of questions about
>> this:
>>
>> 1. On the issue of the hg tags, can you please be more specific about the
>> problem that you're encountering? In my experience, git deals with updated
>> tags the same way as it does with any non-fast-forward merge, and I've
>> never experienced any problems with that.
>
> I'm explicitly not converting most tags, and only whitelisting certain
> ones to avoid the issue caused by moving tags.
>
> To quote
> https://www.kernel.org/pub/software/scm/git/docs/git-tag.html#_on_re_tagging
> on moving git tags:
>
> ``Does this seem a bit complicated? It should be. There is no way that
> it would be correct to just "fix" it automatically. People need to know
> that their tags might have been changed.''
>
> We move tags regularly on hg repos; this is standard operating procedure
> for a release build 2, or if a build 1 has an automation hiccup. While
> we *could* convert the tags automatically, then either never move them
> or move them behind users' backs, users would then never get the updated
> tags unless they explicitly delete and re-fetch the tag by name...
> something people wouldn't typically do without prompting. In my
> opinion, tags pointing at the wrong revision are worse than no tags.

Huh, interesting! I was actually under the impression that tags are
updated in a non-fast-forward manner similar to branches if you want,
but it seems to not be the case given the documentation.

> Also, I need to limit the tags pushed to gecko.git, as there is a hard
> fastforward-only rule there, and notifying partners to delete and
> recreate tags seems like a non-starter. So I built in tag-limiting
> whitelists for safety.
>
> However, there appears to be an issue with the way I'm limiting tags.
> Rather than delay things further, I decided to publish as-is and see if
> anyone really cares about the tags, or if they would be fine using the
> tip of each relbranch instead.
>
> Since you bring up the point of tags, can you give examples of how you
> use tags, or how you have seen others use tags?

I don't use tags myself at all, I remember someone pointing out that my
repo did not contain the hg tags a long time ago (2+ years) so I
unfortunately don't remember who that person was. That's when I started
pushing the tags as well. I just pointed this out as something that
caught my attention.

>> 2. How frequently are these branches updated?
>
> The current job is running on a 5 minute cron job. We can, of course,
> change that if needed. When I add a new repo or a slew of new branches
> to convert, the job can take longer to complete, but it typically
> finishes in about 6 minutes.

Sounds good, that's exactly the same setup and frequency that I have as
well. It seems to be working fine for most people.

>> 3. What are the plans for adding the rest of the branches that we currently
>> have in https://github.com/mozilla/mozilla-central, an what is the process
>> for developers to request their project branches to be added to the
>> conversion jobs? Right now the process is pretty light-weight (they
>> usually just ping me on IRC or send me an email). It would be nice if we
>> kept that property.
>
> There are two concerns here: Project branches are often reset, and also
> individual developers care about different subsets of branches.
>
> Providing a core set of branches that everyone uses, and which we can
> safely support at scale, seems a good set to include by default. For
> users of other branches, one approach we're looking at is to provide
> supported documentation that shows developers how to add
> any-branch-of-their-choice to their local repo. We're still figuring out
> what this default set should be, and as you have clearly expressed
> opinions on this in the past, we'd of course be interested in your
> current opinions here.

I strongly disagree that providing the current subset of branches is
enough. Firstly, to address your point about project branches being
reset, it's true and it works perfectly fine for the people who are
using those branches, since they will get a non-fast-forward merge which
signals them about this change, which they can choose to avoid if they
want. Also, if somebody gives up their project branch (such as is the
case for twigs) we can always delete the branch from the "main" remote
and doing that will not affect people who have local branches based on
that. Git does the right thing in every case.

About the fact that individual developers care about different subsets
of branches, that's precisely correct, and it's fine, since when using
git, you *always* ignore branches that you're not interested in, and are
never affected by what changes happen in those branches. We have an
extensive amount of experience with the current github mirror about both
of these points and we _know_ that neither of these two are issues here.

Part of the value of having a git mirror for developers is that we don't
require individual people to have their own mirroring infrastructure, so
just providing documentation that tells people how to set up their own
branches by setting up hg-git locally, etc. is kind of defeating the
purpose of having a mirror that helps everybody.

Here's my proposal: the RelEng built git mirror should contains every
one of the branches that the current github mirror contains, and the
process of people requesting new branches to be mirrored in the future
should be as lightweight as possible (i.e., we should never say "no" to
a developer who asks for a new branch to be added to the mirror -- we
should trust them to know which branches are useful to them if they have
a git workflow.) This has served us quite well since the beginning of
the the github mirror.

>> 4. About the interaction between gecko.git and beagle, when you say that
>> they cannot be identical, are you talking about the location where they're
>> hosted or the SHA1s for the same commits? I think we need to guarantee the
>> latter being identical going forward, but the former doesn't matter as much.
>
> It's a requirement to keep the SHA1s the same, and we are confident we
> can do that.
> The location and naming for legacy b2g branches will be different, as
> well as a hard fastforward-only rule on gecko.git.

That sounds fine.

One point to note here is that because of reasons which I have not
explored, some of our branches such as aurora and beta are regularly
updated in a non-fast-forward manner. You can see this if you run git
fetch in a local clone (there will be a "(forced update)" next to the
name of the branch in the output.) You might want to investigate why
that happens in case B2G starts to follow regular release trains in the
future (but please also note that it might just be a bug in my existing
setup.)

>> 5. Is the git-mapfile generated from the conversion scrips available
>> somewhere? (Mine is currently available here:
>> http://git.mozilla.org/?p=users/eakh...@mozilla.com/mozilla-history-tools.git;a=history;f=updates/git-mapfile;h=5c8ecf4bf1d7339ad1dff1ea4ab62a712da2fb6c;hb=HEAD
>> )
>
> I'm uploading this per run, but it's currently not going anywhere
> publicly accessible.
> We'll change the location of the upload to a public location before
> signing off on this repo.

Sounds good! Just to confirm, do you end up with a single git-mapfile
in your setup? i.e., do you run hg-git only on a single local clone,
and just pull different hg branches into it, differentiated by bookmarks?

>> 6. Do you have any plans for how we're going to approach the problem of
>> migrating the current git mirror with beagle? Specifically, I think we
>> need to:
>> a) Figure out how github deals with a repository being deleted/recreated,
>> would that break people's forks, stars, etc.?
>> b) Are there any alternative solutions in replacing all of the
>> branches/tags under mozilla/mozilla-central with the new repo?
>> c) What is the developer work flow going to look like in rebasing the local
>> branches once the move has been completed?
>
> This is a one time hit, but obviously, we want the least churn and
> disruption possible. We're still putting that plan together, so all good
> questions to add to the planning criteria.

Sounds good!

> First, lets get feedback that the test-beagle repos looks good and can
> be signed off.

What kind of feedback are you interested in here? Just comparing the
contents of the repo on the tip of the branches can be performed using
md5sum.

Cheers,
Ehsan

Ehsan Akhgari

unread,

Sep 5, 2013, 12:51:01 PM9/5/13

to mozilla.de...@lists.mozilla.org, dev-planning@lists.mozilla.org planning, Aki Sasaki, dev-pl...@lists.mozilla.org, jod...@mozilla.com

Nicolas B. Pierron

unread,

Sep 5, 2013, 1:34:51 PM9/5/13

to

On 09/05/2013 09:51 AM, Ehsan Akhgari wrote:
> On 2013-09-03 9:39 PM, Aki Sasaki wrote:
>> On 9/3/13 8:25 AM, Ehsan Akhgari wrote:
>>> 2. How frequently are these branches updated?
>>
>> The current job is running on a 5 minute cron job. We can, of course,
>> change that if needed. When I add a new repo or a slew of new branches
>> to convert, the job can take longer to complete, but it typically
>> finishes in about 6 minutes.
>
> Sounds good, that's exactly the same setup and frequency that I have as
> well. It seems to be working fine for most people.

I had the same frequency before, but the problem is that when you are trying
to push, you are 5 minutes late compared to the others. It happens to me,
that I had to rebase multiple times before being able to push anything to
inbound.

Since, I changed to a system which is looking every 10s for modifications in
the pushlogs, then if there is a modification the script will update of the
modified branch.

The reasons why I bring the timer so low, is that the pushlog is serve
through http, and the server is likely to cache it knowing that this is used
by tbpl (which by the way use https). I did not do it before because "hg
id" still implies that we have to establish a secure connection.

> Part of the value of having a git mirror for developers is that we don't
> require individual people to have their own mirroring infrastructure, so
> just providing documentation that tells people how to set up their own
> branches by setting up hg-git locally, etc. is kind of defeating the purpose
> of having a mirror that helps everybody.

I Agree.

This is for this exact reason that I move all this process to another
computer. Having to use a special tool for pushing is a terrible git
integration.

>>> 4. About the interaction between gecko.git and beagle, when you say that
>>> they cannot be identical, are you talking about the location where they're
>>> hosted or the SHA1s for the same commits? I think we need to guarantee the
>>> latter being identical going forward, but the former doesn't matter as much.
>>
>> It's a requirement to keep the SHA1s the same, and we are confident we
>> can do that.
>> The location and naming for legacy b2g branches will be different, as
>> well as a hard fastforward-only rule on gecko.git.
>
> That sounds fine.
>
> One point to note here is that because of reasons which I have not explored,
> some of our branches such as aurora and beta are regularly updated in a
> non-fast-forward manner. You can see this if you run git fetch in a local
> clone (there will be a "(forced update)" next to the name of the branch in
> the output.) You might want to investigate why that happens in case B2G
> starts to follow regular release trains in the future (but please also note
> that it might just be a bug in my existing setup.)

I never experienced that before. I guess this might be related to the fact
that I pull all mercurial repositories into one before doing the conversion
to git.

--
Nicolas B. Pierron

Aki Sasaki

unread,

Sep 5, 2013, 10:33:40 PM9/5/13

to Ehsan Akhgari, John O'Duinn

Ok. I'm going to leave the tags the way they are, until whoever that is
speaks up.

We're considering doing the following:

1) leaving test-beagle's repo list the same as it is today.
These are the core hg repos gecko developers will care about going
forward. If another core repository is added, we can add it to the list.

2) creating another RelEng-provided, CVS-history based, same-SHA repo,
which only contains project branches. This would be separate to (1)
above. If this repo gets corrupted by project resetting (or any other
reason), we'll just blow it away and recreate it.

We expect people to use (1), but if you want to use project branches,
you can add (2) as a second remote to your local repo. While this is an
additional setup step for developers wanting project branches in their
git repo, this approach also:

* speeds up the process of pushing the core repos,
* allows us to more easily make the core repo list more strict in terms
of fast-forwardability etc. while letting the project branches be much
less restricted,
* allows us to support the core repos at a higher level than the project
branches. For instance, if a project repo reset or broken commit were
to bring the entire conversion loop to its knees, it would only affect
project branches, not core repos, and
* allows us to make the project branch list much more flexible, without
worrying that we're cluttering the core repo mapfile and conversion
directories with revisions that will never again see the light of day.

> About the fact that individual developers care about different subsets
> of branches, that's precisely correct, and it's fine, since when using
> git, you *always* ignore branches that you're not interested in, and are
> never affected by what changes happen in those branches. We have an
> extensive amount of experience with the current github mirror about both
> of these points and we _know_ that neither of these two are issues here.

I agree that you're able to limit the branch list when you're pulling
from the github mirror.

I think that adding these additional branches (that are known to not be
pertinent to most developers and have a limited lifespan) could affect
the robustness and efficiency of the core repo conversion over time,
however. This robustness is important to the entire organization, so
we're choosing the safer approach.

> Part of the value of having a git mirror for developers is that we don't
> require individual people to have their own mirroring infrastructure, so
> just providing documentation that tells people how to set up their own
> branches by setting up hg-git locally, etc. is kind of defeating the
> purpose of having a mirror that helps everybody.

My proposal above doesn't involve developers having to use hg-git at all.

> Here's my proposal: the RelEng built git mirror should contains every
> one of the branches that the current github mirror contains, and the
> process of people requesting new branches to be mirrored in the future
> should be as lightweight as possible (i.e., we should never say "no" to
> a developer who asks for a new branch to be added to the mirror -- we
> should trust them to know which branches are useful to them if they have
> a git workflow.) This has served us quite well since the beginning of
> the the github mirror.

My proposal would be conducive to making the mirroring request more
lightweight, since the core repos are buffered from the risk of adding
those project branches.

>>> 4. About the interaction between gecko.git and beagle, when you say that
>>> they cannot be identical, are you talking about the location where
>>> they're
>>> hosted or the SHA1s for the same commits? I think we need to
>>> guarantee the
>>> latter being identical going forward, but the former doesn't matter
>>> as much.
>>
>> It's a requirement to keep the SHA1s the same, and we are confident we
>> can do that.
>> The location and naming for legacy b2g branches will be different, as
>> well as a hard fastforward-only rule on gecko.git.
>
> That sounds fine.
>
> One point to note here is that because of reasons which I have not
> explored, some of our branches such as aurora and beta are regularly
> updated in a non-fast-forward manner. You can see this if you run git
> fetch in a local clone (there will be a "(forced update)" next to the
> name of the branch in the output.) You might want to investigate why
> that happens in case B2G starts to follow regular release trains in the
> future (but please also note that it might just be a bug in my existing
> setup.)

Yes. I noted this in my blog post:
http://escapewindow.dreamwidth.org/238476.html#what_is_gecko_git
which I understand is tl;dr.

I have already reached out to Release Management to change their
migration process to include |hg debugsetparents|. I just spoke with
lsblakk, who confirmed that is now added to the migration scripts, and
will take effect for the Firefox 26 Aurora merge next week.

>>> 5. Is the git-mapfile generated from the conversion scrips available
>>> somewhere? (Mine is currently available here:
>>> http://git.mozilla.org/?p=users/eakh...@mozilla.com/mozilla-history-tools.git;a=history;f=updates/git-mapfile;h=5c8ecf4bf1d7339ad1dff1ea4ab62a712da2fb6c;hb=HEAD
>>>
>>> )
>>
>> I'm uploading this per run, but it's currently not going anywhere
>> publicly accessible.
>> We'll change the location of the upload to a public location before
>> signing off on this repo.
>
> Sounds good! Just to confirm, do you end up with a single git-mapfile
> in your setup? i.e., do you run hg-git only on a single local clone,
> and just pull different hg branches into it, differentiated by bookmarks?

That's correct, though in the above proposal, there would be one
git-mapfile for the core repos and another for the project branches.

gecko.git, which is partner-oriented, would have its own mapfile that
will be a subset of beagle's.

>>> 6. Do you have any plans for how we're going to approach the problem of
>>> migrating the current git mirror with beagle? Specifically, I think we
>>> need to:
>>> a) Figure out how github deals with a repository being
>>> deleted/recreated,
>>> would that break people's forks, stars, etc.?
>>> b) Are there any alternative solutions in replacing all of the
>>> branches/tags under mozilla/mozilla-central with the new repo?
>>> c) What is the developer work flow going to look like in rebasing the
>>> local
>>> branches once the move has been completed?
>>
>> This is a one time hit, but obviously, we want the least churn and
>> disruption possible. We're still putting that plan together, so all good
>> questions to add to the planning criteria.
>
> Sounds good!
>
>> First, lets get feedback that the test-beagle repos looks good and can
>> be signed off.
>
> What kind of feedback are you interested in here? Just comparing the
> contents of the repo on the tip of the branches can be performed using
> md5sum.

I can compare SHA1's against gecko.git, but only on the branches they
share in common.

The feedback needed here is: is this good enough to use? to consolidate
all the various other gecko based repos we are each supporting? That's
partially answered here in this email thread, but should anyone have the
time and interest, please dig deeper into the repository for anything
that's missing or different than expected.

aki

>
> Cheers,
> Ehsan

Mook

unread,

Sep 5, 2013, 11:48:45 PM9/5/13

to

On 09/05/2013 09:51 AM, Ehsan Akhgari wrote:
> On 2013-09-03 9:39 PM, Aki Sasaki wrote:

>> Since you bring up the point of tags, can you give examples of how you
>> use tags, or how you have seen others use tags?
>
> I don't use tags myself at all, I remember someone pointing out that my
> repo did not contain the hg tags a long time ago (2+ years) so I
> unfortunately don't remember who that person was. That's when I started
> pushing the tags as well. I just pointed this out as something that
> caught my attention.
>

I might have been that someone; I don't recall. My normal use case
normally just involves building something one-shot, though; I can live
with having a separate repo with tags (similar to mozilla-release).
Alternatively, treating tags as branches instead will also work for me.
With some work, I can probably switch to using tarballs (it's just
more annoying if I later find a reason to hack on that tree). The main
reason I preferred git was because it would actually finish cloning, in
contrast to hg (where I'd need to start from a bundle). That and `git
clone --reference` is very, very useful...

--
Mook

Aki Sasaki

unread,

Sep 6, 2013, 1:13:14 PM9/6/13

to

If you're looking for the {FIREFOX,FENNEC}_*_RELEASE type tags, those
are all created on relbranches.

For instance, FIREFOX_23_0_1_BUILD2 and FIREFOX_23_0_1_RELEASE are on
GECKO2301_2013081518_RELBRANCH :
http://hg.mozilla.org/releases/mozilla-release/graph/144022

That relbranch does exist on test-beagle:
https://github.com/escapewindow/test-beagle/tree/GECKO2301_2013081518_RELBRANCH

Aki Sasaki

unread,

Sep 6, 2013, 2:05:48 PM9/6/13

to Nicolas B. Pierron

On 9/5/13 10:32 AM, Nicolas B. Pierron wrote:
> On 09/05/2013 09:51 AM, Ehsan Akhgari wrote:
>> On 2013-09-03 9:39 PM, Aki Sasaki wrote:

>>> On 9/3/13 8:25 AM, Ehsan Akhgari wrote:
>>>> 2. How frequently are these branches updated?
>>>
>>> The current job is running on a 5 minute cron job. We can, of course,
>>> change that if needed. When I add a new repo or a slew of new branches
>>> to convert, the job can take longer to complete, but it typically
>>> finishes in about 6 minutes.
>>
>> Sounds good, that's exactly the same setup and frequency that I have as
>> well. It seems to be working fine for most people.
>

> I had the same frequency before, but the problem is that when you are
> trying to push, you are 5 minutes late compared to the others. It
> happens to me, that I had to rebase multiple times before being able to
> push anything to inbound.
>
> Since, I changed to a system which is looking every 10s for
> modifications in the pushlogs, then if there is a modification the
> script will update of the modified branch.
>
> The reasons why I bring the timer so low, is that the pushlog is serve
> through http, and the server is likely to cache it knowing that this is
> used by tbpl (which by the way use https). I did not do it before
> because "hg id" still implies that we have to establish a secure
> connection.

Sounds like "It seems to be working fine for most people" + it doesn't
work for Nicolas, who has his own solution. Accurate?

I will be working on reducing the length of the process by parallelizing
what I can, but thought this was at an acceptable-enough level that I
should move towards pushing test-beagle live, and leave further
enhancements + improvements as non-blocking.

Nicolas B. Pierron

unread,

Sep 6, 2013, 2:44:33 PM9/6/13

to Aki Sasaki

5 minutes latency is a lot compared to the time between commits during rush
hours (when the tree is open).

And yes, I have my own solution, because there was nothing else back in
December 2011. And as far as I know, I am still the only one which is
*only* using git to push to central, try, etc …

> I will be working on reducing the length of the process by parallelizing
> what I can, but thought this was at an acceptable-enough level that I
> should move towards pushing test-beagle live, and leave further
> enhancements + improvements as non-blocking.

Sure, I am not saying this should be a blocker, just saying that it is way
more useful to have something which is mirrored without latency, especially
for pushing.

If you want I can provide you the script that I made to trigger the updates
based on the pushlog modifications. By the way, I am still updating
branches sequentially, but I only mirror 3.

--
Nicolas B. Pierron

Aki Sasaki

unread,

Sep 6, 2013, 3:13:43 PM9/6/13

to Nicolas B. Pierron

Ok.
You may need to keep using it until I have time to make the changes below.

Currently I'm thinking:

* fix https://bugzilla.mozilla.org/show_bug.cgi?id=706887 to allow for
elegant parallelization
* look at your script for a currently working workflow
* add some db/json/whatever caching so I can keep track of latest-pushed
hg- and git- revisions for each head, tag, etc. to reduce wasted noop time
* parallelize what i can (git pushes, hg pulls? possibly even hg
gexport). this may involve splitting each hg repo into its own separate
conversion directory to avoid contention. I'm confident this can work
since I've already done so much full-cvs-history conversion testing in
the past few months.
* add pushlog polling per hg repo

That's not a small list, especially if I want to make it
production-level robust and bulletproof (and I do). Hopefully your
script works for you til then.

>> I will be working on reducing the length of the process by parallelizing
>> what I can, but thought this was at an acceptable-enough level that I
>> should move towards pushing test-beagle live, and leave further
>> enhancements + improvements as non-blocking.
>
> Sure, I am not saying this should be a blocker, just saying that it is
> way more useful to have something which is mirrored without latency,
> especially for pushing.

Understood, and I'm glad we agree it shouldn't block rolling out this
phase of the repo.

> If you want I can provide you the script that I made to trigger the
> updates based on the pushlog modifications. By the way, I am still
> updating branches sequentially, but I only mirror 3.

Yes, please.

Ehsan Akhgari

unread,

Sep 6, 2013, 3:52:59 PM9/6/13

to Aki Sasaki, dev-planning@lists.mozilla.org planning, John O'Duinn

Erm, the point about the repository corruption is rather surprising, since
what _can_ go wrong is a given branch not being updated -- hg-git won't
affect other branches when it's converting any given bookmark, so you can
just avoid pushing branches for which the conversion fails. If you're
talking about corruption at a lower level (such as disk corruption) then
that can affect any repository on the file system.

> We expect people to use (1), but if you want to use project branches,
> you can add (2) as a second remote to your local repo. While this is an
> additional setup step for developers wanting project branches in their
> git repo, this approach also:
>
> * speeds up the process of pushing the core repos,
>

How does that speed things up? You can convert different branches
separately at different times if that's needed. FWIW in my setup (on a not
particularly beefy machine) the cost of adding a branch is sub-linear to
the number of branches, since I only do work if there are changesets to be
converted, which means most of the branch updates are going to be close to
no-ops.

> * allows us to more easily make the core repo list more strict in terms
> of fast-forwardability etc. while letting the project branches be much
> less restricted,
>

The fast-forwardability or lack of it will only affect the people who are
_using_ those remote branch names, so putting those branches in a different
repository doesn't really save anyone.

> * allows us to support the core repos at a higher level than the project
> branches. For instance, if a project repo reset or broken commit were
> to bring the entire conversion loop to its knees, it would only affect
> project branches, not core repos, and
>

I don't understand how that failure scenario would happen. Again, the
hg-git conversion is per-branch, not per-repo. It could be that a branch
starts failing to update, but that doesn't need to block other branches
from being updated.

> * allows us to make the project branch list much more flexible, without
> worrying that we're cluttering the core repo mapfile and conversion
> directories with revisions that will never again see the light of day.
>

It will also make using the git-mapfile for other people more difficult
because then they need to figure out which mapfile they want, which is not
easy, or they need to manually merge the git-mapfiles every time, which can
be expensive.

I'm not convinced that this is the right thing to do, and like I said we
*do* have evidence that my proposal actually works in practice. But it
sounds like you've already made up your mind. :(

> > About the fact that individual developers care about different subsets
> > of branches, that's precisely correct, and it's fine, since when using
> > git, you *always* ignore branches that you're not interested in, and are
> > never affected by what changes happen in those branches. We have an
> > extensive amount of experience with the current github mirror about both
> > of these points and we _know_ that neither of these two are issues here.
>
>
> I agree that you're able to limit the branch list when you're pulling
> from the github mirror.
>
> I think that adding these additional branches (that are known to not be
> pertinent to most developers and have a limited lifespan) could affect
> the robustness and efficiency of the core repo conversion over time,
> however. This robustness is important to the entire organization, so
> we're choosing the safer approach.
>

That has not been my experience in running the existing conversion
infrastructure in my spare time for over 2 years. Do you have any concrete
worries? (It's OK if it's just erring on the safe side in the face of the
unknown, I'm trying to let you know about my experience to hopefully
convince you that this is going to be fine.)

> > Part of the value of having a git mirror for developers is that we don't
> > require individual people to have their own mirroring infrastructure, so
> > just providing documentation that tells people how to set up their own
> > branches by setting up hg-git locally, etc. is kind of defeating the
> > purpose of having a mirror that helps everybody.
>
> My proposal above doesn't involve developers having to use hg-git at all.
>
> > Here's my proposal: the RelEng built git mirror should contains every
> > one of the branches that the current github mirror contains, and the
> > process of people requesting new branches to be mirrored in the future
> > should be as lightweight as possible (i.e., we should never say "no" to
> > a developer who asks for a new branch to be added to the mirror -- we
> > should trust them to know which branches are useful to them if they have
> > a git workflow.) This has served us quite well since the beginning of
> > the the github mirror.
>
> My proposal would be conducive to making the mirroring request more
> lightweight, since the core repos are buffered from the risk of adding
> those project branches.
>

Hopefully the above is enough to convince you that those are only perceived
risks.

> >>> 4. About the interaction between gecko.git and beagle, when you say
> that
> >>> they cannot be identical, are you talking about the location where
> >>> they're
> >>> hosted or the SHA1s for the same commits? I think we need to
> >>> guarantee the
> >>> latter being identical going forward, but the former doesn't matter
> >>> as much.
> >>
> >> It's a requirement to keep the SHA1s the same, and we are confident we
> >> can do that.
> >> The location and naming for legacy b2g branches will be different, as
> >> well as a hard fastforward-only rule on gecko.git.
> >
> > That sounds fine.
> >
> > One point to note here is that because of reasons which I have not
> > explored, some of our branches such as aurora and beta are regularly
> > updated in a non-fast-forward manner. You can see this if you run git
> > fetch in a local clone (there will be a "(forced update)" next to the
> > name of the branch in the output.) You might want to investigate why
> > that happens in case B2G starts to follow regular release trains in the
> > future (but please also note that it might just be a bug in my existing
> > setup.)
>
> Yes. I noted this in my blog post:
> http://escapewindow.dreamwidth.org/238476.html#what_is_gecko_git
> which I understand is tl;dr.
>

Sorry, I have actually gone through your post at least a couple of times
(but I find it very difficult to understand everything there), but I was
_not_ talking about this happening during uplifts. I was talking about it
happening at other times.

> I have already reached out to Release Management to change their
> migration process to include |hg debugsetparents|. I just spoke with
> lsblakk, who confirmed that is now added to the migration scripts, and
> will take effect for the Firefox 26 Aurora merge next week.
>
> >>> 5. Is the git-mapfile generated from the conversion scrips available
> >>> somewhere? (Mine is currently available here:
> >>>
> http://git.mozilla.org/?p=users/eakh...@mozilla.com/mozilla-history-tools.git;a=history;f=updates/git-mapfile;h=5c8ecf4bf1d7339ad1dff1ea4ab62a712da2fb6c;hb=HEAD
> >>>
> >>> )
> >>
> >> I'm uploading this per run, but it's currently not going anywhere
> >> publicly accessible.
> >> We'll change the location of the upload to a public location before
> >> signing off on this repo.
> >
> > Sounds good! Just to confirm, do you end up with a single git-mapfile
> > in your setup? i.e., do you run hg-git only on a single local clone,
> > and just pull different hg branches into it, differentiated by bookmarks?
>
> That's correct, though in the above proposal, there would be one
> git-mapfile for the core repos and another for the project branches.
>

That's sad. See above. :-)

> gecko.git, which is partner-oriented, would have its own mapfile that
> will be a subset of beagle's.
>

That's fine, hopefully nobody would need to use that mapfile at all.

> >>> 6. Do you have any plans for how we're going to approach the problem
> of
> >>> migrating the current git mirror with beagle? Specifically, I think we
> >>> need to:
> >>> a) Figure out how github deals with a repository being
> >>> deleted/recreated,
> >>> would that break people's forks, stars, etc.?
> >>> b) Are there any alternative solutions in replacing all of the
> >>> branches/tags under mozilla/mozilla-central with the new repo?
> >>> c) What is the developer work flow going to look like in rebasing the
> >>> local
> >>> branches once the move has been completed?
> >>
> >> This is a one time hit, but obviously, we want the least churn and
> >> disruption possible. We're still putting that plan together, so all good
> >> questions to add to the planning criteria.
> >
> > Sounds good!
> >
> >> First, lets get feedback that the test-beagle repos looks good and can
> >> be signed off.
> >
> > What kind of feedback are you interested in here? Just comparing the
> > contents of the repo on the tip of the branches can be performed using
> > md5sum.
>
> I can compare SHA1's against gecko.git, but only on the branches they
> share in common.
>

Sure, that should be enough to ensure compat with that repository.

> The feedback needed here is: is this good enough to use? to consolidate
> all the various other gecko based repos we are each supporting? That's
> partially answered here in this email thread, but should anyone have the
> time and interest, please dig deeper into the repository for anything
> that's missing or different than expected.
>

I don't think that manual investigation besides what I suggested really
buys us anything. I remember there being an obscure difference at around
the point at which you and I have grafted the CVS repository to the hg
repository (it was a long time ago -- don't remember the details) but I
remember that neither were particularly better than the other, and it
doesn't really matter to anybody anyway! :-)

jste...@gmail.com

unread,

Oct 9, 2013, 6:01:57 PM10/9/13

to

Hey guys,

Thanks for working through the details of getting these conversions into a better place! I've spent some time looking through the contents of this repo, and it looks all good to me. I know there's more to do here, but I hereby say that this stuff is good to go, let's make this live and official and move on to the next step.

Thanks,
Johnny

Aki Sasaki

unread,

Oct 11, 2013, 10:11:28 PM10/11/13

to jste...@gmail.com

On 10/9/13 3:01 PM, jste...@gmail.com wrote:
> Hey guys,
>
> Thanks for working through the details of getting these conversions into a better place! I've spent some time looking through the contents of this repo, and it looks all good to me. I know there's more to do here, but I hereby say that this stuff is good to go, let's make this live and official and move on to the next step.
>
> Thanks,
> Johnny

Thanks! Here's what I have:

gecko-dev
=========

gecko-dev, nee beagle, is now mirrored in two locations [1][2]. It
contains all release-train branches and inbound-branches. There are
fastfoward-only and no-deletes rules that we should also enforce with hg
pre-commit hooks, though we have a way to fix non-fastforward commits
with |hg debugsetparents|. The mapfile, repo_update.json (status), and
logs are currently being published here [3] (a permanent location is
coming shortly).

[1] http://git.mozilla.org/?p=integration/gecko-dev.git;a=summary
[2] https://github.com/mozilla/integration-gecko-dev
[3] http://people.mozilla.org/~asasaki/vcs2vcs/gecko-dev/

gecko-projects
==============

gecko-projects now lives on github [4]. It contains all "twig"
branches, as well as several other branches under
hg.mozilla.org/projects/, and services-central (full list here [5]).
Instructions on how to use this repo are here [6]. The mapfile,
repo_update.json (status), and logs are currently being published here
[7] (a permanent location, and combined mapfiles, should be coming shortly).

[4] https://github.com/mozilla/integration-gecko-projects
[5] https://github.com/mozilla/integration-gecko-projects/branches
[6]
https://github.com/mozilla/integration-gecko-projects/blob/master/README.md
[7] http://people.mozilla.org/~asasaki/vcs2vcs/gecko-projects/

mapfiles, status json, logs
===========================

The mapfiles are each generated by their own process, and map hg to git
shas. It's possible to combine them by downloading them, then

sort --unique --field-separator=" " --key=2 mapfile1 mapfile2 >
combined_mapfile

I'm planning on having this automated shortly.

The repo_update status json is likely overly verbose, but I wasn't quite
sure how to otherwise denote different update/push times for different
repos/branches. I imagine we may revisit this.

The logs are uploaded in logs/, though they'll be rotated+overwritten
every minute or so. I'm currently leaning towards a new upload
directory per run, with a latest softlink, and cron-based cleanup, once
we decide where the permanent home should be.

More enhancements and documentation are in the works.

Ehsan Akhgari

unread,

Oct 15, 2013, 5:40:32 PM10/15/13

to Aki Sasaki, dev-pl...@lists.mozilla.org

Thanks Aki, this is looking great! I was skimming over the links in
your post, and I noticed that the git-mapfile on your people account is
13MB, while the one that I have on my conversion server is about 101MB
right now... Do you have any idea what may have caused that size
difference?

Thanks!
Ehsan

> _______________________________________________
> dev-planning mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-planning
>

Aki Sasaki

unread,

Oct 15, 2013, 5:45:02 PM10/15/13

to Ehsan Akhgari, dev-pl...@lists.mozilla.org

Great, glad to hear!

I don't know why it's *that* large of a difference, but I know I'm
[intentionally] only converting whitelisted branches and tags, not *
like you have been. Also, gecko-projects has its own mapfile, though I
still have a combined mapfile on my todo list.

Maybe we should diff them to see more clearly what revisions are being
skipped?

Gregory Szorc

unread,

Oct 15, 2013, 5:47:54 PM10/15/13

to Ehsan Akhgari, Aki Sasaki, dev-pl...@lists.mozilla.org

IIRC one of my hg-git patches removed saving certain SHA-1 mappings
because the cost of storing and the overhead of looking them up was more
expensive than recomputing them. If most (all?) the Git object types for
the missing SHA-1's are the same type, that should confirm this hypothesis.

On 10/15/13 2:40 PM, Ehsan Akhgari wrote:
> Thanks Aki, this is looking great! I was skimming over the links in
> your post, and I noticed that the git-mapfile on your people account is
> 13MB, while the one that I have on my conversion server is about 101MB
> right now... Do you have any idea what may have caused that size
> difference?
>
> Thanks!
> Ehsan
>
>
> On 2013-10-11 10:11 PM, Aki Sasaki wrote:

Aki Sasaki

unread,

Oct 15, 2013, 5:45:02 PM10/15/13

to Ehsan Akhgari, dev-pl...@lists.mozilla.org

Great, glad to hear!

I don't know why it's *that* large of a difference, but I know I'm
[intentionally] only converting whitelisted branches and tags, not *
like you have been. Also, gecko-projects has its own mapfile, though I
still have a combined mapfile on my todo list.

Maybe we should diff them to see more clearly what revisions are being
skipped?

On 10/15/13 2:40 PM, Ehsan Akhgari wrote:

Aki Sasaki

unread,

Oct 15, 2013, 6:06:29 PM10/15/13

to Gregory Szorc, Ehsan Akhgari

Hm, that would definitely explain it, because my fork is based off your
latest hg-git patches.

Ehsan: Are the missing SHAs in the mapfile problematic? If so, I'm
confident I can re-convert from scratch and maintain the same SHAs, so I
could get a fully populated mapfile within a week or two, once we back
out the below hg-git change from my fork.

Aki Sasaki

unread,

Oct 16, 2013, 1:12:33 PM10/16/13

to

I *think* this fell off the newsgroup due to newsgroup-vs-email-list
differences.
To close the loop, this seems like expected behavior due to hg-git
changes, and we should be able to recreate the mapfile in the future if
this is an issue.
I know that people are already using my git mapfile successfully; if you
are impacted by the size difference in the mapfile, please let me know!

aki

On 10/15/13 3:26 PM, Ehsan Akhgari wrote:
> My only fear is that there may be something very big missing from your
> copy, since the size difference is kind of alarming... But your
> call. :-)
>
> --
> Ehsan
> <http://ehsanakhgari.org/>
>
>
> On Tue, Oct 15, 2013 at 6:19 PM, Aki Sasaki <a...@mozilla.com
> <mailto:a...@mozilla.com>> wrote:
>
> Ok, I'm inclined to wait until I hear from someone that this is
> actually impacting them, then.
>
>
> On 10/15/13 3:13 PM, Ehsan Akhgari wrote:
>> Hmm, I don't really have any idea. My git-mapfile is at
>> <http://git.mozilla.org/?p=users/eakh...@mozilla.com/mozilla-history-tools.git;a=blob;f=updates/git-mapfile;h=a72c98c81f27a498fb70806dc0755da0bd217262;hb=HEAD>,
>> please feel free to take a look! :-)

>>
>> Cheers,
>>
>> --
>> Ehsan
>> <http://ehsanakhgari.org/>
>>
>>

>> <http://people.mozilla.org/%7Easasaki/vcs2vcs/gecko-dev/>

>> >>>
>> >>> gecko-projects
>> >>> ==============
>> >>>
>> >>> gecko-projects now lives on github [4]. It contains all
>> "twig"
>> >>> branches, as well as several other branches under
>> >>> hg.mozilla.org/projects/

>> <http://hg.mozilla.org/projects/>, and services-central (full

>> list here [5]).
>> >>> Instructions on how to use this repo are here [6]. The
>> mapfile,
>> >>> repo_update.json (status), and logs are currently being
>> published here
>> >>> [7] (a permanent location, and combined mapfiles, should
>> be coming
>> >>> shortly).
>> >>>
>> >>> [4] https://github.com/mozilla/integration-gecko-projects
>> >>> [5]
>> https://github.com/mozilla/integration-gecko-projects/branches
>> >>> [6]
>> >>>
>> https://github.com/mozilla/integration-gecko-projects/blob/master/README.md
>> >>>
>> >>>
>> >>> [7]
>> http://people.mozilla.org/~asasaki/vcs2vcs/gecko-projects/

>> <http://people.mozilla.org/%7Easasaki/vcs2vcs/gecko-projects/>

>> <mailto:dev-pl...@lists.mozilla.org>

>> >>> https://lists.mozilla.org/listinfo/dev-planning
>> >>>
>> >>
>> >> _______________________________________________
>> >> dev-planning mailing list
>> >> dev-pl...@lists.mozilla.org

>> <mailto:dev-pl...@lists.mozilla.org>
>> >> https://lists.mozilla.org/listinfo/dev-planning
>> >
>>
>>
>
>

Ehsan Akhgari

unread,

Oct 18, 2013, 5:38:54 PM10/18/13

to Aki Sasaki, dev-planning@lists.mozilla.org planning, John O'Duinn

On Fri, Sep 6, 2013 at 3:52 PM, Ehsan Akhgari <ehsan....@gmail.com>wrote:

> I don't think that manual investigation besides what I suggested really
> buys us anything. I remember there being an obscure difference at around
> the point at which you and I have grafted the CVS repository to the hg
> repository (it was a long time ago -- don't remember the details) but I
> remember that neither were particularly better than the other, and it
> doesn't really matter to anybody anyway! :-)
>

It was brought up to my attention that my comment here (especially the last
sentence) might have come off as offensive, and I want to clarify what I
meant here and apologize if I have unintentionally offended anyone.

A long time ago while we were working out the graft point between the CVS
and hg histories, I remember that we had a long discussion on which exact
graft point to pick. What I brought up back then was that the most
important difference in where the graft point is affects the results of git
blame on old files. We did a number of tests and even though the graft
points that I had picked were different than the graft points that RelEng
had picked, the git blame results for both repositories were identical, and
I think we concluded back then that the graft point in the RelEng
repository is as good if not better than the one that I had picked. The
thing which I argued back then doesn't matter much is the exact location of
the graft point, since there will probably be very few people if any who
will look at the git log output around the time of the switch to hg, and
nearly everyone cares about accurate blames. This is what I was referring
to when writing the above comment.

But I realize that I have made two errors, one not describing enough of
what I was talking about as I was referencing a conversation from quite a
while ago, and the other forgetting that the older conversation did not
happen in this forum, therefore forgetting that the readers are not going
to have enough context. I did not mean to imply that the "it" in my last
sentence refers to the work that RelEng has been doing here, and I
apologize to everybody who felt offended reading the above.

Cheers,
Ehsan

Aki Sasaki

unread,

Oct 21, 2013, 2:23:23 PM10/21/13

to Ehsan Akhgari, dev-planning@lists.mozilla.org planning, John O'Duinn

On 10/18/13 2:38 PM, Ehsan Akhgari wrote:
> On Fri, Sep 6, 2013 at 3:52 PM, Ehsan Akhgari <ehsan....@gmail.com
> <mailto:ehsan....@gmail.com>> wrote:
>
> I don't think that manual investigation besides what I suggested
> really buys us anything. I remember there being an obscure
> difference at around the point at which you and I have grafted the
> CVS repository to the hg repository (it was a long time ago --
> don't remember the details) but I remember that neither were
> particularly better than the other, and it doesn't really matter
> to anybody anyway! :-)
>
>
> It was brought up to my attention that my comment here (especially the
> last sentence) might have come off as offensive, and I want to clarify
> what I meant here and apologize if I have unintentionally offended anyone.

Thank you, Ehsan, for both the clarification + apology.
I realize this project has been tricky, both technically and politically.
Hopefully we're on track for a solution that meets everyone's hard
requirements.

Ehsan Akhgari

unread,

Oct 22, 2013, 5:54:16 PM10/22/13

to jste...@gmail.com, Aki Sasaki, dev-planning@lists.mozilla.org planning

I've been speaking with RelEng to finalize the things that we need to
finish before we can switch to the RelEng git mirror.

Firstly, my personal interest in this matter is to stop maintaining my
mirror (https://github.com/mozilla/mozilla-central) as soon as possible,
preferably by the end of this year. My goal is to make this process as
smooth as possible for all of the developers who are currently using that
mirror.

The RelEng mirror has different commit SHA1s than my mirror does, so we're
going to need an easy way to let people rebase their local branches on top
of the new repository when they decide to switch. Aki has filed bug 929338
in order to create a rebase helper script which can hopefully take all of
your branches and rebase them on top of the new commits coming from the
RelEng mirror.

The other thing that we need to figure out is what to do with the
mozilla/mozilla-central mirror on github. Currently it has a network of
hundreds of forks/stars/watches, and ideally we wouldn't leave all of those
people stranded. My ideal solution would be for us to talk to github to
figure out what's the best way to update all of the branches in that repo
to the new branches in the RelEng mirror, hopefully without losing this
network. I think that the obvious solution of deleting that repository and
recreating one with the same name will break the network.

(Note that the RelEng mirrors separate out the project branches into a
separate repository, so effectively the equivalent of my mirror is the sum
of two new mirrors, but I can live with replacing mozilla/mozilla-central
with https://github.com/mozilla/integration-gecko-dev which has most of the
branches that people want to use.)

Does anybody have any preferences over what we will end up doing with the
github repository?

Thanks!

--
Ehsan
<http://ehsanakhgari.org/>

> _______________________________________________
> dev-planning mailing list
> dev-pl...@lists.mozilla.org

> https://lists.mozilla.org/listinfo/dev-planning
>

Nicolas B. Pierron

unread,

Oct 22, 2013, 6:25:20 PM10/22/13

to

On 10/22/2013 02:54 PM, Ehsan Akhgari wrote:
> Firstly, my personal interest in this matter is to stop maintaining my
> mirror (https://github.com/mozilla/mozilla-central) as soon as possible,
> preferably by the end of this year. My goal is to make this process as
> smooth as possible for all of the developers who are currently using that
> mirror.

I did the same thing last Tuesday, when my hg-git copy got corrupted. I
will look into updating my bidirectional mirror to use gecko-dev. (instead
of my own since gecko-dev now includes inbound :) )

> The RelEng mirror has different commit SHA1s than my mirror does, so we're
> going to need an easy way to let people rebase their local branches on top
> of the new repository when they decide to switch. Aki has filed bug 929338
> in order to create a rebase helper script which can hopefully take all of
> your branches and rebase them on top of the new commits coming from the
> RelEng mirror.

I already had this problem before, and the easiest way to do it is to use:
git rebase --onto <new-repo-master> <old-repo-master> <my-branch>

I use the sema trick to rebase my branches on top of inbound, as I am always
working on top of mozilla-central.

Another way is to cherry-pick each commit. The problem I have is that I
have local branches which I update by merging the latest mozilla-central
into them. Sadly, for these branches, the simplest solution I found was to
lose the history by making a diff from the latest merge and apply them on
top of the new repo.

> The other thing that we need to figure out is what to do with the
> mozilla/mozilla-central mirror on github. Currently it has a network of
> hundreds of forks/stars/watches, and ideally we wouldn't leave all of those
> people stranded. My ideal solution would be for us to talk to github to
> figure out what's the best way to update all of the branches in that repo
> to the new branches in the RelEng mirror, hopefully without losing this
> network. I think that the obvious solution of deleting that repository and
> recreating one with the same name will break the network.

This is ugly, but you can push both version into the github repository. I
did so a while ago and git is quite clever at keeping the thing minimal as
the the diffs are identical, only the commits are duplicated.

> Does anybody have any preferences over what we will end up doing with the
> github repository?

I wonder to which extend we can use grafting on a "git clone --depth xxx" of
the new repository, to connect the old one with the new one. Sadly, I don't
think git supports pushing new branches from a shalow clone.

--
Nicolas B. Pierron

Ehsan Akhgari

unread,

Oct 22, 2013, 7:05:11 PM10/22/13

to Nicolas B. Pierron, dev-pl...@lists.mozilla.org

On 2013-10-22 6:25 PM, Nicolas B. Pierron wrote:
>> The other thing that we need to figure out is what to do with the
>> mozilla/mozilla-central mirror on github. Currently it has a network of
>> hundreds of forks/stars/watches, and ideally we wouldn't leave all of
>> those
>> people stranded. My ideal solution would be for us to talk to github to
>> figure out what's the best way to update all of the branches in that repo
>> to the new branches in the RelEng mirror, hopefully without losing this
>> network. I think that the obvious solution of deleting that
>> repository and
>> recreating one with the same name will break the network.
>
> This is ugly, but you can push both version into the github repository.
> I did so a while ago and git is quite clever at keeping the thing
> minimal as the the diffs are identical, only the commits are duplicated.

Yes, I know that, but *I* don't want to end up having to run that
service forever, and RelEng disagrees that we need to have a single
repository (which is why we currently have two.) That said, I'm not
sure who this suggestion is directed at. :-)

>> Does anybody have any preferences over what we will end up doing with the
>> github repository?
>
> I wonder to which extend we can use grafting on a "git clone --depth
> xxx" of the new repository, to connect the old one with the new one.
> Sadly, I don't think git supports pushing new branches from a shalow clone.

I don't think that it does, either.

Something that I forgot to mention in my previous email: I'm mostly
interested in people's feedback about the github repo issue.
Suggestions on the rebasing script should probably happen in bug 929338
for easier tracking.

Thanks!
Ehsan

Aki Sasaki

unread,

Oct 22, 2013, 6:05:44 PM10/22/13

to Ehsan Akhgari, jste...@gmail.com, dev-planning@lists.mozilla.org planning

On 10/22/13 2:54 PM, Ehsan Akhgari wrote:
> I've been speaking with RelEng to finalize the things that we need to
> finish before we can switch to the RelEng git mirror.
>

> Firstly, my personal interest in this matter is to stop maintaining my
> mirror (https://github.com/mozilla/mozilla-central) as soon as
> possible, preferably by the end of this year. My goal is to make this
> process as smooth as possible for all of the developers who are
> currently using that mirror.

+1

>
> The RelEng mirror has different commit SHA1s than my mirror does, so
> we're going to need an easy way to let people rebase their local
> branches on top of the new repository when they decide to switch. Aki
> has filed bug 929338 in order to create a rebase helper script which
> can hopefully take all of your branches and rebase them on top of the
> new commits coming from the RelEng mirror.
>

I'm not entirely sure what form this script will take, until I
investigate further.
It may be an entirely hands-off, do-everything-but-hand-you-a-beer
script, or a set of commands developers can run themselves, or something
in between.
Since I'm not sure how many different permutations or workflows people
have, I'm not sure a single script can handle every edge case, but I'll
know more when I dig further.

> The other thing that we need to figure out is what to do with the
> mozilla/mozilla-central mirror on github. Currently it has a network
> of hundreds of forks/stars/watches, and ideally we wouldn't leave all
> of those people stranded. My ideal solution would be for us to talk
> to github to figure out what's the best way to update all of the
> branches in that repo to the new branches in the RelEng mirror,
> hopefully without losing this network. I think that the obvious
> solution of deleting that repository and recreating one with the same
> name will break the network.
>

I like having https://github.com/mozilla/mozilla-central go away, and
having people switch to https://github.com/mozilla/integration-gecko-dev
explicitly for several reasons:

* The act of moving over itself is an explicit decision, rather than an
implicit decision. Since the rebasing will require action, I don't want
anything to happen without each user's knowledge.
** The behavior of the new repo will be different: different branch
lists, different people to contact.
** The SHAs will be different, as you mentioned. I want each user to
approach this with this knowledge, not have things changed from
underneath them.
* As you mentioned in another thread, we don't necessarily have ways to
contact every person who has interacted with
https://github.com/mozilla/mozilla-central to let them know things are
changing. If we replace it with new behavior and new SHAs, they'll have
to notice that and debug, then try to find who to contact about it. If
the https://github.com/mozilla/mozilla-central repo stops updating or is
deleted, they will be much more likely to ask around to see what's
happened, and we can then point them to the docs on how to move to
https://github.com/mozilla/integration-gecko-dev .
* The name matches git.mozilla.org's mirror:
http://git.mozilla.org/?p=integration/gecko-dev.git;a=summary

aki

> (Note that the RelEng mirrors separate out the project branches into a
> separate repository, so effectively the equivalent of my mirror is the
> sum of two new mirrors, but I can live with replacing
> mozilla/mozilla-central with
> https://github.com/mozilla/integration-gecko-dev which has most of the
> branches that people want to use.)
>

> Does anybody have any preferences over what we will end up doing with
> the github repository?
>

Trevor Saunders

unread,

Oct 23, 2013, 11:50:09 AM10/23/13

to dev-pl...@lists.mozilla.org

On Tue, Oct 22, 2013 at 07:05:11PM -0400, Ehsan Akhgari wrote:
> On 2013-10-22 6:25 PM, Nicolas B. Pierron wrote:

> >>The other thing that we need to figure out is what to do with the
> >>mozilla/mozilla-central mirror on github. Currently it has a network of
> >>hundreds of forks/stars/watches, and ideally we wouldn't leave all of
> >>those
> >>people stranded. My ideal solution would be for us to talk to github to
> >>figure out what's the best way to update all of the branches in that repo
> >>to the new branches in the RelEng mirror, hopefully without losing this
> >>network. I think that the obvious solution of deleting that
> >>repository and
> >>recreating one with the same name will break the network.
> >
> >This is ugly, but you can push both version into the github repository.
> >I did so a while ago and git is quite clever at keeping the thing
> >minimal as the the diffs are identical, only the commits are duplicated.
>

> Yes, I know that, but *I* don't want to end up having to run that
> service forever, and RelEng disagrees that we need to have a single
> repository (which is why we currently have two.) That said, I'm not
> sure who this suggestion is directed at. :-)

aiui / how I think I'd like things to work the proposal is
- releng keeps there two repo set up for there uses
- releng runs a cron job or something to push both of those repos into
the existing github.com/mozilla/mozilla-central.git repo

Then if git is nice to us rebasing onto new sha1s should just be a
matter of running git rebase. I'm not sure if that will work in reality
or not.

I guess it doesn't matter too much, but imho naming your main repo
gecko-integration-dev.git is silly ;)

Trev

>
> >>Does anybody have any preferences over what we will end up doing with the
> >>github repository?
> >
> >I wonder to which extend we can use grafting on a "git clone --depth
> >xxx" of the new repository, to connect the old one with the new one.
> >Sadly, I don't think git supports pushing new branches from a shalow clone.
>

> I don't think that it does, either.
>
>
> Something that I forgot to mention in my previous email: I'm mostly
> interested in people's feedback about the github repo issue.
> Suggestions on the rebasing script should probably happen in bug
> 929338 for easier tracking.
>
> Thanks!
> Ehsan

Ehsan Akhgari

unread,

Oct 23, 2013, 12:08:05 PM10/23/13

to Trevor Saunders, dev-pl...@lists.mozilla.org

On 2013-10-23 11:50 AM, Trevor Saunders wrote:
> On Tue, Oct 22, 2013 at 07:05:11PM -0400, Ehsan Akhgari wrote:
>> On 2013-10-22 6:25 PM, Nicolas B. Pierron wrote:

>>>> The other thing that we need to figure out is what to do with the
>>>> mozilla/mozilla-central mirror on github. Currently it has a network of
>>>> hundreds of forks/stars/watches, and ideally we wouldn't leave all of
>>>> those
>>>> people stranded. My ideal solution would be for us to talk to github to
>>>> figure out what's the best way to update all of the branches in that repo
>>>> to the new branches in the RelEng mirror, hopefully without losing this
>>>> network. I think that the obvious solution of deleting that
>>>> repository and
>>>> recreating one with the same name will break the network.
>>>
>>> This is ugly, but you can push both version into the github repository.
>>> I did so a while ago and git is quite clever at keeping the thing
>>> minimal as the the diffs are identical, only the commits are duplicated.
>>

>> Yes, I know that, but *I* don't want to end up having to run that
>> service forever, and RelEng disagrees that we need to have a single
>> repository (which is why we currently have two.) That said, I'm not
>> sure who this suggestion is directed at. :-)
>
> aiui / how I think I'd like things to work the proposal is
> - releng keeps there two repo set up for there uses
> - releng runs a cron job or something to push both of those repos into
> the existing github.com/mozilla/mozilla-central.git repo

Previously RelEng has not accepted putting all of these branches in the
same repo.

> Then if git is nice to us rebasing onto new sha1s should just be a
> matter of running git rebase. I'm not sure if that will work in reality
> or not.

FWIW, git rebase --onto should be able to do that.

Cheers,
Ehsan

> I guess it doesn't matter too much, but imho naming your main repo
> gecko-integration-dev.git is silly ;)
>
> Trev
>
>>

>>>> Does anybody have any preferences over what we will end up doing with the
>>>> github repository?
>>>
>>> I wonder to which extend we can use grafting on a "git clone --depth
>>> xxx" of the new repository, to connect the old one with the new one.
>>> Sadly, I don't think git supports pushing new branches from a shalow clone.
>>

>> I don't think that it does, either.
>>
>>
>> Something that I forgot to mention in my previous email: I'm mostly
>> interested in people's feedback about the github repo issue.
>> Suggestions on the rebasing script should probably happen in bug
>> 929338 for easier tracking.
>>
>> Thanks!
>> Ehsan

Trevor Saunders

unread,

Oct 23, 2013, 12:33:10 PM10/23/13

to Ehsan Akhgari, dev-pl...@lists.mozilla.org

On Wed, Oct 23, 2013 at 12:08:05PM -0400, Ehsan Akhgari wrote:
> On 2013-10-23 11:50 AM, Trevor Saunders wrote:
> >On Tue, Oct 22, 2013 at 07:05:11PM -0400, Ehsan Akhgari wrote:
> >>On 2013-10-22 6:25 PM, Nicolas B. Pierron wrote:

> >>>>The other thing that we need to figure out is what to do with the
> >>>>mozilla/mozilla-central mirror on github. Currently it has a network of
> >>>>hundreds of forks/stars/watches, and ideally we wouldn't leave all of
> >>>>those
> >>>>people stranded. My ideal solution would be for us to talk to github to
> >>>>figure out what's the best way to update all of the branches in that repo
> >>>>to the new branches in the RelEng mirror, hopefully without losing this
> >>>>network. I think that the obvious solution of deleting that
> >>>>repository and
> >>>>recreating one with the same name will break the network.
> >>>
> >>>This is ugly, but you can push both version into the github repository.
> >>>I did so a while ago and git is quite clever at keeping the thing
> >>>minimal as the the diffs are identical, only the commits are duplicated.
> >>

> >>Yes, I know that, but *I* don't want to end up having to run that
> >>service forever, and RelEng disagrees that we need to have a single
> >>repository (which is why we currently have two.) That said, I'm not
> >>sure who this suggestion is directed at. :-)
> >
> >aiui / how I think I'd like things to work the proposal is
> >- releng keeps there two repo set up for there uses
> >- releng runs a cron job or something to push both of those repos into
> > the existing github.com/mozilla/mozilla-central.git repo
>
> Previously RelEng has not accepted putting all of these branches in
> the same repo.

I'm willing to accept that releng may have good reasons for not wanting
to convert all of our hg repos directly into one git repo. However I
find it really hard to believe they can come up with a good reason such
a repo should not exist even if none of there stuff pulls from it and its
maintained by a releng machine that's totally seperate from everything
else.

> >Then if git is nice to us rebasing onto new sha1s should just be a
> >matter of running git rebase. I'm not sure if that will work in reality
> >or not.
>
> FWIW, git rebase --onto should be able to do that.

nice, forgot I saw that discussion.

Trev

>
> Cheers,
> Ehsan
>
> >I guess it doesn't matter too much, but imho naming your main repo
> >gecko-integration-dev.git is silly ;)
> >
> >Trev
> >
> >>

> >>>>Does anybody have any preferences over what we will end up doing with the
> >>>>github repository?
> >>>
> >>>I wonder to which extend we can use grafting on a "git clone --depth
> >>>xxx" of the new repository, to connect the old one with the new one.
> >>>Sadly, I don't think git supports pushing new branches from a shalow clone.
> >>

> >>I don't think that it does, either.
> >>
> >>
> >>Something that I forgot to mention in my previous email: I'm mostly
> >>interested in people's feedback about the github repo issue.
> >>Suggestions on the rebasing script should probably happen in bug
> >>929338 for easier tracking.
> >>
> >>Thanks!
> >>Ehsan

Ehsan Akhgari

unread,

Oct 23, 2013, 2:54:37 PM10/23/13

to Trevor Saunders, dev-pl...@lists.mozilla.org

On 2013-10-23 12:33 PM, Trevor Saunders wrote:
> On Wed, Oct 23, 2013 at 12:08:05PM -0400, Ehsan Akhgari wrote:
>> On 2013-10-23 11:50 AM, Trevor Saunders wrote:
>>> On Tue, Oct 22, 2013 at 07:05:11PM -0400, Ehsan Akhgari wrote:
>>>> On 2013-10-22 6:25 PM, Nicolas B. Pierron wrote:

>>>>>> The other thing that we need to figure out is what to do with the
>>>>>> mozilla/mozilla-central mirror on github. Currently it has a network of
>>>>>> hundreds of forks/stars/watches, and ideally we wouldn't leave all of
>>>>>> those
>>>>>> people stranded. My ideal solution would be for us to talk to github to
>>>>>> figure out what's the best way to update all of the branches in that repo
>>>>>> to the new branches in the RelEng mirror, hopefully without losing this
>>>>>> network. I think that the obvious solution of deleting that
>>>>>> repository and
>>>>>> recreating one with the same name will break the network.
>>>>>
>>>>> This is ugly, but you can push both version into the github repository.
>>>>> I did so a while ago and git is quite clever at keeping the thing
>>>>> minimal as the the diffs are identical, only the commits are duplicated.
>>>>

>>>> Yes, I know that, but *I* don't want to end up having to run that
>>>> service forever, and RelEng disagrees that we need to have a single
>>>> repository (which is why we currently have two.) That said, I'm not
>>>> sure who this suggestion is directed at. :-)
>>>
>>> aiui / how I think I'd like things to work the proposal is
>>> - releng keeps there two repo set up for there uses
>>> - releng runs a cron job or something to push both of those repos into
>>> the existing github.com/mozilla/mozilla-central.git repo
>>
>> Previously RelEng has not accepted putting all of these branches in
>> the same repo.
>
> I'm willing to accept that releng may have good reasons for not wanting
> to convert all of our hg repos directly into one git repo. However I
> find it really hard to believe they can come up with a good reason such
> a repo should not exist even if none of there stuff pulls from it and its
> maintained by a releng machine that's totally seperate from everything
> else.

OK, please file a bug on that! :-)

Cheers,
Ehsan

Ehsan Akhgari

unread,

Oct 23, 2013, 3:01:58 PM10/23/13

to Aki Sasaki, jste...@gmail.com, dev-planning@lists.mozilla.org planning

On 2013-10-22 6:05 PM, Aki Sasaki wrote:
>> The RelEng mirror has different commit SHA1s than my mirror does, so
>> we're going to need an easy way to let people rebase their local
>> branches on top of the new repository when they decide to switch. Aki
>> has filed bug 929338 in order to create a rebase helper script which
>> can hopefully take all of your branches and rebase them on top of the
>> new commits coming from the RelEng mirror.
>>
>
> I'm not entirely sure what form this script will take, until I
> investigate further.
> It may be an entirely hands-off, do-everything-but-hand-you-a-beer
> script, or a set of commands developers can run themselves, or something
> in between.
> Since I'm not sure how many different permutations or workflows people
> have, I'm not sure a single script can handle every edge case, but I'll
> know more when I dig further.

Fair enough.

>> The other thing that we need to figure out is what to do with the
>> mozilla/mozilla-central mirror on github. Currently it has a network
>> of hundreds of forks/stars/watches, and ideally we wouldn't leave all
>> of those people stranded. My ideal solution would be for us to talk
>> to github to figure out what's the best way to update all of the
>> branches in that repo to the new branches in the RelEng mirror,
>> hopefully without losing this network. I think that the obvious
>> solution of deleting that repository and recreating one with the same
>> name will break the network.
>>
>

> I like having https://github.com/mozilla/mozilla-central go away, and
> having people switch to https://github.com/mozilla/integration-gecko-dev
> explicitly for several reasons:
>
> * The act of moving over itself is an explicit decision, rather than an
> implicit decision. Since the rebasing will require action, I don't want
> anything to happen without each user's knowledge.
> ** The behavior of the new repo will be different: different branch
> lists, different people to contact.
> ** The SHAs will be different, as you mentioned. I want each user to
> approach this with this knowledge, not have things changed from
> underneath them.
> * As you mentioned in another thread, we don't necessarily have ways to
> contact every person who has interacted with
> https://github.com/mozilla/mozilla-central to let them know things are
> changing. If we replace it with new behavior and new SHAs, they'll have
> to notice that and debug, then try to find who to contact about it. If
> the https://github.com/mozilla/mozilla-central repo stops updating or is
> deleted, they will be much more likely to ask around to see what's
> happened, and we can then point them to the docs on how to move to
> https://github.com/mozilla/integration-gecko-dev .

Sure, I understand all of the above points. But you're ignoring the
existing github community around the mozilla-central projects. I don't
agree that it's fine to lose all of that community.

> * The name matches git.mozilla.org's mirror:
> http://git.mozilla.org/?p=integration/gecko-dev.git;a=summary

To be fair, that name probably doesn't mean much to people outside of
RelEng. All of our developers know what mozilla-central is. I'm not
sure if the same could be asserted about "integration-gecko-dev". :-)

But the naming of things doesn't matter much to me.

Cheers,
Ehsan

Ralph Giles

unread,

Oct 23, 2013, 4:42:03 PM10/23/13

to dev-pl...@lists.mozilla.org

On 2013-10-23 12:01 PM, Ehsan Akhgari wrote:
>> * The name matches git.mozilla.org's mirror:
>> http://git.mozilla.org/?p=integration/gecko-dev.git;a=summary
>
> To be fair, that name probably doesn't mean much to people outside of
> RelEng. All of our developers know what mozilla-central is. I'm not
> sure if the same could be asserted about "integration-gecko-dev".

If you're going to change the name, I suggest just 'gecko'. The '-dev'
and 'integration' parts are redundant, and shorter urls are better.

-r

Gregory Szorc

unread,

Oct 23, 2013, 4:54:23 PM10/23/13

to Ralph Giles, dev-pl...@lists.mozilla.org

Since the subject of unified repositories came up in this thread, I'm a
huge proponent of unified repositories for the release + integration
repos because I strongly feel they can result in productivity wins.
Plus, they result in less bandwidth and load on the canonical VCS
servers since clients only need to pull 1 repo instead of N.

http://hg.gregoryszorc.com/gecko/ is a unified Mercurial repo with each
official Mozilla repo tracked as bookmarks. e.g. central/default. More
at
http://gregoryszorc.com/blog/2013/10/17/alternate-mercurial-server-for-firefox-development/

I would love, love, love for Mercurial and Git unified repos to be
officially hosted by Mozilla. I'm glad to see the Git versions
materialize. But let's not forget about Mercurial.

Aki Sasaki

unread,

Oct 23, 2013, 5:02:53 PM10/23/13

to

I'm not sure I agree that moving locations is losing the community.
It took a single clone or button-click (watch or fork) to join, right?
It's a similar single clone or button-click to move, and there will be
docs and a script to help migrate any local branches. It's an explicit
choice rather than an implicit change from underneath you.

>> * The name matches git.mozilla.org's mirror:
>> http://git.mozilla.org/?p=integration/gecko-dev.git;a=summary
>
> To be fair, that name probably doesn't mean much to people outside of
> RelEng. All of our developers know what mozilla-central is. I'm not
> sure if the same could be asserted about "integration-gecko-dev". :-)
>
> But the naming of things doesn't matter much to me.

It may make more sense in the context of the three repos:

* releases/gecko.git , which is partner-oriented, and our highest
priority to keep sane (currently lives on git.m.o only),

* integration/gecko-dev, which is developer-oriented, and we want to
offer a strong SLA for. It contains all release and inbound branches
(currently lives on both github and git.m.o), and

* integration/gecko-projects, which contains mercurial repos without
strict pre-commit hooks, and are periodically reset; RelEng reserves the
right to reset the repo if these cause vcs-sync issues.

All three share the same SHAs for shared commits.

:joduinn was planning to write a blog post about this, but I think the
above encapsulates that.

> Cheers,
> Ehsan

Aki Sasaki

unread,

Oct 23, 2013, 5:22:24 PM10/23/13

to

If it's just a cronjob, that would be hard to provide a good reason against.

However, I believe that this would implicitly mean that Release
Engineering would own this repository, and would be required to resolve
issues with it within some SLA. Given the fact that there is a ton of
history in this repo with incompatible SHAs that RelEng doesn't have a
way of reproducing currently, and the fact that gecko-projects contains
branches that are not strictly controlled by pre-commit hooks and
occasionally get reset, I cannot say that we can maintain this to the
degree we can gecko-dev.

This ask is not just a cronjob; it's asking Release Engineering to
maintain this repository in perpetuity. I am comfortable maintaining
gecko-dev. I am not comfortable maintaining github/mozilla/mozilla-central.

aki

Trevor Saunders

unread,

Oct 23, 2013, 5:27:34 PM10/23/13

to dev-pl...@lists.mozilla.org

I'm not familiar with the github, but I think ehsan isn't worried about
people developing patches who will notice the change and accomidate. I
think he's worried about people casually watching our repo who may not
notice the change.

> >> * The name matches git.mozilla.org's mirror:
> >> http://git.mozilla.org/?p=integration/gecko-dev.git;a=summary
> >
> > To be fair, that name probably doesn't mean much to people outside of
> > RelEng. All of our developers know what mozilla-central is. I'm not
> > sure if the same could be asserted about "integration-gecko-dev". :-)
> >
> > But the naming of things doesn't matter much to me.
>
> It may make more sense in the context of the three repos:
>
> * releases/gecko.git , which is partner-oriented, and our highest
> priority to keep sane (currently lives on git.m.o only),

but nobody other than releng and b2g partner types should care about.
Arguably that name is actively bad because it will confuse people other
than the few who actually should care about what the "main" repo is.

> * integration/gecko-dev, which is developer-oriented, and we want to
> offer a strong SLA for. It contains all release and inbound branches
> (currently lives on both github and git.m.o), and
>
> * integration/gecko-projects, which contains mercurial repos without
> strict pre-commit hooks, and are periodically reset; RelEng reserves the
> right to reset the repo if these cause vcs-sync issues.

I don't see what keeping these seperate buys anyone accept pain, if you
merge them you can still reset branches if you need to.

Trev

>
> All three share the same SHAs for shared commits.
>
> :joduinn was planning to write a blog post about this, but I think the
> above encapsulates that.
>
> > Cheers,
> > Ehsan
>

Trevor Saunders

unread,

Oct 23, 2013, 6:00:22 PM10/23/13

to dev-pl...@lists.mozilla.org

no, it really is just a cron job. at any point you could recreate
mozilla/mozilla-central.git by first pushing the history of the existing
repo, and then pushing in your gecko-dev-projects repo. Sure you can't
reproduce the history of mozilla-central.git as it is but that's totally
irrelevent because its static so you can just keep tar balls in enough
places that if they all go away you have bigger problems.

Trev

Ehsan Akhgari

unread,

Oct 23, 2013, 6:13:06 PM10/23/13

to Aki Sasaki, dev-pl...@lists.mozilla.org

I am not worried about people who follow these mailing lists actively,
for those people it's just a few more clicks (which sucks, but still
it's not too bad.) I am worried about people outside of those circles.
I have seen random people watching this repo on github, and those
people will never know if we moved to another URL. Likewise, I
personally watch tons of non-Mozilla projects on github, and I would be
left stranded if one of these projects decided to change their github URL.

>>> * The name matches git.mozilla.org's mirror:
>>> http://git.mozilla.org/?p=integration/gecko-dev.git;a=summary
>>
>> To be fair, that name probably doesn't mean much to people outside of
>> RelEng. All of our developers know what mozilla-central is. I'm not
>> sure if the same could be asserted about "integration-gecko-dev". :-)
>>
>> But the naming of things doesn't matter much to me.
>
> It may make more sense in the context of the three repos:
>
> * releases/gecko.git , which is partner-oriented, and our highest
> priority to keep sane (currently lives on git.m.o only),
>
> * integration/gecko-dev, which is developer-oriented, and we want to
> offer a strong SLA for. It contains all release and inbound branches
> (currently lives on both github and git.m.o), and
>
> * integration/gecko-projects, which contains mercurial repos without
> strict pre-commit hooks, and are periodically reset; RelEng reserves the
> right to reset the repo if these cause vcs-sync issues.
>
> All three share the same SHAs for shared commits.
>
> :joduinn was planning to write a blog post about this, but I think the
> above encapsulates that.

Yep, I understand all of this, but I have been in touch with RelEng
about this for 2+ years. See the part where I was talking about people
*outside* RelEng. ;-)

Cheers,
Ehsan

Aki Sasaki

unread,

Oct 23, 2013, 6:17:09 PM10/23/13

to

On 10/23/13 3:00 PM, Trevor Saunders wrote:
> On Wed, Oct 23, 2013 at 02:22:24PM -0700, Aki Sasaki wrote:
>> On 10/23/13 9:33 AM, Trevor Saunders wrote:
>>> I'm willing to accept that releng may have good reasons for not wanting
>>> to convert all of our hg repos directly into one git repo. However I
>>> find it really hard to believe they can come up with a good reason such
>>> a repo should not exist even if none of there stuff pulls from it and its
>>> maintained by a releng machine that's totally seperate from everything
>>> else.
>>
>> If it's just a cronjob, that would be hard to provide a good reason against.
>>
>> However, I believe that this would implicitly mean that Release
>> Engineering would own this repository, and would be required to resolve
>> issues with it within some SLA. Given the fact that there is a ton of
>> history in this repo with incompatible SHAs that RelEng doesn't have a
>> way of reproducing currently, and the fact that gecko-projects contains
>> branches that are not strictly controlled by pre-commit hooks and
>> occasionally get reset, I cannot say that we can maintain this to the
>> degree we can gecko-dev.
>
> no, it really is just a cron job. at any point you could recreate
> mozilla/mozilla-central.git by first pushing the history of the existing
> repo, and then pushing in your gecko-dev-projects repo. Sure you can't
> reproduce the history of mozilla-central.git as it is but that's totally
> irrelevent because its static so you can just keep tar balls in enough
> places that if they all go away you have bigger problems.

If that's true, it should be trivial for a passionate community member
to set this up, correct?

Ehsan Akhgari

unread,

Oct 23, 2013, 6:17:18 PM10/23/13

to Aki Sasaki, dev-pl...@lists.mozilla.org

In the interest of stating this publicly, it is possible to design
things in a way that different *branches* in the same repository are
dealt with under different SLAs.

> This ask is not just a cronjob; it's asking Release Engineering to
> maintain this repository in perpetuity. I am comfortable maintaining
> gecko-dev. I am not comfortable maintaining github/mozilla/mozilla-central.

I believe Trevor was asking about a merged "gecko-dev" and
"gecko-projects" repository.

Cheers,
Ehsan

Gregory Szorc

unread,

Oct 23, 2013, 6:50:30 PM10/23/13

to Ehsan Akhgari, Aki Sasaki, dev-pl...@lists.mozilla.org

On 10/23/2013 3:13 PM, Ehsan Akhgari wrote:> On 2013-10-23 5:02 PM, Aki

Sasaki wrote:
>> On 10/23/13 12:01 PM, Ehsan Akhgari wrote:
>>> On 2013-10-22 6:05 PM, Aki Sasaki wrote:
>>> Sure, I understand all of the above points. But you're ignoring the
>>> existing github community around the mozilla-central projects. I don't
>>> agree that it's fine to lose all of that community.
>>
>> I'm not sure I agree that moving locations is losing the community.
>> It took a single clone or button-click (watch or fork) to join, right?
>> It's a similar single clone or button-click to move, and there will be
>> docs and a script to help migrate any local branches. It's an explicit
>> choice rather than an implicit change from underneath you.
>

> I am not worried about people who follow these mailing lists actively,
> for those people it's just a few more clicks (which sucks, but still
> it's not too bad.) I am worried about people outside of those circles.
> I have seen random people watching this repo on github, and those
> people will never know if we moved to another URL. Likewise, I
> personally watch tons of non-Mozilla projects on github, and I would be
> left stranded if one of these projects decided to change their github URL.

While these issues are valid and should probably be addressed to the
best extent possible, they do highlight some important things:

1) We don't have total control of how things work on 3rd party hosted
services (like GitHub)
2) We have inadequate repository watching services on mozilla.org, so
people are going elsewhere (like GitHub)

Had Git hosting and repository watching been hosted on mozilla.org from
the beginning, we arguably wouldn't be in this mess. Even if we were, it
would have been much easier to say "the GitHub mirror is unsupported,
use X on mozilla.org instead." (I'm not saying a presence on GitHub
isn't important - it is, as that's where people are.)

Speaking of repository watching, Phabricator supports it. I have
http://phabricator.gregoryszorc.com/ notifying me whenever build system
files in the tree change so I can audit for people not following the
review policy on make files :) You too can add rules to watch the tree
by logging in and going to http://phabricator.gregoryszorc.com/herald/new/.

Trev Saunders

unread,

Oct 24, 2013, 11:07:48 AM10/24/13

to Aki Sasaki, dev-pl...@lists.mozilla.org

given that's more or less what we've been doing for the past 3 years
or so clearly yes. That said the point of this whole exercise is to
not have to keep doing that, so doing that would more or less mean
declaring it a failure and waste of time :/

Trev

Johnny Stenback

unread,

Oct 24, 2013, 4:39:08 PM10/24/13

to Trevor Saunders, Ehsan Akhgari, dev-pl...@lists.mozilla.org

On 10/23/2013 9:33 AM, Trevor Saunders wrote:
[...]

>> Previously RelEng has not accepted putting all of these branches in
>> the same repo.
>

> I'm willing to accept that releng may have good reasons for not wanting
> to convert all of our hg repos directly into one git repo. However I
> find it really hard to believe they can come up with a good reason such
> a repo should not exist even if none of there stuff pulls from it and its
> maintained by a releng machine that's totally seperate from everything
> else.

This goes into more than answering this question here, but I wanted to
say this stuff and this thread seemed as good as any, so here we go.

Generally speaking I think what's important to keep in mind here is that
what releng is proposing here is likely not what releng would be
proposing if we were to start with git from day one. We are where we are
today due to the decisions we made in the past regarding CVS, hg, and
the setup of the various repos surrounding our development, l10n, and
release processes using hg. I suspect a lot of things would look very
different had we started with git, or if we had switched from CVS
directly to git rather than to hg, and because of that I don't think it
makes sense to change how everything is organized a lot of those
decisions would have been made differently.

As for whether or not we *need* releng to host a combined repo that
contains all of our branches, I wouldn't be opposed to that, but I
seriously question whether there's enough value in doing that to justify
yet another thing for them to set up and maintain long term (which of
course means they have less time to devote to other critical things that
Mozilla depends on them for). If they were they to create said combined
repo then people would inevitably start depending on it and all of a
sudden it's yet another high SLA thing that they need to be on top of
(whether it's all of it or only part of it). At that point it's far from
just a cron job. If there is actual real value in having this available
to us developers, then sure, absolutely, but so far all I've heard from
talking to various git users about the difference between two separate
repos vs one combined one is two remotes vs one, which translates to git
fetch --all vs git fetch (or stick that in an alias). On top of that
most git users will likely have more than one remote anyways since
they'll likely push to their own github or other repo for backup or to
share their work etc. Plus we have the fact that the number of people
using the rentable branches that live in the other repo is relatively
small compared to the people who work off of the branches available in
geck-dev. It's not the git thing to do, I get that, but given our
history per above paragraph, we are were we are, and I question the
actual value.

If what I've been hearing here, or my assumptions and statements above
are wrong, I'd love to hear it.

--
jst

Trev Saunders

unread,

Oct 24, 2013, 7:32:17 PM10/24/13

to Johnny Stenback, Ehsan Akhgari, dev-pl...@lists.mozilla.org

On 10/24/13, Johnny Stenback <j...@mozilla.com> wrote:
> On 10/23/2013 9:33 AM, Trevor Saunders wrote:
> [...]
>>> Previously RelEng has not accepted putting all of these branches in
>>> the same repo.
>>
>> I'm willing to accept that releng may have good reasons for not wanting
>> to convert all of our hg repos directly into one git repo. However I
>> find it really hard to believe they can come up with a good reason such
>> a repo should not exist even if none of there stuff pulls from it and its
>> maintained by a releng machine that's totally seperate from everything
>> else.
>
> This goes into more than answering this question here, but I wanted to
> say this stuff and this thread seemed as good as any, so here we go.
>
> Generally speaking I think what's important to keep in mind here is that
> what releng is proposing here is likely not what releng would be
> proposing if we were to start with git from day one. We are where we are
> today due to the decisions we made in the past regarding CVS, hg, and
> the setup of the various repos surrounding our development, l10n, and
> release processes using hg. I suspect a lot of things would look very
> different had we started with git, or if we had switched from CVS
> directly to git rather than to hg, and because of that I don't think it
> makes sense to change how everything is organized a lot of those
> decisions would have been made differently.

So far as existing infrastructure goes I generally agree with leaving
it as it is even if some other config would be somewhat nicer.

However It seems to me that how we set up git mirrors is more or less
free from the requirements that make us keep existing infrastructure
as it is. afaik nobody plans to use the git mirrors for running any
tests or doing l10n things, the git mirror would be more or less write
only for releng.

> As for whether or not we *need* releng to host a combined repo that
> contains all of our branches, I wouldn't be opposed to that, but I
> seriously question whether there's enough value in doing that to justify
> yet another thing for them to set up and maintain long term (which of

Well, I think they should just have one repo, and skip the sillyness
around multiple repos since there's no good reason to do things that
way. Which would mean no extra things for them to do, but I was
trying to accomidate what they want even though I don't think it makes
much sense.

> course means they have less time to devote to other critical things that
> Mozilla depends on them for). If they were they to create said combined
> repo then people would inevitably start depending on it and all of a
> sudden it's yet another high SLA thing that they need to be on top of
> (whether it's all of it or only part of it). At that point it's far from
> just a cron job. If there is actual real value in having this available

I think the question is how much work it will take them every year,
which is to say how often things will break / how hard they will be to
fix. I think the answers for this thing will be that it won't break
much if at all, and when it does fixing it will be trivial assuming
the other mirrors they're running are working.

> to us developers, then sure, absolutely, but so far all I've heard from
> talking to various git users about the difference between two separate
> repos vs one combined one is two remotes vs one, which translates to git
> fetch --all vs git fetch (or stick that in an alias). On top of that
> most git users will likely have more than one remote anyways since
> they'll likely push to their own github or other repo for backup or to
> share their work etc. Plus we have the fact that the number of people

for core contributers sure, on the other hand its another non obvious
thing to explain to new people.

> using the rentable branches that live in the other repo is relatively
> small compared to the people who work off of the branches available in

I'd guess its not tiny if you include things like the tree the e10ns
people are using.

> geck-dev. It's not the git thing to do, I get that, but given our
> history per above paragraph, we are were we are, and I question the
> actual value.

I understand seeing the benefits as not that huge in the grand scheme
of things, and might well be fine with another approach if there was
good reasons for it. However the reasons advanced so far don't seem
to hold much water :(

> If what I've been hearing here, or my assumptions and statements above
> are wrong, I'd love to hear it.

people are absolutely free to try and convince me I'm wrong if they like :)

Trev

>
> --
> jst
>

Nicolas B. Pierron

unread,

Oct 24, 2013, 8:10:15 PM10/24/13

to

On 10/24/2013 04:32 PM, Trev Saunders wrote:
> afaik nobody plans to use the git mirrors for running any

> tests […].

>
> people are absolutely free to try and convince me I'm wrong if they like :)

No problem.

http://escapewindow.dreamwidth.org/239854.html :
> AreWeFastYet and some developers have already switched over just fine.

Ok, this is not all AreWeFastYet slaves, but only the B2G part of it.

The reason is that we have access to the mapfile and that it mirrors
inbound. And also that my git mirror stopped working and I no time to
investigate.

--
Nicolas B. Pierron

Ehsan Akhgari

unread,

Oct 25, 2013, 6:34:57 PM10/25/13

to Aki Sasaki, dev-pl...@lists.mozilla.org

johns suggested something about what to do with the existing github
mirror today which may work well.

One possible option for us is to push a non-fast-forward commit to all
of the branches on mozilla/mozilla-central which removes everything and
adds a readme file saying what happened to the repository and point
people to the location for the new repositories, the rebase helper
scripts, etc.

This means that if people try to merge/rebase from this remote, git will
reject such requests, and then they can take a look at the new contents
and hopefully follow the instructions to update their clone.

How does this sound?

Cheers,
Ehsan

On 2013-10-23 6:13 PM, Ehsan Akhgari wrote:
> On 2013-10-23 5:02 PM, Aki Sasaki wrote:

> I am not worried about people who follow these mailing lists actively,
> for those people it's just a few more clicks (which sucks, but still
> it's not too bad.) I am worried about people outside of those circles.
> I have seen random people watching this repo on github, and those
> people will never know if we moved to another URL. Likewise, I
> personally watch tons of non-Mozilla projects on github, and I would be
> left stranded if one of these projects decided to change their github URL.
>

>>>> * The name matches git.mozilla.org's mirror:
>>>> http://git.mozilla.org/?p=integration/gecko-dev.git;a=summary
>>>
>>> To be fair, that name probably doesn't mean much to people outside of
>>> RelEng. All of our developers know what mozilla-central is. I'm not
>>> sure if the same could be asserted about "integration-gecko-dev". :-)
>>>
>>> But the naming of things doesn't matter much to me.
>>
>> It may make more sense in the context of the three repos:
>>
>> * releases/gecko.git , which is partner-oriented, and our highest
>> priority to keep sane (currently lives on git.m.o only),
>>
>> * integration/gecko-dev, which is developer-oriented, and we want to
>> offer a strong SLA for. It contains all release and inbound branches
>> (currently lives on both github and git.m.o), and
>>
>> * integration/gecko-projects, which contains mercurial repos without
>> strict pre-commit hooks, and are periodically reset; RelEng reserves the
>> right to reset the repo if these cause vcs-sync issues.
>>
>> All three share the same SHAs for shared commits.
>>
>> :joduinn was planning to write a blog post about this, but I think the
>> above encapsulates that.
>

Aki Sasaki

unread,

Oct 25, 2013, 6:38:30 PM10/25/13

to Ehsan Akhgari, dev-pl...@lists.mozilla.org

That sounds like it could be a solution that addresses most, if not all,
of the migration concerns.
We should be very certain that we're all ready to switch before that
happens, and make this the very last step of the migration.

On 10/25/13 3:34 PM, Ehsan Akhgari wrote:
> johns suggested something about what to do with the existing github
> mirror today which may work well.
>
> One possible option for us is to push a non-fast-forward commit to all
> of the branches on mozilla/mozilla-central which removes everything
> and adds a readme file saying what happened to the repository and
> point people to the location for the new repositories, the rebase
> helper scripts, etc.
>
> This means that if people try to merge/rebase from this remote, git
> will reject such requests, and then they can take a look at the new
> contents and hopefully follow the instructions to update their clone.
>
> How does this sound?
>
> Cheers,
> Ehsan
>
> On 2013-10-23 6:13 PM, Ehsan Akhgari wrote:
>> On 2013-10-23 5:02 PM, Aki Sasaki wrote:

>> I am not worried about people who follow these mailing lists actively,
>> for those people it's just a few more clicks (which sucks, but still
>> it's not too bad.) I am worried about people outside of those circles.
>> I have seen random people watching this repo on github, and those
>> people will never know if we moved to another URL. Likewise, I
>> personally watch tons of non-Mozilla projects on github, and I would be
>> left stranded if one of these projects decided to change their github
>> URL.
>>

>>>>> * The name matches git.mozilla.org's mirror:
>>>>> http://git.mozilla.org/?p=integration/gecko-dev.git;a=summary
>>>>
>>>> To be fair, that name probably doesn't mean much to people outside of
>>>> RelEng. All of our developers know what mozilla-central is. I'm not
>>>> sure if the same could be asserted about "integration-gecko-dev". :-)
>>>>
>>>> But the naming of things doesn't matter much to me.
>>>
>>> It may make more sense in the context of the three repos:
>>>
>>> * releases/gecko.git , which is partner-oriented, and our highest
>>> priority to keep sane (currently lives on git.m.o only),
>>>
>>> * integration/gecko-dev, which is developer-oriented, and we want to
>>> offer a strong SLA for. It contains all release and inbound branches
>>> (currently lives on both github and git.m.o), and
>>>
>>> * integration/gecko-projects, which contains mercurial repos without
>>> strict pre-commit hooks, and are periodically reset; RelEng reserves
>>> the
>>> right to reset the repo if these cause vcs-sync issues.
>>>
>>> All three share the same SHAs for shared commits.
>>>
>>> :joduinn was planning to write a blog post about this, but I think the
>>> above encapsulates that.
>>