I was planning to write a blog post about this but I didn't have time,
so I just posted something here:
<https://github.com/ehsan/mozilla-history-tools/blob/master/initial_conversaion/README.md>.
I will try to set up something to keep this repository up to date with
our changes on mozilla-central.
Cheers,
Ehsan
Hm! Could we back-convert this to Mercurial and replace m-c with the
result? (I still very much prefer the Mercurial UX, and this would mean
that we got the full blame on hg.m.o.)
zw
> Hm! Could we back-convert this to Mercurial and replace m-c with the
> result? (I still very much prefer the Mercurial UX, and this would mean
> that we got the full blame on hg.m.o.)
>
I don't think so. Every mercurial commit is a cryptographic function of its
ancestors, so all of the SHA1 revision ids would change. This would
invalidate dependent repositories and any hg.m.o links, and there's quite a
few of those strewn about.
Someone could, however, create a separate hg clone of mozilla-history
without too much trouble.
-bholley
> I don't think so. Every mercurial commit is a cryptographic function of its
> ancestors, so all of the SHA1 revision ids would change. This would
> invalidate dependent repositories and any hg.m.o links, and there's quite a
> few of those strewn about.
I think it should be possible to insert an extra commit, just before
the existing hg commit 0, such that the new version of commit 0 has
the same hash as the old one (though perhaps we don't want to go down
this route).
Paul
--
Paul Biggar
Compiler Geek
pbi...@mozilla.com
@paulbiggar
If I remember correctly, a revision's parent is part of the data used to
generate the SHA1 identifier for the revision. So reparenting a
revision changes its SHA1 (and consequently, all of the descendents' too).
Ehsan
Someone should file a good-first-bug.
Nick
What about taking the current m-c repo and a "full history" repo that
ends up with exactly the same bits for the source tree but gets there a
different way (adding in the old history), then merging the two
together? People with a pre-merge m-c checkout would need to merge their
changes in instead of simply committing them, but as soon as everyone
rebased/merged on top of the new repo we'd be ok.
The trick would be to do the merge in the right direction so that hg
annotate/blame would follow the "full history" fork of the graph, not
the current truncated one.
Am I talking sense? It sounds plausible to me, but I'm probably missing
something obvious. I don't know how hg bisect would deal with not having
a common ancestor before the merge, but it seems like you'd have to give
it a starting point on one side or the other anyway.
What about taking the current m-c repo and a "full history" repo that
More than that, changes are also part of the data used to generate the
SHA1 identifier.
> What about taking the current m-c repo and a "full history" repo
> that ends up with exactly the same bits for the source tree but gets
> there a different way (adding in the old history), then merging the
> two together? People with a pre-merge m-c checkout would need to
> merge their changes in instead of simply committing them, but as
> soon as everyone rebased/merged on top of the new repo we'd be ok.
I think that would be a terrible idea. Most operations that deal with
history are already unbearably slow with the current 70000 changeset
(being an order of magnitude slower than git *is* unbearable). Adding
more changesets is not going to help.
Another reason is that mercurial is really bad at storing these things,
and a clone would take a whole lot more space than it currently does.
For instance, my mozilla-central .hg directory is currently already
bigger than ehsan's mozilla-history.git repository...
Mike
How long does it take you to hg clone mozilla-central from scratch? For
me, on my VM, it takes a few minutes. Now triple that length of time
(assuming mozilla-history is ~2x the size of m-c right now). The time
spent doing simple hg operations is bad enough right now; your proposal
makes it untenably bad. I've seen proposals in hg to avoid checking out
full history, and a few partial implementations... maybe we should get
somebody to actually implement those.
It would be nice if you could push the mapfile as well in a separate
branch so that anyone can possibly set up hg-git with it :)
[eg. to keep it updated]
Btw, I've also pushed some time ago a git mirror, which has the
particularity of not only cloning m-c but all of our main Mercurial
repositories (m-c, m-i, fx-team, devtools, ux and so on) within one
single repo using git's lightweight branches:
https://github.com/neonux/mozilla-all
It might be interesting to people who prefer day-to-day git as well
but do not usually work on m-c directly.
Cheers,
On Wed, Aug 24, 2011 at 01:57, Ehsan Akhgari <ehsan....@gmail.com> wrote:
> I'm happy to announce https://github.com/ehsan/mozilla-history, which is a
> Git repository containing *the entire* history of the Mozilla project. It
> is very useful to get blames which don't end at the Mercurial migration
> date, and go through all of the history. I hope that it would be useful to
> make our developers more productive when looking at our source code.
>
> I was planning to write a blog post about this but I didn't have time, so I
> just posted something here:
> <https://github.com/ehsan/mozilla-history-tools/blob/master/initial_conversaion/README.md>.
> I will try to set up something to keep this repository up to date with our
> changes on mozilla-central.
>
> Cheers,
> Ehsan
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
> How long does it take you to hg clone mozilla-central from scratch? For me,
> on my VM, it takes a few minutes. Now triple that length of time (assuming
> mozilla-history is ~2x the size of m-c right now). The time spent doing
> simple hg operations is bad enough right now; your proposal makes it
> untenably bad. I've seen proposals in hg to avoid checking out full history,
> and a few partial implementations... maybe we should get somebody to
> actually implement those.
Why can't we get somebody to just fix Mercurial to not be slow in the first
place?
- Kyle
>Why can't we get somebody to just fix Mercurial to not be slow in the first place?
>
>
Why are there 70000 changesets anyway? Is that typical?
--
Warning: May contain traces of nuts.
One significant problem I'm seeing there is that an m-c clone is already
pretty huge (half a GB for the .hg data only) and with that would just
grow way more huge. That was one of the major reasons why we even
started with zero when switching to Mercurial.
If "shallow clones" (i.e. with reduced history) would work, that would
be easier but we are not there yet, AFAIK.
git has a more efficient and compact storage backend but I guess that
mozilla-history clone is still pretty large.
I actually still like the bonsai blame UI over the hgweb and the github
one, though at least the github one, as space-inefficient it is on
display, provides a bit more data readily visible about the commits that
changed certain lines.
Robert Kaiser.
--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community should think about. And most of the
time, I even appreciate irony and fun! :)
Because then it would just be a git implementation in python (or maybe
it's even python that contributes to it being slow for such large
repos). ;-)
The reason why git is so much faster is very probably mainly because its
(history) storage backend is so much more efficient.
Robert Kaiser
Hey, I said I didn't like git's *user interface*, not its speed or disk
usage :) Give me Mercurial's CLI and extension hooks on top of Git's
storage model and I'll be happy.
(Caveat: I have read, but cannot presently find where I read, that Git
does not record branch information permanently in its revision history;
branches are apparently just ephemeral pointers to tip revisions. If
that's true, it might be a serious problem when doing archaeology.)
zw
For a project of Mozilla's size and age, 70k changesets is on the small
side. As rough points of comparison, GCC's repository is almost 180k
revisions and LLVM's repository (which hosts several projects) is
approaching 140k revisions.
-Nathan
The Linux kernel is 263K revisions, and only starts in 2005.
Mike
Do you have a plan to setup *comm-history* too?
--
hiro
And give me also CVS blame/bonsai like UI, then I'd be happy.
It is surprising that both hg and git are missing good UI for
blame/annotate.
Yes, I'm missing that too.
With regression range finding, I need to see what files are being
changed. Current UI is not really helpful for that.
Regards,
Martijn
>>
>> (Caveat: I have read, but cannot presently find where I read, that Git
>> does not record branch information permanently in its revision history;
>> branches are apparently just ephemeral pointers to tip revisions. If
>> that's true, it might be a serious problem when doing archaeology.)
>>
>> zw
>
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>
--
Martijn Wargers - Help Mozilla!
http://quality.mozilla.org/
http://wiki.mozilla.org/Mozilla_QA_Community
irc://irc.mozilla.org/qa - /nick mw22
That's because in March of 2007, after roughly 9 years of Mozilla
history in CVS, we splitted everything other than the core platform and
Firefox into their separate repositories and started the new
mozilla-central repo in hg without any history.
Incidentally, this is the topic the OP started this thread with, as he
connected the old CVS history to the new hg history in a common git
repo, and there it's "a few" more than 70k changesets, I guess...
Seconded.
You can get that from the command line pretty easily, though, with
"hg log -v". hg log can also take revision ranges and revsets (for
some fun with revsets, see http://www.selenic.com/blog/?p=744 ).
-David
--
𝄞 L. David Baron http://dbaron.org/ 𝄂
𝄢 Mozilla Corporation http://www.mozilla.com/ 𝄂
I have published the fixed git-mapfile here:
<https://github.com/ehsan/mozilla-history-tools/blob/master/initial_conversaion/git-mapfile.new>.
I'm planning to create a script which does the syncing automatically
and also automatically update the git-mapfile for others to grab.
There seems to be a few bugs with hg-git corrupting the author line. I
haven't gotten to debug it yet, but I'll try to look into it next week.
Cheers,
Ehsan