I'm pleased to announce yet another tool for importing darcs repositories
to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in
Haskell, on top of the darcs2 source code. The result is a much faster
program - it can convert the complete ghc 6.9 branch (without libraries)
in less than 15 minutes on my slightly dated machine (Athlon XP 2500+),
which is quite fast [3]. Incremental updates work, too.
The program is still rough around the edges, and there's some cosmetical
work to do, especially with respect to converting author names. The
program should recover from most errors, as long as nobody else modifies
the destination repository.
Nevertheless, it seems quite useable already. I hope somebody finds
this useful.
You can grab the source at
http://int-e.home.tlink.de/haskell/git-darcs-import-0.1.tar.bz2
Look at the README for further information.
Credits go to:
David Roundy and all contributors for darcs2. The code base is
surprisingly pleasant to work with.
And of course, Linus Torvalds, Junio Hamano and all other git
contributors.
Enjoy,
Bertram
[1] http://repo.or.cz/w/darcs2git.git?a=shortlog
[2] http://git.sanityinc.com/?p=darcs-to-git.git
[3] http://nominolo.blogspot.com/2008/05/thing-that-should-not-be-or-how-to.html
_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
> Hi,
>
> I'm pleased to announce yet another tool for importing darcs
> repositories
> to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in
> Haskell, on top of the darcs2 source code. The result is a much faster
> program - it can convert the complete ghc 6.9 branch (without
> libraries)
> in less than 15 minutes on my slightly dated machine (Athlon XP 2500
> +),
> which is quite fast [3]. Incremental updates work, too.
>
Nice! Do you happen to also have a darcs (or Git) repository somewhere?
/ Thomas
--
Monkey killing monkey killing monkey over pieces of the ground.
Silly monkeys give them thumbs they forge a blade
And where there's one they're bound to divide it
Right in two
I've uploaded my (git) repo to repo.or.cz, see
http://repo.or.cz/w/git-darcs-import.git
Patches are welcome.
enjoy,
Bertram
What's the appeal of this? I personally love git, but I thought all
the cool kids at this school used darcs and that was that.
--
Darrin
Disclaimer: I'm no expert, this is what I've heard. Anyone please
confirm or deny the following?
Basically, git is waaay faster than Darcs on a number of use cases.
So, maybe the point of using this converter is when you just cannot
use Darcs any more (too old/big project, merging huge branch with
loads of conflicts, I don't know).
Another point may be "broadcast-ability": It is possible to expose two
repositories: one Darcs, one Git. If I use Git and not Darcs (please
don't sue me), it will be simpler for me to get the source from the
Git snapshot, provided there is one. Well, if I want to contribute
back... maybe I should switch.
I think the True Heresy (and most useful, if practical) would be to
convert back and forth between the two version control systems,
accepting patches from both :-)
Loup
Other reason can be "git rebase". Of course there is a question
how good practice it is ... but it is being used.
Peter.
Darcs patches are pretty much an implicit rebase.
--
Aaron Denney
-><-
You cannot push patch B if it depends on patch A without also
pushing A. And darcs currently does not alow you to reorder
B before A (which is what git rebase actually does). Git rebase
works quite well even in cloned repositories.
See: http://bugs.darcs.net/issue891
Some discussin about it is also here:
http://lists.osuosl.org/pipermail/darcs-users/2008-February/011564.html
When the issue is fixed then darcs will be really patch based and
will become the ultimate DSCM :-)
True. This is a *feature* not a bug. You shouldn't be able to do this
automatically, because it can't be done right. You need to do this sort
of thing manually. If you don't, the heuristics used will bite you at
some point. When they do commute, there is no problem.
> Git rebase works quite well even in cloned repositories.
Meh. It can, if you're really really lucky.
> See: http://bugs.darcs.net/issue891
> Some discussin about it is also here:
> http://lists.osuosl.org/pipermail/darcs-users/2008-February/011564.html
>
> When the issue is fixed then darcs will be really patch based and
> will become the ultimate DSCM :-)
Rebasing is doable in git as a one-repository operation because each
repository has multiple branches. As darcs has one repo per branch,
it fundamentally needs to be done in multiple repos.
There are naturally two repos, upstream, and your-feature-development.
your-feature-development has a patch A that you want to rebase.
What you should do is pull upstream into new-tracking, then pull patch A
from your-feature-development into new-tracking.
If it applies with no problem, great: mv your-feature-development
your-feature-development-old; new-tracking your-feature-development.
Of course, in this case, you could have just pulled into
your-feature-development. If there weren't any other patches to save in
the old your-feature-development, you can delete it instead of moving
it.
When there is a conflict, then you need to handle it somehow. Neither
git nor darcs can do it automatically. You can just record the merge
conflict and your resolution. This keeps repos that pulled from you
valid, but this won't give you the "clean history" that you presumably
want. So you need to combine the merger and cleanup into a new patch
with the same log message, etc. It's true that git does make *this*
process very nice.
There is one thing that git rebase does easily (and correctly) that darcs
doesn't do nicely: rewriting history by merging commits "prior" to the
head. I put prior in quotes, because darcs doesn't preserve history
in the first place. I don't find that a compelling use, as opposed to
maintaing topic branches.
--
Aaron Denney
-><-
_______________________________________________
For myself, git-darcs-import itself is an opportunity to learn more
about both darcs and git. It wasn't meant to be argument in the git
vs. darcs discussion, although it was inevitable that it would be
seen as such.
I really like darcs' concepts, but in my opinion, darcs doesn't get
enough power out of the theory of patches to really shine so far.
This is a hard problem, and I can't offer solutions. Ideally, you'd have
semantic patches which just commute with virtually all other patches
because they "know" what they are about. The only thing that darcs
offers in that direction - besides handling conflicts, mergers and
undos gracefully, which is quite useful in itself - is a keyword
substitution patch type.
In the meantime, I prefer git to darcs, mainly because I'm sort of
attached to seeing the development history, i.e. I prefer to think of
patches as (partially) ordered instead of being a cloud of patches
that darcs uses as a model.
Bertram
Sorry, I did not intend to indicate it should be done without doing the
reordering first (by providing manual conflict resolution).
>> Git rebase works quite well even in cloned repositories.
>
> Meh. It can, if you're really really lucky.
Actually you are probably right, I needed to use a non-complicated
workaround once (but I did it only about two times!). I might have
been just lucky. I liked though that it did tell me what was wrong,
in contrast to mercurial queues which just replicated both original
branch and the rebased branch (so I finished with two copies on
both sides at the end :-( ).
<--- cut --->
> Rebasing is doable in git as a one-repository operation because each
> repository has multiple branches. As darcs has one repo per branch,
> it fundamentally needs to be done in multiple repos.
>
> There are naturally two repos, upstream, and your-feature-development.
>
> your-feature-development has a patch A that you want to rebase.
>
> What you should do is pull upstream into new-tracking, then pull patch A
> from your-feature-development into new-tracking.
>
> If it applies with no problem, great: mv your-feature-development
> your-feature-development-old; new-tracking your-feature-development.
> Of course, in this case, you could have just pulled into
> your-feature-development. If there weren't any other patches to save in
> the old your-feature-development, you can delete it instead of moving
> it.
>
> When there is a conflict, then you need to handle it somehow. Neither
> git nor darcs can do it automatically. You can just record the merge
> conflict and your resolution. This keeps repos that pulled from you
> valid, but this won't give you the "clean history" that you presumably
> want. So you need to combine the merger and cleanup into a new patch
> with the same log message, etc. It's true that git does make *this*
> process very nice.
Ok, in such a simple case darcs can preserve the message too if the
repository is not cloned (and you indicated that it does not really
work with cloned repositories in git - I'm not an experienced git user).
Just pull to the original repository and use amend-record to resolve
the conflict and the message will be preserved. So I would tell that
for *this* *simple* case darcs is better.
But what about this git rebasing option? How to do it more easily
(than the solution I know and I described it later) in darcs?
using "git-rebase --onto master next topic" to get from:
o---o---o---o---o master
\
o---o---o---o---o next
\
o---o---o topic
to:
o---o---o---o---o master
| \
| o'--o'--o' topic
\
o---o---o---o---o next
This is the reason why I mentioned reordering depending patches AB
to BA (with manual conflict resolution) would be needed in darcs
to support (I believe a better) alternative to git rebase.
I do not know how to do this in darcs (without doing manual addition
of "topic" changes with gnu patch utility in a new darcs repository
clone which would not have "topic" changes (and "next" changes as
well) pulled in and throwing avay the old one at the end).
> There is one thing that git rebase does easily (and correctly) that darcs
> doesn't do nicely: rewriting history by merging commits "prior" to the
> head. I put prior in quotes, because darcs doesn't preserve history
> in the first place. I don't find that a compelling use, as opposed to
> maintaing topic branches.
I do not know what you mean here. Can you point me to some example?
I hope that this is not too off-topic for haskell cafe ... and so far
I believe this is not a flame war :-) I just like that Bertram's code
exists and I think it (as well as git) should not be dismissed, since
AFAIK there is more than performance to git as well as there is more
to darcs than it not imposing patch order on us (which is the darcs
feature I like).
Peter.
I don't understand (probably because I haven't use either dvcs).
Either the changes in the next->topic path don't depend on the changes
in the fork->next path. Then, the patches commute and it's no problem
for darcs.
Or the next->topic path relies on features from next that are not
present in master . But then, you're screwed anyway and should merge
some parts from next into master so as to advance the point where
master and next fork.
o---o---o---o---o master
\
x---x---o---o---o next
\
o---o---o topic
(Of course, you don't actually advance the fork but rather add patches
at the end of master . Hm, set of patches semantics seem to be a lot
nicer here anyway. To me, the whole point of rebasing seems to be to
somehow bring set semantics into the tree semantics.)
Regards,
apfelmus
apfelmus answered this. I might expand on his reply.
>> There is one thing that git rebase does easily (and correctly) that darcs
>> doesn't do nicely: rewriting history by merging commits "prior" to the
>> head. I put prior in quotes, because darcs doesn't preserve history
>> in the first place. I don't find that a compelling use, as opposed to
>> maintaing topic branches.
>
> I do not know what you mean here. Can you point me to some example?
Letting capitals be commits, and lowercase be trees at the point of
these commits.
Suppose your history is:
A -> B -> C -> D
| | | |
a b c d
And that B somehow doesn't make sense except with the additional changes
in C. You don't want to deal with this, or have anyone see B. All it
does is clutter up the history. So you want to expunged it from the
history.
git rebase can rewrite this to
A ------> C' -> D'
| | |
a c d
Doing this in darcs would require unrecording B and C, and then
rerecording C'. But, if D is in the repo, then it is likely that B and
C can't be commuted past it to be unrecorded. (If they can, no
problem!)
Unrecording D (and possible E, F, G, etc.) lets you do this, but if you
then pull it back from another repo, it will depend on B and C, and pull
these in, which are now doppelgangers of C'. Not having used darcs 2,
I'm not sure if that's still quite so fatal, but it remains bad news
AIUI.
The bottom line is that darcs is a tool for managing sets of always
existing patches. and ordering them lazily, as needed. In particular,
no history generally exists, unless each patch depends on exactly one
previous. It has a "differential" view of software development, in that
the changes, and not the sum at each point matter (though of course, the
current sum does matter.)
On the other hand, git is a tool for managing (and munging) histories
of development in many weird and wacky ways. It has an "integral"
view of software development, the changes are lazily derived from the
saved state at each point, and are strictly ordered even when they're
independent. It can, when needed, work with these changes to accomplish
fairly interesting history-altering tasks, but as soon as they're used
to construct a new history, they're discarded. (Yes, git uses deltas,
but this is "merely" an optimization.)
The two models are dual to each other in many ways.
--
Aaron Denney
-><-
_______________________________________________
Right. Then
>> o---o---o---o---o master
>> \
>> o---o---o---o---o next
>> \
>> o---o---o topic
is not a good model for what darcs has. What it has is more like
>> o---o---o---o---o master
>> |\
>> | o---o---o---o---o next
>> \ |
>> o---o---o--------+ topic
The patches in "topic" that are in "next" are indepent of the ones that
aren't in "next", so it's another (virtual) line-of-development, that
darcs can lazily construct as needed. These lines-of-development are
similar to branches of git that have been merged, but you also have
access to the "unmerged" versions until a patch comes in that depends on
the merger.
If I commit three new features that don't interact, a darcs repo will
essentially look like:
---- topicA -
/ \
history --- topicB --+--
\ /
---- topicC -
Where the merger is "virtual". Darcs will implicitly linearize this to
any of
history --- topicA --- topicB --- topicC ---
history --- topicA --- topicC --- topicB ---
history --- topicB --- topicA --- topicC ---
history --- topicB --- topicC --- topicA ---
history --- topicC --- topicA --- topicB ---
history --- topicC --- topicB --- topicA ---
/as needed/. git constructs one of these, based on how you did the
commits, and gives you ways to alter it to the others.
> Or the next->topic path relies on features from next that are not
> present in master . But then, you're screwed anyway
Yep.
> and should merge some parts from next into master so as to advance the
> point where master and next fork.
That's one solution. Of course, darcs doesn't have semantic dependency,
but syntactic dependency. (You can add extra dependencies to
model semantic dependencies, but you can't take away the syntactic
dependencies.) Another solution, if there's syntactic,
but not semantic dependencies, is to manually use patch and diff to get
90% there, and then cleanup and record.
--
Aaron Denney
-><-
_______________________________________________
Well not really, depends what kind the dependency is, this kind of rebase
is useful when "topic" depends only syntactically (as you pointed later)
on "next" or when the semantic dependency is only on a small part of "next".
Git rebase allows you get the syntax or the small part of semantics to the
rebased "topic" by asking you for (manual) conflict resolution. This would
correspond to commuting darcs patches which depend on each other (again
possible by providing manual conflict resolution).
Of course this happens only when it was anticipated that upstream merge
of "next" happens before "topic", but then the upstream maintainers
decided that "topic" should go upstream first. So, not often.
>> and should merge some parts from next into master so as to advance the
>> point where master and next fork.
>
> That's one solution. Of course, darcs doesn't have semantic dependency,
> but syntactic dependency. (You can add extra dependencies to
> model semantic dependencies, but you can't take away the syntactic
> dependencies.) Another solution, if there's syntactic,
> but not semantic dependencies, is to manually use patch and diff to get
> 90% there, and then cleanup and record.
OK, so I think this is what I expected for such a case.
Thanks for the explanation of the meaning of "merging patches prior head".
Peter.
I've never been a cool kid at school, but I switched from Darcs to Git
recently. I have not regretted it. Git has quite a few features Darcs
doesn't by now, and there is a little bit (but not much) in the other
direction. That and the lack of the indempotent merge bug.
Git's interface has really cleaned up in the last year, and it seems to
be well on the way to becoming the defacto DVCS of choice. Maybe next
week, when it's picked up the last of the superdelegates, we can say for
sure, but of course bzr won't conceed anything at this point....
(OK, so we've had mind-numbing election coverage here in the US for too
long)
I've blogged about this. http://changelog.complete.org/plugin/tag/git
will get you most of the relevant posts.
-- John