BitBucket: GPL-ed KitBeeper clone

Adam J. Richter

unread,

Mar 1, 2003, 7:15:02 PM3/1/03

to and...@suse.de, linux-...@vger.kernel.org, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

Pavel Machek wrote:
> I've created little project for read-only (for now ;-) kitbeeper
> clone. It is available at www.sf.net/projects/bitbucket (no tar balls,
> just get it fresh from CVS).

Thank you for taking some initiative and improving this
situation by constructive means. You are an example to us all,
as is Andrea Arcangeli with his openbkweb project, which you
will probably want to examine and perhaps integrate
(ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/openbkweb).

bitbucket is about 350 lines of shell scripts, documentation
and diffs, the most interesting file of which is FORMAT, which
documents some reverse engineering efforts on bitkeeper internal file
formats. bitkbucket currently uses rsync to update data from the
repository. openbkweb is 500+ lines of python that implements enough
of the bitkeeper network protocol to do downloads, although perhaps in
inefficiently. That sounds like some functionality that you might be
interested in integrating.

I think the suggestion made by Pavel Janik that it would
be better to work on adding BitKeeper-like functionality to existing
free software packages is a bit misdirected. BitKeeper uses SCCS
format, and we have a GPL'ed SCCS clone ("cssc"), so you are
adding functionality to existing free software version control
code anyhow.

However, I would like to turn Pavel Janik's point in
what I think might be a more constructive direction.

Aegis, BitKeeper and probably other configuration management
tools that use sccs or rcs basically share a common type of lower
layer. This lower layer converts a file-based revision control system
such as sccs to an "uber-cvs", as someone called it in a slashdot
discussion, that can:

1. process a transaction against a group of files atomically,
2. associate a comment with such a transaction rather than
with just one file,
3. represent symbolic links, file protections
4. represent file renames (and perhaps copies?)

You might want to keep in the back of your mind the
possibility of someday splitting off this lower level into a separate
software package that programs like your bitkeeper clone, aegis could
use in common. If the interface to this lower level took cvs
commands, then it could probably replace cvs, although the repository
would probably be incompatible since the meaning of things like
checking in multiple files together with a single comment would be
different, and there would be other kinds of changes to represent
beyond what cvs currently does. Using a repository format that is
compatible with another system (for example bitkeeper or aegis) would
make such a tool more useful, and if such a tool makes it easier for
people to migrate from a prorprietary system to a free one, that's
even better, so your starting with bitkeeper's format seems like an
excellent choice to me.

Thanks again for starting this project. I will at least
try to be a user of it.

Adam J. Richter __ ______________ 575 Oroville Road
ad...@yggdrasil.com \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Larry McVoy

unread,

Mar 1, 2003, 7:21:26 PM3/1/03

to Adam J. Richter, and...@suse.de, linux-...@vger.kernel.org, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

> Thanks again for starting this project. I will at least
> try to be a user of it.

Enjoy yourself.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

David Lang

unread,

Mar 1, 2003, 7:23:01 PM3/1/03

to Adam J. Richter, and...@suse.de, linux-...@vger.kernel.org, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

Adam, the openbkweb project didn't reverse engineer the BK network
protocol, it used the HTTP access that is provided on bkbits.net to
download the individual items and created a repository from that.

unfortunantly the bandwidth requirements to support that are high enough
that Larry indicated that if people keep doing that he would have to
shutdown the HTTP access.

bitbucket uses rsync as that is the most efficiant way to get a copy of
the repository without trying to talk the bitkeeper protocol. it is FAR
more efficiant and accruate then the openbkkweb interface

Davdi Lang

On Sat, 1 Mar 2003, Adam J.
Richter wrote:

> Date: Sat, 1 Mar 2003 16:11:55 -0800
> From: Adam J. Richter <ad...@yggdrasil.com>
> To: and...@suse.de, linux-...@vger.kernel.org, pa...@janik.cz,
> pa...@ucw.cz
> Cc: h...@infradead.org
> Subject: Re: BitBucket: GPL-ed KitBeeper clone

Arador

unread,

Mar 1, 2003, 7:50:34 PM3/1/03

to Adam J. Richter, and...@suse.de, linux-...@vger.kernel.org, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

On Sat, 1 Mar 2003 16:11:55 -0800
"Adam J. Richter" <ad...@yggdrasil.com> wrote:

(Just a very personal suggestion)
Why to waste time trying to clone a
tool such as bitkeeper? Why not to support things like subversion?

Diego Calleja

Jeff Garzik

unread,

Mar 1, 2003, 8:04:43 PM3/1/03

to Arador, Adam J. Richter, and...@suse.de, linux-...@vger.kernel.org, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

Arador wrote:
> On Sat, 1 Mar 2003 16:11:55 -0800
> "Adam J. Richter" <ad...@yggdrasil.com> wrote:
>
> (Just a very personal suggestion)
> Why to waste time trying to clone a
> tool such as bitkeeper? Why not to support things like subversion?

...because, clearly, Pavel is being paid by BitMover to dilute
programmer resources and user mindshare, thus slowing all open source
SCM efforts.

</sarcasm>

That's not Pavel's aim, obviously, but it's the net effect.

Jeff

Alan Cox

unread,

Mar 1, 2003, 8:13:25 PM3/1/03

to Arador, Adam J. Richter, and...@suse.de, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

On Sun, 2003-03-02 at 00:49, Arador wrote:
> On Sat, 1 Mar 2003 16:11:55 -0800
> "Adam J. Richter" <ad...@yggdrasil.com> wrote:
>
> (Just a very personal suggestion)
> Why to waste time trying to clone a
> tool such as bitkeeper? Why not to support things like subversion?

Because the repositories people need to read are in BK format, for better
or worse. It doesn't ultimately matter if you use it as an input filter
for CVS, subversion or no VCS at all.

Jeff Garzik

unread,

Mar 1, 2003, 8:20:59 PM3/1/03

to Alan Cox, Arador, Adam J. Richter, and...@suse.de, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

Alan Cox wrote:
> On Sun, 2003-03-02 at 00:49, Arador wrote:
>
>>On Sat, 1 Mar 2003 16:11:55 -0800
>>"Adam J. Richter" <ad...@yggdrasil.com> wrote:
>>
>>(Just a very personal suggestion)
>>Why to waste time trying to clone a
>>tool such as bitkeeper? Why not to support things like subversion?
>
>
> Because the repositories people need to read are in BK format, for better
> or worse. It doesn't ultimately matter if you use it as an input filter
> for CVS, subversion or no VCS at all.

"BK format"? Not really. Patches have been posted (to lkml, even) to
GNU CSSC which allow it to read SCCS files BK reads and writes.

Since that already exists, a full BitKeeper clone is IMO a bit silly,
because it draws users and programmers away from projects that could
potentially _replace_ BitKeeper.

Jeff

Olivier Galibert

unread,

Mar 1, 2003, 8:27:08 PM3/1/03

to linux-...@vger.kernel.org

On Sat, Mar 01, 2003 at 04:11:55PM -0800, Adam J. Richter wrote:
> Aegis, BitKeeper and probably other configuration management
> tools that use sccs or rcs basically share a common type of lower
> layer. This lower layer converts a file-based revision control system
> such as sccs to an "uber-cvs", as someone called it in a slashdot
> discussion, that can:
>
> 1. process a transaction against a group of files atomically,
> 2. associate a comment with such a transaction rather than
> with just one file,
> 3. represent symbolic links, file protections
> 4. represent file renames (and perhaps copies?)

5. Represent merges. That's what is making cvs branches unusable.

Frankly, if you want all of that you'd better design a repository
format that is actually adapted to it. The RCS format is not very
good, the SCCS weave is a little better but not by much (it reminds me
of Hurd, looks cool but slow by design). Larry did quite a feat
turning it into a distributed DAG of versions but I'm not convinced it
was that smart, technically. In particular, everthing suddendly looks
much nicer when you have one file per DAG node plus a cache zone for
full versions.

But anyway, what made[1] Bitkeeper suck less is the real DAG
structure. Neither arch nor subversion seem to have understood that
and, as a result, don't and won't provide the same level of semantics.
Zero hope for Linus to use them, ever. They're needed for any
decently distributed development process.

Hell, arch is still at the update-before-commit level. I'd have hoped
PRCS would have cured that particular sickness in SCM design ages ago.

Atomicity, symbolic links, file renames, splits (copy) and merges (the
different files suddendly ending up being the same one) are somewhat
important, but not the interesting part. A good distributed DAG
structure and a quality 3-point version "merge" is what you actually
need to build bk-level SCMs.

OG.

[1] 2.1.6-pre5, I don't know about current versions

Filip Van Raemdonck

unread,

Mar 1, 2003, 8:32:08 PM3/1/03

to linux-...@vger.kernel.org

On Sat, Mar 01, 2003 at 04:11:55PM -0800, Adam J. Richter wrote:

> Pavel Machek wrote:
> > I've created little project for read-only (for now ;-) kitbeeper
> > clone. It is available at www.sf.net/projects/bitbucket (no tar balls,
> > just get it fresh from CVS).
>
> Thank you for taking some initiative and improving this
> situation by constructive means. You are an example to us all,
> as is Andrea Arcangeli with his openbkweb project, which you
> will probably want to examine and perhaps integrate
> (ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/openbkweb).

I've said this (indirectly) before, and I'll say it again:
BitBucket, and you, are missing the point here. Openbkweb isn't.
Before one can use bitbucket there still has to be a bkbits mirror first,
which incidentally may be true for the main linux kernel trees but isn't
for other projects developed with the help of bitkeeper.

I've also said this before, and I'll also repeat this again:
While politics & philosophy are my main reasons not to use bitkeeper, I
also am not bothered enough by other issues to use it plain and simple.
Nor to use openbkweb instead. And I'm not going to tell other people what
they should do.

However, until we have a tool (as openbkweb tries to be, although very
inefficiently) which can extract patches from the "main" openlogging
bitkeeper repositories, the schism remains between developers who use BK
and those who cannot use it - be it for political or real legal (i.e.
license violation, because of involvement in another SCM) reasons.

> bitkbucket currently uses rsync to update data from the
> repository.

(...)

> I think the suggestion made by Pavel Janik that it would
> be better to work on adding BitKeeper-like functionality to existing
> free software packages is a bit misdirected. BitKeeper uses SCCS
> format, and we have a GPL'ed SCCS clone ("cssc"), so you are
> adding functionality to existing free software version control
> code anyhow.

Not until you can use that functionality to access the main BK
repositories directly. When you're still accessing mirrors of it, as in
the rsync case, you are - pragmatically speaking - no better of than when
not accessing it at all.

Regards,

Filip

--
"To me it sounds like Cowpland just doesn't know what the hell he is talking
about. That's to be expected: he's CEO, isn't he?"
-- John Hasler

Andrea Arcangeli

unread,

Mar 1, 2003, 8:40:22 PM3/1/03

to Jeff Garzik, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

On Sat, Mar 01, 2003 at 08:19:52PM -0500, Jeff Garzik wrote:
> Alan Cox wrote:
> >On Sun, 2003-03-02 at 00:49, Arador wrote:
> >
> >>On Sat, 1 Mar 2003 16:11:55 -0800
> >>"Adam J. Richter" <ad...@yggdrasil.com> wrote:
> >>
> >>(Just a very personal suggestion)
> >>Why to waste time trying to clone a

> >>tool such as *notrademarkhere*? Why not to support things like subversion?

> >
> >
> >Because the repositories people need to read are in BK format, for better
> >or worse. It doesn't ultimately matter if you use it as an input filter
> >for CVS, subversion or no VCS at all.
>
> "BK format"? Not really. Patches have been posted (to lkml, even) to
> GNU CSSC which allow it to read SCCS files BK reads and writes.

you never tried what you're talking about. there's no way to make any
use of the SCCS tree from Rik's website with only the patched CSSC. The
whole point of bitbucket is to find a way to use CSSC on that tree. And
the longer Larry takes to export the whole data in an open format (CVS,
subversion or whatever), the more progress it will be accomplished in
getting the data out of the only service we have right now (Rik's
server). Sure, CSSC is a foundamental piece to extract the data out of
the single files, but CSSC alone is useless. CSSC only allows you to
work on a single file, you lose the whole view of the tree and in turn
it is completely unusable for doing anything useful like watching
changesets, or checking out a branch or whatever else useful thing. As
Pavel found _all_ the info we are interested about is in the
SCCS/s.ChangeSet file and that has nothing to do with CSSC or SCCS.

>
> Since that already exists, a full BitKeeper clone is IMO a bit silly,
> because it draws users and programmers away from projects that could
> potentially _replace_ BitKeeper.

Jeff, please uninstall *notrademarkhere* from your harddisk, install the
patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
harddisk (like I just did), and then send me via email the diff of the
last Changeset that Linus applied to his tree with author, date,
comments etc... If you can do that, you're completely right and at
least personally I will agree 100% with you, again: iff you can.

Andrea

Jeff Garzik

unread,

Mar 1, 2003, 8:46:57 PM3/1/03

to Andrea Arcangeli, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

Andrea Arcangeli wrote:
> Jeff, please uninstall *notrademarkhere* from your harddisk, install the
> patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
> harddisk (like I just did), and then send me via email the diff of the
> last Changeset that Linus applied to his tree with author, date,
> comments etc... If you can do that, you're completely right and at
> least personally I will agree 100% with you, again: iff you can.

You're missing the point:

A BK exporter is useful. A BK clone is not.

If Pavel is _not_ attempting to clone BK, then I retract my arguments. :)

Jeff

Andrea Arcangeli

unread,

Mar 1, 2003, 9:08:09 PM3/1/03

to Jeff Garzik, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

On Sat, Mar 01, 2003 at 08:45:08PM -0500, Jeff Garzik wrote:
> Andrea Arcangeli wrote:
> >Jeff, please uninstall *notrademarkhere* from your harddisk, install the
> >patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
> >harddisk (like I just did), and then send me via email the diff of the
> >last Changeset that Linus applied to his tree with author, date,
> >comments etc... If you can do that, you're completely right and at
> >least personally I will agree 100% with you, again: iff you can.
>
>
> You're missing the point:
>
> A BK exporter is useful. A BK clone is not.
>
> If Pavel is _not_ attempting to clone BK, then I retract my arguments. :)

hey, in your previous email you claimed all we need is the patched CSSC,
you change topic quick! Glad you agree CSSC alone is useless and to make
anything useful with Rik's *notrademarkhere* tree we need a true
*notrademarkhere* exporter (of course the exporter will be backed by
CSSC to extract the single file changes, since they're in SCCS format
and it would be pointless to reinvent the wheel).

Now you say the bitbucket project (you read Pavel's announcement, he
said "read only for now", that means exporter in my vocabulary) is
useful, to me that sounds the opposite of your previous claims, but
again: glad we agree on this too now.

Andrea

Adam J. Richter

unread,

Mar 1, 2003, 9:26:12 PM3/1/03

to die...@teleline.es, and...@suse.de, h...@infradead.org, linux-...@vger.kernel.org, pa...@janik.cz, pa...@ucw.cz

Arador <die...@teleline.es> wrote:
>Why to waste time trying to clone a

>tool such as bitkeeper? Why not to support things like subversion?

"Why" depends on one's priorities.

Some of us the linux-kernel crowd are interested in being able
to interface with the bitkeeper-using kernel developers a bit more
efficiently. The developers currently using bitkeeper started using
it when these other systems were already available, so I doubt that
improving another free version control system will do more for free
software adoption than providing BK compatability (I don't know if
that is a priority for you).

Note that Subversion, in particular, is GPL incompatible and
uses its own underlying repository format that isn't particularly
compatible with anything else and required a web server plus some
minor web server extension when last I checked. As I previously
mentioned, Bitkeeper is based on sccs for which a GPL-compatible clone
exists: cssc, and sccs format is used in a lot of other places as
well. So the result of cloning the "uber-cvs" in bitkeeper might
actually have more applicability than trying to extract the same layer
from subversion, even though Subversion is freer than BitKeeper.

Different people have different priorities or order them
differently. If you contribute to aegis, prcs, or even Subversion, I
think that's great. If you produce a separate GPL compatible
"uber-CVS" layer that way, I would be interested in hearing about it.

Adam J. Richter __ ______________ 575 Oroville Road
ad...@yggdrasil.com \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

Christoph Hellwig

unread,

Mar 2, 2003, 4:34:38 AM3/2/03

to John Bradford, Christoph Hellwig, Pa...@janik.cz, pa...@ucw.cz, linux-...@vger.kernel.org

On Sun, Mar 02, 2003 at 09:30:00AM +0000, John Bradford wrote:
> > But anyway, don't you think Pavel is free to develop whatever free
> > software he likes to develop instead of following you political
> > agenda?
>
> What is the goal of the BitBucket project, though?
>
> To develop a version control system, or to annoy Larry?

Given the annoucement probably access the kernel BK tree in real time.

> Why doesn't somebody start working on a dedicated kernel version
> control system?

It's called BitKeeper :)

Jeff Garzik

unread,

Mar 2, 2003, 12:14:01 PM3/2/03

to H. Peter Anvin, linux-...@vger.kernel.org

H. Peter Anvin wrote:
> Followup to: <3E616224...@pobox.com>
> By author: Jeff Garzik <jga...@pobox.com>
> In newsgroup: linux.dev.kernel

>
>>You're missing the point:
>>
>>A BK exporter is useful. A BK clone is not.
>>
>
>

> I disagree. A BK clone would almost certainly be highly useful. The
> fact that it would happen to be compatible with one particular
> proprietary tool released by one particular company doesn't change
> that fact one iota; in fact, some people might find value in using the
> proprietary tool for whatever reason (snazzy GUI, keeping the suits
> happy, who knows...)

While people would certainly use it, I can't help but think that a BK
clone would damage other open source SCM efforts. I call this the
"SourceForge Syndrome":

Q. I found a problem/bug/annoyance, how do I solve it?
A. Clearly, a brand new sourceforge project is called for.

My counter-question is, why not improve an _existing_ open source SCM to
read and write BitKeeper files? Why do we need yet another brand new
project?

AFAICS, a BK clone would just further divide resources and mindshare. I
personally _want_ an open source SCM that is as good as, or better, than
BitKeeper. The open source world needs that, and BitKeeper needs the
competition. A BK clone may work with BitKeeper files, but I don't see
it ever being as good as BK, because it will always be playing catch-up.

Jeff

Jeff Garzik

unread,

Mar 2, 2003, 12:29:34 PM3/2/03

to Andrea Arcangeli, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

Andrea Arcangeli wrote:
> On Sat, Mar 01, 2003 at 08:45:08PM -0500, Jeff Garzik wrote:
>
>>Andrea Arcangeli wrote:
>>
>>>Jeff, please uninstall *notrademarkhere* from your harddisk, install the
>>>patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
>>>harddisk (like I just did), and then send me via email the diff of the
>>>last Changeset that Linus applied to his tree with author, date,
>>>comments etc... If you can do that, you're completely right and at
>>>least personally I will agree 100% with you, again: iff you can.
>>
>>
>>You're missing the point:
>>
>>A BK exporter is useful. A BK clone is not.
>>
>>If Pavel is _not_ attempting to clone BK, then I retract my arguments. :)
>
>
> hey, in your previous email you claimed all we need is the patched CSSC,
> you change topic quick! Glad you agree CSSC alone is useless and to make
> anything useful with Rik's *notrademarkhere* tree we need a true
> *notrademarkhere* exporter (of course the exporter will be backed by
> CSSC to extract the single file changes, since they're in SCCS format
> and it would be pointless to reinvent the wheel).

I have not changed the topic, you are still missing my point.

Let us get this small point out of the way: I agree that GNU CSSC
cannot read the BitKeeper ChangeSet file, which is a file critical for
getting the "weave" correct.

But that point is not relevant to my thread of discussion.

Let us continue in the below paragraph...

> Now you say the bitbucket project (you read Pavel's announcement, he
> said "read only for now", that means exporter in my vocabulary) is
> useful, to me that sounds the opposite of your previous claims, but
> again: glad we agree on this too now.

I disagree with your translation. Maybe this is the source of
misunderstand.

To me, a "BK clone, read only for now" is vastly different from a "BK
exporter". The "for now" clearly implies that it will eventually
attempt to be a full SCM.

Why do we need Yet Another Open Source SCM?
Why does Pavel not work on an existing open source SCM, to enable it to
read/write BitKeeper files?

These are the key questions which bother me.

Why do they bother me?

The open source world does not need yet another project that is "not
quite as good as BitKeeper." The open source world needs something that
can do all that BitKeeper does, and more :) A BK clone would be in a
perpetual state of "not quite as good as BitKeeper".

Jeff

Andrea Arcangeli

unread,

Mar 2, 2003, 1:15:36 PM3/2/03

to Jeff Garzik, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

On Sun, Mar 02, 2003 at 12:28:23PM -0500, Jeff Garzik wrote:
> Andrea Arcangeli wrote:
> >On Sat, Mar 01, 2003 at 08:45:08PM -0500, Jeff Garzik wrote:
> >
> >>Andrea Arcangeli wrote:
> >>
> >>>Jeff, please uninstall *notrademarkhere* from your harddisk, install the
> >>>patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
> >>>harddisk (like I just did), and then send me via email the diff of the
> >>>last Changeset that Linus applied to his tree with author, date,
> >>>comments etc... If you can do that, you're completely right and at
> >>>least personally I will agree 100% with you, again: iff you can.
> >>
> >>
> >>You're missing the point:
> >>
> >>A BK exporter is useful. A BK clone is not.
> >>
> >>If Pavel is _not_ attempting to clone BK, then I retract my arguments. :)
> >
> >
> >hey, in your previous email you claimed all we need is the patched CSSC,
> >you change topic quick! Glad you agree CSSC alone is useless and to make
> >anything useful with Rik's *notrademarkhere* tree we need a true
> >*notrademarkhere* exporter (of course the exporter will be backed by
> >CSSC to extract the single file changes, since they're in SCCS format
> >and it would be pointless to reinvent the wheel).
>
> I have not changed the topic, you are still missing my point.

your point is purerly theorical at this point in time. bitbucker is so
far from being an efficient exporter that arguing right now about
stopping at the exporter or going ahead to clone it completely is a
totally pointless discussion at this point in time.

Once it will be a fully functional exporter please raise your point
again, only then it will make sense to discuss your point.

I'm not even convinced it will become a full exporter if Larry finally
provides the kernel data via an open protocol stored in an open format
as he promised us some week ago, go figure how much I can care what it
will become after it has the readonly capability.

> Let us get this small point out of the way: I agree that GNU CSSC
> cannot read the BitKeeper ChangeSet file, which is a file critical for
> getting the "weave" correct.

This is not what I understood from your previous email:

"BK format"? Not really. Patches have been posted (to lkml, even) to
GNU CSSC which allow it to read SCCS files BK reads and writes.

Since that already exists, a full BitKeeper clone is IMO a bit silly,

now you're saying something completely different, you're saying, "yes the
CSSC obviously isn't enough and we _only_ _need_ the exporter but please
don't do more than the exporter or it will waste developement
resources". This is why you changed topic as far as I'm concerned, but
no problem, I'm glad we agree the exporter is useful now.

> To me, a "BK clone, read only for now" is vastly different from a "BK
> exporter". The "for now" clearly implies that it will eventually
> attempt to be a full SCM.

Why do you care that much now? I can't care less. Period. I need the
exporter and for me the exporter or the bk-clone-read-only is the same
thing, I don't mind if I've to run `bk` or `exportbk` or rsync or
whatever to get the data out.

If bitbucket will become much better than bitkeeper 100 years from now,
much better than a clone, is something I can't care less at this point
in time, and it may be the best or worst thing it will happen to the
whole SCM open source arena, you can't know, I can't know, nobody can
know at this point in time.

You agreed the exporter is useful, so we agree, I don't mind what will
happen after the useful thing is avaialble, it's the last of my worries,
and until we reach that point obviously there is no risk to reinvent the
wheel (unless the data become available in a open protocol first).

> Why do we need Yet Another Open Source SCM?
> Why does Pavel not work on an existing open source SCM, to enable it to
> read/write BitKeeper files?

bitbucket could be merged into any SCM at any time, it is _the
exporter_ that the other SCM needs to import from the *notrademarkhere*
trees.

> These are the key questions which bother me.
>
> Why do they bother me?
>
> The open source world does not need yet another project that is "not
> quite as good as BitKeeper." The open source world needs something that
> can do all that BitKeeper does, and more :) A BK clone would be in a
> perpetual state of "not quite as good as BitKeeper".

Disagree, if it will become more than an read-only thing, it will likely
become as good and most probably better than bitkeeper (maybe not
graphical but still usable) because it means it has the critical mass of
developement power _iff_ it can reach that point. But at this point in time
I doubt it will become more than an exporter, infact I even doubt it
will become a fully exporter if Larry avoids us to waste time. I
personally would have no interest in bitbucket if Linus would provide
the data in a open protocol for efficient downloads and in a open format
for backup-archive downloads as we discussed some week ago.

But again, what bitbucket will become after it will be a function
exporter (i.e. your "point") is enterely pointless to argue about right
now IMHO. But feel free to keep discussing it with others if you think
it matters right now (now that I made my point clear, I probably won't
feel the need to answer since my interest in that matter is so low).

Andrea

H. Peter Anvin

unread,

Mar 2, 2003, 1:40:48 PM3/2/03

to Jeff Garzik, linux-...@vger.kernel.org

Jeff Garzik wrote:
>
> My counter-question is, why not improve an _existing_ open source SCM to
> read and write BitKeeper files? Why do we need yet another brand new
> project?
>

I don't disagree with that. However, the question you posited was
"would one be useful", and I think the answer is unequivocally yes.
Furthermore, I don't agree with the "compatibility == bad" assumption I
read into your message.

> AFAICS, a BK clone would just further divide resources and mindshare. I
> personally _want_ an open source SCM that is as good as, or better, than
> BitKeeper. The open source world needs that, and BitKeeper needs the
> competition. A BK clone may work with BitKeeper files, but I don't see
> it ever being as good as BK, because it will always be playing catch-up.

Yes. Personally, I've spent quite a bit of time with OpenCM after a
suggestion from Ted T'so. It's looking quite promising to me, although
I haven't yet used it to maintain a large project.

-hpa

Jeff Garzik

unread,

Mar 2, 2003, 3:02:51 PM3/2/03

to H. Peter Anvin, linux-...@vger.kernel.org

H. Peter Anvin wrote:
> Jeff Garzik wrote:
>
>>
>> My counter-question is, why not improve an _existing_ open source SCM
>> to read and write BitKeeper files? Why do we need yet another brand
>> new project?
>>
>
> I don't disagree with that. However, the question you posited was
> "would one be useful", and I think the answer is unequivocally yes.

Ok, I'll grant that. :)

I think a BK clone is detrimental to the overall open source SCM world,
is my main point. I was thinking more along the lines of "useful to
'the cause'" ;-)

> Furthermore, I don't agree with the "compatibility == bad" assumption I
> read into your message.

Well, I disagree with that assumption too :) My main objection is that
a BK clone would divert attention from another effort (such as OpenCM),
with the end result that neither the BK clone nor OpenCM are as good (or
better) than BitKeeper.

>> AFAICS, a BK clone would just further divide resources and mindshare.
>> I personally _want_ an open source SCM that is as good as, or better,
>> than BitKeeper. The open source world needs that, and BitKeeper needs
>> the competition. A BK clone may work with BitKeeper files, but I
>> don't see it ever being as good as BK, because it will always be
>> playing catch-up.
>
>
> Yes. Personally, I've spent quite a bit of time with OpenCM after a
> suggestion from Ted T'so. It's looking quite promising to me, although
> I haven't yet used it to maintain a large project.

Interesting... Here's the link, in case others want to check it out:

http://www.opencm.org/

Jeff Garzik

unread,

Mar 2, 2003, 3:14:11 PM3/2/03

to Andrea Arcangeli, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

Andrea Arcangeli wrote:
> your point is purerly theorical at this point in time. bitbucker is so
> far from being an efficient exporter that arguing right now about
> stopping at the exporter or going ahead to clone it completely is a
> totally pointless discussion at this point in time.
>
> Once it will be a fully functional exporter please raise your point
> again, only then it will make sense to discuss your point.

Ok, fair enough ;)

> I'm not even convinced it will become a full exporter if Larry finally
> provides the kernel data via an open protocol stored in an open format
> as he promised us some week ago, go figure how much I can care what it
> will become after it has the readonly capability.

I think this is a fair request.

IMO a good start would be to get BK to export its metadata for each
changeset in XML. Once that is accomplished, (a) nobody gives a damn
about BK file format, and (b) it is easy to set up an automated, public
distribution of XML changesets that can be imported into OpenCM, cvs, or
whatever.

>>Let us get this small point out of the way: I agree that GNU CSSC
>>cannot read the BitKeeper ChangeSet file, which is a file critical for
>>getting the "weave" correct.
>
>
> This is not what I understood from your previous email:
>
> "BK format"? Not really. Patches have been posted (to lkml, even) to
> GNU CSSC which allow it to read SCCS files BK reads and writes.
>
> Since that already exists, a full BitKeeper clone is IMO a bit silly,
>
> now you're saying something completely different, you're saying, "yes the
> CSSC obviously isn't enough and we _only_ _need_ the exporter but please
> don't do more than the exporter or it will waste developement
> resources". This is why you changed topic as far as I'm concerned, but
> no problem, I'm glad we agree the exporter is useful now.

I am sorry for the misunderstanding then. Let me quote from an email I
sent to you yesterday:

A BK exporter is useful.

So I think we do agree :)

>>To me, a "BK clone, read only for now" is vastly different from a "BK
>>exporter". The "for now" clearly implies that it will eventually
>>attempt to be a full SCM.
>
>
> Why do you care that much now? I can't care less. Period. I need the
> exporter and for me the exporter or the bk-clone-read-only is the same
> thing, I don't mind if I've to run `bk` or `exportbk` or rsync or
> whatever to get the data out.
>
> If bitbucket will become much better than bitkeeper 100 years from now,
> much better than a clone, is something I can't care less at this point
> in time, and it may be the best or worst thing it will happen to the
> whole SCM open source arena, you can't know, I can't know, nobody can
> know at this point in time.
>
> You agreed the exporter is useful, so we agree, I don't mind what will
> happen after the useful thing is avaialble, it's the last of my worries,
> and until we reach that point obviously there is no risk to reinvent the
> wheel (unless the data become available in a open protocol first).

Yes. As you see, I care about the future and not the present, in my
arguments: I believe that a BK clone may hurt the overall [future]
effort of creating a good quality open source SCM. So, in my mind I
separate the two topics of "BK exporter" and "future BK clone."

To get back to the topic of "BK exporter", I think it is more productive
to get Larry to export in an open file format. I will work with him
this week to do that. Reading the BK format itself may be interesting
to some, but I would rather have BitMover do the work and export in an
open file format ;-) Reading BK format directly is "chasing a moving
target" in my opinion.

Jeff

Geert Uytterhoeven

unread,

Mar 2, 2003, 4:53:00 PM3/2/03

to Jeff Garzik, Andrea Arcangeli, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

On Sun, 2 Mar 2003, Jeff Garzik wrote:
> Andrea Arcangeli wrote:

> > I'm not even convinced it will become a full exporter if Larry finally
> > provides the kernel data via an open protocol stored in an open format
> > as he promised us some week ago, go figure how much I can care what it
> > will become after it has the readonly capability.
>
> I think this is a fair request.
>
> IMO a good start would be to get BK to export its metadata for each
> changeset in XML. Once that is accomplished, (a) nobody gives a damn
> about BK file format, and (b) it is easy to set up an automated, public
> distribution of XML changesets that can be imported into OpenCM, cvs, or
> whatever.

Read: an XML scheme with a public, open specification?

Ask Microsoft how to `encrypt' documents using an `open' standard like XML...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Pavel Machek

unread,

Mar 2, 2003, 7:04:50 PM3/2/03

to Pavel Janík, linux-...@vger.kernel.org

Hi!

> > I've created little project for read-only (for now ;-) kitbeeper
> > clone. It is available at www.sf.net/projects/bitbucket (no tar balls,
> > just get it fresh from CVS).
>

> I think that it is waste of your valuable time to create clones of
> proprietary software like KitBeeper (I do not want to infringe on Larry's
> trademark) is. It is not worth the effort and in fact your work supports
> using that proprietary tool. I suggest to completely remove the
> project.

Actually bk's on-disk format is quite reasonable, and there's a *lot*
of example data in that format, so it might be easier to develop
free version control system this way.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

nickn

unread,

Mar 2, 2003, 7:48:43 PM3/2/03

to Jeff Garzik, H. Peter Anvin, linux-...@vger.kernel.org

On Sun, Mar 02, 2003 at 12:12:58PM -0500, Jeff Garzik wrote:
> My counter-question is, why not improve an _existing_ open source SCM to
> read and write BitKeeper files? Why do we need yet another brand new
> project?

Or improve BK to export and import on demand of an existing open source SCM.

David Lang

unread,

Mar 2, 2003, 7:58:05 PM3/2/03

to nickn, Jeff Garzik, H. Peter Anvin, linux-...@vger.kernel.org

I'm a little confused about the on-disk format

is it SCCS and the problem is that CSSC doesn't recognise everything that
the latest SCCS does so a patch is needed for CSSC or does it differ
slightly from SCCS?

Larry has mentioned that there were things they changed from the base SCCS
format that they started with, but he indicated that they had fed patches
to SCCS to use the new info.

I'm trying to figure out if the problem is CSSC not being as compatible as
it would like to be or is larry not getting the changes he is proposing
into SCCS, or are there other problems.

David Lang

On Mon, 3 Mar 2003, nickn wrote:

> Date: Mon, 3 Mar 2003 00:47:28 +0000
> From: nickn <ni...@www0.org>
> To: Jeff Garzik <jga...@pobox.com>
> Cc: H. Peter Anvin <h...@zytor.com>, linux-...@vger.kernel.org
> Subject: Re: BitBucket: GPL-ed *notrademarkhere* clone

Jeff Garzik

unread,

Mar 2, 2003, 9:32:29 PM3/2/03

to David Lang, nickn, H. Peter Anvin, linux-...@vger.kernel.org

David Lang wrote:
> I'm a little confused about the on-disk format
>
> is it SCCS and the problem is that CSSC doesn't recognise everything that
> the latest SCCS does so a patch is needed for CSSC or does it differ
> slightly from SCCS?

CSSC can read the sfiles with the patch posted to lkml, but it cannot
read the BitKeeper-specific files such as the all-important ChangeSet
file. ChangeSet is required to build the DAG that weaves all the sfiles
together into the proper order.

Jeff

Jeff Garzik

unread,

Mar 2, 2003, 9:34:01 PM3/2/03

to nickn, H. Peter Anvin, linux-...@vger.kernel.org

nickn wrote:
> On Sun, Mar 02, 2003 at 12:12:58PM -0500, Jeff Garzik wrote:
>
>>My counter-question is, why not improve an _existing_ open source SCM to
>>read and write BitKeeper files? Why do we need yet another brand new
>>project?
>
>
> Or improve BK to export and import on demand of an existing open source SCM.

That may be possible with OpenCM, but it's a bit of a stretch for the
other existing SCMs. Regardless, if BK can export metadata to an open
format (such as a defined XML spec), then the SCM interchange
possibilities are only limited by a programmer's time and imagination.

Jeff

Bernd Eckenfels

unread,

Mar 3, 2003, 3:30:26 AM3/3/03

to linux-...@vger.kernel.org

In article <3E630CD9...@aitel.hist.no> you wrote:
> I wouldn't worry. Linus can develop any SCM software he want to

Linus is not the only kernel hacker, and there are quite a few who work on
SCMs.

Gruss
Bernd
--
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

John Bradford

unread,

Mar 3, 2003, 6:58:48 AM3/3/03

to Christoph Hellwig, Pa...@janik.cz, pa...@ucw.cz, linux-...@vger.kernel.org

> But anyway, don't you think Pavel is free to develop whatever free
> software he likes to develop instead of following you political
> agenda?

What is the goal of the BitBucket project, though?

To develop a version control system, or to annoy Larry?

I haven't seen a single post to the list saying, "If we were designing
a version control system dedicated to the Linux kernel, what would you
like to see in it?". Before I started work on my bug database, I
spent a week or so discussing it on the list with people.

Why doesn't somebody start working on a dedicated kernel version
control system?

John.

Pavel Machek

unread,

Mar 3, 2003, 7:40:10 AM3/3/03

to John Bradford, Christoph Hellwig, Pa...@janik.cz, pa...@ucw.cz, linux-...@vger.kernel.org

Hi!

> > But anyway, don't you think Pavel is free to develop whatever free
> > software he likes to develop instead of following you political
> > agenda?
>
> What is the goal of the BitBucket project, though?
>
> To develop a version control system, or to annoy Larry?

To be able to access kernel version history without touching
bk. Annoying Larry is just a side effect, altrough I agree selection
of project name was "interesting" ;-).

> I haven't seen a single post to the list saying, "If we were designing
> a version control system dedicated to the Linux kernel, what would you
> like to see in it?". Before I started work on my bug database, I
> spent a week or so discussing it on the list with people.

Apparently Linus, DaveM and Larry did just this, 2 years ago and
offline. bk is result of that discussion.

> Why doesn't somebody start working on a dedicated kernel version
> control system?

That's way harder than accessing kernel version history. I do not have
time for *that*.
Pavel
--
Horseback riding is like software...
...vgf orggre jura vgf serr.

Alan Cox

unread,

Mar 3, 2003, 8:11:00 AM3/3/03

to John Bradford, Christoph Hellwig, Pa...@janik.cz, pa...@ucw.cz, Linux Kernel Mailing List

On Sun, 2003-03-02 at 09:30, John Bradford wrote:
> I haven't seen a single post to the list saying, "If we were designing
> a version control system dedicated to the Linux kernel, what would you
> like to see in it?". Before I started work on my bug database, I
> spent a week or so discussing it on the list with people.

Larry spent a lot of time talking to people directly about such things.

John Bradford

unread,

Mar 3, 2003, 8:59:24 AM3/3/03

to Alan Cox, h...@infradead.org, pa...@janik.cz, pa...@ucw.cz, linux-...@vger.kernel.org

> > I haven't seen a single post to the list saying, "If we were designing
> > a version control system dedicated to the Linux kernel, what would you
> > like to see in it?". Before I started work on my bug database, I
> > spent a week or so discussing it on the list with people.
>
> Larry spent a lot of time talking to people directly about such things.

I meant in relation to Bit Bucket.

If the need for Bit Keeper to make a profit for Bit Mover excludes
Linux developers from using it, the most logical thing to do in my
opinion is to start from scratch and write a version control system
dedicated to furthering Linux kernel development.

Compatibility with Bit Keeper should not be a goal of that project.

John.

Charles Cazabon

unread,

Mar 3, 2003, 9:20:51 AM3/3/03

to linux-...@vger.kernel.org

John Bradford <jo...@grabjohn.com> wrote:
> >
> > Larry spent a lot of time talking to people directly about such things.
>
> I meant in relation to Bit Bucket.
>
> If the need for Bit Keeper to make a profit for Bit Mover excludes
> Linux developers from using it, the most logical thing to do in my
> opinion is to start from scratch and write a version control system
> dedicated to furthering Linux kernel development.
>
> Compatibility with Bit Keeper should not be a goal of that project.

Larry did a pretty good job of determining what the software needed to do to
be useful for the linux-kernel project. Any possible replacement that doesn't
do at least what bitkeeper does is likely dead before it starts.

(nevermind that no replacement is necessary, due to Larry's gratis provision
of the tool to kernel developers...)

Charles
--
-----------------------------------------------------------------------
Charles Cazabon <li...@discworld.dyndns.org>
GPL'ed software available at: http://www.qcc.ca/~charlesc/software/
-----------------------------------------------------------------------

Richard B. Johnson

unread,

Mar 3, 2003, 9:21:55 AM3/3/03

to John Bradford, Alan Cox, h...@infradead.org, pa...@janik.cz, pa...@ucw.cz, linux-...@vger.kernel.org

On Mon, 3 Mar 2003, John Bradford wrote:

> > > I haven't seen a single post to the list saying, "If we were designing
> > > a version control system dedicated to the Linux kernel, what would you
> > > like to see in it?". Before I started work on my bug database, I
> > > spent a week or so discussing it on the list with people.
> >
> > Larry spent a lot of time talking to people directly about such things.
>
> I meant in relation to Bit Bucket.
>
> If the need for Bit Keeper to make a profit for Bit Mover excludes
> Linux developers from using it, the most logical thing to do in my
> opinion is to start from scratch and write a version control system
> dedicated to furthering Linux kernel development.
>
> Compatibility with Bit Keeper should not be a goal of that project.

^^^^^^^^^^^^^^^^^^^^
>
> John.

Hmmm. Compatibility with existing things is always one of the
considerations of any new product (or project). If compatibility
can be achieved without a significant trade-off in performance,
then it should become one of the goals.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.

John Bradford

unread,

Mar 3, 2003, 10:10:33 AM3/3/03

to ro...@chaos.analogic.com, al...@lxorguk.ukuu.org.uk, h...@infradead.org, pa...@janik.cz, pa...@ucw.cz, linux-...@vger.kernel.org

> > > > I haven't seen a single post to the list saying, "If we were designing
> > > > a version control system dedicated to the Linux kernel, what would you
> > > > like to see in it?". Before I started work on my bug database, I
> > > > spent a week or so discussing it on the list with people.
> > >
> > > Larry spent a lot of time talking to people directly about such things.
> >
> > I meant in relation to Bit Bucket.
> >
> > If the need for Bit Keeper to make a profit for Bit Mover excludes
> > Linux developers from using it, the most logical thing to do in my
> > opinion is to start from scratch and write a version control system
> > dedicated to furthering Linux kernel development.
> >
> > Compatibility with Bit Keeper should not be a goal of that project.
> ^^^^^^^^^^^^^^^^^^^^

> Hmmm. Compatibility with existing things is always one of the
> considerations of any new product (or project).

I did consider it, I just don't think it's a good idea.

> If compatibility can be achieved without a significant trade-off in
> performance, then it should become one of the goals.

Agreed, but in a project to develop the best possible version control
system for the Linux kernel [1], I don't see how compatibility with
any other version control system is of any interest at all.

It's only if you want to use several version control systems in
parallel that it might be an issue, but if a single, free, version
control system is in use, I don't see why that would be necessary.

Getting the data out of Bit Keeper and in to a different version
control system is a one-time job, and as far as I am aware there is
no difficulty getting data out of a Bit Keeper repository, (using Bit
Keeper itself), in a suitable format.

[1] As distinct from projects to develop the best possible version
control system in general or to develop a GPLed project to complete
with Bit Keeper.

John.

Edward S. Marshall

unread,

Mar 3, 2003, 10:27:48 AM3/3/03

to Adam J. Richter, die...@teleline.es, and...@suse.de, h...@infradead.org, linux-...@vger.kernel.org, pa...@janik.cz, pa...@ucw.cz

On Sat, Mar 01, 2003 at 06:23:21PM -0800, Adam J. Richter wrote:
> Note that Subversion, in particular, is GPL incompatible and

http://subversion.tigris.org/project_license.html

I don't see anything particularly GPL-incompatible in there; looks pretty
much like a BSD-style license to me. Something that precludes SVN's use
by GPL'd projects, or precludes integration with GPL'd projects, is
something I'm sure CollabNet and the developers on the mailing list would
love to know about (along with all the Apache folks, since it's really
their license), considering that there's already at least on GPL'd
front-end for Subversion (gsvn), and plenty of GPL projects being hosted
in Subversion repositories.

(Not meant as a flame, please don't take it as such. I'd really like to
know where the Apache/Subversion license is "GPL-incompatible".)

> uses its own underlying repository format that isn't particularly
> compatible with anything else

Lacking an on-disk format that's actually useful for storing more
information than files and diffs, they invented one. I don't blame them.
The fun part, of course, is that svn is architected such that bolting up
to another repository storage system (say, an RDBMS, or even, horrors, a
bitkeeper-compatible SCCS derivative) is really just a matter of writing
the code (with a few caveats, obviously, but that's the basic idea).

"svnadmin dump" will provide a dumpfile of the repository, which could
be translated into another format, if that were desirable. Again, just a
simple matter of coding. ;-)

> and required a web server plus some
> minor web server extension when last I checked.

Not everyone is aware of this, but there's a new access method for svn
repositories that works with SSH, or as a standalone pserver-like scheme,
called "ra_svn". Translation: you no longer need Apache 2.0 and mod_dav
to access a Subversion repository; you just don't get some of the cool
features that using Apache gives you (such as all the access controls,
the availability of the repository via DAV and through a normal web
browser, etc).

This came about only a few milestones back, so it's not surprising that
everyone hasn't seen it yet. :-)

--
Edward S. Marshall <e...@logic.net>
http://esm.logic.net/

Felix qui potuit rerum cognoscere causas.

Adam J. Richter

unread,

Mar 3, 2003, 12:04:12 PM3/3/03

to e...@logic.net, linux-...@vger.kernel.org

[I've dropped all of the direct email addresses from the cc list.
I think they'd probably rather just see one copy of this tangential
discussion on the linux-kernel list, if at all. -Adam]

On Mon, 3 Mar 2003, Edward S. Marshall wrote:
>On Sat, Mar 01, 2003 at 06:23:21PM -0800, Adam J. Richter wrote:
>> Note that Subversion, in particular, is GPL incompatible and

>http://subversion.tigris.org/project_license.html

>I don't see anything particularly GPL-incompatible in there;

I think the main incompatibility is part 3 (the advertising
requirement). Regarding the Apache 1.1 license, which is used by
Subversion, http://www.fsf.org/licenses/license-list.html#OriginalBSD
says, "This is a permissive non-copyleft free software license with a
few requirements that render it incompatible with the GNU GPL. We urge
you not to use the Apache licenses for software you write. However,
there is no reason to avoid running programs that have been released
under this license, such as Apache."

>looks pretty much like a BSD-style license to me.

FSF has said that they believe that only the "new style" BSD
licenses are GPL compatible. They consider the "old style" (with the
advertising requirement) to be GPL incompatible.

>Something that precludes SVN's use
>by GPL'd projects, or precludes integration with GPL'd projects, is
>something I'm sure CollabNet and the developers on the mailing list would
>love to know about

That is not what I mean by "GPL incompatible."

http://www.fsf.org/licenses/gpl-faq.html#WhatDoesCompatMean
gives the foundation's definition of "compatible with the GPL":

| What does it mean to say a license is "compatible with the GPL".
|
| It means that the other license and the GNU GPL are
| compatible; you can combine code released under the other license with
| code released under the GNU GPL in one larger program.
|
| The GPL permits such a combination provided it is released
| under the GNU GPL. The other license is compatible with the GPL if it
| permits this too.

When I say "GPL compatible", my meaning is similar, perhaps
identical. By "GPL compatible", I mean that if the contents of a work
are comingled with GPL'ed content, even within the same file, that
obeying the terms of the GPL with regard to the whole resultant work
results in one having at least the permissions of the GPL with regard
to the whole resultant work (this, by the way, is what I believe
"licensed as a whole" under the GPL means).

I mentioned the GPL incompatibility issue to Brian Behlendorf
after a he gave at a trade show once, and he said, approximately, "it
is the GPL's fault, you know." This discussion took place after the
University of California at Berkeley had switched to the new BSD terms
for GPL compatibility. At that point, I believed that investing
further time in arguing would probably not produce useful results.

>Lacking an on-disk format that's actually useful for storing more
>information than files and diffs, they invented one.

There already were free systems that deal with directory
information and change sets on groups of files rather than single
files. The Subversion team chose to use a new format that I've heard
many people, even people who seem not to care about the GPL
incompatibility, says is complex enough to warrant not using svn.
Even when one of the Subversion people gave a talk at the Silicon
Valley Linux User Group, he acknowledged that many people on some
mailing list discussing Subversion design issues, including Larry
McVoy, considered their decision to be a big mistake.

>[...] svn is architected such that bolting up

>to another repository storage system (say, an RDBMS, or even, horrors, a
>bitkeeper-compatible SCCS derivative) is really just a matter of writing
>the code (with a few caveats, obviously, but that's the basic idea).

That is arguably true of most source code control system that
try to deal with multi-file commits. For example, Aegis already lets
you choose between rcs or sccs.

All that said, I do not consider contribution to Subvresion to be
something that has a negative sum effect. If you are contributing to
Subversion, I'm not saying that that is a bad thing.

Adam J. Richter __ ______________ 575 Oroville Road
ad...@yggdrasil.com \ / Milpitas, California 95035
+1 408 309-6081 | g g d r a s i l United States of America
"Free Software For The Rest Of Us."

Larry McVoy

unread,

Mar 3, 2003, 1:38:55 PM3/3/03

to Andrea Arcangeli, Jeff Garzik, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

How close is http://www.bitmover.com/EXPORT to what you want (3MB file).

Note that this is very coarse granularity, it's very 2.5.62 up to 2.5.63,
in practice the granularity would be as least as fine as each of Linus'
pushes and finer if possible. We can't capture all the branching structure
in patches, there is too much parallelism, but what we can do is capture
each push that Linus does and if he did more than one merge in that push,
we can break it up into each merge.

We can also provide this as a BK url on bkbits for any cset or range of
csets (we'll have to get another T1 line but I don't see way around that).

This should give enough information that anyone could build their own
BK 2 SVN gateway (or whatever, we're doing the CVS one).

Also, here's what Linus' recent pushes look like WITHOUT breaking it into
each merge, we're still working on that code:

57 csets on 2003/03/03 08:49:44
5 csets on 2003/03/02 21:30:31
28 csets on 2003/03/02 21:04:02
1 csets on 2003/03/02 10:19:24
49 csets on 2003/03/01 19:03:58
2 csets on 2003/03/01 11:04:04
5 csets on 2003/03/01 09:19:24
1 csets on 2003/02/28 19:34:30
37 csets on 2003/02/28 15:30:29
8 csets on 2003/02/28 15:18:12
23 csets on 2003/02/28 15:05:08
31 csets on 2003/02/27 23:30:05
16 csets on 2003/02/27 09:15:07
11 csets on 2003/02/27 07:45:06
47 csets on 2003/02/26 23:09:53
32 csets on 2003/02/25 21:35:34
24 csets on 2003/02/25 18:34:41
22 csets on 2003/02/25 15:49:41
14 csets on 2003/02/24 21:23:34
3 csets on 2003/02/24 15:19:44
1 csets on 2003/02/24 11:16:14
15 csets on 2003/02/24 11:00:36
4 csets on 2003/02/24 10:48:49
1 csets on 2003/02/24 10:03:36
15 csets on 2003/02/24 09:49:34
1 csets on 2003/02/23 20:33:00
3 csets on 2003/02/23 11:15:28
8 csets on 2003/02/23 11:01:10
6 csets on 2003/02/23 10:49:14
2 csets on 2003/02/22 19:32:35
4 csets on 2003/02/22 16:17:27
1 csets on 2003/02/22 12:45:28
76 csets on 2003/02/22 12:34:13
1 csets on 2003/02/21 20:18:19
6 csets on 2003/02/21 19:49:32
86 csets on 2003/02/21 18:03:23
3 csets on 2003/02/21 16:18:24
30 csets on 2003/02/21 14:14:48
1 csets on 2003/02/21 10:18:19
1 csets on 2003/02/21 09:49:15
etc.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

Larry McVoy

unread,

Mar 3, 2003, 1:47:43 PM3/3/03

to Andrea Arcangeli, Jeff Garzik, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

On Mon, Mar 03, 2003 at 10:37:34AM -0800, Larry McVoy wrote:
> How close is http://www.bitmover.com/EXPORT to what you want (3MB file).
>
> Note that this is very coarse granularity, it's very 2.5.62 up to 2.5.63,

This was too big, I replaced it with the diffs + comments for the last push
Linus did. Even this is pretty big, he pulled 57 csets from DaveM if I
understand things properly.

Pavel Machek

unread,

Mar 3, 2003, 2:39:26 PM3/3/03

to John Bradford, ro...@chaos.analogic.com, al...@lxorguk.ukuu.org.uk, h...@infradead.org, pa...@janik.cz, pa...@ucw.cz, linux-...@vger.kernel.org

Hi!

> It's only if you want to use several version control systems in
> parallel that it might be an issue, but if a single, free, version
> control system is in use, I don't see why that would be necessary.

But you *will* be using bitkeeper along with something else, because
linus is not going to switch unless that something else is well
tested, creating chicken-egg problem.

--
Horseback riding is like software...
...vgf orggre jura vgf serr.

John Bradford

unread,

Mar 3, 2003, 2:51:22 PM3/3/03

to Pavel Machek, ro...@chaos.analogic.com, al...@lxorguk.ukuu.org.uk, h...@infradead.org, pa...@janik.cz, pa...@ucw.cz, linux-...@vger.kernel.org

> > It's only if you want to use several version control systems in
> > parallel that it might be an issue, but if a single, free, version
> > control system is in use, I don't see why that would be necessary.
>
> But you *will* be using bitkeeper along with something else, because
> linus is not going to switch unless that something else is well
> tested, creating chicken-egg problem.

Well, you could try convincing Alan to use it for the -ac trees, or
the 2.2 kernel to get it tested with a real dataset.

John.

Joel Becker

unread,

Mar 3, 2003, 4:55:20 PM3/3/03

to Jeff Garzik, H. Peter Anvin, linux-...@vger.kernel.org

On Sun, Mar 02, 2003 at 12:12:58PM -0500, Jeff Garzik wrote:
> My counter-question is, why not improve an _existing_ open source SCM to
> read and write BitKeeper files? Why do we need yet another brand new
> project?

Normally, I'd agree with you Jeff. However, none of the current
open source SCM systems are architected in a way that can operate like
BK.
I've been using subversion for a while now. It pretty much
fixes all the problems that CVS had, AS LONG AS you accept the CVS style
of version control. That style doesn't work for non-central work like
the kernel.
The one thing BK does that makes it worthwhile is the three-way
merge. This (and the resulting DAG) make handling code from Alan, from
Linus, from Andrew, and from everyone else possible. With CVS,
subversion, or any other SCM I've worked with, you have to hand merge
anything past the first patch. Ugh.
This requires architecture, and (AFAIK) BitBucket is the first
try at it. Compatibility with the proprietary tool that does it already
is a good thing.

Joel

--

"Can any of you seriously say the Bill of Rights could get through
Congress today? It wouldn't even get out of committee."
- F. Lee Bailey

Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel....@oracle.com
Phone: (650) 506-8127

Pavel Machek

unread,

Mar 3, 2003, 5:18:42 PM3/3/03

to Jeff Garzik, Alan Cox, Arador, Adam J. Richter, and...@suse.de, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

Hi!

> >>(Just a very personal suggestion)
> >>Why to waste time trying to clone a
> >>tool such as bitkeeper? Why not to support things like subversion?
> >
> >
> >Because the repositories people need to read are in BK format, for better
> >or worse. It doesn't ultimately matter if you use it as an input filter
> >for CVS, subversion or no VCS at all.

>
> "BK format"? Not really. Patches have been posted (to lkml, even) to
> GNU CSSC which allow it to read SCCS files BK reads and writes.
>
> Since that already exists, a full BitKeeper clone is IMO a bit silly,

> because it draws users and programmers away from projects that could
> potentially _replace_ BitKeeper.

Read-only access to the bk repositories is the first goal. Then, I'll
either add write support (unlikely) or feed it into some existing
version control system to work with that. I'm still not sure what's
the best.

[bk's on-disk format is quite reasonable; it might be okay to reuse
that.]

Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

Pavel Machek

unread,

Mar 3, 2003, 5:19:14 PM3/3/03

to Jeff Garzik, Andrea Arcangeli, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

Hi!

> >Jeff, please uninstall *notrademarkhere* from your harddisk, install the
> >patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
> >harddisk (like I just did), and then send me via email the diff of the
> >last Changeset that Linus applied to his tree with author, date,
> >comments etc... If you can do that, you're completely right and at
> >least personally I will agree 100% with you, again: iff you can.
>
>
> You're missing the point:
>
> A BK exporter is useful. A BK clone is not.

I meant exporter.

David Lang

unread,

Mar 3, 2003, 5:48:13 PM3/3/03

to John Bradford, ro...@chaos.analogic.com, al...@lxorguk.ukuu.org.uk, h...@infradead.org, pa...@janik.cz, pa...@ucw.cz, linux-...@vger.kernel.org

the big reason for haveing it be compatable with existing systems is so
that if people move to it and decide they don't like it they can move away
from it again.

you may like to think that your program is so good that people will never
want to move away, but if it isn't an option then people will be very slow
to adopt it as the risk to their source history just went up a few
notches.

David Lang

On Mon, 3 Mar 2003, John Bradford wrote:

> Date: Mon, 3 Mar 2003 15:08:15 +0000 (GMT)
> From: John Bradford <jo...@grabjohn.com>
> To: ro...@chaos.analogic.com
> Cc: al...@lxorguk.ukuu.org.uk, h...@infradead.org, pa...@janik.cz, pa...@ucw.cz,
> linux-...@vger.kernel.org
> Subject: Re: BitBucket: GPL-ed KitBeeper clone

Andrea Arcangeli

unread,

Mar 3, 2003, 5:56:39 PM3/3/03

to Larry McVoy, Jeff Garzik, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

On Mon, Mar 03, 2003 at 10:37:34AM -0800, Larry McVoy wrote:

> How close is http://www.bitmover.com/EXPORT to what you want (3MB file).
>
> Note that this is very coarse granularity, it's very 2.5.62 up to 2.5.63,

I'm probably missing something obvious but it's not clear to me how to
extract the changeset info from this format.

Let's assume I want to extract this changeset:

hang...@1.1021, 2003-02-24 10:49:30-08:00, randy....@verizon.net
[PATCH] convert /proc/io{mem,ports} to seq_file

This converts /proc/io{mem,ports} to the seq_file interface
(single_open).

How can I?

I mean, the above format is fine, as far as we have a file like that per
changeset (or alternatively per Linus's merge, even if not for every
single changeset, when he does the pulls). Clearly a file of that format
for a 2.5.62->63 diff is not finegrined enough.

Correct me if I'm wrong but if I understand well the changeset numbers
aren't fixed in the bitkeeper tree, a changeset number can change while
the merging happens across different cloned trees. So in short, the
changeset numbers are useless to the outside (but still providing them
won't hurt as far as nobody rely on them).

> in practice the granularity would be as least as fine as each of Linus'
> pushes and finer if possible. We can't capture all the branching structure
> in patches, there is too much parallelism, but what we can do is capture
> each push that Linus does and if he did more than one merge in that push,
> we can break it up into each merge.
>
> We can also provide this as a BK url on bkbits for any cset or range of
> csets (we'll have to get another T1 line but I don't see way around that).

If that hurts, you could simply upload them to kernel.org. Even if it's
not a file, can't you simply checkin into a remote cvs on kernel.org or
osdl.org or sourceforge, or whatever else place, so you won't need to
pay for it. It's up to you of course, but I'm sure you're not forced to
pay for this service (besides for the once-a-time setup of the exports,
that I hope won't generate any maintainance overhead to you).

> This should give enough information that anyone could build their own
> BK 2 SVN gateway (or whatever, we're doing the CVS one).

Yes, as far as this file-format is per-merge I think this is all we
need. This way it will be usable to checkout, browse and regenerate the
tree, unlike the cset directory currently in kernel.org.

Just curious, this also means that at least around the 80% of merges
in Linus's tree is submitted via a bitkeeper pull, right?

Andrea

Pavel Machek

unread,

Mar 3, 2003, 6:15:27 PM3/3/03

to Andrea Arcangeli, Larry McVoy, Jeff Garzik, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

Hi!

> > How close is http://www.bitmover.com/EXPORT to what you want (3MB file).
> >
> > Note that this is very coarse granularity, it's very 2.5.62 up to 2.5.63,
>
> I'm probably missing something obvious but it's not clear to me how to
> extract the changeset info from this format.

Is that format parsable at all? It looks like strange changeset
comments could confuse parsers...

> Let's assume I want to extract this changeset:
>
> hang...@1.1021, 2003-02-24 10:49:30-08:00, randy....@verizon.net
> [PATCH] convert /proc/io{mem,ports} to seq_file
>
> This converts /proc/io{mem,ports} to the seq_file interface
> (single_open).
>
> How can I?
>
> I mean, the above format is fine, as far as we have a file like that per
> changeset (or alternatively per Linus's merge, even if not for every
> single changeset, when he does the pulls). Clearly a file of that format
> for a 2.5.62->63 diff is not finegrined enough.

Ben's bitsubversion script is somewhat slow, but should be capable of
pulling any diff you want...
Pavel

--
Horseback riding is like software...
...vgf orggre jura vgf serr.

David Lang

unread,

Mar 3, 2003, 6:58:35 PM3/3/03

to Andrea Arcangeli, Larry McVoy, Jeff Garzik, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

On Mon, 3 Mar 2003, Andrea Arcangeli wrote:

> Just curious, this also means that at least around the 80% of merges
> in Linus's tree is submitted via a bitkeeper pull, right?
>
> Andrea

remember how Linus works, all normal patches get copied into a single
large patch file as he reads his mail then he runs patch to apply them to
the tree. I think this would make the entire batch of messages look like
one cset.

David Lang

Jeff Garzik

unread,

Mar 3, 2003, 7:03:34 PM3/3/03

to David Lang, Andrea Arcangeli, Larry McVoy, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

David Lang wrote:
> On Mon, 3 Mar 2003, Andrea Arcangeli wrote:
>
>
>>Just curious, this also means that at least around the 80% of merges
>>in Linus's tree is submitted via a bitkeeper pull, right?
>>
>>Andrea
>
>
> remember how Linus works, all normal patches get copied into a single
> large patch file as he reads his mail then he runs patch to apply them to
> the tree. I think this would make the entire batch of messages look like
> one cset.

Not correct. His commits properly separate the patches out into
individual csets.

Jeff

Larry McVoy

unread,

Mar 3, 2003, 7:06:08 PM3/3/03

to Jeff Garzik, David Lang, Andrea Arcangeli, Larry McVoy, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

On Mon, Mar 03, 2003 at 07:02:28PM -0500, Jeff Garzik wrote:
> David Lang wrote:
> >On Mon, 3 Mar 2003, Andrea Arcangeli wrote:
> >
> >
> >>Just curious, this also means that at least around the 80% of merges
> >>in Linus's tree is submitted via a bitkeeper pull, right?
> >>
> >>Andrea
> >
> >
> >remember how Linus works, all normal patches get copied into a single
> >large patch file as he reads his mail then he runs patch to apply them to
> >the tree. I think this would make the entire batch of messages look like
> >one cset.
>
>
> Not correct. His commits properly separate the patches out into
> individual csets.

And we've written code which finds the longest path through the graph
to get the finest granularity; when run on his tree we get 8138 nodes.
That is 43% of the 18837 nodes possible. The trunk only includes
1068 nodes. So we can a very good job exporting to CVS.

--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

Andrea Arcangeli

unread,

Mar 3, 2003, 7:17:43 PM3/3/03

to Jeff Garzik, David Lang, Larry McVoy, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

On Mon, Mar 03, 2003 at 07:02:28PM -0500, Jeff Garzik wrote:

> David Lang wrote:
> >On Mon, 3 Mar 2003, Andrea Arcangeli wrote:
> >
> >
> >>Just curious, this also means that at least around the 80% of merges
> >>in Linus's tree is submitted via a bitkeeper pull, right?
> >>
> >>Andrea
> >
> >
> >remember how Linus works, all normal patches get copied into a single
> >large patch file as he reads his mail then he runs patch to apply them to
> >the tree. I think this would make the entire batch of messages look like
> >one cset.
>
>
> Not correct. His commits properly separate the patches out into
> individual csets.

and they're unusable as source to regenerate a tree. I had similar
issues with the web too. to make use of the single csets you need to
implement the internal bitkeeper branching knowledge too. Not to tell
apparently the cset numbers changes all the time.

Andrea

Jeff Garzik

unread,

Mar 3, 2003, 7:31:19 PM3/3/03

to Andrea Arcangeli, David Lang, Larry McVoy, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

Andrea Arcangeli wrote:
> On Mon, Mar 03, 2003 at 07:02:28PM -0500, Jeff Garzik wrote:
>
>>David Lang wrote:
>>
>>>On Mon, 3 Mar 2003, Andrea Arcangeli wrote:
>>>
>>>
>>>
>>>>Just curious, this also means that at least around the 80% of merges
>>>>in Linus's tree is submitted via a bitkeeper pull, right?
>>>>
>>>>Andrea
>>>
>>>
>>>remember how Linus works, all normal patches get copied into a single
>>>large patch file as he reads his mail then he runs patch to apply them to
>>>the tree. I think this would make the entire batch of messages look like
>>>one cset.
>>
>>
>>Not correct. His commits properly separate the patches out into
>>individual csets.
>
>
> and they're unusable as source to regenerate a tree. I had similar
> issues with the web too. to make use of the single csets you need to
> implement the internal bitkeeper branching knowledge too. Not to tell
> apparently the cset numbers changes all the time.

The "weave", or order of csets, certainly changes each time Linus does a
'bk pull'. I wonder if a 'cset_order' file would be useful -- an
automated job uses BK to export the weave for a specific point in time.
One could use that to glue the csets together, perhaps?

WRT cset numbers, ignore them. Each cset has a unique key. When
setting up the 2.5 snapshot cron job, Linus asked me to export this key
so that the definitive top-of-tree may be identified, regardless of cset
number. Here is an example:
ftp://ftp.kernel.org/pub/linux/kernel/v2.5/snapshots/patch-2.5.63-bk6.key

Jeff

Horst von Brand

unread,

Mar 3, 2003, 8:09:28 PM3/3/03

to Jeff Garzik, nickn, H. Peter Anvin, linux-...@vger.kernel.org

Jeff Garzik <jga...@pobox.com> said:

[...]

> That may be possible with OpenCM, but it's a bit of a stretch for the
> other existing SCMs. Regardless, if BK can export metadata to an open
> format (such as a defined XML spec),

Like something quite as obscure as unidiff?

> then the SCM interchange
> possibilities are only limited by a programmer's time and imagination.

Then we are done ;-)
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

H. Peter Anvin

unread,

Mar 3, 2003, 8:11:21 PM3/3/03

to Horst von Brand, Jeff Garzik, nickn, linux-...@vger.kernel.org

Horst von Brand wrote:
> Jeff Garzik <jga...@pobox.com> said:
>
> [...]
>
>
>>That may be possible with OpenCM, but it's a bit of a stretch for the
>>other existing SCMs. Regardless, if BK can export metadata to an open
>>format (such as a defined XML spec),
>
> Like something quite as obscure as unidiff?
>

Unidiff isn't metadata. Jeff is talking about the metadata which gives
context to the unidiffs.

-hpa

Martin J. Bligh

unread,

Mar 3, 2003, 9:31:50 PM3/3/03

to David Lang, Andrea Arcangeli, Larry McVoy, Jeff Garzik, Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz

>> Just curious, this also means that at least around the 80% of merges
>> in Linus's tree is submitted via a bitkeeper pull, right?
>>
>> Andrea
>
> remember how Linus works, all normal patches get copied into a single
> large patch file as he reads his mail then he runs patch to apply them to
> the tree. I think this would make the entire batch of messages look like
> one cset.

I think he also creates subtrees, applies flat patches to those, then
merges the subtrees back into his main tree as a bk-merge ... won't that
distort the stats?

M.

Linus Torvalds

unread,

Mar 4, 2003, 12:31:49 AM3/4/03

to linux-...@vger.kernel.org

In article <592860000.1046744403@flay>,

Martin J. Bligh <mbl...@aracnet.com> wrote:
>>> Just curious, this also means that at least around the 80% of merges
>>> in Linus's tree is submitted via a bitkeeper pull, right?
>>>
>>> Andrea
>>
>> remember how Linus works, all normal patches get copied into a single
>> large patch file as he reads his mail then he runs patch to apply them to
>> the tree. I think this would make the entire batch of messages look like
>> one cset.

Nope. All my tools are very careful about making single cset's from
single patches. Without that, you can't get good history and changelog
files, and you can't undo or test single patches.

What I _do_ do is to "batch up" patches, which you can see if you take a
look at the times for various changesets. I will save many emails to one
single "pending" file (I call it "doit"), and then my tools will apply
each of them in sequence as one "batch" of files. You can see the effect
of this by just doing

bk changes | grep ChangeSet | less

and seeing how the changes are just a few seconds apart. For example,
here's the last batch I have from Andrew, and you can see that my
scripts applied 19 patches in sequence:

Chan...@1.1058, 2003-03-02 20:38:36-08:00, ak...@digeo.com
Chan...@1.1057, 2003-03-02 20:38:23-08:00, ak...@digeo.com
Chan...@1.1056, 2003-03-02 20:38:15-08:00, ak...@digeo.com
Chan...@1.1055, 2003-03-02 20:38:09-08:00, ak...@digeo.com
Chan...@1.1054, 2003-03-02 20:38:02-08:00, ak...@digeo.com
Chan...@1.1053, 2003-03-02 20:37:54-08:00, ak...@digeo.com
Chan...@1.1052, 2003-03-02 20:37:48-08:00, ak...@digeo.com
Chan...@1.1051, 2003-03-02 20:37:41-08:00, ak...@digeo.com
Chan...@1.1050, 2003-03-02 20:37:34-08:00, ak...@digeo.com
Chan...@1.1049, 2003-03-02 20:37:26-08:00, ak...@digeo.com
Chan...@1.1048, 2003-03-02 20:37:19-08:00, ak...@digeo.com
Chan...@1.1047, 2003-03-02 20:37:13-08:00, ak...@digeo.com
Chan...@1.1046, 2003-03-02 20:37:07-08:00, ak...@digeo.com
Chan...@1.1045, 2003-03-02 20:36:59-08:00, ak...@digeo.com
Chan...@1.1044, 2003-03-02 20:36:51-08:00, ak...@digeo.com
Chan...@1.1043, 2003-03-02 20:36:44-08:00, ak...@digeo.com
Chan...@1.1042, 2003-03-02 20:36:38-08:00, ak...@digeo.com
Chan...@1.1041, 2003-03-02 20:36:31-08:00, ak...@digeo.com
Chan...@1.1040, 2003-03-02 20:36:23-08:00, ak...@digeo.com

roughly 8 seconds between patch (that's how long it takes the scripts to
commit between each change. Imagine doing a commit in 8 seconds using
CVS..)

But all 19 emails ended up as separate changesets, and the only thing
the "batching" does is to make _me_ work more efficiently (ie I don't go
back and forth between reading email and applying one patch: I save the
batch away, I then look through the patches individually and possibly
edit and clean up the email commentary on it, and then I apply them all
in one go).

>I think he also creates subtrees, applies flat patches to those, then
>merges the subtrees back into his main tree as a bk-merge ... won't that
>distort the stats?

Yes, it will.

I try to generally avoid doing parallell development with myself, partly
because it ends up _looking_ really confusing in revtool and thus
sometimes hard to find stuff, but partly just because I'm lazy and I
consider my main tree to be the "merge tree", so by default everything
_should_ go into that one tree if I really do want to merge it.

However, sometimes I get a big series of patches that was generated
against some specific kernel version, and then I'll set up a parallell
tree with the top at that specific version, so that I re-create exactly
what the original developer was working on. That way I avoid patch
rejects, and can take advantage of the automatic BK merge features.

It's not that common, though - I do it mostly if I know or suspect that
something will clash with existing changes in my tree, or if it's
something so fundamental that I want a separate branch for it (this was
the case for a lot of the fundamental VFS stuff Al Viro did earlier in
2.5.x, for example).

Linus

Jeff Garzik

unread,

Mar 4, 2003, 9:52:08 AM3/4/03

to dp...@rogers.com, Linus Torvalds, linux-...@vger.kernel.org

Dimitrie O. Paun wrote:

> On March 4, 2003 12:29 am, Linus Torvalds wrote:
>
>>the case for a lot of the fundamental VFS stuff Al Viro did earlier in
>
>

> Whatever happened to Al BTW? I really miss his patches as well as his
> comments -- for the longest time I would follow l-k just to read his
> posts! I really hope he's alright, and that we'll hear from him soon...

He's still alive and still smoking cigarettes, at least... ;-)

Jeff

David Woodhouse

unread,

Mar 4, 2003, 11:17:38 AM3/4/03

to Pavel Machek, Jeff Garzik, Alan Cox, Arador, Adam J. Richter, and...@suse.de, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

On Mon, 2003-03-03 at 00:10, Pavel Machek wrote:
> [bk's on-disk format is quite reasonable; it might be okay to reuse
> that.]

I disagree. Keeping the checked-out files _outside_ the repository, and
being able to have multiple checked-out trees from the same repository
with uncommitted changes outstanding while you pull from a remote
repository, etc, is useful.

cvs with cvsup does some of this but has obvious disadvantages, not
least of which being the one-way nature of change propagation. SVN and a
yet-to-be-invented SVNup (hopefully not in Modula-3) this time) may be a
lot closer to what we want.

--
dwmw2

Pavel Machek

unread,

Mar 4, 2003, 11:28:47 AM3/4/03

to David Woodhouse, Jeff Garzik, Alan Cox, Arador, Adam J. Richter, and...@suse.de, Linux Kernel Mailing List, pa...@janik.cz, pa...@ucw.cz, h...@infradead.org

Hi!

> > [bk's on-disk format is quite reasonable; it might be okay to reuse
> > that.]
>
> I disagree. Keeping the checked-out files _outside_ the repository, and
> being able to have multiple checked-out trees from the same repository
> with uncommitted changes outstanding while you pull from a remote
> repository, etc, is useful.

Agreed, but bk's SCCS-based format does not prevent you from keeping
checked-out files outside repository or from having multiple
checked-out trees. In fact I'm doing exactly that with bitbucket.

Pavel
--
Horseback riding is like software...
...vgf orggre jura vgf serr.

Olaf Hering

unread,

Mar 4, 2003, 6:40:52 PM3/4/03

to Joel Becker, Jeff Garzik, H. Peter Anvin, linux-...@vger.kernel.org

On Mon, Mar 03, Joel Becker wrote:

> On Sun, Mar 02, 2003 at 12:12:58PM -0500, Jeff Garzik wrote:
> > My counter-question is, why not improve an _existing_ open source SCM to
> > read and write BitKeeper files? Why do we need yet another brand new
> > project?
>
> Normally, I'd agree with you Jeff. However, none of the current
> open source SCM systems are architected in a way that can operate like
> BK.

Ah, finally we got to the root of the "problem".

--
A: No.
Q: Should I include quotations after my reply?

Pavel Machek

unread,

Mar 7, 2003, 6:09:06 AM3/7/03

to Jeff Garzik, H. Peter Anvin, linux-...@vger.kernel.org

Hi!

> While people would certainly use it, I can't help but think that a BK
> clone would damage other open source SCM efforts. I call this the
> "SourceForge Syndrome":

Where do I get pills for that one? :-)

> Q. I found a problem/bug/annoyance, how do I solve it?
> A. Clearly, a brand new sourceforge project is called for.

>
> My counter-question is, why not improve an _existing_ open source SCM
> to read and write BitKeeper files?

I of course thought about that (I'm not yet
hit *that* hard by sf syndrome :-), but:

a) I might extend cssc, but bitbucket is
naturally layer *over* cssc, and cssc
is GNU program (copyright assignment
needed) and is C++. I do not feel like
writing new code in C++, and I do not
like their codingstyle.

b) take something else, *merge cssc to
it*, then add my stuff. Ouch. svn is out
because of licensing, cvs is not powerfull
enough, and I do not like arch. (I did
not know abojt opencm, sorry). Ouch
and this would mean fork, anyway,
because developers of that project
would probably not be happy about
those copyrights (FSF!).

c) so new sf project is indeed way to go :-(.

I hope you understand now,

Pavel

--
Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...

Pavel Machek

unread,

Mar 7, 2003, 6:12:00 AM3/7/03

to Olivier Galibert, linux-...@vger.kernel.org

Hi!

> But anyway, what made[1] Bitkeeper suck less is the real DAG
> structure. Neither arch nor subversion seem to have understood that
> and, as a result, don't and won't provide the same level of semantics.
> Zero hope for Linus to use them, ever. They're needed for any
> decently distributed development process.

Can you elaborate? I thought that this
"real DAG" structure is more or less
equivalent to each developer having
his owm CVS repository...

> Hell, arch is still at the update-before-commit level. I'd have hoped
> PRCS would have cured that particular sickness in SCM design ages ago.
>
> Atomicity, symbolic links, file renames, splits (copy) and merges (the
> different files suddendly ending up being the same one) are somewhat
> important, but not the interesting part. A good distributed DAG
> structure and a quality 3-point version "merge" is what you actually
> need to build bk-level SCMs.

If I fixed CVS renames, added atomic
commits, splits and merges, and gave each
developer his own CVS repository,
would I be in same league as bk?
Ie 10 times slower but equivalent
functionality?

(3 point merge should be doable for CVS
to and would be good thing anyway,
right?)

Pavel Machek

unread,

Mar 7, 2003, 6:14:20 AM3/7/03

to Joel Becker, Jeff Garzik, H. Peter Anvin, linux-...@vger.kernel.org

Hi!

> The one thing BK does that makes it worthwhile is the three-way
> merge. This (and the resulting DAG) make handling code from Alan, from
> Linus, from Andrew, and from everyone else possible. With CVS,
> subversion, or any other SCM I've worked with, you have to hand merge
> anything past the first patch. Ugh.

What's so magical about 3way merge?
I thought it would be easy to do even
in CVS...

Pavel
--
Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...

-

Tupshin Harper

unread,

Mar 7, 2003, 6:26:38 AM3/7/03

to Pavel Machek, linux-...@vger.kernel.org

Pavel Machek wrote:

>it*, then add my stuff. Ouch. svn is out
>because of licensing, cvs is not powerfull
>

Could you or somebody else explain this repeated claim that the
Subversion licensing is problematic?

I don't have anything to do with the project, but a quick perusal of the
license doesn't reveal any problems. It's basically an Apache/BSD
license (pasted below for your reading pleasure).

-Tupshin

------------Subversion copyright---------------

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

3. The end-user documentation included with the redistribution, if
any, must include the following acknowledgment: "This product includes
software developed by CollabNet (http://www.Collab.Net/)."
Alternately, this acknowledgment may appear in the software itself, if
and wherever such third-party acknowledgments normally appear.

4. The hosted project names must not be used to endorse or promote
products derived from this software without prior written
permission. For written permission, please contact in...@collab.net.

5. Products derived from this software may not use the "Tigris" name
nor may "Tigris" appear in their names without prior written
permission of CollabNet.

THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL COLLABNET OR ITS CONTRIBUTORS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

This software consists of voluntary contributions made by many
individuals on behalf of CollabNet.

Pavel Machek

unread,

Mar 7, 2003, 6:31:06 AM3/7/03

to Tupshin Harper, linux-...@vger.kernel.org

Hi!

> >it*, then add my stuff. Ouch. svn is out
> >because of licensing, cvs is not powerfull
> >
> Could you or somebody else explain this repeated claim that the
> Subversion licensing is problematic?
>
> I don't have anything to do with the project, but a quick perusal of the
> license doesn't reveal any problems. It's basically an Apache/BSD
> license (pasted below for your reading pleasure).

You snipped what you should not snip: I'd need to merge CSSC (GPL-ed)
and svn (BSD with advertising). I may not do that. svn license is
okay, but merging it with CSSC is not possible.

Pavel
--
Horseback riding is like software...
...vgf orggre jura vgf serr.

Olivier Galibert

unread,

Mar 7, 2003, 7:13:36 AM3/7/03

to Pavel Machek, linux-...@vger.kernel.org

On Thu, Mar 06, 2003 at 05:18:53PM +0100, Pavel Machek wrote:
> Can you elaborate? I thought that this
> "real DAG" structure is more or less
> equivalent to each developer having
> his owm CVS repository...

Nope. CVS uses RCS, and RCS only knows about trees, not graphs.
Specifically, branch merges are not tagged as such, and as a result
CVS is unable to pick up the best grandparent when doing a merge.
That's the main reason of why branching under CVS is so painful
(forgetting about the performance issues).

> If I fixed CVS renames, added atomic
> commits, splits and merges, and gave each
> developer his own CVS repository,
> would I be in same league as bk?
> Ie 10 times slower but equivalent
> functionality?

Nope. You'll find out that this per-developper repository quickly
needs to become a per-branch repository, and even need you need to
write somewhere when the merges with other repositories happen, and
you end up with the DAG again.

Another way to see it is that CVS and friends use an
update-then-commit scheme, which is proven crap because you lose the
working version you had when you do the update to get a result that is
sometimes interesting. Nice systems, like PRCS and bk, first commit
to a new branch (no update necessary obviously) then merge in the
mainline. As a side effect, they are Good with branches. Bk's main
quality over PRCS is the distribution. This lack is what makes PRCS
essentially unusable for serious open source projects. Otherwise
they're semantically the same.

> (3 point merge should be doable for CVS
> to and would be good thing anyway,
> right?)

Technically, CVS does 3-point merge, it's just crap at finding the
third point, and diff3 -m (which is what is used under the hood) isn't
that spectacular either.

You can see the merge operation in a different way. You take 3
versions of your complete repository A, B and R (reference). You
compute the deltas dA and dB so that A=dA(R) and B=dB(R). Then you
try to build M=dA(dB(R))=dB(dA(R)), when it makes sense (not only the
deltas aren't necessarily commutative, they can't even always apply
one after the other). When it doesn't work there are conflicts to be
resolved by the user. You can see that when it workds M=dA(B)=dB(A).

You can do a lot of things with that, merging branches is just one of
them. You can back out patches from within the history for instance
(D->E->F, merge D and F using E as reference removes the D->E patch
from F).

The trick is, the "simplest" your deltas are the lowest the conflict
probability is. That's where the DAG kicks in. For a branch merge,
the lowest conflict probability of conflict tends to occur when the
two deltas are a linear combination of small user-made deltas, with no
delta common between the two chains. I.e. the best reference to use
is the latest merge point. The DAG allows you to find it. CVS
doesn't note the merge points so it always goes all the way where the
branch is rooted, ensuring that the two delta chains have a large
common prefix.

Sub-optimal reference point plus diff3's algorithm being what it is
makes the CVS branches plain unusable. Multiple repositories won't
fix that, since you'll need to merge between repositories anyway.

OG.

Pavel Machek

unread,

Mar 7, 2003, 7:33:55 AM3/7/03

to Olivier Galibert, linux-...@vger.kernel.org

Hi!

> > Can you elaborate? I thought that this
> > "real DAG" structure is more or less
> > equivalent to each developer having
> > his owm CVS repository...
>
> Nope. CVS uses RCS, and RCS only knows about trees, not graphs.
> Specifically, branch merges are not tagged as such, and as a result
> CVS is unable to pick up the best grandparent when doing a merge.
> That's the main reason of why branching under CVS is so painful
> (forgetting about the performance issues).

I see. But I still somehow can not understand how merging is
possible. Merge possibly means work-by-hand, right? So it is not as
simple as noting that 1.8 and 1.7.1.1 were merged into 1.9, no? [And
what if developer did really crap job at merging that, like dropping
all changes from 1.7.1.1?]

> > If I fixed CVS renames, added atomic
> > commits, splits and merges, and gave each
> > developer his own CVS repository,
> > would I be in same league as bk?
> > Ie 10 times slower but equivalent
> > functionality?
>
> Nope. You'll find out that this per-developper repository quickly
> needs to become a per-branch repository, and even need you need to
> write somewhere when the merges with other repositories happen, and
> you end up with the DAG again.

Yep, that's what I wanted to know. [I see per-branch repository is
pain, but it helps me to understand that.]

Thanx for your explanations,

Pavel
--
Horseback riding is like software...
...vgf orggre jura vgf serr.

Olivier Galibert

unread,

Mar 7, 2003, 11:55:07 AM3/7/03

to Pavel Machek, linux-...@vger.kernel.org

On Fri, Mar 07, 2003 at 01:32:37PM +0100, Pavel Machek wrote:
> > Nope. CVS uses RCS, and RCS only knows about trees, not graphs.
> > Specifically, branch merges are not tagged as such, and as a result
> > CVS is unable to pick up the best grandparent when doing a merge.
> > That's the main reason of why branching under CVS is so painful
> > (forgetting about the performance issues).
>
> I see. But I still somehow can not understand how merging is
> possible. Merge possibly means work-by-hand, right? So it is not as
> simple as noting that 1.8 and 1.7.1.1 were merged into 1.9, no? [And
> what if developer did really crap job at merging that, like dropping
> all changes from 1.7.1.1?]

Calling A and B the versions to merge and R the reference, diff3 uses
this algorithm (probably the simplest possible):
- Compute the diff between A and R, call it dA
- Compute the diff between B and R, call it dB
- Merge the two diffs into one (and conflict where you can't)
- Apply the merged diff to R

Better algorithms do the alignments per-character instead of per-line,
detect moved and changed functions, detect duplicate inserts, etc.
None, of course, is perfect, as Larry could tell you.

Now if the development went that way:

1.7 -> 1.7.1.1 (branching, i.e. copy)
v v
v 1.7.1.2
1.8 v
v -> 1.7.1.3 (merge)
1.9 v
v v
1.10 v
v -> 1.7.1.4 (merge)
v v
v 1.7.1.5
v v
1.11 <- (merge)

Pretty much standard, a developper created a new branch, made some
changes in it, synced with mainline, synced with mailine again a
little later, made some new changes and finally folded the branch back
in the mainline. Let's admit the developper changes don't conflict by
themselves with the mainline changes.

CVS, for all the merges, is going to pick 1.7 as the reference. The
first time, for 1.7.1.3, it's going to work correctly. It will fuse
the 1.7->1.8 patch with the 1.7.1.1->1.7.1.2 patch and apply the
result to 1.7 to get 1.7.1.3. The two patches have no reason to
overlap. 1.7.1.2->1.7.1.3 will essentially be identical to 1.7->1.8,
and 1.8->1.7.1.3 will essentially be identical to 1.7.1.2->1.7.1.3.

As soon as the next merge, i.e 1.7.1.4, it breaks. CVS is going to
try to fuse the 1.7->1.10 patch with the 1.7->1.7.1.3 patch. But
1.7->1.10 = 1.7->1.8+1.8->1.10 and 1.7->1.7.1.3 ~= 1.7->1.7.1.2+1.7->1.8.
So they have components in common, hance they _will_ conflict.

If CVS had taken the latest common ancestor by keeping in the
repository the existence of the 1.8->1.7.1.3 link, it would have taken
the 1.8 version as the reference. The patches to fuse would have been
1.8->1.10 and 1.8->1.7.1.3, which have no reason to conflict.

Same for the next merge, the optimal merge point is in that case 1.10,
and it ends up being a null merge, i.e. 1.11 is a copy of 1.7.1.5.

You can see the final structure is a DAG, with each node having a max
of 2 ancestors. And that's what PRCS and bk are working with,
fundamentally.

OG.

Geert Uytterhoeven

unread,

Mar 7, 2003, 12:16:52 PM3/7/03

to Olivier Galibert, Pavel Machek, Linux Kernel Development

^^^^^^^^^^^^^^^^
1.7.1.1->1.7.1.2, I assume?

> As soon as the next merge, i.e 1.7.1.4, it breaks. CVS is going to
> try to fuse the 1.7->1.10 patch with the 1.7->1.7.1.3 patch. But
> 1.7->1.10 = 1.7->1.8+1.8->1.10 and 1.7->1.7.1.3 ~= 1.7->1.7.1.2+1.7->1.8.
> So they have components in common, hance they _will_ conflict.
>
> If CVS had taken the latest common ancestor by keeping in the
> repository the existence of the 1.8->1.7.1.3 link, it would have taken
> the 1.8 version as the reference. The patches to fuse would have been
> 1.8->1.10 and 1.8->1.7.1.3, which have no reason to conflict.
>
> Same for the next merge, the optimal merge point is in that case 1.10,
> and it ends up being a null merge, i.e. 1.11 is a copy of 1.7.1.5.
>
> You can see the final structure is a DAG, with each node having a max
> of 2 ancestors. And that's what PRCS and bk are working with,
> fundamentally.

Aha, so that's why my `mergetree' script (which basically is some directory
recursion around plain RCS merge, with additional support for hardlinking
identical files) works better than CVS, when I merge e.g. linux-2.5.64 and
linux-m68k-2.5.63 into linux-m68k-2.5.64. It always uses the latest common
ancestor (linux-2.5.63)...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Pavel Machek

unread,

Mar 7, 2003, 2:09:40 PM3/7/03

to Olivier Galibert, Pavel Machek, linux-...@vger.kernel.org

Hi!

So, basically, if branch was killed and recreated after each merge
from mainline, problem would be solved, right?

Pavel
--
Horseback riding is like software...
...vgf orggre jura vgf serr.

Eli Carter

unread,

Mar 7, 2003, 2:26:20 PM3/7/03

to Pavel Machek, Olivier Galibert, linux-...@vger.kernel.org

You would lose the history that branch gave you.
Or do you mean create a new branch (with a new name) at the point where
the old branch was merged, and no longer use the old branch for commits?

Eli
--------------------. "If it ain't broke now,
Eli Carter \ it will be soon." -- crypto-gram
eli.carter(a)inet.com `-------------------------------------------------

Pavel Machek

unread,

Mar 7, 2003, 3:30:37 PM3/7/03

to Eli Carter, Pavel Machek, Olivier Galibert, linux-...@vger.kernel.org

Hi!

> >So, basically, if branch was killed and recreated after each merge
> >from mainline, problem would be solved, right?
> >
> > Pavel
>
> You would lose the history that branch gave you.
> Or do you mean create a new branch (with a new name) at the point where
> the old branch was merged, and no longer use the old branch for
> >commits?

Yes, that's what I meant.

Pavel
--
Horseback riding is like software...
...vgf orggre jura vgf serr.

Linus Torvalds

unread,

Mar 7, 2003, 6:18:34 PM3/7/03

to linux-...@vger.kernel.org

In article <20030307190...@atrey.karlin.mff.cuni.cz>,

Pavel Machek <pa...@suse.cz> wrote:
>
>So, basically, if branch was killed and recreated after each merge
>from mainline, problem would be solved, right?

Wrong.

Now think three trees. Each merging back and forth between each other.

Or, in the case of something like the Linux kernel tree, where you don't
have two or three trees. You've got at least 20 actively developed
concurrent trees with branches at different points.

Trust me. CVS simple CANNOT do this. You need the full information.

Give it up. BitKeeper is simply superior to CVS/SVN, and will stay that
way indefinitely since most people don't seem to even understand _why_
it is superior.

Linus

Olaf Dietsche

unread,

Mar 7, 2003, 7:20:19 PM3/7/03

to Olivier Galibert, Pavel Machek, linux-...@vger.kernel.org

Olivier Galibert <gali...@pobox.com> writes:

> sometimes interesting. Nice systems, like PRCS and bk, first commit
> to a new branch (no update necessary obviously) then merge in the
> mainline. As a side effect, they are Good with branches. Bk's main
> quality over PRCS is the distribution. This lack is what makes PRCS
> essentially unusable for serious open source projects. Otherwise
> they're semantically the same.

So, what you say is: add distribution to PRCS and you're done?

Regards, Olaf.

Zack Brown

unread,

Mar 8, 2003, 5:53:56 PM3/8/03

to Linus Torvalds, linux-...@vger.kernel.org

Hi Linus,

On Fri, Mar 07, 2003 at 11:16:47PM +0000, Linus Torvalds wrote:
> In article <20030307190...@atrey.karlin.mff.cuni.cz>,
> Pavel Machek <pa...@suse.cz> wrote:
> >
> >So, basically, if branch was killed and recreated after each merge
> >from mainline, problem would be solved, right?
>
> Wrong.
>
> Now think three trees. Each merging back and forth between each other.
>
> Or, in the case of something like the Linux kernel tree, where you don't
> have two or three trees. You've got at least 20 actively developed
> concurrent trees with branches at different points.
>
> Trust me. CVS simple CANNOT do this. You need the full information.
>
> Give it up. BitKeeper is simply superior to CVS/SVN, and will stay that
> way indefinitely since most people don't seem to even understand _why_
> it is superior.

You make it sound like no one is even interested ;-). But it's not true! A
lot of people currently working on alternative version control systems would
like very much to know what it would take to satisfy the needs of kernel
development. Maybe, being on the inside of the process and well aware of
your own needs, you don't realize how difficult it is to figure these things
out from the outside. I think only very few people (perhaps only one) really
understand this issue, and they aren't communicating with the horde of people
who really want to help, if only they knew how.

My impression is that Pavel is really smart and pretty close to the core of
kernel development. But you say even he doesn't get it? Come on! Throw
us a bone, willya!? ;-)

Be well,
Zack

>
> Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majo...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Zack Brown

Larry McVoy

unread,

Mar 8, 2003, 7:07:32 PM3/8/03

to Zack Brown, Linus Torvalds, linux-...@vger.kernel.org

> > Give it up. BitKeeper is simply superior to CVS/SVN, and will stay that
> > way indefinitely since most people don't seem to even understand _why_
> > it is superior.
>
> You make it sound like no one is even interested ;-). But it's not true! A
> lot of people currently working on alternative version control systems would
> like very much to know what it would take to satisfy the needs of kernel
> development. Maybe, being on the inside of the process and well aware of
> your own needs, you don't realize how difficult it is to figure these things
> out from the outside. I think only very few people (perhaps only one) really
> understand this issue, and they aren't communicating with the horde of people
> who really want to help, if only they knew how.

[Long rant, summary: it's harder than you think, read on for the details]

There are parts of BitKeeper which required multiple years of thought by
people a lot smarter than me. You guys are under the mistaken impression
that BitKeeper is my doing; it's not. There are a lot of people who
work here and they have some amazing brains. To create something like
BK is actually more difficult than creating a kernel.

To understand why, think of BK as a distributed, replicated, version
controlled user level file system with no limits on any of the file system
events which may happened in parallel. Now put the changes back together,
correctly, no matter how much parallelism there has been. Pavel hasn't
understood anything but a tiny fraction of the problem space yet, he
just doesn't realize it. Even Linus doesn't know how BitKeeper works,
we haven't told him and I can tell from his explanations that he gets
part of it but not most of it. That's not a slam on Linus or Pavel or
anyone else. I'm just trying to tell you guys that this stuff is a lot
harder than you think. I've told people that before, like the SVN and
OpenCM guys, and the leaders of both those efforts showed up later and
said "yup, you're right, it is a hell of a lot harder than it looks".
And they are nowhere near being able to do what BK does. Ask them if
you have doubts about what I am saying.

Merging is just one of the complex areas. It gets all the attention
because it is hard enough but easy enough that people like to work on it.
It's actually fun to work on merging. Ditto for the graph structure,
that's trivial. The other parts aren't fun and they are more difficult
so they don't get talked about. But they are more important because
the user has no idea how to deal with them and users do know how to deal
with merge problems, lots of you understand patch rejects.

Rename handling in a distributed system is actually much harder than
getting the merging done. It doesn't seem like it is, but we've rewritten
how we do it 3 times and are working on a 4th all because we've been
forced to learn about all the different ways that people move things
around. CVS doesn't have any of the rename problems because it doesn't
do them, and SVN doesn't have 1/1000th of the problems we do because it
is centralized. Centralized means that there is never any confusion
about where something should go, you can only create one file in one
directory entry because there is only one directory entry available.
In BK's case, there can be an infinite number of different files which
all want to be src/foo.c.

Symbolic tags are really hard. What?!? What could be easier than adding
a symbolic label on a revision? Well, in a centralized system it is
trivial but in a distributed system you have to handle the fact that
the same symbol can be put on multiple revs. It's the same problem as
the file names, just a variation. Add to that the fact that time can
march forward or backwards in a distributed system, even if all the
events were marching forward, and the fun really starts. I personally
have redone the tags support about 6 times and it still isn't right.

Security semantics are hard in a distributed system. Where do you
put them, how do you integrate them into the system, what happens when
people try and work around them? In CVS or SVN you can simply lock down
the server and not worry about it, but in BK, the user has the revision
history and they are root, they can do whatever they want.

Time semantics are the hardest of all. You simply can't depend on time
being correct. It goes forwards, backwards, and sideways on you and
if you think you can use time you don't have the slightest idea of the
scope of the problem. Again, not a problem for CVS/SVN/whatever, all the
deltas are made against the same clock. Not true in a distributed system.

That's a taste of what it is like. You have to get all of those right
and the many other ones that I didn't tell you about or you might as
well not bother. Why? Because the problems are very subtle and there
isn't any hope of getting an end user to figure out a subtle problem,
they don't have the time or the inclination. We've seen users throw away
weeks of work just because they didn't understand the merge conflict so
they start over on an updated tree. And those people will understand
the rename corner cases? Not a chance.

The main point here is that if you think that BK happened quickly,
by one guy, you are nuts. It started in May of 1997, that's almost 6
years ago, not the 2 years that Pavel thinks, and I had already written
a complete version control system prior to that, so this was round two.
Even with that knowledge, I wasn't near enough to get BK to where it is
today, there is more than 40 man years of effort in BK so far. A bunch
of people, working 60-90 hour weeks, for almost 6 years. Not average
people, either, any one of these people would be a staff engineer or
better at Sun (salaries for those people are in the $115K - $140K range).

The disbelievers think that I'm out here waving the "it's too hard"
flag so you'll go away. And the arrogant people think that they are
smarter than us and can do it quicker. I doubt it but by all means go
for it and see what you can do. Just file away a copy of this and let
me know what you think three or four years from now.

Oh, by the way, you'll need a business model, I found that out 2 or 3
years into it when my savings ran out. Oh, my, you might not be able
to GPL it! Why it might even end up being just like BitKeeper with
an evil corporate dude named Pavel running the show. Believe me, if
that happens, I'll be here to rake him over the coals on a daily basis
for being such an evil person who doesn't understand the point of free
software. I can't wait.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

Davide Libenzi

unread,

Mar 8, 2003, 8:13:32 PM3/8/03

to Larry McVoy, Zack Brown, Linus Torvalds, Linux Kernel Mailing List

On Sat, 8 Mar 2003, Larry McVoy wrote:

> > > Give it up. BitKeeper is simply superior to CVS/SVN, and will stay that
> > > way indefinitely since most people don't seem to even understand _why_
> > > it is superior.
> >
> > You make it sound like no one is even interested ;-). But it's not true! A
> > lot of people currently working on alternative version control systems would
> > like very much to know what it would take to satisfy the needs of kernel
> > development. Maybe, being on the inside of the process and well aware of
> > your own needs, you don't realize how difficult it is to figure these things
> > out from the outside. I think only very few people (perhaps only one) really
> > understand this issue, and they aren't communicating with the horde of people
> > who really want to help, if only they knew how.
>
> [Long rant, summary: it's harder than you think, read on for the details]
>
> There are parts of BitKeeper which required multiple years of thought by
> people a lot smarter than me. You guys are under the mistaken impression
> that BitKeeper is my doing; it's not. There are a lot of people who
> work here and they have some amazing brains. To create something like
> BK is actually more difficult than creating a kernel.

Larry, how many years are that you're working as a developer and side by
side with developers ? 15 maybe 20 ? Do you know what's the best way to
keep developers out of doing something ? Well, just say the task is
trivial, easy, for dummies. And you will see developers stay away from the
project like cats from water. Try, even remotely, to dress the project
with complexity, and they'll come in storms ...

- Davide

Horst von Brand

unread,

Mar 8, 2003, 9:08:34 PM3/8/03

to Pavel Machek, Olivier Galibert, linux-...@vger.kernel.org

Pavel Machek <pa...@suse.cz> said:

[...]

> So, basically, if branch was killed and recreated after each merge
> from mainline, problem would be solved, right?

Who is branch, who is mainline? The branch owner _will_ be pissed off if
his head version changes each time he syncronizes. What if mainline dies,
and the official line moves to one of the branches? What happens when there
aren't just two, but a dozen developers swizzling individual csets from
each other (not necesarily just resyncing with each other)? If said
developers also apply random patches from a common mailing list?

This is _much_ harder than it looks on the surface.

--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

Zack Brown

unread,

Mar 8, 2003, 9:47:28 PM3/8/03

to Larry McVoy, Linus Torvalds, linux-...@vger.kernel.org

On Sat, Mar 08, 2003 at 04:05:14PM -0800, Larry McVoy wrote:
> Zack Brown wrote:

> > Linus Torvalds wrote:
> > > Give it up. BitKeeper is simply superior to CVS/SVN, and will stay that
> > > way indefinitely since most people don't seem to even understand _why_
> > > it is superior.
> >
> > You make it sound like no one is even interested ;-). But it's not true! A
> > lot of people currently working on alternative version control systems would
> > like very much to know what it would take to satisfy the needs of kernel
> > development.
>

> [Long rant, summary: it's harder than you think, read on for the details]

[skipping long description]

OK, so here is my distillation of Larry's post.

Basic summary: a distributed, replicated, version controlled user level file

system with no limits on any of the file system events which may happened

in parallel. All changes must be put correctly back together, no matter how

much parallelism there has been.

* Merging.

* The graph structure.

* Distributed rename handling. Centralized systems like Subversion don't
have as many problems with this because you can only create one file in

one directory entry because there is only one directory entry available.

In distributed rename handling, there can be an infinite number of different
files which all want to be src/foo.c. There are also many rename corner-cases.

* Symbolic tags. This is adding a symbolic label on a revision. A distributed
system must handle the fact that the same symbol can be put on multiple
revisions. This is a variation of file renaming. One important thing to
consider is that time can go forward or backward.

* Security semantics. Where should they go? How can they be integrated
into the system? How are hostile users handled when there is no central
server to lock down?

* Time semantics. A distributed system cannot depend on reported time
being correct. It can go forward or backward at any rate.

I'd be willing to maintain this as the beginning of a feature list and
post it regularly to lkml if enough people feel it would be useful and not
annoying. The goal would be to identify the features/problems that would
need to be handled by a kernel-ready version control system.

Be well,
Zack

--
Zack Brown

Roman Zippel

unread,

Mar 8, 2003, 10:20:17 PM3/8/03

to Zack Brown, Larry McVoy, Linus Torvalds, linux-...@vger.kernel.org

Hi,

On Sat, 8 Mar 2003, Zack Brown wrote:

> * Distributed rename handling. Centralized systems like Subversion don't
> have as many problems with this because you can only create one file in
> one directory entry because there is only one directory entry available.
> In distributed rename handling, there can be an infinite number of different
> files which all want to be src/foo.c. There are also many rename corner-cases.

This actually a very bk specific problem, because the real problem under
bk there can be only one src/SCCS/s.foo.c. A separate repository doesn't
have this problem, because it has control over the naming in the
repository and the original naming is restored with an explicit checkout.
In this context it will be really interesting to see how Larry wants to
implement "lines of development" (aka branches which don't suck) and
also maintain SCCS compatibility.

bye, Roman

Linus Torvalds

unread,

Mar 8, 2003, 10:47:56 PM3/8/03

to Roman Zippel, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

On Sun, 9 Mar 2003, Roman Zippel wrote:
> On Sat, 8 Mar 2003, Zack Brown wrote:
>
> > * Distributed rename handling.
>

> This actually a very bk specific problem, because the real problem under
> bk there can be only one src/SCCS/s.foo.c.

I don't think that is the issue.

[ Well, yes, I agree that the SCCS format is bad, but for other reasons ]

> A separate repository doesn't have this problem

You're wrong.

The problem is _distribution_. In other words, two people rename the same
file. Or two people rename two _different_ files to the same name. Or two
people create two different files with the same name. What happens when
you merge?

None of these are issues for broken systems like CVS or SVN, since they
have a central repository, so there _cannot_ be multiple concurrent
renames that have to be merged much later (well, CVS cannot handle renames
at all, but the "same name creation" issue you can see even with CVS).

With a central repostory, you avoid a lot of the problems, because the
conflicts must have been resolved _before_ the commit ever happens - put
another way, you can never have a conflict in the revision history.

Sepoarate repostitories and SCCS file formats have nothing to do with the
real problem. Distribution is key, not the repository format.

Linus

Roman Zippel

unread,

Mar 8, 2003, 11:35:06 PM3/8/03

to Linus Torvalds, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

Hi,

On Sat, 8 Mar 2003, Linus Torvalds wrote:

> None of these are issues for broken systems like CVS or SVN, since they
> have a central repository, so there _cannot_ be multiple concurrent

> renames that have to be merged much later.

It is possible, you only have to remember that the file foo.c doesn't have
to be called foo.c,v in the repository. SVN should be able to handle this,
it's just lacking important merging mechanisms.
This is actually a key feature I want to see in a SCM system - the ability
to keep multiple developments within the same repository. I want to pull
other source tress into a branch and compare them with other branches and
merge them into new branches.

> Sepoarate repostitories and SCCS file formats have nothing to do with the
> real problem. Distribution is key, not the repository format.

I agree, what I was trying to say is that the SCCS format makes a few
things more complex than they had to be.

bye, Roman

Eric W. Biederman

unread,

Mar 9, 2003, 8:36:14 AM3/9/03

to Roman Zippel, Linus Torvalds, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

Roman Zippel <zip...@linux-m68k.org> writes:

> Hi,
>
> On Sat, 8 Mar 2003, Linus Torvalds wrote:
>
> > None of these are issues for broken systems like CVS or SVN, since they
> > have a central repository, so there _cannot_ be multiple concurrent
> > renames that have to be merged much later.
>
> It is possible, you only have to remember that the file foo.c doesn't have
> to be called foo.c,v in the repository. SVN should be able to handle this,
> it's just lacking important merging mechanisms.
> This is actually a key feature I want to see in a SCM system - the ability
> to keep multiple developments within the same repository. I want to pull
> other source tress into a branch and compare them with other branches and
> merge them into new branches.

In a distributed system everything happens on a branch.

> > Sepoarate repostitories and SCCS file formats have nothing to do with the
> > real problem. Distribution is key, not the repository format.
>
> I agree, what I was trying to say is that the SCCS format makes a few
> things more complex than they had to be.

I don't know, if the problem really changes that much. How do
you pick a globally unique inode number for a file? And then
how do you reconcile this when people on 2 different branches create
the same file and want to merge their versions together?

So as a very rough approximation.
- Distribution is the problem.
- Powerful branching is the only thing that helps this
- Non branch local data (labels/tags) is very difficult.

Eric

Olivier Galibert

unread,

Mar 9, 2003, 9:51:29 AM3/9/03

to linux-...@vger.kernel.org

On Sat, Mar 08, 2003 at 07:42:24PM -0800, Linus Torvalds wrote:
>
> On Sun, 9 Mar 2003, Roman Zippel wrote:
> > On Sat, 8 Mar 2003, Zack Brown wrote:
> >
> > > * Distributed rename handling.
> >
> > This actually a very bk specific problem, because the real problem under
> > bk there can be only one src/SCCS/s.foo.c.
>
> I don't think that is the issue.
>
> [ Well, yes, I agree that the SCCS format is bad, but for other reasons ]

It is a large part of the issue though. If you don't have one
repository file per project file with a name that resembles the
repository's one you find out that the project file name is somewhat
unimportant, just yet another of the metadata to track.

> The problem is _distribution_.

The only problem with distribution is sending as little as possible
over the network. All the problems you're talking about exist with a
single repository as soon as you have decent branches.

> In other words, two people rename the same
> file. Or two people rename two _different_ files to the same name. Or two
> people create two different files with the same name. What happens when
> you merge?

A conflict, what else? The file name is only one of the
characteristics of a file. And BTW, the interesting problem which is
what to do when you find out two different files end up being in fact
the same one is not covered by bk (or wasn't).

OG.

Roman Zippel

unread,

Mar 9, 2003, 10:37:50 AM3/9/03

to Eric W. Biederman, Linus Torvalds, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

Hi,

On 9 Mar 2003, Eric W. Biederman wrote:

> > This is actually a key feature I want to see in a SCM system - the ability
> > to keep multiple developments within the same repository. I want to pull
> > other source tress into a branch and compare them with other branches and
> > merge them into new branches.
>
> In a distributed system everything happens on a branch.

That's true, but with bk you have to use separate directories for that,
which makes cross references between branches more difficult.

> > I agree, what I was trying to say is that the SCCS format makes a few
> > things more complex than they had to be.
>
> I don't know, if the problem really changes that much. How do
> you pick a globally unique inode number for a file? And then
> how do you reconcile this when people on 2 different branches create
> the same file and want to merge their versions together?

Unique identifier are needed for change sets anyway and if you decide
during merge, that two files are identical, at least one branch has to
carry the information that these identifiers point to the same file.

> So as a very rough approximation.
> - Distribution is the problem.

I would rather say, that it's only one (although very important) problem.

bye, Roman

Martin J. Bligh

unread,

Mar 9, 2003, 11:56:57 AM3/9/03

to Roman Zippel, Eric W. Biederman, Linus Torvalds, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

>> > This is actually a key feature I want to see in a SCM system - the ability
>> > to keep multiple developments within the same repository. I want to pull
>> > other source tress into a branch and compare them with other branches and
>> > merge them into new branches.
>>
>> In a distributed system everything happens on a branch.
>
> That's true, but with bk you have to use separate directories for that,
> which makes cross references between branches more difficult.
>
>> > I agree, what I was trying to say is that the SCCS format makes a few
>> > things more complex than they had to be.
>>
>> I don't know, if the problem really changes that much. How do
>> you pick a globally unique inode number for a file? And then
>> how do you reconcile this when people on 2 different branches create
>> the same file and want to merge their versions together?
>
> Unique identifier are needed for change sets anyway and if you decide
> during merge, that two files are identical, at least one branch has to
> carry the information that these identifiers point to the same file.
>
>> So as a very rough approximation.
>> - Distribution is the problem.
>
> I would rather say, that it's only one (although very important) problem.

I think it's possible to get 90% of the functionality that most of us
(or at least I) want without the distributed stuff. If that's 10% of
the effort, would be really nice to have the auto-merging type of
functionality at least.

If the "maintainer" heirarchy was a strict tree structure, where you
send patches to your parent, and receive them from your children, that
doesn't seem to need anything particularly fancy to me.

Personally, I just collect together patches mainly from IBM people here,
test them for functionality and performance, and sync up with Linus every
new release by reapplying them on top of the new tree, and fix the conflicts
by hand. Then I just email the patches as flat diffs to Linus. If I could
get some really basic auto-merge functionality, that would get rid of 90%
of the work, even if it only worked 95% of the time, and showed me what
it had done that patch couldn't have done by itself. I don't see why that
requires all this distributed stuff. If I resync with the latest -bk
snapshot just before I send, the chances of Linus having to do much merge
work is pretty small.

I'm sure Bitkeeper is better than that, and has all sorts of fancy features,
and perhaps Linus even uses some of them. But if I can get 90% of that for
10% of the effort, I'd be happy. Some way to pass Linus some basic metadata
like changelog comments would be good (at the moment, I just slap those atop
the patch, and he edits them, but a basic perl script could hack off a
"comment to Linus" section from a "changelog section", which might save
Linus some editing).

Andrew and Alan seem to work pretty well with flat patches too - Larry
seemed to imply that he thought the merge part of the problem was easy
enough in a non-distributed system ... if anything existant has or could
have that without the distributed stuff and the complexity, would be cool.

If I'm missing something fundamental here, it wouldn't suprise me ;-)

M.

Zack Brown

unread,

Mar 9, 2003, 12:23:21 PM3/9/03

to Martin J. Bligh, Roman Zippel, Eric W. Biederman, Linus Torvalds, Larry McVoy, linux-...@vger.kernel.org

On Sun, Mar 09, 2003 at 08:55:44AM -0800, Martin J. Bligh wrote:
> I think it's possible to get 90% of the functionality that most of us
> (or at least I) want without the distributed stuff. If that's 10% of
> the effort, would be really nice to have the auto-merging type of
> functionality at least.

> If I'm missing something fundamental here, it wouldn't suprise me ;-)

I think the fundamental thing you're missing is that Linus doesn't want it. ;-)

As long as people keep trying to avoid the hard problems that Linus and Larry
keep pointing out, I doubt any effort will get very far. I see a lot of cases
where someone says, "yeah, but we can side-step that problem if we do x,
y, or z." That doesn't help. The question is, what are the actual features
required for a version control system that could win support among the top
kernel developers?

People in the know hint at these features ("naming is really important"),
but the details are apparently complicated enough that no one wants to sit
down and actually describe them. They just hint at the *sort* of problems
they are, and then someone says, "but that's not really a problem because
of x, y, or z that can be done instead."

Then people get sidetracked on the features they personally would settle for,
and the real point gets lost in the fog. Or else they start dreaming
about what the perfect system would be like, describing features that
would not actually be required for a kernel-ready version control
system.

Unless the people in the know actually speak up, the rest of us just won't
be able to figure out what they need. A lot of projects are chasing their
tails right now, trying to do something, but lacking the direction they need
in order to do it.

Be well,
Zack

>
> M.

--
Zack Brown

Linus Torvalds

unread,

Mar 9, 2003, 12:42:22 PM3/9/03

to Martin J. Bligh, Roman Zippel, Eric W. Biederman, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

On Sun, 9 Mar 2003, Martin J. Bligh wrote:
>
> I think it's possible to get 90% of the functionality that most of us
> (or at least I) want without the distributed stuff.

No, I really don't think so.

The distribution is absolutely fundamental, and _the_ reason why I use BK.

Now, it's true that in 90% of all cases (probably closer to 99%) you will
never see the really nasty cases Larry was talking about. People just
don't rename files that much, and more importantly: then whey do, they
very very seldom have anybody else doing the same.

But what are you going to do when it happens? Because it _does_ happen:
different people pick up the same patch or fix suggestion from the mailing
list, and do that as just a small part of normal development. Are the
tools now going to break down?

BK doesn't. That' skind of the point. Larry

> If the "maintainer" heirarchy was a strict tree structure, where you
> send patches to your parent, and receive them from your children, that
> doesn't seem to need anything particularly fancy to me.

But it's not, and the above would make BK much less than it is today.

On eof the things I always hated about CVS is how it makes it IMPOSSIBLE
to "work together" on something between two different random people. Take
one person who's been working on something for a while, but is chansing
that one final bug, and asks another person for help. It just DOES NOT
WORK in the CVS mentality (or _any_ centralized setup).

You have to either share the same sandbox (without any source control
support AT ALL), or you have to go to the central repository and create a
branch (never mind that you may not have write permissions, or that you
may not know whether it's going to ever be something worthwhile yet).

With BK, the receiver just "bk pull"s. And if he is smart, he does that
from a cloned repository so that after he's done with it he will just do a
"rm -rf" or something.

This is FUNDAMENTAL.

And yes, maybe the really hard cases are rare. But does that mean that you
aren't going to do it?

Linus

Martin J. Bligh

unread,

Mar 9, 2003, 12:49:01 PM3/9/03

to Zack Brown, Roman Zippel, Eric W. Biederman, Linus Torvalds, Larry McVoy, linux-...@vger.kernel.org

>> I think it's possible to get 90% of the functionality that most of us
>> (or at least I) want without the distributed stuff. If that's 10% of
>> the effort, would be really nice to have the auto-merging type of
>> functionality at least.
>
>> If I'm missing something fundamental here, it wouldn't suprise me ;-)
>
> I think the fundamental thing you're missing is that Linus doesn't want it. ;-)

Depends what your goal is ;-) I'm not on a holy quest to stop Linus using
Bitkeeper .... I'm just trying to make the non-Bitkeeper users' life a
little easier.

M.

Martin J. Bligh

unread,

Mar 9, 2003, 12:59:17 PM3/9/03

to Linus Torvalds, Roman Zippel, Eric W. Biederman, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

>> I think it's possible to get 90% of the functionality that most of us
>> (or at least I) want without the distributed stuff.
>
> No, I really don't think so.
>
> The distribution is absolutely fundamental, and _the_ reason why I use BK.
>
> Now, it's true that in 90% of all cases (probably closer to 99%) you will
> never see the really nasty cases Larry was talking about. People just
> don't rename files that much, and more importantly: then whey do, they
> very very seldom have anybody else doing the same.
>
> But what are you going to do when it happens? Because it _does_ happen:
> different people pick up the same patch or fix suggestion from the mailing
> list, and do that as just a small part of normal development. Are the
> tools now going to break down?

I'm going to fix it by hand ;-) As long as it stops at a sensible point,
and clearly barfs and says what the problem is, that's fine by me.

> BK doesn't. That' skind of the point. Larry

Right ... I appreciate that. I'd just rather fix things up by hand 1% of
the time than use Bitkeeper myself. I'm not trying to stop *you* using
Bitkeeper by any stretch of the imagination ... you probably need the
heavyweight tools, but I'm OK without them.

> This is FUNDAMENTAL.
>
> And yes, maybe the really hard cases are rare. But does that mean that you
> aren't going to do it?

Yup, that's exactly what I'm saying. I'm not saying this as good as bitkeeper,
I'm saying it's "good enough" for me and I suspect several others (not saying
it's good enough for you), and significantly better than diff and patch.
(though cp -lR is *blindingly* fast, and diff understands hard links).

M.

Larry McVoy

unread,

Mar 9, 2003, 1:21:01 PM3/9/03

to Linus Torvalds, Martin J. Bligh, Roman Zippel, Eric W. Biederman, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

> And yes, maybe the really hard cases are rare. But does that mean that you
> aren't going to do it?

This is sort of the point I've been trying to make for years. It is
unlikely that an open source project is going to solve these problems.
It's possible, but unlikely because the problems are rare and the code to
solve them is incredibly difficult. It isn't obvious at all, it wasn't
obvious to me the first time around, it's only after you've done it that
you can see how something that appeared really simple wasted 6 months.

In the open source model, the portion of the work which is relatively
easy gets done, but the remaining part only gets done if there is a
huge amount of pressure to do so. If you take a problem which occurs
only rarely, is difficult to solve, and has only a small set of users,
that's a classic example of something that just isn't going to get fixed
in the open source environment.

It's a lot different when you have a very small set of users and the
solutions are very expensive. I'm not saying that people don't solve hard
problems in open source projects, they do, the kernel is a good example.
The kernel also has millions of users, gets all sorts of friendly press
every day, and is fun. In the SCM space, there are hundreds of products
for a potential market that is about 4000 times smaller than the potential
market for the kernel.

SVN is a good example. They side stepped almost all of the problems
that BK solves and it was absolutely the right call. It would have cost
them millions to solve them and their product is free, it would take
decades to recoup the investment at the low rates they can charge for
support or bundling or hosting.

Going back to the engineering problems, those problems are not going to
get fixed by people working on them in their spare time, that's for sure,
it's just not fun enough nor are they important enough. Who wants to
spend a year working on a problem which only 10 people see in the world
each year? And commercial customers aren't going to pay for this either
if the model is the traditional open source support model. If you hit a
problem and it costs us $200K to fix it and you only hit it a few times
a year, if that, then you are not going to be OK with us billing you
that $200K, there isn't a chance that will work.

I'm starting to think that the best thing I could do is encourage Pavel &
Co to work as hard as they can to solve these problems. Telling them that
it is too hard is just not believable, they are convinced I'm trying to
make them go away. The fastest way to make them go away is to get them
to start solving the problems. Let's see how well Pavel likes it when
people bitch at him that BitBucket doesn't handle problem XYZ and he
realizes that he needs to take another year of 80 hour weeks to fix it.
Go for it, dude, here's hoping that we can make it as pleasant for you
as you have made it for us. Looking forward to it.

--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

Larry McVoy

unread,

Mar 9, 2003, 2:59:55 PM3/9/03

to Zack Brown, Martin J. Bligh, Roman Zippel, Eric W. Biederman, Linus Torvalds, Larry McVoy, linux-...@vger.kernel.org

On Sun, Mar 09, 2003 at 09:20:45AM -0800, Zack Brown wrote:
> People in the know hint at these features ("naming is really important"),
> but the details are apparently complicated enough that no one wants to sit
> down and actually describe them.

What part of "40 man years" did you not understand? Do you seriously
think that it is easy to "sit down and actually describe them"? And if
you think I would do so just so you can go try to copy our solution
you have to be nuts, of course we aren't going to do that. It took
us year to figure it out, we're still figuring things out every day,
if you want a free SCM you can bloody well go figure it out yourself.
The whole point of the non-compete clause in the well loved BK license
is to say "this stuff is hard. If you want to create a similar product,
do it without the benefit of looking at our product". That seems to be
lost on you and a lot of other people as well.

It's perfectly OK for you to go invent a new SCM system. Go for it.
But stop asking for help from the BK crowd. Not only will we not
give you that help, we will do absolutely everything we can to make
sure that you can't copy BK. Everything up to and including selling
the company to the highest bidder and letting them chase after you.

Get it through your thick head that BK is something valuable to this
community, even if you don't use it you directly benefit from its use.
All you people trying to copy BK are just shooting yourself in the foot
unless you can come up with a solution that Linus will use in the short
term. And nobody but an idiot believes that is possible. So play nice.
Playing nice means you can use it, you can't copy it. You can also
go invent your own SCM system, go for it, it's a challenging problem,
just don't use BK's files, commands, or anything else in the process.
We didn't have the benefit of copying something that you wrote, you
don't get the benefit of copying something we wrote.

You don't have to agree with us, you can do whatever you want, but do
so realizing that if you become too annoying we'll simple decide that
supporting the kernel isn't worth the aggravation. As for you armchair
CEO's who think we're racking in the bucks because of the kernel's usage
of BK, think again. That is not how sales are made in this space, sales
are made at the VP of engineering, CTO, CIO, and/or CEO level. If you
think those guys read this list or slashdot or care about the kernel
using BK, think again, they don't. All they care about it is how much
it costs and how much effort it will save them. And they all know that
their development model is dramatically different than that of the
kernel so any BK success here is of marginal interest at best.

BK is made available for free for one reason and one reason only: to
help Linus not burn out. That's based on my personal belief that he is
critical to success of the Linux effort, he is a unique resource and has
to be protected. I've paid a very heavy price for that belief and I'm
telling you that you are right on the edge of making that price too high.

--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

Roman Zippel

unread,

Mar 9, 2003, 3:02:27 PM3/9/03

to Linus Torvalds, Martin J. Bligh, Eric W. Biederman, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

Hi,

On Sun, 9 Mar 2003, Linus Torvalds wrote:

> The distribution is absolutely fundamental, and _the_ reason why I use BK.

I agree, that this is an important aspect and for your kind of work it's
absolutely necessary.
But source management is more than just distributing and merging changes.
E.g. if I want to develop a driver, I would start like most people from a
stable base, that means 2.4. At a certain point the development splits
into a development and stable branch, eventually I also want to merge my
driver into the 2.5 tree.
This means I have to deal with 5 different source trees (branches), two
branches track external trees and I want to know what has been merged from
my development into my 2.4 and 2.5 stable branches, which I can use to
make official releases. I want to be able to push multiple changes as a
single change into the stable branches and it should be able to tell me
which changes are still left.
If there would be a free SCM system, which could do this, I could easily
do without a distributed option. Although I think as soon as it would be
this far it should be relatively easy to add a distribution mechanism (by
using a separate branch, which is only used for pulling changes). OTOH I
suspect that it will be very hard to add the other capabilities to bk
without a major redesign, as it's not a simple hierarchic structure
anymore.

bye, Roman

Zack Brown

unread,

Mar 9, 2003, 4:34:35 PM3/9/03

to Larry McVoy, Martin J. Bligh, Roman Zippel, Eric W. Biederman, Linus Torvalds, Larry McVoy, linux-...@vger.kernel.org

On Sun, Mar 09, 2003 at 11:58:52AM -0800, Larry McVoy wrote:
> On Sun, Mar 09, 2003 at 09:20:45AM -0800, Zack Brown wrote:
> > People in the know hint at these features ("naming is really important"),
> > but the details are apparently complicated enough that no one wants to sit
> > down and actually describe them.
>

> It's perfectly OK for you to go invent a new SCM system. Go for it.
> But stop asking for help from the BK crowd.

I haven't been asking you for help. I've been asking Linus and other
kernel developers to describe their needs. There seems to be three
camps in this discussion:

1) the people who feel that the hard problems solved by BitKeeper are
crucial

2) the people who feel that the hard problems are not that important,
and that a decent feature set could be designed to handle pretty much
everything anyone might normally need

3) the people who want features that are not really related to finding a
BitKeeper alternative.

My own opinion is that the people in camp (2) are falling into the trap which
has been described often enough, in which they will realize their design
mistakes too late to do anything about them. Whil the people in camp (3)
seem to be getting ahead of the game. The features they want are all great,
but the question of the basic structure still remains.

I think what needs to be done is to identify the hard problems, so that
any version control project that starts up can avoid mistakes that will
put a glass ceiling over their heads. Even if they choose not to implement
everything, or if they choose to implement features orthogonal to a real
BitKeeper alternative, they would still have the proper framework to raise
the project to the highest level later.

Of kernel developers, only Linus seems to have a clear idea of what the kernel
development process' needs are; but aside from insisting that distribution
is key (which people in camp (1) know already), he hasn't gone into the kind
of detail that folks would need in order to actually make a decent attempt.

Be well,
Zack

--
Zack Brown

Valdis.K...@vt.edu

unread,

Mar 9, 2003, 4:56:25 PM3/9/03

to Zack Brown, Larry McVoy, Martin J. Bligh, Roman Zippel, Eric W. Biederman, Linus Torvalds, Larry McVoy, linux-...@vger.kernel.org

On Sun, 09 Mar 2003 13:32:46 PST, Zack Brown <zbr...@tumblerings.org> said:

> Of kernel developers, only Linus seems to have a clear idea of what the kerne
l
> development process' needs are; but aside from insisting that distribution
> is key (which people in camp (1) know already), he hasn't gone into the kind
> of detail that folks would need in order to actually make a decent attempt.

It's quite possible that even Linus doesn't have a clear cognitive grasp of
all the problems - Larry gave BK to Linus to prevent burn-out. I'd not be
surprised if Linus was so busy dealing with the *first* order problems in
the pre-BK world (just getting patches to apply to his tree) that he never
encountered all the 'tough problems', and once he started using BK, he
also never hit any of the 'tough problems' because Larry's crew had already
spent 40 man-years making sure Linus *didnt* hit them.

fs

unread,

Mar 9, 2003, 6:20:38 PM3/9/03

to Larry McVoy, Linus Torvalds, Martin J. Bligh, Roman Zippel, Eric W. Biederman, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

On Sun, Mar 09, 2003 at 10:20:09AM -0800, Larry McVoy wrote:
> In the open source model, the portion of the work which is relatively
> easy gets done, but the remaining part only gets done if there is a
> huge amount of pressure to do so. If you take a problem which occurs
> only rarely, is difficult to solve, and has only a small set of users,
> that's a classic example of something that just isn't going to get fixed
> in the open source environment.

You are wrong. The choice of you and your team for a license is well respected
here both by the tree maintainer and its users, but we don't need to go
further into pissing on open source projects because your project wouldn't
make it if it was. I(an almost anonymous reader), and most here respect both
your work and your honesty in describing why you did it commercial but this
is one thing, and generalizing is another.

The Linux kernel by itself is a good example. It has code for things
that Microsoft will create when people need it in great extend like
ipv6, encryption API and IA-64/x64 support. Well, the examples are
numerous and I'm sure some experienced hackers can enlighten you
better.

The Grub bootloader is another example. An Open Source project that
provides support for almost any kernel there exists having command line
and autocomplete support on demand. Features that *nobody asked* but
they exist.

More experienced people on open source projects I'm sure will say "wtf,
there are plenty of better examples".

And think it otherwise. If a closed source project is more advanced on
something is a result of what *its* users want. If Microsoft is better on GUI
is a result of what its users want. The Open Source operating systems
are traditionally (as for the past 10 years) better on networking and
multiuser capabilities because what's what users want.

That of course comes into you words but the fact that most closed source
projects are indeed follow what their users want, that doesn't make a
difference.

So, if your project is better that's another thing. If you and team chose
to make it commercial is well respected and understood. More understood
is the fact that you actuall *spend money* on it. It is a fundamental
right of yours to do what you want with your code especially when it is
a matter of personal economic health. But getting it generalised and
say that every open source project is just a hobbyish thing that is
always inferior to closed source unless 2^64 people ask for a feature?

no sir, real examples show things different.

-fs

Larry McVoy

unread,

Mar 9, 2003, 6:29:37 PM3/9/03

to Valdis.K...@vt.edu, Zack Brown, Larry McVoy, Martin J. Bligh, Roman Zippel, Eric W. Biederman, Linus Torvalds, linux-...@vger.kernel.org

Bingo. We work hard to make sure that we've thought of and solved the
problems *before* they are hit in the field. We try to be proactive,
not reactive (at least in coding, mailing lists are another matter).
We're not that great at it, but we've definitely solved all sorts of
problems long before Linus did anything to hit them.

--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

Petr Baudis

unread,

Mar 9, 2003, 7:03:35 PM3/9/03

to Zack Brown, Larry McVoy, Linus Torvalds, linux-...@vger.kernel.org

Dear diary, on Sun, Mar 09, 2003 at 03:45:22AM CET, I got a letter,
where Zack Brown <zbr...@tumblerings.org> told me, that...

> On Sat, Mar 08, 2003 at 04:05:14PM -0800, Larry McVoy wrote:
> > Zack Brown wrote:
> > > Linus Torvalds wrote:
> > > > Give it up. BitKeeper is simply superior to CVS/SVN, and will stay that
> > > > way indefinitely since most people don't seem to even understand _why_
> > > > it is superior.
> > >
> > > You make it sound like no one is even interested ;-). But it's not true! A
> > > lot of people currently working on alternative version control systems would
> > > like very much to know what it would take to satisfy the needs of kernel
> > > development.
> >
> > [Long rant, summary: it's harder than you think, read on for the details]
> [skipping long description]
>
> OK, so here is my distillation of Larry's post.

I've decided to elaborate a little more how BK in fact works for those who
don't use it and don't want to read over all the documentation, and also share
some thoughts and possible solutions of the individual problems.

All this is derived from various LKML threads and BK.com's documentation as I'm
not permitted to use BK myself, corrections are more than welcome.

> Basic summary: a distributed, replicated, version controlled user level file
> system with no limits on any of the file system events which may happened
> in parallel. All changes must be put correctly back together, no matter how
> much parallelism there has been.

[in the following text, "checkin" and "commit" are not inter-exchangable;
"checkin" means to one-time get some changes to one file, "commit" means to
form a changeset from several checked in changes in several files; this mirrors
BK's semantics]

I'd add

* ChangeSets.

at the top. Unlike ie. SVN, changes checkins and changesets commits are
separated in BK and that sounds as a good thing to do --- it encourages people
to checkin more frequently and group a changeset from the uncommitted changes
when the changes are finished and good enough. See also
http://www.bitkeeper.com/UG/Getting.Working.Checking.html. Basically, you
checkin files as you want and the checkins to individual files are independent.
When you finish some logical change over several files, you use bk commit and
the checkins which aren't part of any changeset yet are automagically grouped
to one, you write a summary comment of the changeset and then ChangeSet
revision number will increase and somewhere will be written down which checkins
are part of this ChangeSet. One changeset is then an atomic unit when sharing
the changes with others, that is you must form one in order to make the changes
available.

The more-or-less opposite concept is to have each checkin(s, when you checking
multiple files at once) as a changeset (this is what SVN does) --- then you
don't need per-file revision numbers but just one per-repository revision
number which is increased by each checkin (which is also commit in SVN). This
can seem more elegant and generic, but I personally believe that it's better to
have release checkins and changeset commits separated. Then per-repository
revision numbers should obviously increase by each commit, not checkin.

In BK, you usually work with the changeset numbers, but for the internal
structure the revision numbers are also important. Since changeset number can
be taken as revision number of the ChangeSet metafile, I will operate mostly
with revision numbers below.

> * Merging.
>
> * The graph structure.

About these two, it could be worth noting how BK works now, how looks branching
and merging and how could it be done internally.

When you want to branch, you just clone the repository --- each clone
automatically creates a branch of the parent repository. Similiarly, you merge
the branch by simply pulling the "branch repository" back. This way the
distributed repositories concept is tightly connected with the branch&merge
concept. When I will talk about merging below, it doesn't matter if it will
happen from the cloned repository just one directory far away or over network
from a different machine.

[note that the following is figured out from various resources but not from the
documentation where I didn't find it; thus I may be well completely wrong here;
please substitute "is" by "looks to be", "i think" etc in the following text]

BK works with a DAG (Directed Acyclic Graph) formed from the versions, however
the graph looks differently from each repository (diagrams show ChangeSet
numbers).

From the imaginary Linus' branch, it looks like:

linus 1.1 -> 1.2 -> 1.3 -----> 1.4 -> 1.5 -----> 1.6 -----> 1.7
\ / \ /
alan \-> 1.2.1.1 --/---\-> 1.2.1.2 -> 1.2.1.3 --/

But from the Alan' branch, it looks like:

linus 1.1 -> 1.2 -> 1.2.1.1 -> 1.2.1.2 -> 1.2.1.3 -> 1.2.1.4 -> 1.2.1.5
\ / \ /
alan \-> 1.3 ------/---\-----> 1.4 -----> 1.5 ------/

But now, how does merging happen? One of the goals is to preserve history even
when merging. Thus you merge individual per-file checkins of the changeset
one-by-one, each checkin receiving own revision in the target tree as well ---
this means the revision numbers of individual checkins change during merge if
there were other checkins to the file between fork and merge.

But it's a bit more complicated: ChangeSet revision number is not globally
unique neither and it changes. You cannot decide it to be globally unique
during clone, because then you would have to increase the branch identifier
after each clone (most of them are probably just read-only). Thus in the cloned
repository, you work like you would continue in the branch you just cloned, and
the branch number is assigned during merge.

A virtual branch (used only to track ChangeSets, not per-file revisions) is
created in the parent repository during merge, where the merged changesets
receive new numbers appropriate for the branch. However the branch is really
only virtual and there is still only one line of development in the repository.
If you want to see the ChangeSets in order they were applied and present in the
files, you have not to sort them by revision, but by merge time. Thus the order
in which they are applied to the files is (from Linus' POV):

1.1 1.2 1.3 1.2.1.1 1.4 1.5 1.6 1.2.1.2 1.2.1.3 1.7

> * Distributed rename handling. Centralized systems like Subversion don't
> have as many problems with this because you can only create one file in
> one directory entry because there is only one directory entry available.
> In distributed rename handling, there can be an infinite number of different
> files which all want to be src/foo.c. There are also many rename corner-cases.

One obvious solution is hitting me here. First, you virtualize files to inodes
and give them numbers (in practice it's not necessary and in fact it could be
better not to do that, but it can be much easier to think about it as if it
would be this way) --- the numbers don't have to be globally unique, they are
just convience abstraction; they are inherited upon clone, though. Then in
repository you have each file name being just that inode number, and for each
inode you keep history of names it had and in which revision the name was
assigned (thus you also know in what changeset it was assigned).

When you are merging an "inode", you just go back to the last common ChangeSet
revision in the names history and look what the name is. If there's no name for
that changeset yet, it's a new file and if there's filename conflict, you
cannot do much with it. Otherwise you know that the inode number has to be same
for both repositories. Then you just do the rename of inode in the target
repository to the current name in the source repository. If there is a
conflict, you check if you can't repeat this whole operation on the file in the
way in the target repository --- if not (or you can but the conflict was not
solved anyway), you again probably cannot do much with this again and you have
to let the user decide.

What am I missing?

> * Symbolic tags. This is adding a symbolic label on a revision. A distributed
> system must handle the fact that the same symbol can be put on multiple
> revisions. This is a variation of file renaming. One important thing to
> consider is that time can go forward or backward.

You remap the tags when you remap the changeset numbers, and? BK seems to allow
one tag to be on multiple changesets and I presume that then the latest one is
normally used --- you can do the similiar here, the latest such-named tag is
used normally, the merged ones are just preserved in the history.

> * Security semantics. Where should they go? How can they be integrated
> into the system? How are hostile users handled when there is no central
> server to lock down?

I'm not sure which points exactly this attempts to bring up. Which particular
issues are open here? This is mostly question of configuration of individual
repositories (if you allow push and from who) and trust (if you will do pull
and from who), isn't it?

> * Time semantics. A distributed system cannot depend on reported time
> being correct. It can go forward or backward at any rate.

Yes, then let's not depend on the time ;-).

Kind regards,

--

Petr "Pasky" Baudis
.
When in doubt, use brute force.
-- Ken Thompson
.
Crap: http://pasky.ji.cz/

Larry McVoy

unread,

Mar 9, 2003, 7:33:32 PM3/9/03

to Zack Brown, Linus Torvalds, linux-...@vger.kernel.org

> What am I missing?

Nothing that a half of decade of work wouldn't fill in :)

More seriously, lots. I keep saying this and people keep not hearing it,
but it's the corner cases that get you. You seem to have a healthy grasp
of the problem space on the surface but in reading over your stuff, you
aren't even where I was before BK was started. That's not supposed to be
offensive, just an observation. As you try out the ideas you described
you'll find that they don't work in all sorts of corner cases and the
problem is that there are a zillion of them. And the solutions have
this nasty habit of fighting with each other, you solve one problem
and that creates another.

The thing we've found is that this problem space is much bigger than one
person can handle, even an exceptionally talented person. The number of
variables are such that you can't do it in your head, you need to have a
muse for each area and both of you have to be thinking about it full time.

This isn't a case of "oh, I get it, now I'll write the code". It's a
case of "write the code, deploy the code, get taught that it didn't work,
get the insight from that, write new code, repeat". And the problems are
such that if you aren't on them all the time then you work very slowly,
99% of the work is recreating the state you had in your brain the last
time you were here.

I strongly urge you to wander off and talk to people who are actually
writing code for real users. Arch, SVN, CVS, whatever. Get deeply
involved and understand their choices. Personally, I'd suggest the SVN
guys because I think they are the most serious, they have a team which
has been together for a long time and thought hard about it. On the
other hand, Arch is probably closer to mimicing how development really
happens in the real world, in theory, at least, it can do better than BK,
it lets you take out of order changesets and BK does not. But it is light
years behind SVN in terms of actually working and being a real product.
SVN is almost there, they are self hosting, they eat their own dog food,
Arch is more a collection of ideas and some shell scripts. From SVN,
you're going to learn more of the hard problems that actually occur,
but Arch might be a better long term investment, hard to say.

--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

Horst von Brand

unread,

Mar 9, 2003, 10:42:59 PM3/9/03

to Zack Brown, Larry McVoy, linux-...@vger.kernel.org

Zack Brown <zbr...@tumblerings.org> said:

[...]

> I'd be willing to maintain this as the beginning of a feature list and
> post it regularly to lkml if enough people feel it would be useful and not
> annoying. The goal would be to identify the features/problems that would
> need to be handled by a kernel-ready version control system.

I believe that has very little relevance to lkml, only perhaps to a mailing
list for a bk replacement. For the kernel this work has already been done
(by Larry and the head penguins).

--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

Jamie Lokier

unread,

Mar 10, 2003, 8:53:53 AM3/10/03

to Horst von Brand, Zack Brown, Larry McVoy, linux-...@vger.kernel.org

Horst von Brand wrote:

> Zack Brown <zbr...@tumblerings.org> said:
> > I'd be willing to maintain this as the beginning of a feature list and
> > post it regularly to lkml if enough people feel it would be useful and not
> > annoying. The goal would be to identify the features/problems that would
> > need to be handled by a kernel-ready version control system.
>
> I believe that has very little relevance to lkml, only perhaps to a mailing
> list for a bk replacement. For the kernel this work has already been done
> (by Larry and the head penguins).

I'd like to thank those kind souls who explained how branch _and_
merge history is used by the better merging utilities. Now I see why
tracking merge history is so helpful. (Tracking it for credit and
blame history was obvious, but tracking it to enable tools to be
better at resolving conflicts was not something I'd thought of).

Of course there will be times when two or more people apply a patch
without the history of that patch being tracked, and then try to merge
both changes - any version control system should handle that as
gracefully as it can. However I now see how much actively tracking
the history of those operations can help tools to reduce the amount of
human effort required to combine changes from different places.

So thank you for illustrating that.

ps. Yes I know that CVS sucks at these things. I've seen _awful_
software engineering disasters due to the difficulty of tracking
different lines of development through CVS, first hand :)

-- Jamie