Merging troubles ?

Koen

unread,

Jun 13, 2003, 6:59:44 AM6/13/03

to

Could someone help me out on this one please?

I'm trying to understand why branching / merging gives some strange
behavior (well, strange to me at least).

Say I have a file "blabla.cpp" in my cvs repository which was added on
the main branch.

Then I made a new branch where I could work on extensions without
disturbing the other developers (say "KoenWork_branch"). So now I've
done the extensions/changes on the file and committed the file on my own
working branch.

After testing succesfully (let's say that no other files where
added/changed in the main branch for simplicity) I of course want to
merge the changes into the main branch. So, I go to a checked out copy
of the main branch and do "CVS --> merge" from branch "KoenWork_branch"
(using TortoiseCVS, which uses: "cvs -q update -j KoenWork_branch"). I
get to see the changes as should, and I commit them on the main branch.

After that, people start working on new files or change existing files,
all committed on the main branch.

Then I need to do another change/extension to some of the files in the
main branch, so I intend to do that on my working branch (again not to
disturb the others while working on it, and still have a way to commit
intermediate versions when parts of the changes/extensions are
finished). So, now I first need to make sure I have all the latest
changes from the main branch in my working branch, and I do a "CVS -->
merge" from branch "HEAD" on my checked out working branch copy ("cvs -q
update -j HEAD").

Now, the trouble I have with that is that the merge shows that my
"blabla.cpp" file has changed, although I didn't touch it at all (not in
the main branch, and not in the working branch) since I merged it from
the working branch in the main branch and committed it in the main
branch?!

Why is that, and how can I avoid this?

I mean: it surely can't be that for the rest of the life span of my
repository I will get to see that a file changed, but in fact didn't,
right?

Any ideas / help?

Koen

Pierre Asselin

unread,

Jun 13, 2003, 8:53:04 PM6/13/03

to

Do yourself a favor and abandon your branch after you've merged it.
If you need to isolate yourself again, start a new branch.

For branches that continue after the merge (for example a bugfix branch
to a shipped release) you need to maintain tags so you don't later
merge the same changes twice.

Under no circumstance should you merge bidirectionally, from trunk
to branch as well as from branch to trunk. To do this right, you
need to maintain too many tags and it's just too easy to slip.
I find it easier to rejuvenate the branch by re-grafting it higher
up on the trunk.

Koen

unread,

Jun 15, 2003, 5:41:09 PM6/15/03

to

"Pierre Asselin" <p...@invalid.invalid> wrote in message
news:glrdcb...@brick.verano.sba.ca.us...

Now, I'm a bit confused...
What do you mean (in CVS terms and in terms of CVS commands) by:
1. "abandon a branch"
2. "rejuvenate the branch by re-grafting it higher up on the trunk."

Also, "If you need to isolate yourself again, start a new branch" seems a
bit strange to me...
What I want to do, is make sure other people are not disturbed while I'm
making changes/enhancements to some files in the source tree. So, I create a
branch. Until here, you're saying the same thing. The difference is that you
seem to advise me to create a new branch each time I need to make a change.
Now, this happens to be about twice a week or so as we're in the early
stages of development, so that would mean that within 2 months, there will
be about 16 branches in the repository? Can't I just create one branch
called KoenWork once, and then start off that branch for doing my changes?
I thought that I needed to merge the MAIN branch into my branch whenever I
wanted to do some new changes starting from the current state of the MAIN
branch. You seem to tell me not to do that, but create a new branch (with a
new name?) instead. Or do you mean I should create a new branch *each time
with the same name* (didn't know that was possible)?

Sorry if these questions seem a strange, but I'm still a bit in a learning
phase with CVS...

Koen

Peter S. Shenkin

unread,

Jun 16, 2003, 8:56:34 PM6/16/03

to

"Koen" <n...@ssppaamm.com> wrote in message news:<3eece7f7$0$28025$ba62...@reader1.news.skynet.be>...

I'm not Pierre, but...

> Now, I'm a bit confused...
> What do you mean (in CVS terms and in terms of CVS commands) by:
> 1. "abandon a branch"

Stop working in that branch.

> 2. "rejuvenate the branch by re-grafting it higher up on the trunk."

Create a new branch.

I think Pierre is advocating the following, where B is
a branch and O---M means a merge from the O branch to
the M branch:

main_trunk ----B--O-----O--M-----B---O---
| | | | | | etc.
your_branch |--M-----M--O |---M---

You can manage two-way merges if you're doing it all yourself
and being very careful, but something like the above says
"keep your own branch up-to-date based on main_trunk changes;
but after you merge a body of work back to the main_trunk,
start a new branch for the next body of work."

-P.

Pierre Asselin

unread,

Jun 16, 2003, 11:55:32 PM6/16/03

to

Peter S. Shenkin <she...@mindspring.com> wrote:
> "Koen" <n...@ssppaamm.com> wrote in message news:<3eece7f7$0$28025$ba62...@reader1.news.skynet.be>...

> I'm not Pierre, but...

>> Now, I'm a bit confused...
>> What do you mean (in CVS terms and in terms of CVS commands) by:
>> 1. "abandon a branch"

> Stop working in that branch.

Right.

>> 2. "rejuvenate the branch by re-grafting it higher up on the trunk."

> Create a new branch.

> I think Pierre is advocating the following, where B is
> a branch and O---M means a merge from the O branch to
> the M branch:

> main_trunk ----B--O-----O--M-----B---O---
> | | | | | | etc.
> your_branch |--M-----M--O |---M---

^
That can work, but the this ^ merge will give you a bit of trouble
because now your branch contains changes that are already on the trunk.
You're likely to get nuisance conflicts where the text is identical
on both sides.

> You can manage two-way merges if you're doing it all yourself
> and being very careful, but something like the above says
> "keep your own branch up-to-date based on main_trunk changes;
> but after you merge a body of work back to the main_trunk,
> start a new branch for the next body of work."

Actually, I advocate *not* keeping your branch up to date with the trunk,
so the final merge will be easier. If you need the latest and greatest,
you should probably work on the trunk. If you work on a branch, and you
must have your reasons, you have to accept a certain amount of isolation.

Here's what I mean by "rejuvenate", using Peter's notation:

-----B--------------------------B--
| |
+---------------O - - - -> M--

See how this works? The first "B" is the start of your branch, which
isn't ready to merge to the trunk yet but is beginning to seriously
lag behind. Solution: return to the trunk and create a new branch,
the second "B". Switch your sandbox to that branch, which is still
identical to the trunk. Now merge the old branch, commit, and declare
the old branch dead.

cvs update -A
cvs tag newbranch_start
cvs tag -b newbranch
cvs update -r newbranch
cvs update -j oldbranch
(fix conflicts)
cvs commit

Koen

unread,

Jun 17, 2003, 12:15:08 PM6/17/03

to

"Pierre Asselin" <p...@invalid.invalid> wrote in message

news:kf3mcb...@brick.verano.sba.ca.us...

> Peter S. Shenkin <she...@mindspring.com> wrote:
> > "Koen" <n...@ssppaamm.com> wrote in message
news:<3eece7f7$0$28025$ba62...@reader1.news.skynet.be>...
>
> > I'm not Pierre, but...
>
> >> Now, I'm a bit confused...
> >> What do you mean (in CVS terms and in terms of CVS commands) by:
> >> 1. "abandon a branch"
>
> > Stop working in that branch.
>
> Right.

So, no "single working branch with possible subbranches" for each developer
then...

> >> 2. "rejuvenate the branch by re-grafting it higher up on the trunk."
>
> > Create a new branch.
>
> > I think Pierre is advocating the following, where B is
> > a branch and O---M means a merge from the O branch to
> > the M branch:
>
> > main_trunk ----B--O-----O--M-----B---O---
> > | | | | | | etc.
> > your_branch |--M-----M--O |---M---
> ^
> That can work, but the this ^ merge will give you a bit of trouble
> because now your branch contains changes that are already on the trunk.
> You're likely to get nuisance conflicts where the text is identical
> on both sides.

Hmmm... That's exactly the problem I had indeed, and that's my whole point:
why by any common sense should I get a conflict here? CVS did get all the
information to know that the two files are exactly the same right? Is this
really normal? And WHY (I just ask because it doesn't seem to be normal to
me ;-) )?

> Actually, I advocate *not* keeping your branch up to date with the trunk,
> so the final merge will be easier. If you need the latest and greatest,
> you should probably work on the trunk.

And how can you do intermediate commits until you're really finished (like
for when you're working on your code at different physical locations)
without disturbing the other developers then?

Strange, I have read in several places that "the way to do it right" is to
work on a separate branch and before putting your changed code back into the
trunk, you should get all the latest updates from the trunk, test your code
with it, verify compilation and semantics (possibly a review by others), and
only then merge your code into the trunk (which seems reasonable to me).
This seems to be in conflict with the advice to work on the trunk (even if
it's only when you want the latest and greatest).

Don't get me wrong: I'm not really experienced with branching/merging, so I
can only trust my commong sense here (but that's no guarantee ;-) )...

> If you work on a branch, and you
> must have your reasons, you have to accept a certain amount of isolation.
>
> Here's what I mean by "rejuvenate", using Peter's notation:
>
> -----B--------------------------B--
> | |
> +---------------O - - - -> M--
>
> See how this works? The first "B" is the start of your branch, which
> isn't ready to merge to the trunk yet but is beginning to seriously
> lag behind. Solution: return to the trunk and create a new branch,
> the second "B". Switch your sandbox to that branch, which is still
> identical to the trunk. Now merge the old branch, commit, and declare
> the old branch dead.
>
> cvs update -A
> cvs tag newbranch_start
> cvs tag -b newbranch
> cvs update -r newbranch
> cvs update -j oldbranch
> (fix conflicts)
> cvs commit

Yes, I see what you mean: instead of merging from the trunk into the branch,
you create a new branch in the trunk and merge the old branch into the new
one. And then you only merge into the trunk when all work for which the
branch was created is finished.
You will end up with lots of "dead" branches this way, no? Do you remove
these after creating the new branch (if so, how)? (this also seems to answer
my question about wether you can create a new branch with a name that
already exists: no, otherwise you can't refer to the old branch anymore)

I guess my idea of making a branch for each developer (say "b_dev1") and let
him work on that branch and merge from there looked like this:

If I understand your reasoning, I guess you would say not to do the last
merge from trunk into b_dev1, but create a new branch instead (so I'll get
something like "b_dev1_1", "b_dev1_2", "b_dev1_3", "b_dev1_4", ... and so on
and for each developer.

Maybe I should try your method for a while and see what works best. If it
wasn't for the "false conflicts" I think I'd like the method I've shown best
(I just wouldn't expect any conflict from that last merge, but that doesn't
seem to be true...)
We'll see.

Any other advices from other experienced CVS users? Maybe other people have
other habits for some reason?

Thanks for taking the time to explain this!

Koen

Peter S. Shenkin

unread,

Jun 18, 2003, 7:13:36 PM6/18/03

to

> Peter S. Shenkin <she...@mindspring.com> wrote:
> > "Koen" <n...@ssppaamm.com> wrote in message news:<3eece7f7$0$28025$ba62...@reader1.news.skynet.be>...

> >> 2. "rejuvenate the branch by re-grafting it higher up on the trunk."

>
> > Create a new branch.
>
> > I think Pierre is advocating the following, where B is
> > a branch and O---M means a merge from the O branch to
> > the M branch:
>
> > main_trunk ----B--O-----O--M-----B---O---
> > | | | | | | etc.
> > your_branch |--M-----M--O |---M---
> ^
> That can work, but the this ^ merge will give you a bit of trouble
> because now your branch contains changes that are already on the trunk.
> You're likely to get nuisance conflicts where the text is identical
> on both sides.

I don't think so. You need to keep a tag at the last merge
point in any branch from which you're merging. Thus, at the last
O in the main_trunk before the ^ merge you'd have a tag called,
say, "last_merge_from_main_trunk". Then you'd do the ^ merge
as:

cd main_trunk
cvs update -kk -dP -j last_merge_from_main_trunk -j your_branch
cvs commit

But for the use that Koen has in mind, it's not actually
clear that he needs a branch at all. If only he is working
on his own revisions, he can just keep his revisions in his
working repository until he's ready to do a checkin, and
then he can check them into the main trunk.

The only reason to create a branch is to allow multiple people
to work in it.

-P.

Pierre Asselin

unread,

Jun 18, 2003, 11:15:12 AM6/18/03

to

There's a lot to cover here. I'll shuffle the order a bit.

Koen <n...@ssppaamm.com> wrote:

> Strange, I have read in several places that "the way to do it right" is to
> work on a separate branch and before putting your changed code back into the
> trunk, you should get all the latest updates from the trunk, test your code
> with it, verify compilation and semantics (possibly a review by others), and
> only then merge your code into the trunk (which seems reasonable to me).
> This seems to be in conflict with the advice to work on the trunk (even if
> it's only when you want the latest and greatest).

It depends. Your site can choose to work that way if it suits you better.
Personally, I'm used to working on the trunk with others, committing
early and committing often, and my advice assumed that model. There was
a long thread last October in c.s.config-mgmt (I think) titled "Branching
philosophy" or some such. Find it in groups.google.com and have fun.

Here's the tradeoff as I remember it.
Shared trunk:
*) Programmers interfere with one another.
*) Conflicts are resolved immediately, no integration issues.
Branch per programmer:
*) Programmers work independently.
*) Integration is delayed to merge time.

Working on the trunk, your group fully utilizes the "C" in "CVS" ==
"Concurrent Version System". It's scary at first, but I always found
it to work incredibly well. True conflicts are rare, and communications
improve. Programmers even end up fixing each other's mistakes.

Working on a branch per programmer, your group loses the constant
communication and you use only the merge functionality. One of you
becomes system integrator and is responsible for merging the branches
to the trunk.

You've been doing the merges, right? two questions for you: 1) Aren't
those merges scary? 2) Aren't they in fact easy to do? Well, the same
thing happens on a shared trunk. The constant updates are scary, but
they are even easier than branch merges because they are smaller.

I do create personal branches when I'm about to create badly broken code
(as opposed to working but incomplete code) and I don't want to really
commit it, but I still want the CVS safety net. I try to keep those
branches short because the constant interaction on the trunk is so
incredibly valuable.

Also, if you have a sizeable staff, say more than a dozen people, it can
become impossible to squeeze in your commits edgewise. In this case I
would create task branches, but I would still assign more than one staff
per branch. (And maybe more than one branch to some people, too!)

> I guess my idea of making a branch for each developer (say "b_dev1") and let
> him work on that branch and merge from there looked like this:

> trunk ----B----------------------O--M-----
> | | |
> b_dev1 +-B-------M--B------M--M--O-----
> | | | |
> +-------O +------O
> feature1 feature2

> If I understand your reasoning, I guess you would say not to do the last
> merge from trunk into b_dev1, but create a new branch instead (so I'll get
> something like "b_dev1_1", "b_dev1_2", "b_dev1_3", "b_dev1_4", ... and so on
> and for each developer.

That's one way to do it, but I would advise creating feature branches
directly off the trunk. Your diagram implies that only one programmer
works on each feature, so operationally it wouldn't be a big change.
Your developers may fight the idea of losing "their" branches where they
are kings, but that's an education issue. Ask yourselves how disruptive
it is to return to the wild trunk, compared to merging a bunch of wild
changes from the trunk?

> Maybe I should try your method for a while and see what works best. If it
> wasn't for the "false conflicts" I think I'd like the method I've shown best
> (I just wouldn't expect any conflict from that last merge, but that doesn't
> seem to be true...)
> We'll see.

The spurious conflicts are an imperfection of the tool. It doesn't
keep track of past merges, so it doesn't pick the best common ancestor
on which to base the three-way merge. Even so the diff3 engine should
detect that two changes are identical and merge them silently, but I
think it gets confused when the changes occur in different locations in
the two branches.

Only you can decide how to use CVS, but the spurious conflicts are an
indication that your usage is not what the original developers had in
mind...

> You will end up with lots of "dead" branches this way, no? Do you remove
> these after creating the new branch (if so, how)?

Nah, let them be. If you want to delete the branch tags, there's "cvs
tag -d". If you want to wipe out the actual revisions on the branches,
there's "cvs admin -o". Think twice about doing that, sooner or later
you'll delete something you shouldn't.

> (this also seems to answer
> my question about wether you can create a new branch with a name that
> already exists: no, otherwise you can't refer to the old branch anymore)

You can't have two branches with the same name for the reason you
indicated, but you can reuse the names. Newer releases of cvs let you
create a new name for a branch tag, but on earlier releases you can do

cvs admin -nnew_branch_name:old_branch_name

or even

cvs admin -Nnew_branch_name:old_branch_name

if "new_branch_name" already exists and you want to reseat it.
You can then delete the "old_branch_name" with a normal "cvs tag -d",
effectively renaming the branch and making the old name available.

> So, no "single working branch with possible subbranches" for each developer
> then...

I like my branches short-lived. You want one long-lived branch per
developer. It took me years to figure out a way to maintain long-lived
parallel branches, and I don't like it. You could say it's a limitation
of CVS that this is so hard to do. My idea of re-grafting the branch
further up the trunk was a workaround for branches that stretch longer
than planned.

Koen

unread,

Jun 23, 2003, 5:22:40 AM6/23/03

to

"Pierre Asselin" <p...@invalid.invalid> wrote in message

news:0mvpcb...@brick.verano.sba.ca.us...

> There's a lot to cover here. I'll shuffle the order a bit.

Thanks *a lot* for the extended info and thoughts!
It's these kind of things that are really helpful in practice (after
going through a reference manual first).

Koen

unread,

Jun 23, 2003, 5:36:00 AM6/23/03

to

"Peter S. Shenkin" <she...@mindspring.com> wrote in message
news:b864a1a0.0306...@posting.google.com...

> I don't think so. You need to keep a tag at the last merge
> point in any branch from which you're merging. Thus, at the last
> O in the main_trunk before the ^ merge you'd have a tag called,
> say, "last_merge_from_main_trunk". Then you'd do the ^ merge
> as:

OK, that seems like a good idea yes.

> But for the use that Koen has in mind, it's not actually
> clear that he needs a branch at all. If only he is working
> on his own revisions, he can just keep his revisions in his
> working repository until he's ready to do a checkin, and
> then he can check them into the main trunk.

Well, I sometimes work here at the office and sometimes at home, and I
don't want to be sending myself emails with code to keep both of my
computers up to date.
The reasons I need a branch (feature branch or developer branch, doesn't
matter for this issue) is because:
- I don't want other people to be disturbed by my changes until I'm
finished
- I do want to be able to do intermediate commits for the code *parts*
that are finished
- I do want to able to work on my changes from different places
I just can't do this by only doing the changes in my local copy and then
commit them only when I'm really finished.

Koen

Peter S. Shenkin

unread,

Jun 27, 2003, 10:56:03 PM6/27/03

to

"Koen" <n...@ssppaamm.com> wrote in message news:<bd6h1m$n31$1...@gaudi2.UGent.be>...

> "Peter S. Shenkin" <she...@mindspring.com> wrote in message
> news:b864a1a0.0306...@posting.google.com...

> > But for the use that Koen has in mind, it's not actually

> > clear that he needs a branch at all. If only he is working
> > on his own revisions, he can just keep his revisions in his
> > working repository until he's ready to do a checkin, and
> > then he can check them into the main trunk.
>
> Well, I sometimes work here at the office and sometimes at home, and I
> don't want to be sending myself emails with code to keep both of my
> computers up to date.
> The reasons I need a branch (feature branch or developer branch, doesn't
> matter for this issue) is because:
> - I don't want other people to be disturbed by my changes until I'm
> finished

Local working dir does this.

> - I do want to be able to do intermediate commits for the code *parts*
> that are finished

Local working dir does this. Just commit those changes as you finish
them.

> - I do want to able to work on my changes from different places

See below.

> I just can't do this by only doing the changes in my local copy and then
> commit them only when I'm really finished.

A checked out working directory is by far the best way to handle
your first two needs; however, your last need requires discussion.

A branch isn't necessarily the best solution even for that. The problem
is that if you forget to update, or whatever, you can get into scenarios
where code changes you made in one location get lost when you update
at the other location.

It's better to login remotely to the single location and work there.
Second best is to keep a primary checked out location (the only
place you'll run CVS commands from) and rsynch to/from that when you
need to. A branch is a distinct third choice. Doable, but not
the best solution, IMO, unless the others are not doable.