One of the problems we have with CVS is that we try to perform too much
integration in one place: the trunk. This creates a lot of pressure to
accept patches that we really shouldn't, because people need their
patches in the tree to make collaboration easy.
__________________
| |
| Shipping Build |
|________________|
/ \
/ \
/ \
/ \
/ \
__________________ __________________
| | | |
| Backend Build | | Frontend Build |
|________________| |________________|
In this arrangement, developers would focus their efforts on one of the
lower trees, and the shipping build would basically always require
approval in some form (maybe just informal: "it's been in the frontend
build for a while").
This would require 3x the build infrastructure, but I think it would
have a positive effect on most other aspects of the operation. So, are
we thinking about how we can scale our tinderbox operation? :)
- Rob
See for example
http://moishelettvin.blogspot.com/2006/11/windows-shutdown-crapfest.html
Rob
I would say that keeping back-end/front-end divergence under control is
going to be a major headache in the proposed layout.
If I understand correctly no API is frozen in mozilla2. So there are
supposed to be numerous back-end changes that require concerted
front-end adjustments, and probably vice versa. If it were not true, why
we would need different trees at all.
On the other side, keeping development trees too close, kills the
purpose of using distributed VCS.
There is an existing repository layout which is working acceptably well
for an open-source project of a comparable scale. Using Robert Sayre's
lexics, it looks like this:
__________________
| |
| Shipping Build |
|________________|
/ \
/ \
/ \
/ \
/ \
__________________ __________________
| | | |
| Module 1 Build | ... | Module N Build |
|________________| |________________|
\ /
\ /
\ /
\ /
_________________
| |
| Nightly build |
|_______________|
In this layout, module builds changes are first integrated to the
nightly builds for field testing, and after successful testing into the
shipping build. The idea is that changes between each module tree and
shipping tree are "human scale", so single merge is a piece of cake.
Nightly build is going to be quite a mess, but new ideas will be pushed
for field testing early and often.
In this context, typical bugzilla flags obtain VSC meaning:
patch submitted no changes
r+ merged into module tree
sr+ merged into nightly tree
a(xxx)+ merged into shipping tree
Module owners can borrow changes from other module trees as they see
fit, establishing dependencies between patches, which can be tracked
with bugzilla.
Bottom line: looks like it may be possible to receive advantages of
"tree closed" and "tree open" cvs modes without their respectful
disadvantages.
--
Sergey
I don't think we have much pressure in times where we are not freezing
or pushing for a release, so I don't see the actual issue you want to
address here.
Robert Kaiser
What's not clear to me is what the benefit is of splitting the
"integration builds" into frontend and backend pieces. If you take that
out of the picture, what you're suggesting is isomorphic to branching
1.9 now, which might indeed be the right thing to do.
Dan
> One of the problems we have with CVS is that we try to perform too much
> integration in one place: the trunk. This creates a lot of pressure to
> accept patches that we really shouldn't, because people need their
> patches in the tree to make collaboration easy.
>
> __________________
> | |
> | Shipping Build |
> |________________|
> / \
> / \
> / \
> / \
> / \
> __________________ __________________
> | | | |
> | Backend Build | | Frontend Build |
> |________________| |________________|
>
>
> In this arrangement, developers would focus their efforts on one of the
> lower trees, and the shipping build would basically always require
> approval in some form (maybe just informal: "it's been in the frontend
> build for a while").
I agree with other replies that a permanent split of this sort is probably
not productive. But I think there are other options we can consider,
especially as we get better release tooling:
+----------------------------------+
| Shipping Build (mozilla-central) |
+----------------------------------+
| | \
| | \
+--------------------------+ | +----------------------------+
| Larry Integration Branch | | | <video> integration branch |
+--------------------------+ | +----------------------------+
|
+------------------+
| New theme branch |
+------------------+
Each branch could contain related FE and BE changes (e.g. a new theme branch
could contain new features for nsITheme native theming, in addition to the
FE CSS/XUL changes).
The try server recently gained the ability to build from arbitrary hg
branches; the Q4 QA goals include a perftest tryserver that will run
performance tests on these builds... so rather than keeping tinderboxes
constantly cycling with these builds, the maintainer of the branch would
ocassionally kick off a feature build.
There are some issues to think about: the try builds don't get any updates;
if they are installed by a significant audience we might want to consider
whether there is a way to set special update channels for feature branches.
The hard part about updates is generally partials, so if we limit it to full
updates this probably isn't too hard.
For the XPCOMGC work, we aren't using Hg branches but rather a shared HG
patch queue... this requires some discipline so that you handle merges sanely.
--BDS
I have to say I'm really surprised no one else thinks there's a problem
here. We are already doing separate repositories for things like
<video>, and we still have a traffic jam on the trunk.
Maybe there's no need to make a rule about this, though. We will still
need to scale out our build infrastructure so that ad-hoc integration
repositories (say, for a new gfx layer) will have adequate performance
testing and unit testing. The goal here is to reduce the number of
regressions landing on the trunk and surprising us.
- Rob
I know in the performance meeting there has been talk about being able
to scale to multiple branches. I think it's expected that multiple
development branches will happen more often in mozilla 2, this was one
of the major reasons to switch to mercurial.
So I think that to some extent what you are describing is already
happening, we are ramping up to be able to deal with multiple branches,
but so far no rules have been set up around it.
/ Jonas
It seems to me that we could easily setup a buildbot master with a
pool of slaves and the ability to watch multiple Hg trees and build
any of them on checkin. This way anyone could create an ad-hoc branch
for any work (which is already happening in Hg), and get the benefits
of building each platform per-checkin and running unit/perf tests.
AFAIK, the only thing that isn't doable immediately with our buildbot
setups is having a buildbot master where you can add/remove repos to
watch from a web interface or something, but I don't think that's alot
of effort.
-Ted
Maybe you should elaborate on what "traffic jam" you're seeing... do you
mean just the frequency of checkins, or frequency of automatically-detected
regressions, or the frequency of user-detected regressions?
I think we should use the automated tests (unit and performance) on as many
testing branches as possible... but I don't think that splitting into
branches will help with the user-find regression rate. Rather, I suspect
that it would either
1) fragment and confuse our nightly testing community and make it harder to
figure out regression dates and causes
2) delay the discovery of regressions until they were merged "up" into the
integration tree, which would probably lump many checkins together and make
it harder to isolate regressions
--BDS
Frequency of automatically-detected regressions, especially performace
regressions. We face pressure to include them in order to facilitate
collaboration.
>
> I think we should use the automated tests (unit and performance) on as many
> testing branches as possible... but I don't think that splitting into
> branches will help with the user-find regression rate.
No, probably not.
> Rather, I suspect
> that it would either
>
> 1) fragment and confuse our nightly testing community and make it harder to
> figure out regression dates and causes
Well, that all depends on whether we encourage them to use the builds
from the integration tree.
> 2) delay the discovery of regressions until they were merged "up" into the
> integration tree, which would probably lump many checkins together and make
> it harder to isolate regressions
We already lump checkins pretty often with the current process. I'm not
convinced this would be worse.
- Rob
Did I miss a discussion about this somewhere?
For what it's worth, I've been seeing a number of things go by recently that
really should have been caught by sr (and were not, because they were happening
in modules with no sr).
-Boris
Sure, I agree with this in principle. I think my proposed branch
topology is simple to implement, understand, and observe, while
"everyone doing any non-trivial work" is not. Perhaps the arbitrary
division is a bit impolitic. I'm not claiming it should be permanent.
> We're gearing up, from what I understand,
> but newer systems such as talos are not quite turnkey to set up or
> clone.
There are other threads floating around about this stuff. Can we get 10x
the capacity up, running, and managed?
- Rob
> I agree with other replies that a permanent split of this sort is probably
> not productive. But I think there are other options we can consider,
> especially as we get better release tooling:
>
> +----------------------------------+
> | Shipping Build (mozilla-central) |
> +----------------------------------+
> | | \
> | | \
> +--------------------------+ | +----------------------------+
> | Larry Integration Branch | | | <video> integration branch |
> +--------------------------+ | +----------------------------+
> |
> +------------------+
> | New theme branch |
> +------------------+
The main problem I see with this is that to get testing, you still need
to check in to mozilla-central.
The mercurial guys have a set up like this:
hg:
main repository,
hg-stable:
stable repository, latest release + bugfixes only. subset of hg.
crew:
shared repository, a couple people have push access. things get
tested and reviewed here. See for more info:
http://www.selenic.com/mercurial/wiki/index.cgi/CrewRepository
crew-stable:
same thing as crew, but for -stable.
Additionally, most developers have a repo with outgoing chages, many
have several for various features they're working on. See
http://www.selenic.com/mercurial/wiki/index.cgi/DeveloperRepos
The nice part about mercurial, and all DVCes actually, is that you don't
need to have commit access to publish a repository with your changes, so
anyone can download that repository, build from it, or even pull the
changes from your repository into their own.
Say I'm working on refactoring something, and Fred wants to help.
I publish my work on my website.
Fred clones that repository, makes changes.
Then he publishes his changes on his website, and emails me the URL
I pull from Fred's clone.
Similarly, if Joe is working on a new feature and wants to use an API I
added in my refactoring, he can pull all of my changes into his repository.
To really get the full mileage out of a DVCS, people should be creating
*and publishing* changes they make, as they make them. It sounds like
with the try server we'll have a reliable way to get perf data for a
mercurial repository. I think this will eliminate a lot of the urge to
get something checked in to the main tree ASAP, which is what causes the
traffic jams sayrer was talking about.
Maybe in the New World, you'll need to have run the performance & unit
tests *before* you're allowed to push to mozilla-crew (or
mozilla-crew-{widget,content,toolkit,browser,...}, maybe) which has even
more runs of the tests run on it, so we're *really sure*, before
anything goes into mozilla-central.
It's pretty common right now for a particularly invasive patch to get
backed out and re-landed multiple times. There's really no reason this
should happen as often as it does now when we're using Mercuria.
I did use kernel.org as a starting point. They have an equivalent of a
nightly build. It is Andrew Morton's -mm tree.
> "Module" may be more than just a module or subsystem, of course. It
> could be a cross-cutting project that will reshape several modules at
> once.
IIUC modules are used to avoid "bus effect". Even most ambitious
project-wide changes like out parameter removal and reference count ->
garbage collection switch reshape only one module. All the rest is
client calls change, which can be huge, of course.
>> In this layout, module builds changes are first integrated to the
>> nightly builds for field testing, and after successful testing into the
>> shipping build. The idea is that changes between each module tree and
>> shipping tree are "human scale", so single merge is a piece of cake.
>
> This sounds like the kernel.org subsystem branch merging without
> conflict testimony I've heard (second hand).
I have had a bit of experience with kernel.org, when I submitted a
device driver patch there. The patch was not accepted, but I learned lot
about their practices.
They don't have conflicts at all when merging to the release branch
(Linus Torvald's tree), because the whole development process it
designed to prevent this type conflicts.
They have at least three branches (or better say moving virtual tags or
vtags) for each subsystem: unstable, testing and stable.
Development happens in unstable. From time to time, nightly driver
(Andrew Morton) pulls unstable subsystem branches for the first
integration. If unstable subsystem branch is not causing the nightly to
break in a severe way, it is accepted for testing. Subsytem testing vtag
is updated.
Testing vtags are then prepared for release integration. The difference
between the current release tree and the testing vtag is split into
small "obvious" patches. What each patch does should be clear from the
patch content. Each patch should also contain annotation, why it is
required at all. When preparation is completed, the stable vtag is updated.
Finally, the stable vtag is merged into release tree. They use a
bullet-proof merging algorithm, which uses SHA1 hash sums to determine
versions of merged files. This way they ensure that each file is patched
at the correct location.
>> Nightly build is going to be quite a mess, but new ideas will be pushed
>> for field testing early and often.
>
> We can't stop doing that, yes. But each module branch or repo needs to
> have its testing tinderboxes, too.
>
>> In this context, typical bugzilla flags obtain VSC meaning:
>> patch submitted no changes
>> r+ merged into module tree
>> sr+ merged into nightly tree
>> a(xxx)+ merged into shipping tree
>
> I'm pretty sure we are *not* going to keep sr going for many modules
> in Mozilla 2.
I was not aware of the idea to abandon sr+ flag. Being an indicator of
module nightly merge, it may be used to trigger tinderbox testing of
that module branches.
Right now, the number of r+ (cvs trunk merge) per business day is around
50. I would expect that the number of sr+ and a(xxx)+ in the proposed
scheme will be considerably lower. As a result, it will be technically
possible to test every critical point, not some random points as it
happens today.
If development pace elevates, we will need to use virtual tinderboxen.
IIUC, it is not a problem for mochitest. I see two possible solutions
for perf testing: either only test a(xxx)+ for perf regressions or
rewrite perf tests to measure spent processor time, not elapsed time.
Even if it requires certain investments in tools and equipment, the code
quality management results will be *much* better.
--
Sergey Yanovich
All due respect, Mercurial is not even close to Mozilla in terms of size
and complexity. When the number of active developers is small enough,
they can communicate efficiently enough to allow for a single testing
branch/repo (crew in this example). In our case, crew will diverge from
main beyond comprehension, and we will become the same mozilla-central
with all its traffic jams.
> To really get the full mileage out of a DVCS, people should be creating
> *and publishing* changes they make, as they make them. It sounds like
> with the try server we'll have a reliable way to get perf data for a
> mercurial repository.
This is part is true.
> I think this will eliminate a lot of the urge to get something checked
> in to the main tree ASAP, which is what causes the traffic jams sayrer
> was talking about.
But this is not. Mere publishing of individual branches without proper
integration strategy won't solve anything. Right now, we have "tree
open"/"tree closed" modes of operation. Each mode has its pros and
contras, and none is acceptable.
People at kernel.org are developing a comparably sized project with a
distributed VCS. They design process combines advantages of both with
disadvantages of none.
> Maybe in the New World, you'll need to have run the performance & unit
> tests *before* you're allowed to push to mozilla-crew (or
> mozilla-crew-{widget,content,toolkit,browser,...}, maybe) which has even
> more runs of the tests run on it, so we're *really sure*, before
> anything goes into mozilla-central.
I would say it is a mission impossible to test everything *before* it is
landed to the shipping build. Also the current number of accepted
patches and the current build/test duration effectively forbid per
module tinderbox structure.
However, more careful selection of critical points for testing in
development/integration trees will solve this problem.
> It's pretty common right now for a particularly invasive patch to get
> backed out and re-landed multiple times. There's really no reason this
> should happen as often as it does now when we're using Mercuria.
Again, plain usage of DVCS is not a cure against the described problem.
But DVCS allows for a more efficient code management, which will help.
--
Sergey Yanovich
As Brendan apparently said, the solution to this is just a try-server
perf-testing farm that people can use when they think their changes
might affect perf. If it runs the complete set of regression tests
too, that'd be even better. Allowing the try server to pull directly
from a Mercurial branch would help but is not strictly necessary.
Being able to publish private feature branches is great, I'm looking
forward to that. But I don't want a staged-checkin architecture, at
least until we've definitely exhausted all other options to address
checkin bandwidth.
Rob