Gitflow Python Rewrite

308 views
Skip to first unread message

Sebastian Thiel

unread,
Oct 29, 2010, 9:56:04 AM10/29/10
to gitflow-users
Hi,

A few days ago Vincent Driessen ( that name won't need introduction I
assume :) ) contacted me asking about the maturity of Git-Python, as
he is looking for a cross-platform python library allowing to interact
with git. He mentioned gitflow, and to that point, I haven't heard
about it, what a shame. When I looked it up, and read his convincing
blog post, one could clearly say I caught fire.

My reply about the said maturity was lengthy, and apparently good
enough to make him choose git-python to use in gitflow's python
rewrite. Admittedly, this was very much to the liking of git-python's
maintainer, which happens to be myself. Gitflows unique needs will
help git-python to evolve into a different direction, and to improve
in many ways.

After fixing a few issues in git-python that came up during Vincent's
first ride with it, I started realizing the possibilities slumbering
within py-gitflow. Not only will it be supporting the "successful git
branching model" as good as its bash counterpart, but it might also be
able to bring it to the next level.

It must have been the third day after first contact when I dedicated
myself to the py-gitflow project with first-class git-python support,
git-python centric code reviews of the py-gitflow code, and one patch
or another. This should help to get a as-best-as-possible initial
implementation, and lay a good foundation for that 'next level'.

But what is that 'next level' I keep bragging about ? First of all, I
don't quite know whether it really is a next level, or whether gitflow
can already do it. If it does, the following will just be a use-case
representing existing gitflow features, otherwise it might be worth
further studies.

( Now follow these studies, its a long text, that needs images.
Otherwise it might be hard to follow. I will re-release this document
with images, better structure, and more consistent terms soon ).

Imagine you are a single person developing software in a single
repository. You use gitflow, and the respective branching model, and
everything is a snap. This is equivalent to one development stream,
called the main stream, and a single repository with no remotes ).

Your sell the software to one client, called C1, which enables you to
expand the team. Now several people are developing collaboratively,
and pull from and push to each other, to finally transfer their
changes to a central repository. They all use gitflow, being quite
happy as the workflow is already supported. Now we have multiple
repositories, each of which has one ore more remotes, but the number
of development streams is still one (i.e. there is just one main
master and develop branch).

Wow, hiring these sales people really paid off, because you got
another client. That client, C2, has very special needs, and wants you
to develop a feature, called C2F1, exclusively for him, the rest of
the software is fine though. Now you are somewhat in trouble, because
you need to create a special version of the software just for C2,
which effectively opens a second development stream, especially
because C2F1 may not be merged into C1's development stream.

The sales people are still at work, and there you have C3 as new
client. He is low on cash, and only makes a deal if he may pay solely
for the features he needs. Out of 3, he takes F1 and F3, but doesn't
want F2. Now you would add a third development stream, and C3's
version would even need to be recreated by repeating the work done up
to and including F1, leaving out F2, to finally integrate F3.

As you see, next level really is to increment the number of
development streams, from 1 to N, N being the maximum amount of
development streams you can possibly manage with gitflow's next-level
implementation.
All this sounds neat, but wouldn't this clutter up gitflow's simple
and easy-to-use interface ? The answer is: Not if done right !

Remember the second step of your company expansion, when new
developers were added to the team, multiplying the amount of
repositories, and adding remotes to each ? The usecase stated that
there is still just one development stream, but that's not quite
right. There are multiple streams, one for each repository, and each
of the streams is unique for a while. The reason they don't diverge
into separate development streams is the contract they have which each
other. That contract forces them to stay in sync, so they merge with
each other to effectively maintain one common development stream.
Adding the repositories of C2 and C3 adds new development streams just
because that constraint is not in place, that is you don't have to
merge them, and in fact you may not even be allowed to.

Moving on, the next 'issue' to think about are changes introduced
through fixes/hotfixes. In gitflow, hotfixes are applied to the
release version which sits in the master branch, and is subsequently
merged into that branch, as well as the respective develop branch.
This is good as long as the hotfix fixes the system/framework, as its
position in the commit graph clearly indicates that, as the hotfix was
forked of the master branch. Considering we have multiple independent
development streams, this would imply we have to cheery-pick the
hotfix and put it onto the most similar spot on the master branches of
the other streams, into which it is subsequently merged, as well as
into their respective develop branches.

When thinking about C3, which lacks F2, we realize that hotfixes that
in fact only fix something in F2 are of no interest for C3. In that
case, gitflow would fail to find a good spot to apply the hotfix to.
But how can gitflow possibly figure out the 'sweet spot' to apply
hotfixes to, if the streams are independent ? To my mind, only be
relating the position of the hotfix in the mainstream to the commit
graph around it, trying to find similar spots in the graph of the
other streams. Prominent landmarks would be merges with feature
branches for instance.
To help gitflow narrow down on the target position, hotfixes that are
meant to fix a specific feature, should be based on the remains of
that feature. The respective feature branches in other streams can
more easily be found.

Speaking about positions: Its very important where ( on which commit )
you actually base your hotfix upon. The best position is not just the
latest commit, but it is the earliest commit which makes the hotfix
necessary, that is, where the bug was introduced. This allows you to
find all release versions ( in the master branch ) which still need
that particular hotfix applied.

Another kind of change are features themselves. The initially striking
difference between features and hotfixes is, that features are usually
composed of multiple commits, not just one. When thinking back at C3,
its development stream was rebuild from the spot at which F1 started
its development. This is the earliest spot at which a new stream can
safely support that feature. Then F1, as it is a stream of commits,
was rebased ( as opposed to cherry-picked ) onto the new stream,
creating the copy F1'. The same rebasing was done for F3, to create
F3'. The exact same thing happens once a new feature, F4, is merged
into the main stream. Gitflow would try to find a good spot for the
root of the F4 commit stream, and rebase it onto all viable
development streams to create F4' copies.

Although this example uses the term 'main stream' to describe the
stream where everything started, technically all streams have equal
rights, and hotfixes and features can be implemented on any stream, to
be copied back into the other development streams. This is the idea of
being distributed, isn't it :) .

To enable gitflow to do all this, it needs to have an understanding of
which 'substreams', that is a feature stream for instance, and which
hotfixes, i.e. a stream with just one commit, are already merged into
the development stream at hand. The great thing is, that git can
already do that, by comparing the 'patch-ids' of each commit ( see git-
cherry for further reference ). Even without additional meta-data kept
in some hidden file, gitflow could be able to retrieve all this
information by just analysing the commit graph. With that technique,
queries like : "Which features are in the stream at 'master' ?'",
"Which hotfixes have been applied to feature X ?" are totally
possible, and are additionally supported by certain assumptions which
are true in the 'successfull git branching model'. It is interesting
to see that git already does so much of what is needed, which is
because the usecases mentioned here are already at work in the open-
source world, but without specific tool support.

The most troublesome issue to think about was left for the end:
Dependencies. What if F3 depends on F2, and cannot work without it ?
C3 would have a problem, as he requested to get only F1 and F3. The
only way for gitflow to track these dependencies is if they are
actually represented by the commit graph itself. This implies that, if
F3 is dependent on F2, it had to be based not on the latest develop
branch, but on the feature branch it should directly depend on.
Changes in the form of hotfixes, based on F2, would be merged into F2,
then into all its child-features, which is only F3 in this example.
From there, it gets propagated into all streams that contain the
respective feature streams in question. As the point in time at which
the hotfixed feature stream was originally merged into its develop
branch is clear, one can easily find all affected releases and the
respective commits in the master branch, into which the hotfix would
have to be merged as well.
If F3 would have been based on develop, it is in fact dependent on the
particular spot it forked off, which makes develop its parent stream.
Hotfix propagation would work exactly the same as described before.
Compared to the way hotfixes have been applied in the default gitflow
implementation, one would now get the option of applying the hotfixes
on a lower level in the stream hierarchy, and propagate them upwards.

Even though I might have lost 95% of the readers till now, the last
example should have made one thing clear: What gitflow has to deal
with, is not a static list of branches, but a hierarchy of commit
streams, that follows the rules of a Directed Acyclic Graph. That
hierarchy is extended whenever one is branching/forking off a commit
in the graph.
When these streams are merged, they form something that looks like an
acyclic Dependency Graph graph. This graph is extended by each merge
commit.
One major assumption I make is that the individual streams have an ID
to identify them both in the Merge DG as well as in the Hierarchy DAG,
in all development streams. A simple way to track any stream for
instance is the name the user has assigned to the branch when forking
it. Maybe a better and automatic way to get a unique ID would be to
accumulate all Patch-IDs in that stream into a SHA1. Using this ID
enables gitflow to transfer applicable hotfixes and features from one
development stream into another, and even compute how good the match
is (e.g. there could be features with similar names, but many
differing patch-ids).

With such an abstraction in place, gitflow could easily support the
workflow it dealt with before, but would also grow incredibly more
powerful as a tool to aid software development, as it changes the way
releases can be handled towards the client.

A critical part of any project using such sophisticated workflows are
unittests with good coverage, because even though gitflow can handle
the merges/cheery-picks and rebases for you, it cannot guarantee that
the resulting code still works as intended - verifying that would be
the job of the unittests, that could even be run automatically in all
affected streams.

All this might sound terribly complicated, but I believe, complex is
the better wording here. Thus I am optimistic that it can actually be
implemented, even though all of this might not be available in version
one.
My future work shall be to improve on this document to make my point
clearer, and to help getting a deep understanding of the Commit Graph
into gitflow, which would be needed to have a chance to deal with the
presented usecases.

Regards,
Sebastian

PS: There are many possible issues and constraints that go with the
usecases mentioned here, but delving into these would easily have made
the document totally unreadable ( for a mailing-list ).

Mark Derricutt

unread,
Oct 29, 2010, 10:47:03 PM10/29/10
to gitflo...@googlegroups.com
Actually - I just skimmed the latter part of the message and saw the mention of dependency graphs.  Combined with making ones application modular, the use of dependency management tools such as maven, gems, cpan is often also the case.

Would making git-flow also play in this area (at first thought) sounds like it could make things rather complicated and maybe overkill?

Mark

--
"Great artists are extremely selfish and arrogant things" — Steven Wilson, Porcupine Tree

Mark Derricutt

unread,
Oct 29, 2010, 10:42:47 PM10/29/10
to gitflo...@googlegroups.com
Wow talk about a LONG email ;-)

I'll admit that I've only read the first 1/2 of this and then stopped, ( it was 6am and I was reading on my phone, but I pretty much came to the thought that this is almost the wrong fight to be fighting ( do correct me if I'm wrong tho - repository layout and project setups are a mighty complex messy business)).

But....

Surely this is just an argument for writing modular, compose-able applications?

At work we have our -core project, our -userinterface, and our -reports modules.  Each in separate repositories, with separate release cycles.

The packaging up of these groups of modules is how we handle the situation of customer X wanting core+feature1+feature2 and customer Y wanting core+feature3   ( in our situation thats actually dev/staging/demo/production-nz/production-india/production-japan ).

Sometimes a change in feature-X necessitates a change in -core, but that should hopefully be kept to a minimum, but if not - those changes in core get provided to all customers, even if the functionality they expose/provide isn'it.

Now that I'm awake I'll go read the rest of the email tho :)

Mark

--
"Great artists are extremely selfish and arrogant things" — Steven Wilson, Porcupine Tree



On Sat, Oct 30, 2010 at 2:56 AM, Sebastian Thiel <byro...@googlemail.com> wrote:

Sebastian Thiel

unread,
Oct 30, 2010, 6:08:35 AM10/30/10
to gitflo...@googlegroups.com
Hi Mark,

Creating modular applications in the first place - that is exactly what
I thought about when finishing that email, and I do agree that doing
that is very common, and its quite well known how to do it. It could be
as easy as writing a few dll's/so's that link to their common
application core. The code for each of these would be kept in on
repository each, where git-flow is used for the branching. The
distribution to the client is done by individual scripts/build-steps
(one per client), which takes care of putting the right files into the
deliverable, according to the features the client has chosen. At the
end, this deliverable is is sent as a whole to each client, who fully
replaces his application with the new version. In such a system, you
wouldn't really track the inter-dll dependencies, as you know that if a
dll changes, all clients which use it need a new deliverable, and if the
core changes, every client needs to get a new deliverable - problem solved.

This kind of workflow usually works well for C++ and C programs.

Lets look at a program based on python, and lets assume that a package
is setup such that each module within it provides new functionality to
some framework it is built upon. Each of these modules adds a distinct
feature, that could be comparable to a separate dll that is being sold
separately. To keep things very simple, once again the developer just
keeps one development stream, and has deployment scripts which compose
the application for individual clients, be simply removing python
modules they didn't pay for, for instance. Whenever the developer has a
stable version he wants to share, he runs all his deployment scripts
which create (possibly byte-compiled) distributions. He puts these into
their own 'distribution git repositories', and ships the new version to
the client only if it actually changed ( the feature he worked on to
create a new release could have been completely removed in one clients
version, so for this one, it didn't change at all ).

In both examples, the knowledge about which client gets which features
is kept in some sort of script ( or let it be some database that the
build system uses to determine the feature set ).
Yet again, a system like that appears to work well enough to be usable,
and its relatively simple to implement and to use.

When reading this, I would come to exactly your conclusion: Why
complicate things in git-flow if we already have it working ? Git-flow
does the branching, the build-system cares about the distribution, et voilà.

Now lets have a look at the dependency graph formed by the commits in
our repositories. When using gitflow, we have multiple branches, which
represent distinct entities, like the development stream, feature
streams, hotfixes or streams of hotfixes, release streams, and the
stream containing the deliverable itself, master. Its great to have
that, and a lot of information is contained in there. Its dependency
information, and information about issues that arose ( hotfixes ),
features you added, and the time-relationships of all these.

At the current point, there is no software that would put that
information to good use. For us, currently, its just a bunch of commits
linked to each other, and all we really care about is the develop- and
the master-branch. But in fact this dependency graph contains the
dependencies of your code, as well the 'flow' of the development.
Realizing what actually happened to your code allows you to assemble and
combine it, while keeping complete track of said dependencies.
Whats even better is that we have all this in a single integrated
database, which is our git repository, and we can exactly tell what
features and fixes are in each development stream, because git-flow
understands them.

Lets add just one more example: You have a client who is in the middle
of a production, he uses software version 1, and he really wants to
stick with it as he is afraid that any change would have a
negative/unforseen impact on all the tools that depend on your software,
at version 1. Now he runs into a bug, and asks for help. You know this
bug is already fixed in version 1.5 of your software, but besides that,
many things changed.
If you didn't have git-flows new abilities, you would tell the client to
risk it, and use version 1.5 of your software, which might involve
relinking his tools against your library as many other things changed.
This essentially shifts the risk to the client, and he doesn't like that
at all. Also, the client had to run into the bug that you obviously
fixed a while ago in version 1.3 - as we know he didn't want to upgrade
to a new version anyway so he ignored all releases above 1.0.

With the new features, and provided the hotfix required to fix the bug
was applied at the point where it was introduced, git-flow could have
transferred the fix into all development streams that are affected by
it. You, as the developer of the software, are able to roll-out new
point releases of all affected releases for all your client, which just
add the bugfix but nothing more. You would have been the one to inform
the client about this bugfix, and would have made it available in
version 1.0.1, which is guaranteed to just fix a single bug, nothing else.

With the system I proposed for git-flow, something like backports and
feature-ports (i.e. the transfer of these into other development streams
) can be automated to a great extend, which is something I find very
valuable.
The service that you, as a software company can provide to the client,
is superior to that what the others can do, which gives you an edge over
the competition. You basically acknowledge that there are clients who
use an older version of the software, but using the new git-flow, you
can still support it with bugfixes, and even compatible features ( if
you like ).
What a great service :).

Cheers,
Sebastian

PS: I must admit that the point I make here is always based on the
assumption that you can find the spot in time ( i.e. the commit ) that
introduced a certain bug, so that you can fix it right there. If you
always apply hotfixes at the newest commit ( which is easiest ), even
git-flow could only assume that this hotfix really needs the latest
version of your software, in which it could only port/transfer it into
compatible development streams which are at the same 'level'.
Finding that spot in time could be done using bisection + a custom test
which you run whenever you have checked out a certain version of the
software. All this needs to be automated to be feasible, but ... it can
be automated, its all possible, someone just has to do it :).

PS: Yet another LONG email, can't help it, must be the topic ;).

Vincent Driessen

unread,
Oct 30, 2010, 8:27:02 AM10/30/10
to gitflow-users
Hi,

Wow, let me first say that I'm really happy with your involvement and
enthusiasm to contribute to gitflow and I'm glad to receive thoughts
on alternative implementations.

I read your e-mail top to bottom and tried to understand the ideas
that you are proposing. Although it sounds pretty complicated, I think
I know what you are trying to say here, but I think we have to make
sure that it doesn't get any more complicated than what gitflow is
right now. This doesn't mean however, that we shouldn't add new
features or behaviour to gitflow.

Let me start by pointing out what gitflow is about and what I think
should not be gitflow's responsibilities. The original blog post
describing the workflow proposed a simple set of rules to formalise a
uniform release management process. gitflow was simply the next
evolutionary step to ease working with these rules, and to prevent
users against accidentally skipping/forgetting steps.

Although I can understand your proposal to let gitflow do more than
"just that", I think it definitely helps to keep these core objectives
clearly visible and to ask questions about whether or not this should
(or even *can*) be gitflow's responsibility. My gut feeling says this
is something that gitflow was never developed for. I trust my gut
feeling always when I'm developing software since it rarely lets me
down, but I feel obligated to rationalise what exactly it is that
causes it. So here we go.

If I understood your proposal correctly, a good chunk of it is
dedicated to something that I would call "feature management"—the act
of managing what features land in what version of the software you are
selling. Essentially, you have a piece of reusable software that you
sell in different "flavours" to clients, where each flavour
essentially is e core product and a fixed set of features/add-ons. The
term "feature" is what causes the obscurity here, since you seem to
approach a feature branch as if they are an integral set of commits
that together implement a complete end-user feature (e.g. what a
customer would call a feature). This is a false assumption!

In the gitflow model, a feature branch is used to create an isolated
environment to work on a non-trivial change to the project's source
code (this is what we call a "feature"). When a feature is ready/
stable enough, it is merged into the develop branch. As a consequence,
this means that it is included in the next release (by definition). It
does, however, NOT mean that the end-user feature is finished in any
way. There can be proceeding work on develop for example, only to
later change source files of the same end-user feature again, which
would require *another* feature branch with gitflow. Therefore,
gitflow is unable to ever "relate" these feature branches to be the
same "end-user feature". Hence, the nature of the model, and the tool,
makes it unsuited to bear this kind of responsibility.

As Mark proposes correctly, there are better ways of dealing with this
kind of managerial problem, but all variant of that require you to
structure your source files in order to support this kind of
management. Examples are feature switches [1], or using different Git
repo's for each client, each "including" your core product using a Git
submodule. Yet another one is restructuring your project using add-ons/
plugins, and making distributions for your customers by assembling
packages to contain fixed sets of these plugins. Of course, you
customers don't need to know anything about this implementation. If
you know the Django framework, you know how it makes the application
pluggable [2], which you can really learn a lot from.

So I'd advise you to manage these changes in space (directories), not
time (branches). To make it a bit more visual: finishing feature
branches in gitflow is a bit like pissing into a swimming pool.
Filling up the swimming pool requires multiple pisses, possibly by
multiple people. But if you want to track what pisses are for which
customer, really the only way is to piss in another pool :-)

I like the idea, though, of gitflow being able to better deal with
multiple remotes and to assist in getting a better understanding of
what remotes are upfront/behind on what branches. I see a perfect fit
for this in gitflow. I encourage you to take a look at the
possibilities that the "feature pull" subcommand already offers [3],
as this is currently the kind of remote-friendly commands that I'm
targeting with the rewrite.

Let me know what you think of this!

Cheers,
Vincent

[1] http://blog.disqus.com/post/789540337/partial-deployment-with-feature-switches
[2] http://charlesleifer.com/blog/django-patterns-pluggable-backends/
[3] http://github.com/nvie/gitflow/commit/f68d405cc3a11e9df3671f567658a6ab6ed8e0a1

Vincent Driessen

unread,
Oct 30, 2010, 8:48:16 AM10/30/10
to gitflow-users
Hi Sebastian,

On Oct 30, 12:08 pm, Sebastian Thiel <byron...@googlemail.com> wrote:
> Lets add just one more example: You have a client who is in the middle
> of a production, he uses software version 1, and he really wants to
> stick with it as he is afraid that any change would have a
> negative/unforseen impact on all the tools that depend on your software,
> at version 1. Now he runs into a bug, and asks for help. You know this
> bug is already fixed in version 1.5 of your software, but besides that,
> many things changed.
> If you didn't have git-flows new abilities, you would tell the client to
> risk it, and use version 1.5 of your software, which might involve
> relinking his tools against your library as many other things changed.
> This essentially shifts the risk to the client, and he doesn't like that
> at all. Also, the client had to run into the bug that you obviously
> fixed a while ago in version 1.3 - as we know he didn't want to upgrade
> to a new version anyway so he ignored all releases above 1.0.

In the original blog post's comment stream, a few people suggested
adding support for exactly this, which we decided to call "support
branches". Please read [1] and my reply to this [2] (sorry, old Disqus
threaded comments are a bit hard to link to). Maybe this can already
deal with your case?

Cheers,
Vincent

[1] http://nvie.com/posts/a-successful-git-branching-model/#comment-72478932
[2] http://nvie.com/posts/a-successful-git-branching-model/#comment-72478934

Sebastian Thiel

unread,
Nov 2, 2010, 5:04:58 AM11/2/10
to gitflo...@googlegroups.com
Hi Vincent,

This is where argumentations get difficult, as mailing lists usually create dialogues by interleaving monologues, in this case, these are rather long and contain a lot of information. This makes it hard to still see it as a dialogue, which is why I will comment on some of your ideas, quoting them beforehand. I do hope that the formatting will be persist on the mailing list.

<vincent>I read your e-mail top to bottom and tried to understand the ideas
that you are proposing. Although it sounds pretty complicated, I think
I know what you are trying to say here, but I think we have to make
sure that it doesn't get any more complicated than what gitflow is
right now. This doesn't mean however, that we shouldn't add new
features or behaviour to gitflow.</vincent>
Keeping it simple ( to the user ) is something I see likewise. To my mind, it is not necessary at all to change the interface of the user in the first version.
<vincent>Let me start by pointing out what gitflow is about and what I think
should not be gitflow's responsibilities. The original blog post
describing the workflow proposed a simple set of rules to formalise a
uniform release management process. gitflow was simply the next
evolutionary step to ease working with these rules, and to prevent
users against accidentally skipping/forgetting steps.</vincent>
You see gitflow, in its first form as a necessary innovation, as you first formalised a good branching workflow when working with git. You call it a necessary evolutionary step, like the fish, which eventually came out of the water as reptiles. What I am proposing is to apply yet another evolutionary step, which doesn't make the Reptile a bird, but something like a crocodile which is faster and better than its predecessors, and ... it can bite ;). Before I go into details about this, I will pick up more of your statements.

<vincent>Although I can understand your proposal to let gitflow do more than
"just that", I think it definitely helps to keep these core objectives
clearly visible and to ask questions about whether or not this should
(or even *can*) be gitflow's responsibility.</vincent>

This is a very strong statement as you seriously doubt that the workflow is possibly achievable, and whether it is gitflow's turn to try. Gitflow formalised a way to work with branches, and the set of rules also keeps it as simple as possible. The resulting commit graph will be relatively easily read when watched in gitk even, and I believe such a design is good.
What I would like to do, is to formalise the way you work and *can* work with branches even more, which covers gitflows previous workflow, but also allows new ones if the user wishes to do so.
<vincent>My gut feeling says this
is something that gitflow was never developed for. I trust my gut
feeling always when I'm developing software since it rarely lets me
down, but I feel obligated to rationalise what exactly it is that
causes it.</vincent>
A gut feeling is something very healthy - it prevents you from doing dangerous things, and definitely helps to keep things sane. When you tell me that I should write a fluid simulation solver, my gut-feeling would tell me that there is no way, and that I couldn't do it, there is no light at the end of the tunnel for me. This feeling is based on fear of something that I do not (yet) understand, so I have no idea how to possibly tackle this.
My gut-feeling about my proposed changes is a mixed bag - I believe an implementation is possible, even though I am not able to write a fluid solver. What I am yet unsure about is how much benefit such a complex implementation will actually bring to the community. Will it be worth it ? In the first version, maybe it won't ... .

The following quote will be a rather long one. It defines what a 'feature' is in gitflow's speak, and tries to regard my previously proposed workflow as another way to do client-feature-management (which is not to be confused with gitflow's term 'feature'). It closes by proposing different patterns to aid the said client-feature-management.

<vincent>If I understood your proposal correctly, a good chunk of it is
dedicated to something that I would call "feature management"�the act
of managing what features land in what version of the software you are
selling. Essentially, you have a piece of reusable software that you
sell in different "flavours" to clients, where each flavour
essentially is e core product and a fixed set of features/add-ons. The
term "feature" is what causes the obscurity here, since you seem to
approach a feature branch as if they are an integral set of commits
that together implement a complete end-user feature (e.g. what a
customer would call a feature). This is a false assumption!
If I understood your proposal correctly, a good chunk of it is
dedicated to something that I would call "feature management"�the act
of managing what features land in what version of the software you are
selling. Essentially, you have a piece of reusable software that you
sell in different "flavours" to clients, where each flavour
essentially is e core product and a fixed set of features/add-ons. The
term "feature" is what causes the obscurity here, since you seem to
approach a feature branch as if they are an integral set of commits
that together implement a complete end-user feature (e.g. what a
customer would call a feature). This is a false assumption!
As Mark proposes correctly, there are better ways of dealing with this
kind of managerial problem, but all variant of that require you to
structure your source files in order to support this kind of
management. Examples are feature switches [1], or using different Git
repo's for each client, each "including" your core product using a Git
submodule. Yet another one is restructuring your project using add-ons/
plugins, and making distributions for your customers by assembling
packages to contain fixed sets of these plugins. Of course, you
customers don't need to know anything about this implementation. If
you know the Django framework, you know how it makes the application
pluggable [2], which you can really learn a lot from.</vincent>

The proposed workflow indeed has to do a great deal with feature management, without it, there is no benefit from it.
A feature, in my sense, is code in one or more files which can run in the context of the (other) code that surrounds it. That would mean for instance, that the implementation of a new pluggable node which may be spread across X files, can be implemented in a 'time' where the 'pluggable node framework' was first implemented. Hence, when designing your system, you already had frameworks and their extension with particular implementations in mind - extension code would only need the context provided by its framework, but not more. The dependencies of every line of code are not known by anyone except for the compiler, but the developer at least knows the conceptual dependency, and expresses that by forking of a new 'feature' stream at the earliest point/commit where that dependency is met. This is like travelling back in time, somewhat pretending you have had this new 'feature' in the first place. The 'pluggable node' in that example is a child-stream which is based on a parent-stream.
The benefit of this is that gitflow now understands the conceptual dependencies in your code by understanding the commit graph, and is able to track the streams where your parent stream was merged into. Hence it knows which streams to merge your latest child-stream into to update them accordingly.

The notion of a 'stream' is new, so lets define it a little more. A stream is not a branch, but a consecutive line of one or more commits, where each commit only has one exclusive parent commit. By that definition, a new stream starts whenever you fork a new branch from a commit which has been merged already, i.e. the end of another stream, and is finished when you merge the stream itself.

If gitflow starts to understand the relationships between these streams, suddenly it becomes obvious where hotfixes have to be merged into for example. It doesn't need to be hardcoded anymore. Also, limitations applied to where you may fork off a new branch can be lifted, as there is no 'hard-coded' component anymore.


So I'd advise you to manage these changes in space (directories), not
time (branches). To make it a bit more visual: finishing feature
branches in gitflow is a bit like pissing into a swimming pool.
Filling up the swimming pool requires multiple pisses, possibly by
multiple people. But if you want to track what pisses are for which
customer, really the only way is to piss in another pool :-)
I like this image :) ! If the pool was just a directory structure on some network drive, I would totally agree that there is no way to track these pisses. But it is git we are talking about, and the piss isn't really piss as code doesn't diffuse into an unstructured mess by its nature.
Of course, currently, for gitflow the development process is indeed more like that pool, and the only thing gitflow does is to guide the pisses, preventing the developers to piss at each other all the time :P.
But ... by understanding these streams of piss ... ok, lets stop with this analogy here ;).


I like the idea, though, of gitflow being able to better deal with
multiple remotes and to assist in getting a better understanding of
what remotes are upfront/behind on what branches. I see a perfect fit
for this in gitflow. I encourage you to take a look at the
possibilities that the "feature pull" subcommand already offers [3],
as this is currently the kind of remote-friendly commands that I'm
targeting with the rewrite.
Yes, remotes are necessary, and I am happy that this should now be an integral part of gitflow. In my previous examples, I took remote-handling as granted, but maybe I have never stated that explicitly. To me, remotes are just a namespace for branches with an associated url to fetch from and push to. These boil down to branches and streams, which the proposed system could handle and understand naturally.


To come to a conclusion for myself at least: Without further research into that matter, which is regarding the commit graph as a hierarchy of streams and their flow in time, gitflow can't possibly commit to that. Even though I believe it's gitflow's next step, the effort required to build such a system might be larger than initially intended when developing gitflow.

For my part, I see the role of git-python support and occasional patches, but additionally to get a first implementation of something that could be called a 'StreamDB', hence it allows to navigate the commit graph more easily. In the meanwhile, I am sure there will be a first release of py-gitflow, and maybe, at that time we can come back to the initial proposal. Then, it will be backed with a proof of concept, so it will be easier to grasp and to see where such a step would lead gitflow to.

Cheers,
Sebastian

  

Mark Derricutt

unread,
Nov 2, 2010, 6:13:34 AM11/2/10
to gitflo...@googlegroups.com
Just a quick response before fully rereading and formulating more thoughts.

I wonder if part of the problem we're having in this discussion is terminology.

Under Sebastian's proposed changes, I see the current gitflow "feature" being more a "task" - a short lived set of commits that introduce functionality [1], whereas a feature is more a block of code with its own lifecycle of tasks, releases, hotfixes ( which I still think should live in a separate repository ).

Would the renaming of "feature" in to "task" in gitflow be a viable stepping stone to introducing the new ideas?

Mark




[1] A good post on this is from the guys at Plastic SCM: http://codicesoftware.blogspot.com/2010/08/branch-per-task-workflow-explained.html


--
"Great artists are extremely selfish and arrogant things" — Steven Wilson, Porcupine Tree



<vincent>If I understood your proposal correctly, a good chunk of it is
dedicated to something that I would call "feature management"—the act
of managing what features land in what version of the software you are
selling. Essentially, you have a piece of reusable software that you
sell in different "flavours" to clients, where each flavour
essentially is e core product and a fixed set of features/add-ons. The
term "feature" is what causes the obscurity here, since you seem to
approach a feature branch as if they are an integral set of commits
that together implement a complete end-user feature (e.g. what a
customer would call a feature). This is a false assumption!
If I understood your proposal correctly, a good chunk of it is
dedicated to something that I would call "feature management"—the act
of managing what features land in what version of the software you are
selling. Essentially, you have a piece of reusable software that you
sell in different "flavours" to clients, where each flavour
essentially is e core product and a fixed set of features/add-ons. The
term "feature" is what causes the obscurity here, since you seem to
approach a feature branch as if they are an integral set of commits
that together implement a complete end-user feature (e.g. what a
customer would call a feature). This is a false assumption!

Sebastian Thiel

unread,
Nov 4, 2010, 1:53:55 PM11/4/10
to gitflo...@googlegroups.com, Mark Derricutt
And just a quick response from my side as well. My whiteboard has just been installed, and I couldn't resist to scribble the commit-graph that would be created if I would try that next-level workflow. Even though I can see why I do these things, and why they work and make sense, for now I am unable to see all the rules behind my natural human thinking, which would make it impossible to program a system that could do the same, and more.

Basically this makes me emphasize my previous conclusion, back to the drawing board, or white board in this case. I believe I will scribble there from time to time, to hopefully figure out what it means in the logical world of a computer.
I truly hope this will happen soon :).

Gitflow will be what it was before, but stronger with remotes, and git-python will hopefully be a good partner in that endeavour.

Cheers,
Sebastian

Phil Hord

unread,
Nov 10, 2010, 2:45:36 AM11/10/10
to gitflow-users
Sebastian,

Please continue on your quest. I, for one, share your vision. I feel
stymied by the hard-coded nature of git-flow (as I understand it), and
I wish, for example, that it could be developed more generically so
that this kind of "meta-level" idea could be added easily.

Briefly, in our environment, we have some common codebase to implement
a custom bootloader in some embedded linux hardware devices. It is
all written as cross-platform C code running with no OS (as it is a
bootloader). It is now more than four years old and has branched into
about 6 different "live" branches managed by 6 different teams. Lets
call the branches A..F.

Branch A doesn't get many new features/fixes anymore.
Branch B gets some.
Branch C is pretty stable with few changes allowed by management.
Branch D is still evolving as needed.
Branch E is very active, but with some previous features removed for
space.
Branch F is where most "new" code goes as this product is not yet
released.

I would like to use the git-flow model for each of these branches.
Currently the only way I see to do that is to break these into 6
different repositories and adopt the git-flow branch naming
structure. But this makes promulgation of new features across teams
more difficult. It also needlessly discards the potential automation
of task/feature management using existing git functionality.

So I'd like to keep all these branches in one repository. Thus, git-
flow might have to deal with one team using "A/master", "A/develop",
"A/feature", etc., while another uses "C/master", "C/develop", "C/
feature". And I suppose this is rather trivial, when you get down to
it.

But maybe Vincent is correct. Maybe this is a space-thing. Maybe I
should have a common master repo that everyone pushes to. Team A will
push master:A/master and Team C will push master:C/master. But to
what benefit? I still feel a loss of potential git-power here. But
like you, I am uncertain if it can be realized in a clever tool.

git-flow is only a workflow imposition tool. And maybe that is where
it stays. Perhaps this tool you and I are imagining is another tool
altogether. Perhaps it is used to manage task inclusion (see the
PlasticSCM blog entry linked earlier in this thread). Perhaps it has
a nice GUI. Perhaps it can be driven by managers using Excel
printouts in weekly meetings (a common practice).

But maybe it is ancillary to git-flow.

Or maybe it is the core technology on which py-git-flow could be
based.

Don't lose it. It is important, I think.

Thanks,
Phil

Ryan Cross

unread,
Nov 10, 2010, 3:32:44 AM11/10/10
to gitflo...@googlegroups.com
My 2 cents... 

I think Phil nails it when he says, "git-flow is only a workflow imposition tool."

git flow comes from a well defined "successful" workflow using git and thus is just some helper code on top of git. It has the following properties that make it useful (and successful?): 

1) it is based on a workflow that good practice for "most" software projects
2) it doesn't prevent you from using git normally if you are advanced
3) it provides structure (like training wheels) for git newbies and convenience for undisciplined developers

This is all premised on the fact that your software project either:
1) Uses the same workflow that git flow is based on
2) You can modify your workflow or otherwise fit your workflow into the git flow model
3) You don't have a workflow  
4) You don't want to spend any time determining what is best practice for your situation (or you're not experienced enough to know what best practice might be for your project)

If your workflow doesn't fit into the defined model, then git flow is not going to help you. You do not have to use git flow in order to successfully work with git. I personally don't feel it makes sense to unnecessarily complicate the model just to accommodate edge cases. I do look forward to further development around the support branch concept/function, but I think this is something will likely benefit many projects not just edge cases. Considering git offers all the flexibility needed (perhaps this is an assumption), I see git flow as something that simplifies the tool and thus making it more complex goes against its main aim. KISS really applies here. 

Considering the various use cases that Sebastian and Phil have proposed, I would suggest that they may be working against the principles of the project that I described above. The conflict is at the workflow model level, not with the software. 

My recommendation would be to first write up what your proposed workflow model is - what your "best practice" is (look at Vincent's original post). Then it should be easy to see to compare it against the current git flow model and discuss where modifications of either model might make sense. Also, at that point (regardless of if modifying anyone's model is helpful) it would be relatively easy to simply write up your own helper scripts to match your workflow... git myflow if you will - same way the original scripts were born. 

For any sufficiently complex project, with enough developers on it to handle 6 active versions (as Phil mentioned), this shouldn't be a significant amount of overhead to reap the gains provided by the script's efficiency for your team. 

Looking forward to further contributions here. 

-Ryan

Vincent Driessen

unread,
Nov 10, 2010, 5:05:21 AM11/10/10
to gitflo...@googlegroups.com
Thanks, guys, for your inputs. I couldn't have put it as clear as Ryan put it. His mail really helps to get a bit of grip on the discussion, I think. I'm especially interested in looking at an example of the more complicated workflow that Sebastian and Phil target to successfully manage the multi-master set up.

I think the list of premises that Ryan put in place is something that helps to scope the git-flow tool, too. In fact, premise 1 ("Your software project uses the same workflow that git flow is based on") is the most important one. But yet, it might be a bit unclear what that means exactly.

One of the key aspects of why this workflow has turned out to be "successful" is that it fits single-master projects perfectly. Branching off in terms of different distributions/flavours is something that the workflow never considered in the first place. Trying to map git-flow onto this workflow is what now causes the pain and confusion.

As long as you develop a single-flavour application, the workflow is well-defined. But if you intend to keep many supported master branches around, this fundamentally erodes the principles of git-flow. I'm not saying doing this is wrong in any way, I'm just saying that git-flow might not be the tool that is going to help you.

I still believe that VCSes are way better at tracking changes over time, rather than space. This is not a Git-specific issue. Therefore, if you manage to *build* flavours from the same, single-master source code, the workflow will fit your project fine again. It's the same space/time discussion.

I'm looking forward to your thoughts.

Cheers,
Vincent

Sebastian Thiel

unread,
Nov 12, 2010, 6:48:40 PM11/12/10
to gitflo...@googlegroups.com, Vincent Driessen
The workflow that is currently supported by git-flow may just be the
'hard-coded tip of the iceberg' of the rules that lie underneath it.
Understanding these rules and using them to get a deeper understanding
of the Commit-Graph at hand would allow the elaborate workflows that
could help Phil and me, on top of what git-flow can do already.

The problem here is that such a system, and I am somewhat repeating
myself, still needs to be done, and I don't believe anyone will have the
time for it, especially when you consider that the current git-flow
branching model already suits the majority of the people.

Phil said: "Don't lose it. It is important, I think." - and this is
exactly what I will do (Phil's post was quite encouraging by the way :)
). I believe I was onto something there, and hopefully I will be able to
implement a first prototype at some point. Until then, I am sure the
first version if pygit-flow will be available already, and it will be
great !

To provide a playground for future endeavours, in that direction or in
others, I hereby introduce a new pet-project of mine: pgit (PeeGit -
https://github.com/Byron/pgit).
Its something like a python version of the git command-line tools, and
the most important reason for its existence is the improvement to
git-submodule's features that I have to implement to support the
distribution of one of my internal projects.
Maybe one day, it could host that 'first prototype' I was talking about,
but for now, I shall only add the advanced submodule handling, hopefully
by the end of the next week.

Cheers,
Sebastian

pablo

unread,
Nov 22, 2010, 8:35:11 AM11/22/10
to gitflow-users
Hi guys,

In fact we like a lot the gitflow idea you proposed. We (Codice)
somehow do the same graphically already but some of our customers
asked for a CLI interface too. We've developed (internally yet) a
workflow (using our workflow library) to support a "flow" from the
CLI. Will post it when we've results.

Regards,

pablo

www.plasticscm.com


On Nov 2, 11:13 am, Mark Derricutt <m...@talios.com> wrote:
> Just a quick response before fully rereading and formulating more thoughts.
>
> I wonder if part of the problem we're having in this discussion is
> terminology.
>
> Under Sebastian's proposed changes, I see the current gitflow "feature"
> being more a "task" - a short lived set of commits that introduce
> functionality [1], whereas a feature is more a block of code with its own
> lifecycle of tasks, releases, hotfixes ( which I still think should live in
> a separate repository ).
>
> Would the renaming of "feature" in to "task" in gitflow be a viable stepping
> stone to introducing the new ideas?
>
> Mark
>
> [1] A good post on this is from the guys at Plastic SCM:http://codicesoftware.blogspot.com/2010/08/branch-per-task-workflow-e...
>
> --
> "Great artists are extremely selfish and arrogant things" — Steven Wilson,
> Porcupine Tree
>
> On Tue, Nov 2, 2010 at 10:04 PM, Sebastian Thiel <byron...@googlemail.com>wrote:>  Hi Vincent,
> > (or even **can**) be gitflow's responsibility.</vincent>

einm...@gmail.com

unread,
Feb 15, 2015, 8:16:45 AM2/15/15
to gitflo...@googlegroups.com, byro...@googlemail.com
Hello everyone!

I'm developing branching automation tool, similar to git-flow, that solves some problems mentioned in this thread (sorry, didn't read it entirely):
- it allows maintainers and release managers to pick topics (features in git-flow) into integration/release branches one-by-one or by groups.
- it resolves dependencies between topics

Another feature worth to mention is that it makes developers solve potential merge conflicts early, before topic is finished. This allows to mix and match topics into integration/release branches without merge conflicts.

By the way, it's written in python.

I made an announce at git-users group.

I really need some feedback, would appreciate any criticism/comments,
Vasily.

Hartmut Goebel

unread,
Feb 16, 2015, 5:11:47 AM2/16/15
to gitflo...@googlegroups.com
Am 15.02.2015 um 14:16 schrieb einm...@gmail.com:
> I'm developing branching automation tool, similar to git-flow, that
> solves some problems mentioned in this thread (sorry, didn't read it
> entirely):
> [...]
> By the way, it's written in python.

A Pure-Python implementation of gitflow is completed for two years now:

https://github.com/htgoebel/gitflow

I invite you to join that project instead of reinventing the wheel.


--
Regards
Hartmut Goebel

| Hartmut Goebel | h.go...@crazy-compilers.com |
| www.crazy-compilers.com | compilers which you thought are impossible |

einm...@gmail.com

unread,
Feb 16, 2015, 6:27:06 AM2/16/15
to gitflo...@googlegroups.com, h.go...@crazy-compilers.com

Hello Hartmut!


I’m aware of your gitflow fork.

My intent is not to rewrite gitflow in Python, but to implement a tool, alternative to gitflow.

It is similar to gitflow in the sense that it is git workflow automation tool too, and it also solves some problems mentioned in this thread.


Please, take a look at link I posted.
Reply all
Reply to author
Forward
0 new messages