git version in output files

13 views
Skip to first unread message

Marios Chatzikos

unread,
Dec 5, 2020, 11:48:09 AM12/5/20
to cloud...@googlegroups.com
Hello,

With the switch to Git, the infrastructure for reporting the Subversion
version number obviously fails.  Under the old system, Cloudy output
files started with a statement like this:

                                              Cloudy (trunk, r14363,
experimental)
www.nublado.org

Instead what is now reported under git is the default release date
hardcoded in date.h, and looks like this:

Cloudy 13.06.01
www.nublado.org

The problem is that the SVN-based infrastructure (implemented in
version.cpp) is based on SVN keywords, described here:

http://svnbook.red-bean.com/en/1.7/svn.advanced.props.special.keywords.html

which do not exist on git.  In particular, the infrastructure is based
on $HeadURL, which is for example:

$HeadURL: svn://svn.nublado.org/cloudy/trunk/source/version.cpp $
$HeadURL: svn://svn.nublado.org/cloudy/branches/newdyna/source/version.cpp $

and which can be parsed to extract the branch name (trunk, and newdyna,
resp.).

With git, this doess not seem possible.  The most directly comparable
entity is the remote.origin.url configuration parameter, which is the
same for all branches, and does not even exist for branches handpicked
from the entire repository (with git pull ... <branch>).  (If you're
aware of some other method, do let me know!)

With this in mind, I've put together the attached script that produces a
version string in the format <branch>-<git rev>-<svn-like rev>.  For
example:

master-19b5e4e-r3
master-76a53b3-r5M
Priyanka-9b468b0-r1M

A few questions that pop to mind:

- Do we want the git revision (SHA1) string reported?  It is definitely
useful, although it might seem bizarre to casual users.

- What about the SVN-like revision number?  This simply counts the
number of commits on the branch, has no significant meaning, but gives
you the old SVN feel.

- Notice that modified repos carry an "M" suffix.  If we were to drop
the SVN-like revision, I could alternatively just print "...-modified"
instead of "...-r1M".


As a proof of concept, the output in the .out file looks like this:

                                                    Cloudy
(master-76a53b3-r5M)
www.nublado.org

but I will make it more legible, like the printout with the current
system, once we decide on the format.

Notice that version.cpp is designed to provide much more information to
the user, which would better be retained.  However, some of it assumes
the use of tags, release branches, etc, which I'm not sure if or how we
will port to git.  I just wanted to raise the issue now, but we should
get back to it later.

Also, the information in date.h is badly out of date, and requires
something more current.  I understand this is important only for release
branches, but we might as well use the most recent release, as the
output is confusing (several of us tripped over it yesterday).

Thanks,

Marios

gitversion.sh

Gary J. Ferland

unread,
Dec 5, 2020, 12:25:36 PM12/5/20
to cloud...@googlegroups.com
Hi Marios,

very informative.  The proposed header looks good - people will eventually become comfortable with a change of format once they have seen it often enough.  so I would go with the most natural git info and not try to make it look like svn.

I want to propose a learning experience.   This page outlines how to get team feedback on pull requests.  As we have discussed, we have to be much more careful about what happens to the master.  Perhaps for this change, and the dynabranch change from a few days ago, we could try to use this method - I would find it very helpful as I learn git.

thanks
Gary

--
--
http://groups.google.com/group/cloudy-dev
---
You received this message because you are subscribed to the Google Groups "cloudy-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloudy-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloudy-dev/419e442f-b1c4-e0b8-2e12-1b4ea053cc89%40gmail.com.


--
Gary J. Ferland
Physics, Univ of Kentucky
Lexington KY 40506 USA
Tel: 859 257-8795
https://pa.as.uky.edu/users/gary

Robin Williams

unread,
Dec 5, 2020, 1:14:24 PM12/5/20
to cloud...@googlegroups.com
Hi Marios --

I think you need to ask what you need the information for.  FWIW, my recollection is that this information was included mostly to get a robust provenance of what version was being used for questions asked by users on the mailing lists.  I guess it's also useful to have this information to hand in a format to be included in publications, as in the example with Nigel's manuscript this week.

The 40-character Git commit hash is designed to be a "universally unique" identifier, which you can use to check out the relevant revision or locate it using a source browser.  I remember one or other git book had an appendix discussing the statistical likelihood of a collision.  So that's all you really need.  It doesn't give that much of a hint to a human, though.  Branch/tag names might be some use for that purpose.

I'm not sure how much use finer-scale information than this is.  The nature of DVCS systems mean that commit hashes and revision numbers along branches are liable to change depending on how you choose to resolve merge conflicts -- projects often decide to rebase & flatten history, which makes things easier to navigate but loses granularity (and changes all the commit IDs, perhaps removing commits altogether).  If you don't rebase, then a branch can have multiple threads leading to a given revision, and multiple revisions at a given depth.  That said, generally runs with these versions will only be of interest to developers, and while they are under active development, so their provenance may be more obvious.

  Robin

Marios Chatzikos

unread,
Dec 5, 2020, 3:32:16 PM12/5/20
to 'Robin Williams' via cloudy-dev
Hi Gary & Robin,

I am aware that git commits may be changed, and that's why I included the SVN-like revision -- being sequential, it's more robust.  The downside is that, unless commit numbers change, it is pretty much useless.

Gary, yes, I thought about that, too.  Once we decide on the revision signature, and maybe update the current release version, I could create a branch to commit the changes to the nublado repository, and then submit a pull request for a review.

Robin, take a look at version.cpp to see how fine the information is.  I think there is value in this practice, and if possible, we should continue it.  The question is how to handle tags, and branches.  I think git-describe can be used to get the most recent tag, but I haven't experimented with that.  In any case, one design change will be that the script will have to prepare much of that information, as git does not provide a unique URL for each branch, and it certainly does not support keyword expansion.

Thanks,
Marios

Robin Williams

unread,
Dec 5, 2020, 4:07:21 PM12/5/20
to cloud...@googlegroups.com
Hi Marios --

I guess my general point is that it's best to concentrate on use cases, rather than trying to necessarily replicate what there is at present -- which was constrained/directed by what could be easily achieved with SVN.

  Robin

Marios Chatzikos

unread,
Dec 5, 2020, 4:14:26 PM12/5/20
to 'Robin Williams' via cloudy-dev
Hi Robin,

I agree, and I said as much in the original message.  Replicating the 'experimental' etc labels should be straightforward, but what to do about tags and release branches remains to be seen.

Thanks again,
Marios

Francisco Guzman Fulgencio

unread,
Dec 7, 2020, 8:35:43 AM12/7/20
to cloud...@googlegroups.com
Hi,

Sequential numbers of commits have served more than once to fast bisect for a change and correct a bug that was introduced. I remember the problem with the b-factors 5 years ago, where we also were bisecting specific branches.

Is there a way to do this with the git tagging? I guess, my question is, if we were going to bisect in the future the git repository. Is there a fast/human way to do it?

Thanks!

==========================
Francisco Guzmán
Assistant Professor of Physics
Rogers Hall, 106B,
Dahlonega Campus
University of North Georgia

From: cloud...@googlegroups.com <cloud...@googlegroups.com> on behalf of Marios Chatzikos <mchat...@gmail.com>
Sent: Saturday, December 5, 2020 4:14 PM
To: 'Robin Williams' via cloudy-dev <cloud...@googlegroups.com>
Subject: Re: [cloudy-dev] git version in output files
 

Robin Williams

unread,
Dec 7, 2020, 8:40:39 AM12/7/20
to cloud...@googlegroups.com
Git has a facility "git bisect" which is intended to deal with this automatically.

  Robin

Marios Chatzikos

unread,
Dec 7, 2020, 8:52:18 AM12/7/20
to cloud...@googlegroups.com

Hello,

Feel free to pull the 'version' branch (if you already have a cloudy repo, just do 'git checkout version' -- I could have picked a better name, apologies!) to review the changes I've made.  I have removed the SVN-like revision number, and adjusted the printout to be:

                                                    Cloudy (version, abfc681)
                                               Cloudy (version, 351120e, modified)

depending on the state of the branch.


Regarding tags and release branches, are we going to maintain the previous practice, or are we going toward a 'living' production branch?  I've looked around and found two practices on this

https://nvie.com/posts/a-successful-git-branching-model/
https://guides.github.com/introduction/flow/

The first is closer to our model under SVN, while the latter is more 'modern', but it has a web-apps mentality (as the first explains).  Both treat the master as something sacred, which I guess is in line with our stated goal of not directly committing to it.

Thanks,

Marios

Marios Chatzikos

unread,
Dec 7, 2020, 4:41:53 PM12/7/20
to cloud...@googlegroups.com

And here is the gitlab rebuttal to git-flow:

https://about.gitlab.com/blog/2014/09/29/gitlab-flow/

Gary J. Ferland

unread,
Dec 7, 2020, 4:58:03 PM12/7/20
to cloud...@googlegroups.com
The world around us has changed a lot in the fifteen years since we set up the svn / trac / viewvc & Major-Release-Every-Few-Years-With-Major-Review-Paper workflow model.  Neither is used anymore by the software I track, although this was the model some time ago.  MS announced that Windows 10 would be the last 'version' of windows and that it would evolve incrementally going forward.  I have had several W10 installs here, two on hardware and others in VMs, and there are significant updates to W10 a few times per year and smaller updates every month.  I could name a number of other projects that do this as well.

I would favor going over to a model with much more frequent updates, with the updates being modest and targeted at some feature.  These updates could be done every few months.  That would be more sustainable than our current model.

Francisco Guzman Fulgencio

unread,
Dec 8, 2020, 8:17:49 AM12/8/20
to cloud...@googlegroups.com
Hi,

Here is my point of view.

Maybe, I have not understood it well, but it seems to me that the Git flow model and the environmental branches model is the same but just changing the master for the production branch?

I don't see why we could not have our "production" branch from which we merge ("deploy") from the master branch from time to time (I agree with Gary, maybe deploying twice a year). At the same time, we work in our branches for major projects and commit to the master for fixes.

At the same time, I think we should make a point to make pull requests every time we are going to commit to the master. 

What do you think?

Fran


==========================
Francisco Guzmán
Assistant Professor of Physics
Rogers Hall, 106B,
Dahlonega Campus
University of North Georgia

From: cloud...@googlegroups.com <cloud...@googlegroups.com> on behalf of Gary J. Ferland <ga...@g.uky.edu>
Sent: Monday, December 7, 2020 4:57 PM
To: cloud...@googlegroups.com <cloud...@googlegroups.com>

Subject: Re: [cloudy-dev] git version in output files

Gary J. Ferland

unread,
Dec 8, 2020, 8:39:30 AM12/8/20
to cloud...@googlegroups.com
Hi Fran,
Perhaps we could talk about this some time?  You mentioned that times after 5:30 don't work for you - when does work?  I found our last discussion very useful.
Gary

Marios Chatzikos

unread,
Dec 8, 2020, 2:29:14 PM12/8/20
to cloud...@googlegroups.com

Hi Gary,

My point was more about how to organize the repository, and less about changing our work flow, although the two are connected at some level.

Regardless of how often we release, the question remains: do we branch off the new version (as we have been), or to we point to a particular (tagged) commit on the master (or, a separate, release) branch?  It suspect you prefer the latter.

What happens with bug fixes in this model?  I suppose we could point to a newer tag on the branch, but if that branch is master and an intervening branch merge has occurred, the patched version would carry features that "ought to" become public with the next release (in 6 months, say).  Or, we could leave the bug as-is until the next release, which is far from ideal.

With that in mind, I think we should have a release branch parallel to master, and all user-facing bug fixes be applied to it.  Master and feature branches should operate the way you and Fran describe -- which has anyway been the status quo.

I agree with Fran that we should adopt merge/pull requests to merge to master.  In fact, to force the policy, we can make master write-protected.

Regarding the release frequency, I agree that the current model makes releasing code a burden, and that having two rounds a year is reasonable -- that's the Ubuntu model.  We could start with a release in July, c21.07, which should give us enough time to fix anything we want fixed on master, and follow it with a release in January of 2020, c22.01.  Or something like that.

Thanks,

Marios

Gary J. Ferland

unread,
Dec 8, 2020, 2:47:45 PM12/8/20
to cloud...@googlegroups.com
The current trunk / master is broken.  It is in an unacceptable state.  We have several blocking tickets against it but nobody has time to work on it.  If we can get the master back into shape we could use it as the release.

I pretty much agree with what you said.  We must only work on branches then carefully decide exactly what goes onto the master.  The branch would have to be carefully studied before reintegration.  With that model, the evolving master could be the release version.   The workflow could be a carefully throughout reintegration of the branch, a tag created at that point, then announce the tag as the next incremental release.  

Marios Chatzikos

unread,
Dec 11, 2020, 8:37:23 PM12/11/20
to cloud...@googlegroups.com

Hello,

The attached patch operates under the assumption that there is a separate release branch into which master reintegrates, say, every 6 months,  and on which git tags may be applied.

The printed version in the .out file can be one of the following:

- tagged version on release branch, modified:

                                                    Cloudy (c20.12, modified)
- tagged version on release branch, pristine:

                                                         Cloudy (c20.12)

Other branches are as before; e.g., a modified one:

                                               Cloudy (version, abfc681, modified)

while a pristine one does not carry the last string inside the parentheses.

Let me know what you think.

Marios

version.patch

Gary J. Ferland

unread,
Dec 11, 2020, 8:53:02 PM12/11/20
to cloud...@googlegroups.com
this looks good.  really, it has to go onto the master then onto c17_branch, if we plan on doing a release of c17.03 soon.  its next release will be in git.  

Marios Chatzikos

unread,
Dec 11, 2020, 9:37:17 PM12/11/20
to 'Robin Williams' via cloudy-dev
Okay, I'll prepare a pull request, and we can review it one final time before we merge.

Marios Chatzikos

unread,
Dec 11, 2020, 10:32:26 PM12/11/20
to cloud...@googlegroups.com

Okay, submitted.

Robin Williams

unread,
Dec 12, 2020, 8:14:06 AM12/12/20
to cloud...@googlegroups.com
FWIW, the GitLab flow discussion all seems sensible to me.  On other projects with dev branches, they seem to end up reflecting the kind of project focus which makes sense as the default base branch for ongoing work -- effective review and testing are a better way of ensuring stability than introducing long feedback loops.

One major issue can be ensuring that merge reviews are done in a timely manner -- otherwise, things are forced to either stall or turn into larger blobs.  On another project I helped out with for a while, it was somewhat frustrating when the lead tended to prioritize merging their own work, meaning repeated rework to my pending requests -- but that was a relatively small project undergoing structural improvements, so that might be less of an issue for Cloudy.

Best wishes
  Robin

Reply all
Reply to author
Forward
0 new messages