CFEngine Git pull from "branch"

107 views
Skip to first unread message

Todd Erwin

unread,
Sep 28, 2016, 4:26:01 PM9/28/16
to help-cfengine
I have a unique situation that Im trying to remediate and drawing a blank as to a solution.

Using the documentation as a template let's say I want to update the masterfiles from a Github repo.

bundle agent vcs_update { commands: "/usr/bin/git" args => "pull --ff-only upstream master", contain => masterfiles_contain; } body contain masterfiles_contain { chdir => "/var/cfengine/masterfiles"; }

Insert this line of code and things work great.. However...

I want to be able to create a "FLAG FILE" that has the Branch that I wish to actually pull the update from.

Say I have a pull-request and I want to test out these "NEW" masterfiles that someone is proposing in my environment, I want to be able to set a FLAG in a file somehwere that then tells the vcs_update to update from the branch.

example..

/var/cfengine/tmp/REPOHEAD
Contents would be "testing-new-masterfiles"
and of course there would be a git branch called testing-new-masterfiles.
Now I need the vcs_update to look like this.

bundle agent vcs_update { commands: "/usr/bin/git" args => "pull upstream testing-new-masterfiles", contain => masterfiles_contain; } body contain masterfiles_contain { chdir => "/var/cfengine/masterfiles"; }


It would be possible to do it when I call it..

body common control { bundlesequence => { ... vcs_update(testing-new-masterfiles), };
.... rest of body code

Then the agent would be..

bundle agent vcs_update(branch) { commands: "/usr/bin/git" args => "pull upstream $(branch)", contain => masterfiles_contain; } body contain masterfiles_contain { chdir => "/var/cfengine/masterfiles"; }

However in this case I would then have to TWEAK the bundlesequence every time to include this "TESTING" branch because it was hardcoded..

What I need is an automated way to read the file and set that as a VAR to then stick into the bundlesequence.

Anyone else have any ideas how to do that?

Thanks.

Eric O'Connor

unread,
Sep 28, 2016, 4:30:36 PM9/28/16
to Todd Erwin, help-cfengine
You can create something like this:

bundle agent my_vcs_update {
var:
"branch" string => ... file read;
methods:
"call_vcs" usebundle => vcs_update("$(branch)");
}

And then just include "my_vcs_update" in your bundlesequence.


---- On Wed, 28 Sep 2016 14:26:01 -0600 Todd Erwin <toddw...@gmail.com> wrote ----
> I have a unique situation that Im trying to remediate and drawing a blank as to a solution.
> Using the documentation as a template let's say I want to update the masterfiles from a Github repo.
> bundle agent vcs_update { commands: "/usr/bin/git" args => "pull --ff-only upstream master", contain => masterfiles_contain; } body contain masterfiles_contain { chdir => "/var/cfengine/masterfiles"; }
>
> Insert this line of code and things work great.. However...
> I want to be able to create a "FLAG FILE" that has the Branch that I wish to actually pull the update from.
> Say I have a pull-request and I want to test out these "NEW" masterfiles that someone is proposing in my environment, I want to be able to set a FLAG in a file somehwere that then tells the vcs_update to update from the branch.
> example..
> /var/cfengine/tmp/REPOHEADContents would be "testing-new-masterfiles"and of course there would be a git branch called testing-new-masterfiles.Now I need the vcs_update to look like this.
> bundle agent vcs_update { commands: "/usr/bin/git" args => "pull upstream testing-new-masterfiles", contain => masterfiles_contain; } body contain masterfiles_contain { chdir => "/var/cfengine/masterfiles"; }
>
>
> It would be possible to do it when I call it..
> body common control { bundlesequence => { ... vcs_update(testing-new-masterfiles), };
> .... rest of body code
> Then the agent would be..
> bundle agent vcs_update(branch) { commands: "/usr/bin/git" args => "pull upstream $(branch)", contain => masterfiles_contain; } body contain masterfiles_contain { chdir => "/var/cfengine/masterfiles"; }
>
> However in this case I would then have to TWEAK the bundlesequence every time to include this "TESTING" branch because it was hardcoded..
> What I need is an automated way to read the file and set that as a VAR to then stick into the bundlesequence.
> Anyone else have any ideas how to do that?
> Thanks.
>
> --
> You received this message because you are subscribed to the Google Groups "help-cfengine" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to help-cfengin...@googlegroups.com.
> To post to this group, send email to help-c...@googlegroups.com.
> Visit this group at https://groups.google.com/group/help-cfengine.
> For more options, visit https://groups.google.com/d/optout.
>


mike.w...@verticalsysadmin.com

unread,
Sep 29, 2016, 3:19:35 AM9/29/16
to help-cfengine
Hi Todd,

It's less unique than you might think.  ;)  I've worked on this extensively and have an entire suite of scripts relating to this.

On the hubs I work on, the "masterfiles" directory is updated during every "update.cf" run on the policy server to match a specified git revision (whether branch, tag, or commit hash).  (All CFEngine policy work I do is done through Git, of course.)

But I went further than that—that much is available using OOTB code.  I added the capability to have "policy channels."  Each host defaults to pulling policy from /var/cfengine/masterfiles, but can be reassigned (by creating a flag file on the hub) to point it to another directory, e.g. /var/cfengine/policy_channels/experimental_channel/.

Further, the host channel assignments (stored in the /var/cfengine/host_channel_assignments directory) don't specify a git reference—just a policy channel (i.e. a directory in /var/cfengine/policy_channels).  Each policy channel can be independently assigned to a specific git revision, from a single "parameters" file.

As I said, there are a great many complementary scripts that go along with this framework, but you can see the initial versions (of the basic "staging" script that pulls from git), here: https://github.com/cfengine/core/pull/2465  (Warning: This is *not* the user-friendly version for reading!)

Does that sound like something you'd be interested in?

If so, give me a nudge.  I've been meaning to do a proper writeup and post the framework, including the scripts and policies that make it easy to work with, to help the wider CFEngine community.  If you're interested, it will be a nice little incentive for me to actually invest the work to polish up the documentation and publish it.  ;)

Best,
--Mike Weilgart
Vertical Sysadmin, Inc.

Eric O'Connor

unread,
Sep 29, 2016, 4:46:24 AM9/29/16
to mikeweilgart, help-cfengine
Heh, here is another stab at this problem:

https://gist.github.com/oconnore/7ddc34aaff280ae38a20e313f8270b92


---- On Thu, 29 Sep 2016 01:19:34 -0600 <mike.w...@verticalsysadmin.com> wrote ----

Ted Zlatanov

unread,
Sep 29, 2016, 8:37:32 AM9/29/16
to help-c...@googlegroups.com
On Thu, 29 Sep 2016 00:19:34 -0700 (PDT) mike.w...@verticalsysadmin.com wrote:

mw> But I went further than that—that much is available using OOTB code. I
mw> added the capability to have "policy channels." Each host defaults to
mw> pulling policy from /var/cfengine/masterfiles, but can be reassigned (by
mw> creating a flag file on the hub) to point it to another directory, e.g.
mw> /var/cfengine/policy_channels/experimental_channel/.

mw> Further, the host channel assignments (stored in the
mw> /var/cfengine/host_channel_assignments directory) don't specify a git
mw> reference—just a policy channel (i.e. a directory in
mw> /var/cfengine/policy_channels). Each policy channel can be independently
mw> assigned to a specific git revision, from a single "parameters" file.

I know there's a cf-serverd "shortcut" option to do exactly this on the
server side. Was it not useful?

Ted

Nick Anderson

unread,
Sep 30, 2016, 11:17:24 AM9/30/16
to help-c...@googlegroups.com

Ted Zlatanov <t...@lifelogs.com> writes:
> I know there's a cf-serverd "shortcut" option to do exactly this on the
> server side. Was it not useful?

For the "shortcut" feature to be useful the server must map the client
requests to a local directory. I believe Mikes current implementation is
completely agent side decisions. I think that shortcut for each policy
channel could be implemented, but that would add additional complexity
that might not be necessary right now.

Aleksey Tsalolikhin

unread,
Sep 30, 2016, 11:27:03 AM9/30/16
to Nick Anderson, help-cfengine
Exactly, we did not want to have to update CFEngine policy code (e.g., add or change shortcuts) whenever we wanted to add hosts, or re-assign hosts from one channel to another (e.g., to try out a new feature).

We did consider the shortcuts feature when you offered it, thanks, Ted.  :)


-- 
Need training on CFEngine, Git or Time Management?  Email trai...@verticalsysadmin.com.

--
You received this message because you are subscribed to the Google Groups "help-cfengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to help-cfengine+unsubscribe@googlegroups.com.

Nick Anderson

unread,
Sep 30, 2016, 11:27:15 AM9/30/16
to Eric O'Connor, mikeweilgart, help-cfengine

Eric O'Connor <er...@oco.nnor.org> writes:

> Heh, here is another stab at this problem:
>
> https://gist.github.com/oconnore/7ddc34aaff280ae38a20e313f8270b92

Do you think any of that work wold fit will with the vcs bundles in the stdlib[1]1?

[1][https://github.com/cfengine/masterfiles/blob/master/lib/vcs.cf]

Eric O'Connor

unread,
Sep 30, 2016, 12:14:22 PM9/30/16
to Nick Anderson, mikeweilgart, help-cfengine
I'm not sure, but I'm interested! I think trivially the umask and workdir support could be pulled over.

Trickier is, I wrote this because I wanted to support convergent git operations for _consuming_ an external git repo:

A promise that the git repository should have checked out, in this case, the latest "unstable" branch of https://github.com/NixOS/nixpkgs.
A promise that the git repository is readable by anyone in the "nixpkgs" group.
A promise that there is also a worktree (so I don't duplicate commit data) with the stable branch checked out.

The vcs.cf stuff is, it seems, geared towards using CFEngine to automate _producing_ a local git repo (presumably without collaboration, since automatically resolving merge conflicts is an unsolved problem).

For example:

vcs.cf :: git_checkout works exactly like you would expect, running git checkout <branch>
my git.cf :: git_checkout obliterates everything except for a precise remote/branch with an optional commit id.

Using either method for the opposite purpose will do the wrong thing.
So, I guess the question is, how would we build a convergent interface that addresses both concerns, and make it clear how they're different?

Eric

---- On Fri, 30 Sep 2016 09:27:09 -0600 Nick Anderson <nick.a...@cfengine.com> wrote ----

Ted Zlatanov

unread,
Oct 1, 2016, 8:00:28 AM10/1/16
to help-c...@googlegroups.com
On Fri, 30 Sep 2016 10:14:14 -0600 Eric O'Connor <er...@oco.nnor.org> wrote:

EO> vcs.cf :: git_checkout works exactly like you would expect, running git checkout <branch>
EO> my git.cf :: git_checkout obliterates everything except for a precise remote/branch with an optional commit id.

EO> Using either method for the opposite purpose will do the wrong thing.
EO> So, I guess the question is, how would we build a convergent interface that addresses both concerns, and make it clear how they're different?

The vcs.cf bundles replicate Git commands. For instance git_checkout()
runs a `git checkout`. Deploying a repository in a clean way is a nice
use case that's not handled currently by any single Git command.

So I would take your git_checkout bundle, put it in vcs.cf, and call it
git_mirror or something like that. I think it will be useful!

I'd look at the Ansible Git module as an example of the typical options
and behavior users might want in a mirroring module:
http://docs.ansible.com/ansible/git_module.html

They are currently: accept hostkey; use a private key; bare repo; do not
clone if it doesn't exist; clone depth; force clean; recursive or not;
clone by branch, tag, or refspec; name a specific origin to use;
ssh_opts wrapper for GIT_SSH; track submodules; retrieve new revisions
or not; track submodules or not; GPG verify commits.

These options don't have to all be parameters to the bundle. You can
just have an "options" parameter, a data container, which has these (and
you could even have the same option names as the Ansible module, if that
makes sense).

Then just merge the options data container with your defaults and you're
all set. But implementing the behavior for each of these options is not
trivial, so I hope you find it interesting. If you're not interested in
doing them all, put them as TODO items in the bundle docs, and others
can then implement them.

Ted

Ted Zlatanov

unread,
Oct 1, 2016, 8:15:48 AM10/1/16
to help-c...@googlegroups.com
AT> On Fri, Sep 30, 2016 at 8:17 AM, Nick Anderson <nick.a...@cfengine.com> wrote:
>> Ted Zlatanov <t...@lifelogs.com> writes:
>> > I know there's a cf-serverd "shortcut" option to do exactly this on the
>> > server side. Was it not useful?
>>
>> For the "shortcut" feature to be useful the server must map the client
>> requests to a local directory. I believe Mikes current implementation is
>> completely agent side decisions. I think that shortcut for each policy
>> channel could be implemented, but that would add additional complexity
>> that might not be necessary right now.

On Fri, 30 Sep 2016 08:26:41 -0700 Aleksey Tsalolikhin <ale...@verticalsysadmin.com> wrote:

AT> Exactly, we did not want to have to update CFEngine policy code (e.g., add
AT> or change shortcuts) whenever we wanted to add hosts, or re-assign hosts
AT> from one channel to another (e.g., to try out a new feature).

AT> We did consider the shortcuts feature when you offered it, thanks, Ted. :)

I've seen policy distribution either fully decentralized (Git checkouts)
or fully centralized (directories on a central server). So without
talking about the "shortcut" feature, I wanted to understand policy
channels better. I saw Mike's pull request but only looked at it as
code, not to see the bigger picture. I'm also a bit confused because I
remember there was some discussion of policy channels on the issue
tracker:

https://tracker.mender.io/browse/CFE-2069
https://tracker.mender.io/browse/CFE-2095

So first of all, I hope this work ends up in a place where Community and
Enterprise users can benefit from it. I think it's valuable.

To my mind, the benefits of fully decentralized are: independent agents;
easy switching to different policies, especially for testing; fewer
points of failure; local validation of policy (especially useful when
clients can run different CFEngine versions and platforms). The benefits
of fully centralized are a consistent security model; central policy
validation before it's pushed out; and better knowledge of the hosts.

The model Mike described seems halfway between those two. Can you
explain how it rates or prioritizes those features? For instance, when
and where is policy validated?

Thanks
Ted

Ted Zlatanov

unread,
Oct 4, 2016, 6:06:14 AM10/4/16
to help-c...@googlegroups.com
On Sat, 01 Oct 2016 07:59:50 -0400 Ted Zlatanov <t...@lifelogs.com> wrote:

TZ> On Fri, 30 Sep 2016 10:14:14 -0600 Eric O'Connor <er...@oco.nnor.org> wrote:
EO> vcs.cf :: git_checkout works exactly like you would expect, running git checkout <branch>
EO> my git.cf :: git_checkout obliterates everything except for a precise remote/branch with an optional commit id.

...
TZ> So I would take your git_checkout bundle, put it in vcs.cf, and call it
TZ> git_mirror or something like that. I think it will be useful!

Eric, are you interested in doing this? If not, I can take a look, but
it's your code...

Thanks
Ted

Eric O'Connor

unread,
Oct 4, 2016, 10:04:02 AM10/4/16
to help-cfengine, help-c...@googlegroups.com
I was planning on looking at it later this week, but I'm happy to collaborate on it with you if you'd like to get moving sooner (lets keep a git repo synced up -- in general but also while we work on this!).
If I do get started first, I'm sure I could use help matching features in Puppet or Ansible :)

Do you have experience contributing to the standard library? Is it kosher to use things like execresult and awk here?

Eric


---- On Tue, 04 Oct 2016 04:05:29 -0600 Ted Zlatanov <t...@lifelogs.com> wrote ----

Ted Zlatanov

unread,
Oct 4, 2016, 10:29:25 AM10/4/16
to help-c...@googlegroups.com
On Tue, 04 Oct 2016 08:03:49 -0600 Eric O'Connor <er...@oco.nnor.org> wrote:

EO> I was planning on looking at it later this week, but I'm happy to collaborate on
EO> it with you if you'd like to get moving sooner (lets keep a git repo synced up
EO> -- in general but also while we work on this!).
EO> If I do get started first, I'm sure I could use help matching features in Puppet or Ansible :)

The easiest way is on Github: fork
https://github.com/cfengine/masterfiles and make a pull request against
it with your changes. Add me as a write collaborator to your repo
(username tzz) and I'll be able to modify the pull request as well.

EO> Do you have experience contributing to the standard library?

Yes, the latest one were the testing bundles:
https://github.com/cfengine/masterfiles/pull/766 but historically I've
touched the masterfiles quite a bit :)

EO> Is it kosher to use things like execresult and awk here?

It really depends. The latest CFEngine has regex_replace() and other
useful functions, so you can do a lot internally. Do your pull request
the way it works for you, and I can make changes to it.

Similarly, don't worry about all the options I mentioned, just do what
works for you and add TODO items for the rest. I will help you fill
those out, and I'm sure others will be glad to give us a hand as well.

Ted

Ted Zlatanov

unread,
Oct 13, 2016, 1:38:18 PM10/13/16
to help-c...@googlegroups.com
On Sat, 01 Oct 2016 08:14:56 -0400 Ted Zlatanov <t...@lifelogs.com> wrote:

TZ> On Fri, 30 Sep 2016 08:26:41 -0700 Aleksey Tsalolikhin <ale...@verticalsysadmin.com> wrote:

AT> Exactly, we did not want to have to update CFEngine policy code (e.g., add
AT> or change shortcuts) whenever we wanted to add hosts, or re-assign hosts
AT> from one channel to another (e.g., to try out a new feature).

AT> We did consider the shortcuts feature when you offered it, thanks, Ted. :)

TZ> I've seen policy distribution either fully decentralized (Git checkouts)
TZ> or fully centralized (directories on a central server). So without
TZ> talking about the "shortcut" feature, I wanted to understand policy
TZ> channels better. I saw Mike's pull request but only looked at it as
TZ> code, not to see the bigger picture. I'm also a bit confused because I
TZ> remember there was some discussion of policy channels on the issue
TZ> tracker:

TZ> https://tracker.mender.io/browse/CFE-2069
TZ> https://tracker.mender.io/browse/CFE-2095

TZ> So first of all, I hope this work ends up in a place where Community and
TZ> Enterprise users can benefit from it. I think it's valuable.

TZ> To my mind, the benefits of fully decentralized are: independent agents;
TZ> easy switching to different policies, especially for testing; fewer
TZ> points of failure; local validation of policy (especially useful when
TZ> clients can run different CFEngine versions and platforms). The benefits
TZ> of fully centralized are a consistent security model; central policy
TZ> validation before it's pushed out; and better knowledge of the hosts.

TZ> The model Mike described seems halfway between those two. Can you
TZ> explain how it rates or prioritizes those features? For instance, when
TZ> and where is policy validated?

I haven't heard back from Mike or Aleksey, so I wanted to post an
article that explains some of the issues with deploying from Git, and
why it's not for everyone (especially without a staging area):
http://gitolite.com/deploy.html

Thanks
Ted

Mike Weilgart

unread,
Oct 13, 2016, 2:02:24 PM10/13/16
to help-c...@googlegroups.com
Hi Ted,

Nice summary! Yes, what we're doing is the closest to: "archive dump with staging -- the absolute simplest in terms of git concepts. It's inefficient but compensates by allowing atomic switchovers." Except that currently we are using CFEngine update policy to get the policy *from* the various directories on the policy server—we are only using Git to *populate* those directories. So we're not doing archive dumps.

The inefficiency of the "churn" caused by simply replacing each entire policy channel directory on every run of the staging script was the one complaint Nick Anderson had about the code. I did it that way because of the complexity you mentioned.

Actually, you missed some other edge cases in your writeup: If the contents of ".gitignore" in a directory you are trying to update doesn't match what you expect, you can get even stranger results. Your check for "untracked files" may fail; your "force checkout" may not overwrite files that are being ignored (but aren't supposed to be).... Ultimately it is so complex to consider all possibilities that I didn't bother even listing them all out, I just skipped the whole set of problems that went along with trying to update the files in place.

In other words, my staging script that sets up the policy channel directories is idempotent, but it is NOT convergent.

Using Git repositories over the network for *convergent* policy channel deployments will probably include use of plumbing commands "git write-tree" and "git update-index", as I mentioned earlier, rather than just bog-standard porcelain command use. (Remember, porcelain commands are designed for *collaboration*, not so much for deployments.)

I'll check out git-deploy; thanks for the tip.

Best,
—Mike Weilgart
Vertical Sysadmin, Inc.
--
You received this message because you are subscribed to a topic in the Google Groups "help-cfengine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/help-cfengine/GUsqJjD_wt0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to help-cfengin...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages