cf keys on agent node

Rad

unread,

Sep 27, 2015, 9:32:14 AM9/27/15

to help-cfengine

Greetings.

We have situation where the device running cf-serverd may need to be swapped out without a chance for retrieving keys from the old device. All agent devices now need to delete public keys (root-* files in ppkeys) and a re-bootstrap should be unnecessary (trust keys is configured).

Without implementing a complex mechanism for the agent devices to detect this swap out, the agent nodes can be rebooted and upon startup should be able to detect this swap-out of the hub and delete keys. Is there a way that an authentication failure can be detected on the agent nodes (other than messy output log parsing) ?

Cheers

Rad

Brian Bennett

unread,

Sep 27, 2015, 12:35:31 PM9/27/15

to Rad, help-cfengine

What I usually do is bootstrap a new policy_hub along side the old policy_hub. Then on the old policy_hub I create the following commands promise on the old policy hub to apply to all nodes:

"$(sys.cf_agent) -B new_policy_hub.example.com";

Using cf-key -s on the new hub you can tell when all nodes have rebootstrapped to the new hub so you know when it's safe to shut off the old.

The other thing you can do make is sure the new hub as the same IP as the old and copy the keys in $(sys.workdir)/ppkeys from the old hub to the new hub. In this case the agents won't know the difference.

--
Brian Bennett
Looking for CFEngine training?
http://www.verticalsysadmin.com/

--
You received this message because you are subscribed to the Google Groups "help-cfengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to help-cfengin...@googlegroups.com.
To post to this group, send email to help-c...@googlegroups.com.
Visit this group at http://groups.google.com/group/help-cfengine.
For more options, visit https://groups.google.com/d/optout.

Ramakant Duggal

unread,

Sep 27, 2015, 5:31:04 PM9/27/15

to Brian Bennett, help-cfengine

Thanks Brian.
In the operational scenario where a faulty device is replaced, this solution isn't feasible. Its not even possible to know which device will replace the faulty one before the demise of the faulty one.
There might be up to 20 agent nodes attached to this hub on a private lan.
I am tempted to have the startup script on the agents blow away the root-* keys at each restart.

Unless there is a tool that can detect the authentication failure - the cause being in correct keys ?
Perhaps a "checkauth" promise can be written to check key validity at startup on agent nodes ?

Brian Bennett

unread,

Sep 27, 2015, 7:28:41 PM9/27/15

to Ramakant Duggal, help-cfengine

Is this a hypothetical question or did this actually happen and now you need to recover?

--

Brian Bennett

Looking for CFEngine training?

http://www.verticalsysadmin.com/

Ramakant Duggal

unread,

Sep 27, 2015, 7:44:39 PM9/27/15

to Brian Bennett, help-cfengine

It's not hypothetical, this will happen as part of regular operations. All this stuff is on a public transport vehicle (e.g., tram or bus). The driver console (which has more grunt than the ticket validators) controls upto 20 ticket validators on the vehicle lan. If the console breaks, it gets swapped out, the new unit commissions itself by getting stuff over 3 G and steps in as the policy hub for the on-board devices.

I am wondering why is there no way (perhaps there is) of detecting a broken auth on the agent devcies, and dump the root-* files once it's known that the auth is failing. The truskey specified should ensure smooth operation from that point on.

Brian Bennett

unread,

Sep 27, 2015, 10:07:55 PM9/27/15

to Ramakant Duggal, help-cfengine

Ok, you're in a pretty atypical situation.

In general, it's expected that the policy hub will be stable and will be stably replaced. You'll need to work out a mechanism of your own for detecting a dead policy_hub (because I don't know exactly what that means for your environment). If cf-serverd is linked with avahi, you can configure it to publish a mDNS name (I believe it's _cfenginehub._tcp.local), then when a dead hub is detected, (optionally) clear the keys and re-bootstrap.

Here's something that may work, but is completely untested. In particular, I don't know if host2ip works for mDNS names. I think it should as long as your system resolver can. But YMMV.

# Copyright 2015, Brian Bennett <bah...@digitalelf.net>

#   Licensed under the Apache License, Version 2.0 (the "License");
#   you may not use this file except in compliance with the License.
#   You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
#   Unless required by applicable law or agreed to in writing, software
#   distributed under the License is distributed on an "AS IS" BASIS,
#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#   See the License for the specific language governing permissions and
#   limitations under the License.
bundle agent rebootstrap {
vars:
"hub_mdns_ip" string => host2ip("_cfenginehub._tcp.local"),
comment => "Find the currently published policy hub IP address";
classes:
"hub_has_changed" expression => strcmp("$(hub_mdns_ip)","$(sys.policy_hub)")
comment => "Flag a change if the mdns address does not match the system policy hub";
commands:
hub_has_changed::
"$(sys.cf_agent) -B $(hub_mdns_ip)"
comment => "Bootstrap to the newly detected mDNS advertised policy hub";
}

Be warned, that this has a huge security hole. Anybody joining your network can advertise a policy hub and hijack all of your nodes. This should only be used on a tightly controlled (and preferably air gapped) LAN segment.

Note, I feel a little weird putting in an explicit license, but in this case I sort of feel compelled to. It's just meant to call out that this isn't supported or warranted by either Cfengine, nor Vertical Sysadmin, and it also protects me because it seems like your environment is a little high stakes. Not that I distrust you, but I don't know you or your employer. Personally, I hope you understand.

At any rate, I find your case problem to be very interesting. I'm also very interested to know if you're able to work out a viable solution.

Good luck =)

--
Brian Bennett
Looking for CFEngine training?
http://www.verticalsysadmin.com/

Ramakant Duggal

unread,

Sep 29, 2015, 6:33:45 AM9/29/15

to Brian Bennett, help-cfengine

Hi Brian,

Given the butt-crazy deadlines we have, I am doing the quick-and-dirty (but "viable"). A start-up script can detect auth failure thru cf-egnt dry run, clear the public keys (cf-key -r to keep the db clean), and re-bootstrap (if necessary). The approach treats all causes of auth failure as caused by invalid key. So clearing the key might not fix auth failure arising from other causes, but I guess we are no worse off for that.

I wish there were a "cf-ping" that would diagnose connection/authentication problems and could be used here.

What you suggest is a great idea for the central hub (s) - vm (s) in customer's private cloud. They have a fairly large network of transit devices - fixed & mobile.

Actually your suggested solution also looks good for the on-board LAN of 20-odd devices, which are PPC hw. In the longer term, avahi with mdns could facilitate smooth replacement of the hub (driver console) on vehicles. BTW removing public keys is necessary in the above to ensure bootstrap success. Actually the re-bootstrap is needed only if new hub is expected to deliver different policy set ?

Thanks for your input on this.

Cheers

Rad

Brian Bennett

unread,

Sep 29, 2015, 8:58:09 PM9/29/15

to Ramakant Duggal, help-cfengine

Re-bootstrapping isn't strictly necessary, but it's a single instruction that performs all of the steps you need to get set up with a new hub. So it's the easiest way. Performing s bootstrap is what you want to do. Merely clearing keys is not enough. CfEngine uses a special trust mode during bootstrap to exchange keys that isn't available at other times.

--

Brian Bennett

Looking for CFEngine training?

http://www.verticalsysadmin.com/

Ramakant Duggal

unread,

Sep 30, 2015, 12:55:10 AM9/30/15

to Brian Bennett, help-cfengine

It just dawned on me that you are right, bootstrap on the client side is the equivalent of "trustkeys" acl in server control body - there is no client side trustkey acl.

cf-key --trust-key <key> was looking promising, but apparently the "<key>" to be specified is a file, which hasn't landed on the agent node yet, like a chicken-and-egg problem. Wonder what is the purpose of cf-key -t then ?

cf-ping sounds like a useful idea, perhaps I will do it.

Reply all

Reply to author

Forward