Good documentation somewhere for doing a cert-roll?

121 views
Skip to first unread message

Dan Mahoney, System Admin

unread,
May 13, 2016, 5:57:57 PM5/13/16
to puppet...@googlegroups.com
Hey there puppet-users deinzens.

One of my puppet agents helpfully reminded me that my root CA cert is due
to expire within a few months, and I'm wondering what the best way to go
about rolling it over would be.

A lot of my reading suggests something like "burn everything involving
certificates to the ground and start your entire CA infrastructure over
from scorched earth" is an approximation of the way to go.

From the various looks and reading I've done, this was one of those parts
of puppet that had some serious technical debt involved in authoring it.

I've likened puppet's SSL config to how I might manage an SSL cert on my
webserver/clients, and I'm seeing a disconnect, since many of the things
I'd do in those cases don't work here..

In short -- I think the following problems still exist:

* There's still no support for putting multiple certificate files as the
puppet CA -- all must still be signed by a common root entity. Is this
correct? (In the "web" analogy, my browser could have lots of built-in
and additional trust-points, both corporate and as-shipped).

* There's no directive I can find whereby puppet agents can, within N days
of expiry, re-request their certificate, while maintaining a valid one in
the meantime. On the puppet master, a duplicate cert is treated as an
absolute error and must be purged from both sides with extreme prejudice
and started over.

* There's no way the puppet master itself can have multiple trust points.
(I.e. old CA and new CA) -- in the real world, of course, I can have
multiple CA files from which I can trust clients, for example, for SMTP
auth.

* Puppet has no concept of a CA Path, rather than a CA file. And since
certificates are multi-line blocks in text files, they're a real pain to
manipulate with Augeas or shell scripts.

* There's no way the master can say "multiple public keys for the same
cert are bad, but we will re-sign *existing* keys that are merely near
expiry." (Which is a thing we might do in PGP). And even if we could
define such a policy, there's no support in the agent to do such a thing.

* There's no way to have the puppet-master auto-sign a cert, based on the
presence of some sort of file or hash on the node, similar to the above.

* We blindly trust the first CA we get (using the default options), but
then have NO real method for accepting a second CA without manually
manipulating the CA files directly. (DANE, anyone?)

* Your old CA, Puppet Master Certs, and Client Certs, are all an
ecosystem, and there's no way to replace any one of them without having to
replace the whole thing.

Am I getting this wrong? Please tell me I am.

-Dan

--

--------Dan Mahoney--------
Techie, Sysadmin, WebGeek
Gushi on efnet/undernet IRC
ICQ: 13735144 AIM: LarpGM
Site: http://www.gushi.org
---------------------------

Eric Sorenson

unread,
May 17, 2016, 3:16:51 AM5/17/16
to Puppet Users
On Friday, May 13, 2016 at 2:57:57 PM UTC-7, Dan Mahoney wrote:
Hey there puppet-users deinzens.

One of my puppet agents helpfully reminded me that my root CA cert is due
to expire within a few months, and I'm wondering what the best way to go
about rolling it over would be.

A lot of my reading suggests something like "burn everything involving
certificates to the ground and start your entire CA infrastructure over
from scorched earth" is an approximation of the way to go.

Hi Dan, this is a good and timely post. I'm working on some related issues
regarding Puppet's CA that may help you out. Your thinking on this is 
roughly correct -- things are a lot harder than they need to be, but the
above advice to nuke everything and start over is both overly simplistic
and wrong-headed.

Note that my comments here are specifically about the Clojure CA that
is included in puppetserver, not the Ruby CA; most things apply to both
but the past couple of years of server-side bugfixes and development energy
have gone into the Clojure CA, and Puppet 5 will consolidate
all the CA-side cert lifecycle onto this codebase. 
 

From the various looks and reading I've done, this was one of those parts
of puppet that had some serious technical debt involved in authoring it.

I've likened puppet's SSL config to how I might manage an SSL cert on my
webserver/clients, and I'm seeing a disconnect, since many of the things
I'd do in those cases don't work here..

You're right that the agent SSL code is very old and badly needs an overhaul.
For some interesting historical context, check out this Redmine bug and
the related issues that it links to:

 
In short -- I think the following problems still exist:

* There's still no support for putting multiple certificate files as the
puppet CA -- all must still be signed by a common root entity.  Is this
correct?  (In the "web" analogy, my browser could have lots of built-in
and additional trust-points, both corporate and as-shipped).

Have you verified experientially that this doesn't work in current Puppet
versions? I am working on one variant of this (chain-of-trust with root
and intermediate CA in $ssldir/certs/ca.pem) and it does work. That's
slightly different to what you're saying though, which is that any issuer
in that file should be considered valid. Due to some confusion in the
ca_crt.pem which the agent downloads can't contain a bundle, but I believe
if you "pre-seed" a valid bundle into that location the agent code will do
the right thing.

You're right that the agent does not support a CApath, in openssl parlance: a directory
of hashed CA certs, any of which are valid. The server side farms out its SSL verification
to the underlying web stack, so it ought to be tolerant of agents issued from
multiple CAs checking in. I haven't tried this angle yet.
 

* There's no directive I can find whereby puppet agents can, within N days
of expiry, re-request their certificate, while maintaining a valid one in
the meantime.  On the puppet master, a duplicate cert is treated as an
absolute error and must be purged from both sides with extreme prejudice
and started over.

The first part is true, the second is controlled by the 'allow-duplicate-certs' CA setting
which will allow later requests to overwrite newer ones. 
 

* There's no way the puppet master itself can have multiple trust points.
(I.e. old CA and new CA) -- in the real world, of course, I can have
multiple CA files from which I can trust clients, for example, for SMTP
auth.

* Puppet has no concept of a CA Path, rather than a CA file.  And since
certificates are multi-line blocks in text files, they're a real pain to
manipulate with Augeas or shell scripts.

As I said above, on the master the cert verification is delegated to the
web server layer (jetty in the case of the puppetserver, apache or nginx
or (gah) webrick for non-puppetserver setups). So agent verification on the 
master has a lot more going for it than the agents verifying the master's
identity. 
 

* There's no way the master can say "multiple public keys for the same
cert are bad, but we will re-sign *existing* keys that are merely near
expiry." (Which is a thing we might do in PGP).  And even if we could
define such a policy, there's no support in the agent to do such a thing.
 

* There's no way to have the puppet-master auto-sign a cert, based on the
presence of some sort of file or hash on the node, similar to the above.

There's nothing built-in that does either of these things. But policy-based
autosigning provides an API that lets you do this based on some
'a priori' knowledge you have of the node: 


This is an interesting line of thought that I'm looking into more on the CA side:
you can re-use the same private key to generate a new certificate that would
have an extended lifetime but not require a complete re-key.

There isn't an API guarantee that agents' certs (and therefore their public keys)
are collected and stored in the CA, though such a thing would be very useful
and is on deck for future work - you can see the whole list of these things at
 

* We blindly trust the first CA we get (using the default options), but
then have NO real method for accepting a second CA without manually
manipulating the CA files directly.  (DANE, anyone?)

I don't know what DANE means in this context, but this statement is true.
 

* Your old CA, Puppet Master Certs, and Client Certs, are all an
ecosystem, and there's no way to replace any one of them without having to
replace the whole thing.

Am I getting this wrong?  Please tell me I am.

I have a WIP doc adding support for intermediate CAs up at:


This may help as many of the goals (and the corresponding implementation bugs or missing features) overlap with your use-case.

There's also some tooling that may be helpful from an old and incredibly annoying CVE


(tar gz linked off that page)

I'm definitely interested in keeping in touch as you go through this, to make things easier for other people. I think a lot of sites are coming up on the five year anniversary of their installation and the easier this kind of re-keying can be, the better.

--eric0

Dan Mahoney, System Admin

unread,
Jun 15, 2016, 2:21:24 AM6/15/16
to Puppet Users
On Tue, 17 May 2016, Eric Sorenson wrote:

> Hi Dan, this is a good and timely post.

I apologize for the lack of response. Health issues have taken a front
seat for a while.

> I'm working on some related issues regarding Puppet's CA that may help
> you out. Your thinking on this is roughly correct -- things are a lot
> harder than they need to be, but the above advice to nuke everything and
> start over is both overly simplistic and wrong-headed.

Funny, that's pretty much exactly the advice I'm seeing here:

https://docs.puppet.com/puppet/4.4/reference/ssl_regenerate_certificates.html

Once you blow away your old CA, *none* of your agents work. If you've
been running puppet for five years, not everything is due to expire all at
once.

I've found a way forward that I think is reasonably clever, and I'll go
into it below.

> Note that my comments here are specifically about the Clojure CA that
> is included in puppetserver, not the Ruby CA; most things apply to both
> but the past couple of years of server-side bugfixes and development energy
> have gone into the Clojure CA, and Puppet 5 will consolidate
> all the CA-side cert lifecycle onto this codebase. 

I'm largely running Puppet 3.8, open source. My certs expire sooner than
5 will be released.

> You're right that the agent SSL code is very old and badly needs an overhaul.
>
> * There's still no support for putting multiple certificate files as the
> puppet CA -- all must still be signed by a common root entity.  Is this
> correct?  (In the "web" analogy, my browser could have lots of built-in
> and additional trust-points, both corporate and as-shipped).
>
> Have you verified experientially that this doesn't work in current Puppet
> versions?

I have verified that multiple root certificates in the file will at the
very least not crash the agent. Which means, I guess, if you're rolling
from one master to another, you can seed out a ca file with two roots in
it, via puppet itself (but *not* via the auto-download).

> You're right that the agent does not support a CApath, in openssl parlance: a directory
> of hashed CA certs, any of which are valid. The server side farms out its SSL verification
> to the underlying web stack, so it ought to be tolerant of agents issued from
> multiple CAs checking in. I haven't tried this angle yet.

Not every OS uses CA pathing. Some of the linuxes do. FreeBSD uses a
monolithic cert. As far as I understood it, it's a function of the
underlying SSL library. It would be nice -- at least that way you could
deploy certs atomically.

> * There's no directive I can find whereby puppet agents can, within N days
> of expiry, re-request their certificate, while maintaining a valid one in
> the meantime.  On the puppet master, a duplicate cert is treated as an
> absolute error and must be purged from both sides with extreme prejudice
> and started over.
>
>
> The first part is true, the second is controlled by the 'allow-duplicate-certs' CA setting
> which will allow later requests to overwrite newer ones. 

I think you mean older? Also, do you really mean overwrite, or do you
mean the two certs can coexist?

Presumably, the master needs to keep a list of everything it's signed, so
that it can later revoke a given certificate. That's all listed in the
inventory.txt file, as well as a copy cached in the "signed" directory.

> As I said above, on the master the cert verification is delegated to the
> web server layer (jetty in the case of the puppetserver, apache or nginx
> or (gah) webrick for non-puppetserver setups).
>
> There's nothing built-in that does either of these things. But policy-based
> autosigning provides an API that lets you do this based on some
> 'a priori' knowledge you have of the node: 
>
> https://docs.puppet.com/puppet/4.4/reference/ssl_autosign.html

Frustrating. When you're using an external signing executable -- unless
you've had the forethought to tell all your agents to see extra,
non-default information into their CSR, the only info you get is the CSR
and the common-name.

> This is an interesting line of thought that I'm looking into more on the CA side:
> you can re-use the same private key to generate a new certificate that would
> have an extended lifetime but not require a complete re-key.

> There isn't an API guarantee that agents' certs (and therefore their public keys)
> are collected and stored in the CA, though such a thing would be very useful
> and is on deck for future work - you can see the whole list of these things at
> https://tickets.puppetlabs.com/browse/SERVER-974 



> * We blindly trust the first CA we get (using the default options), but
> then have NO real method for accepting a second CA without manually
> manipulating the CA files directly.  (DANE, anyone?)

> I don't know what DANE means in this context, but this statement is true.

DANE (and the DNS TLSA Record) is a way of putting your SSL certificate
(either your private key, or your entire cert) into the DNS. It hinges on
using DNSSEC as a trust-anchor point, rather than a pre-seeded cert. The
idea is that with a validating resolver, as you'd have in an enterprise,
you'll either get the correct record, or servfail. Postfix can make use
of this, and there's a standards-track RFC for it as well. (RFC6698, and
others). Assuming you have good DNSSEC, it's a useful method of
bootstrapping your trust chain -- similar to putting SSHFP records in DNS.

> I have a WIP doc adding support for intermediate CAs up at:
>
> https://gist.github.com/ahpook/06d4cfda1d68c08bc82fbfdc40123b28
>
> This may help as many of the goals (and the corresponding implementation bugs or missing features) overlap with your use-case.

Somewhat, but I went a slightly different way.

> I'm definitely interested in keeping in touch as you go through this, to
> make things easier for other people. I think a lot of sites are coming
> up on the five year anniversary of their installation and the easier
> this kind of re-keying can be, the better.

So, here's something insane I discovered today.

Puppet defaults all certs. Including the CA. And the cert on the master.
To five years.

Go open firefox, and look at the expiry dates of the CA certs they ship.
Some of them are valid till the 2030's. A root cert expiring is a major,
major pain.

No commercial issuer will issue a cert with an expiry date past the expiry
of its parent.

The Puppet CA will actually issue certs that have validity dates outside
the validity of the root cert. If you have had your root cert for four
years, a cert you request now will be valid for four years after your root
cert expires (and thus, won't validate anyway).

That said, we can take advantage of that, as long as we pay attention to
something fundamental in SSL. Keys don't expire. Signatures do. And
certificates don't sign other certificates -- private keys sign certs.

So this was my plan of attack, which mostly worked.

0) Back everything up.

1) Get onto the master, and generate a new CA certificate, using the exact
same private key:

openssl req -new -key ca_key.pem -out /tmp/ca_cert.csr -subj '/CN=Puppet
CA: pm.foo.org'

openssl x509 -req -in /tmp/ca_cert.csr -signkey
/var/puppet/ssl/ca/ca_key.pem -days 3500 -out /tmp/ca_crt.pem -extfile
/etc/ssl/openssl.cnf -extensions v3_ca

...and then copied it into place in /var/puppet.

We're re-using existing keying material. Assuming our puppetmaster hasn't
been compromised, there's not a lot more risk here than there would have
been if we had originally signed our cert for ten years. The main loss
here is that it's still 1024 bit and sha1 -- which I consider acceptable,
at least to buy some time. (We can't change the keysize and still keep
the same key).

2) Since the puppetmaster's SSL cert is only a few minutes younger than
the CA cert, we need a new one of those too.

I tried manually regenerating those, but puppet refused to start.
However, letting puppet magically build its own DID work.

service puppetmaster stop
puppet cert clean pm.isc.org
service puppetmaster start

3) Experimentally replaced the new ca_crt.pem from the puppetmaster onto a
few agents. No expiry warnings were issued -- the new dates were honored.

4) Tried doing an agent run on machines where I hadn't replaced the CA
file. I got the expiry warning, but an agent run still worked.

5) Defined a fairly simple manifest that replaced
/var/puppet/ssl/ca/ca.pem with my new version, and ran it in a few nodes.
It worked -- nodes that had been warning previously, no longer did.

So, on my TODO list, is the following:

* Look over the inventory.txt file, parsing it with some code, such that
we look at the *last* match for a given host to find when a machine
expires. (By definition, no machine will expire sooner than the CA was
due to, so that was my major hurdle).

* Use that to trigger an SSH push to delete and regenerate my certs, for
those machines. This can go in cron, perhaps only running during business
hours. Rather than using any kind of policy-based-autosigning, such a
tool will simply chain the commands (clean the agent, clean the master,
run the agent, sign the cert, rerun the agent to make sure all's well).

Note that this way, as I've deployed puppet more widely in my
organization, I don't lose any of the machines that I've enrolled in the
last four years, and I get a nice rolling upgrade, instead of
everything-at-once.

Here's what puppet needs to fix -- rather than open a bunch of feature
requests, I'll list some things here, and see if you think they're viable.

* The five year metric for a root ca cert needs a re-thinking. I'm sure
all the hip young kids don't expect to have a vm in "the cloud" last
longer than a few months, but this stuff, for us, is critical
infrastructure.

* Puppet needs to support varying trust models. Intermediate certs.
Commercially signed certs. Cases where you have two distinct CA's and we
can trust a thing signed by either.

* Puppet needs a metric to trust our initial CA file, and to roll that
trust forward to additional CA files.

* Puppet needs an API-based method to re-request a cert, perhaps keeping
one active while another is pending. If I can go to the DMV and request a
new drivers license, before my old one expires, and they're all still
running windows XP, we can figure this out.

* More data needs to be exposed to the policy-based autosigner. Looking
at the example immediately above, your pending-expiry cert could serve as
a useful passport to get a new cert. Exposing the connecting IP is
useful, as well as things that are machine-unique like mac-addresses, and
the sha256 sum of ssh private keys, which would need to match prior
reports. One could even build a "points" based system that autosigns
things conditionally depending on how much proof is presented.

* Also useful would be being able to use a DNS TXT record as a
bootstrapping method for additional information to supply in the CSR.
Because right now, we're installing a totally stock puppet package, and
having to preconfigure a bunch of things sort of defeats the purpose of
having puppet.

* Puppet really needs to not sign things with dates that exceed ca-expiry
(at least, not by default). Trying to sign something when your root cert
is due to expire within, say, three months, should yield a warning.

* The CA warnings need to happen on the master, as well as on the agents.
The master is where we're typing the signing command. There's LOTS of
stuff going by on the agent and it's easy to miss.

* Rather than (or in addition to) a CRL, perhaps an OCSP responder?

* DANE. Really. It's good stuff.

Thanks for all your help. If you spot any issues with the above, please
let me know.

-Dan

--

John Gelnaw

unread,
Jun 20, 2016, 1:43:03 PM6/20/16
to Puppet Users

Many thanks for the re-signing of the CA idea.

I can report that it worked for me, although I had to run the webrick version of puppetmaster to regenerate the puppet master's certificate.

Since I have a full mcollective deployment as well, I was able to use the following steps to automate the cert regen on my clients:

puppet cert clean <host>
mco puppet resource exec "/bin/rm -rf /var/lib/puppet/ssl/*" -W fqdn=<host>
mco puppet runonce -W fqdn=<host>
puppet cert sign <host>

I think I'll run a nightly cron job off my puppet server to search for certificate files that are within 14 days of expiring, and auto-regen them using this method.

Dan Mahoney

unread,
Jun 21, 2016, 2:54:49 PM6/21/16
to Puppet Users
On Mon, 20 Jun 2016, John Gelnaw wrote:

>
> Many thanks for the re-signing of the CA idea.
>
> I can report that it worked for me, although I had to run the webrick version of puppetmaster to regenerate the puppet master's certificate.

Okay -- so, I discovered a few things that I should share, and perhaps
that others should perhaps heed as well.

Take your old certificate, and plug it in here:

https://redkestrel.co.uk/products/decoder/ (Super helpful tool!)

And then try your old cert. You'll notice some differences.

There's a few things that you should do if you're following my
previous instructions.

Additional Certificate Fields
=============================

When you sign a certificate, there can be extra fields in the certificate,
beyond the basic "here's a key, signed by another key".

There's at least a couple fields that we didn't add -- some may matter in
the future, some may not.

For example, there's a comment field (puppet probably will never care
about this), as well as some special attributes that say CA: True.
(Puppet may in the future care about this -- a proper root ca cert will
have this field set).

There's also a few hashes, a "subject key identifier" and an "authority
key identifier".

Finally, there's some certificate purpose fields present that list what a
cert may be used for. (Puppet for the moment doesn't seem to look for or
check these, but if they decide to be more strict in the future, you want
to at least match the puppetmaster's old behavior).

To get these in, I made use of my openssl.cnf (on FreeBSD this is in
/etc/ssl), and added the following fields:

[ v3_ca ]
subjectKeyIdentifier=hash
authorityKeyIdentifier=keyid:always,issuer
nsComment = "Ruby Generated Certificate"
basicConstraints = critical,CA:true
keyUsage = cRLSign, keyCertSign

And ran OpenSSL with the following args:

openssl x509 -req -in /tmp/ca_cert.csr -signkey
/var/puppet/ssl/ca/ca_key.pem -days 3500 -out /tmp/ca_crt.pem
-extfile /etc/ssl/openssl.cnf -extensions v3_ca -sha256

Better Hashtypes for your certs
===============================

The redkestrel tool will complain about an sha1 hash on your cert (as will
ssllabs, and other tools -- with a commercial certificate, you'd often be
eligibile for a free re-issue).

I added -sha256 because openssl defaults to an sha1 signature on my
platform, and this is deprecated. I did make sure my oldest clients could
still validate that cert (your linuxes and other OSes should be tested as
well).

It's possible (but unlikely) that a future update to OpenSSL or
puppet could cause it to no longer like sha1 signed hashes -- similar to
the way chrome and other browsers are choosing to no longer honor them.

I don't know if puppet currently uses a better algo in current versions.

Note that there's openssl docbugs listed for the fact that -sha256 isn't
listed in the usage messages, but please do feel free to google -- I
wouldn't expect you to randomly trust running undocumented openssl
commands from a stranger on the net. :)

Inventory.txt
=============

Finally, take note of the fact that your new certificate doesn't show up
in inventory.txt -- adding it manually might not be a bad idea, just in
case, but openssl itself doesn't know how to update that file. (I'm not
sure why the puppet authors didn't use the standard openssl CA format for
their key list). Since 'puppet cert clean' uses that file to get the
serial number to revoke, you probably want your new cert there for
completeness.

Moving Forward
==============

At least for me, this is still an older key that I'm using (five years
ago, the default was 1024-bit) -- so there's a plan to replace it,
gradually, with a new one with modern expectations (4096-bit, most
likely). What we've done here is simply made sure that our old key
doesn't expire out from under us while we're rolling this stuff out.

There's still a bunch of questions and problems I've got with this
process, but I do hope my previous statements and the above are helpful.

John Gelnaw

unread,
Jun 21, 2016, 7:56:52 PM6/21/16
to Puppet Users

You can also use:

# openssl x509 -in ca_cert.pem -text -noout

to see all the fields of the SSL cert.
Reply all
Reply to author
Forward
0 new messages