Validation Resources

91 views
Skip to first unread message

Spencer Krum

unread,
May 14, 2015, 3:11:19 PM5/14/15
to puppe...@googlegroups.com
Hi Folks,

There is currently a PR against stdlib that I am writing to you today
about: https://github.com/puppetlabs/puppetlabs-stdlib/pull/444
Thanks to Spredzy for making this PR.

This is tracked in jira:
https://tickets.puppetlabs.com/browse/MODULES-1982

This pattern has poked up a few different places. As the PR says, it has
shown up in the monogodb module and the puppetdb module. I know that
Michael Chapman added something like this to his OpenStack things and
Dan Bode as well.

At the modules triage today we had the following reactions (please reply
if there is something you said I didn't get):

* This is a new pattern
* Having it in stdlib means we can't iterate on it quickly
* This is a library thing, and should be a library
* Once standardized, puppetdb and other modules could be retrofitted to
use it
* This will probably change frequently as people use it and explore what
it should/can do

We had the idea that rather than landing this in puppet-stdlib, that we
could create a module in puppet-community to hold this and other
validation/health check resources.

We had some ideas on the name:

puppet-healthcheck
puppet-validation
puppet-external_validate.

It's worth noting that these are primitives for building multi-node
orchestration with Puppet.

What do you think? Do you use these patterns? Would you? What would you
want from your library?

Thanks,
Spencer


--
Spencer Krum
ni...@spencerkrum.com

Clayton O'Neill

unread,
May 14, 2015, 3:23:47 PM5/14/15
to puppe...@googlegroups.com, ni...@spencerkrum.com
We have something almost exactly the same as this that we use internally.  We
also have a http_conn_validator that we use for services more like PuppetDB
that need to do a HTTP request against a specific URI, or look for a specific
HTTP result code.  This has been huge for us to be able to automate integration
testing for multiple node environments.

Trevor Vaughan

unread,
May 14, 2015, 3:24:48 PM5/14/15
to puppe...@googlegroups.com
I'd like to counter this limited use case with my rant about semaphores from five years ago: http://comments.gmane.org/gmane.comp.sysutils.puppet.devel/13039.

Followed by the conversation from two years ago. https://projects.puppetlabs.com/issues/16187

What you want is cross-node synchronization and synchronization storage state.

You can sort of do this with exported resources, but it's VERY clumsy.

I know that it's a long shot, but I figure that I'll resurrect it as appropriate every couple of years ;-).

Other than that, why not call it 'haproxy'.

Trevor




--
  Spencer Krum
  ni...@spencerkrum.com

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/1431630674.2625129.268922745.5AA0382C%40webmail.messagingengine.com.
For more options, visit https://groups.google.com/d/optout.



--
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvau...@onyxpoint.com

-- This account not approved for unencrypted proprietary information --

Trevor Vaughan

unread,
May 14, 2015, 3:29:08 PM5/14/15
to puppe...@googlegroups.com
Ugh, sorry all, didn't mean to make that so rant-ish.

Anyway, it would seem that you would not want to hold up a catalog compilation or application for this. Instead, you would want to register the check with a service that could drop a queriable entity that could be used by Puppet for making decisions about the compilation and/or application of the catalog.

PuppetDB may be the ideal place to host this but it could also be a stand-alone, authenticated, service.

Obviously, nodes should only obtain their own data unless explicitly shared between a node group.

In terms of naming, I would probably call it network_service_status or some such.

Thanks,

Trevor

Colleen Murphy

unread,
May 14, 2015, 3:32:04 PM5/14/15
to puppe...@googlegroups.com
As a data point for another use case, currently the OpenStack puppet modules have a while-loop with a timeout buried within the provider itself:


This is useful for cases when puppet restarts a service that doesn't come back up right away, or for when the operation is being done over a laggy network connection. Would love a less hacky way to do this.

Colleen

Spencer Krum

unread,
May 14, 2015, 3:33:07 PM5/14/15
to puppe...@googlegroups.com
Trevor,
 
I agree that if you take it to its logical conclusion you end up with semaphores stored in consul and a handful of Puppet resources to interact with them. Dan Bode presented on exactly this (and what doesn't work well about it) at the PDX Puppet Users group last month.
 
I think though that from a practical standpoint, these resources as written have value. Simply waiting for some java process to start before you do follow-on actions is a common task. And looking to the future I'd like to see them live in their own module so we can evolve them without symver constraints.
 
--
Spencer Krum
 

Trevor Vaughan

unread,
May 14, 2015, 3:38:21 PM5/14/15
to puppe...@googlegroups.com
Hey! Do you have a link to that presentation?

For a Java spin lock, wouldn't it go something like:

* First try -> Wait until timeout
* Timeout -> Drop file
* Second try -> Notice and remove file
* Try again
* etc...

Trevor


For more options, visit https://groups.google.com/d/optout.

Trevor Vaughan

unread,
May 14, 2015, 3:45:52 PM5/14/15
to puppe...@googlegroups.com
Hmm....what about a concept of deferred actions?

I.e. Try this resource, can't do it, shove it (and it's dependencies) to the bottom of the stack and do everything else, then come back to it.

You could even technically have a method for simply backgrounding that entire resource chain.

This sort of sends me down an idea that I had that I'd like to be able to apply, by reference, a puppet resource chain from an excerpt of the catalog.

For instance: As the catalog runs, I would like to have each catalog collection fragmented into a mini catalog that I could use to apply with other utilities.

Since it's a DAG, this should be relatively easy but you could potentially incur a lot of I/O overhead.

BUT, you could have something like the Puppet deferred action daemon (Puppet-DAD) that would pick those items up after the main Puppet run and execute them, sending a report back after each fragment.

*sigh*....and now more from Puppet 22.

Thanks,

Trevor


For more options, visit https://groups.google.com/d/optout.

Dan Bode

unread,
May 14, 2015, 3:48:56 PM5/14/15
to puppe...@googlegroups.com, ni...@spencerkrum.com


On Thursday, May 14, 2015 at 12:11:19 PM UTC-7, Spencer Krum wrote:
Hi Folks,

There is currently a PR against stdlib that I am writing to you today
about: https://github.com/puppetlabs/puppetlabs-stdlib/pull/444
Thanks to Spredzy for making this PR.

This is tracked in jira:
https://tickets.puppetlabs.com/browse/MODULES-1982

This pattern has poked up a few different places. As the PR says, it has
shown up in the monogodb module and the puppetdb module. I know that
Michael Chapman added something like this to his OpenStack things and
Dan Bode as well.

FWIW, I've moved away from this pattern for a few reasons:

- blocking catalog execution lad to lots of issues, the worst of which I've seen is that recovery of failure states can take forever. I've moved towards a mode is just failing a subgraph quickly if pre-conditions aren't met.
- I've also moved away from the model of directly querying to perform the same actions through a service registration/discovery system (in my case consul)

Trevor Vaughan

unread,
May 14, 2015, 3:54:26 PM5/14/15
to puppe...@googlegroups.com, ni...@spencerkrum.com
- I've also moved away from the model of directly querying to perform the same actions through a service registration/discovery system (in my case consul)

Interesting, were you doing these in the providers/on the clients, or at catalog compile time?

I was thinking it would be a catalog compile service as opposed to a client side operation to minimize distributed complexity.

Thanks,

Trevor

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Dan Bode

unread,
May 14, 2015, 4:12:24 PM5/14/15
to puppe...@googlegroups.com, ni...@spencerkrum.com
On Thu, May 14, 2015 at 12:54 PM, Trevor Vaughan <tvau...@onyxpoint.com> wrote:
- I've also moved away from the model of directly querying to perform the same actions through a service registration/discovery system (in my case consul)

Interesting, were you doing these in the providers/on the clients, or at catalog compile time?

sorry if this is a little bit of hi-jacking. and please feel free to tldr :)

I've been experimenting with lots of variations, in all variations, it relies on externally available service profiles to include puppet for registering their services on consul along with service checks.

For the consuming (or dependent services), I experimented with 3 different models.

1. (DID NOT WORK) fail to compile catalog if external resources are not ready or available. This approach did not work for my applications b/c it deadlocks if you have any circular cross host dependencies. It also doesn;t allow you to pre-compile a version of a catalog before all it's dependencies are ready (which I use to pre-install packages to optimize overall build times)
2. (MIGRATING AWAY FROM) Block until services are discoverable via DNS (or through consul's service apis). This model is very similar to the model approached in this path, and it worked, but it was less than ideal for a few reasons:
- There is no way to specify the blocking resources as nice. They wind up blocking lots of things that they don't need to if they exist in a place in the graph where order it ambiguous.
- Don't handle catastrophic failure well. In cases where everything blows up, Puppet might be able to remediate to a working state, but it winds up being blocked on these resources. This is very frustrating and is the main reason that I wound up moving away from this model. 
3. Check state during compile, pass data to a special resource that fails subgraphs. Puppet doesn't really support passing state info between resources at run-time (this seems like the biggest difference between Puppet/Chef). For this reason, we wanted to check state during compile time where that state can be forwarded. We created a specify resource called runtime_fail (https://github.com/JioCloud/puppet-orchestration_utils/blob/master/lib/puppet/type/runtime_fail.rb) that does nothing except take this data and then fail a subgraph.

the typical example is something like:

class someclass(
  $external_service = values(service_discovery_consul('some.dns.address')),
) {

  # fail if service has not reached quorum
  if size($external_service) != 3 {
    $fail = true
  } else {
    $fail = false
  }

  runtime_fail { 'quorum not reached for external service': fail => $fail }

  service {'local_service':
     require => Runtime_fail['quorum not reached for external service']
  }
}
 

Erik Dalén

unread,
May 14, 2015, 7:00:22 PM5/14/15
to puppe...@googlegroups.com
On Thu, 14 May 2015 at 21:45 Trevor Vaughan <tvau...@onyxpoint.com> wrote:
Hmm....what about a concept of deferred actions?

I.e. Try this resource, can't do it, shove it (and it's dependencies) to the bottom of the stack and do everything else, then come back to it.

If puppet would think that the provider is not yet suitable it would do exactly this AFAIK. Would be interesting to "exploit" that mechanism to defer the processing of that subgraph instead of just blocking.
 

John Bollinger

unread,
May 15, 2015, 4:06:54 PM5/15/15
to puppe...@googlegroups.com


On Thursday, May 14, 2015 at 2:45:52 PM UTC-5, Trevor Vaughan wrote:
Hmm....what about a concept of deferred actions?

I.e. Try this resource, can't do it, shove it (and it's dependencies) to the bottom of the stack and do everything else, then come back to it.


Yes.  If there's a resource that depends on some piece of machine state that is asynchronous with respect to catalog application, then doing *anything* productive while waiting on that state to change is better than doing nothing.  Inasmuch as there may be plenty of other resources that can be applied without delay, applying all such resources should be the first choice for a time filler.

 

You could even technically have a method for simply backgrounding that entire resource chain.


That would be easier if resources reliably formed simple chains.  In practice, they too often form overlapping trees.  That doesn't make backgrounding impossible, but it does complicate things.

The logical extension of backgrounding is full-fledged multi-threaded / multi-process catalog application.  In the past I have been rather leery of that idea, but I think I'm warming up to it.


John

Trevor Vaughan

unread,
May 15, 2015, 4:27:23 PM5/15/15
to puppe...@googlegroups.com
I was actually really hoping for multi-threaded catalog application but I realized that 99% of the time I don't want it because I'm already pegging one processor working on the system.

Also, the code complexity that comes with that probably isn't worthwhile.

HOWEVER, if we take the idea above and split everything into independent portions, spin up threads (yeah, I know) that work on that independent portion AND have a shared memory data repository that allows for cross thread synchronization....it could be really awesome.

But, the big thing that I want is the ability to isolate and apply micro portions of the catalog without actually having to load the entire catalog.

Trevor

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Luke Kanies

unread,
May 15, 2015, 6:23:39 PM5/15/15
to puppe...@googlegroups.com
On May 14, 2015, at 4:00 PM, Erik Dalén <erik.gus...@gmail.com> wrote:



On Thu, 14 May 2015 at 21:45 Trevor Vaughan <tvau...@onyxpoint.com> wrote:
Hmm....what about a concept of deferred actions?

I.e. Try this resource, can't do it, shove it (and it's dependencies) to the bottom of the stack and do everything else, then come back to it.

If puppet would think that the provider is not yet suitable it would do exactly this AFAIK. Would be interesting to "exploit" that mechanism to defer the processing of that subgraph instead of just blocking.

It wouldn’t be too hard these days to just support this directly, rather than hacking it with provider suitability.  Last I heard, Nick Lewis or Patrick Carlisle were the best to ping about this.

You basically just need a simple boolean method that indicates whether a resource is ready, then the system to defer resources that aren’t ready.  That patch would surely get accepted. :)

FWIW, I think it’s a great idea, and would really help.

I do think we should support parallelization, but I think that’s also a harder and less universally desired problem.


For more options, visit https://groups.google.com/d/optout.


— 

Luke Kanies

unread,
May 15, 2015, 6:52:14 PM5/15/15
to puppe...@googlegroups.com
(Sorry, coming late to the thread.)

I think this is a great idea.

I built something similar a long time ago:

https://github.com/puppetlabs/puppetlabs-remote_resource/blob/master/ext/example.pp

I agree it should be independent, at least for now, to support faster evolution and because I like small things. :)

However…

This is something we’re working on a ton internally, for related but different reasons. Nothing is in a state that’s worth sharing, unfortunately, because this is just one part of a much larger project and we can’t easily share any of it until we can share all of it (because it’s very confusing in its current state, among other reasons).

Basically, we want to build a special kind of resource for exactly this. All resources of this special type would share some values, like interval and timeout, and then each resource type would provide the parameters for things like IP address, port, etc.

We want to go quite a bit past just checking IP and port, though.

I want a base provider that can just check network status, but why not a provider that can check whether a database is all set up, and a user can connect with a specific password?

Or why not something that validates that a filesystem is available? No network information, just validate that a file is present, or something similar. These often take a while to actually work.

Or even more, why not confirm that a configuration file is valid? Wouldn’t you love something that would roll back the sudoers file if you managed to deploy a broken one?

In other words, this should be a generic framework, with generic support from Puppet, and we should provide as much power to the framework as possible.

We’re really only focused on the simplest form of health checks at this point, but because we’re explicitly expecting to make some changes to core Puppet to make some of it work, we likely won’t be able to make the whole thing external.

We’re basically at the “plan to build” phase right now, without full designs in place.

David Lutterkort is one of the eng leads, and Ryan Coleman is the product manager.

And I’m off to China for a week in 12 hours, so I’ll be off the grid for the next week, and thus can’t elaborate. :/

And, in regards to Trevor’s rant, and pointing to the old ticket, yeah, it’s exactly that stuff (plus a lot more) that’s led us to this. We’re just finally getting to invest in it. :)



Spencer Krum

unread,
May 20, 2015, 7:14:53 AM5/20/15
to puppe...@googlegroups.com
Thanks everyone for your feedback. It sounds like this is a popular
idea, and one that leads into further discussion.

I've created https://github.com/puppet-community/puppet-healthcheck and
asked spredzy to submit the PR against that repo.

Cheers,
Spencer

--
Spencer Krum
ni...@spencerkrum.com

> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-dev+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-dev/651C9A29-8133-4E1A-B14E-8493B203E091%40puppetlabs.com.
Reply all
Reply to author
Forward
0 new messages