Options for notifying external services in case of changing exported resources.

96 views
Skip to first unread message

Jelle Smet

unread,
Mar 14, 2016, 12:23:53 PM3/14/16
to Puppet Users
Hi list,


I have a Puppet module (internal to the company) which makes use of exported resources.
The exported resource data stored in PuppetDB is used to configure an external application.

I'm looking for a way to notify this external service in order to trigger it load the stored exported resources from PuppetDB.
If the "trigger" would be a log (syslog) event that would already be sufficient to work with.

I have found 3 ways to achieve this, but for different reasons I'm not allowed to use them:

  1. Use a Postgress trigger on the PuppetDB database which generates an event in case data changes.

  2. https://github.com/ryanuber/puppet-tell.

  3. An exec of the "/usr/bin/logger" command with "refreshonly" which subscribes to a a file resource writing the exported resources to a file.
    The data written to the file would then have to be the same as the exported resources.
    This would mean, if the "exported resources"  change, the content of the file changes which only then produce a syslog.
Unfortunately, the notify resource doesn't support "refreshonly".


Are there any other constructions which would allow me to notify an external application in case exported resources for a host have changed?


Thanks,

Jelle

jcbollinger

unread,
Mar 15, 2016, 10:50:28 AM3/15/16
to Puppet Users


On Monday, March 14, 2016 at 11:23:53 AM UTC-5, Jelle Smet wrote:
Hi list,


I have a Puppet module (internal to the company) which makes use of exported resources.
The exported resource data stored in PuppetDB is used to configure an external application.


That sounds like an odd way to go about it.  One would normally export resources that directly configure the external application, one way or another.  Details vary greatly with nature of the application that needs to be configured, of course.

 

I'm looking for a way to notify this external service in order to trigger it load the stored exported resources from PuppetDB.
If the "trigger" would be a log (syslog) event that would already be sufficient to work with.

I have found 3 ways to achieve this, but for different reasons I'm not allowed to use them:

  1. Use a Postgress trigger on the PuppetDB database which generates an event in case data changes.

  2. https://github.com/ryanuber/puppet-tell.

  3. An exec of the "/usr/bin/logger" command with "refreshonly" which subscribes to a a file resource writing the exported resources to a file.
    The data written to the file would then have to be the same as the exported resources.
    This would mean, if the "exported resources"  change, the content of the file changes which only then produce a syslog.
Unfortunately, the notify resource doesn't support "refreshonly".


Are there any other constructions which would allow me to notify an external application in case exported resources for a host have changed?



So that we do not waste our time suggesting other non-viable alternatives, how about giving us the constraints on what solutions you will be allowed to use?

So that we can get a handle on what even to consider, how about describing what has to happen to configure the target application?

Without more information, anything we might suggest would be a shot in the dark.


John

Jelle Smet

unread,
Mar 16, 2016, 5:36:09 AM3/16/16
to Puppet Users

On Tuesday, March 15, 2016 at 3:50:28 PM UTC+1, jcbollinger wrote:


On Monday, March 14, 2016 at 11:23:53 AM UTC-5, Jelle Smet wrote:
Hi list,


I have a Puppet module (internal to the company) which makes use of exported resources.
The exported resource data stored in PuppetDB is used to configure an external application.


That sounds like an odd way to go about it.  One would normally export resources that directly configure the external application, one way or another.  Details vary greatly with nature of the application that needs to be configured, of course.


So we have an internal restful service which has to be configured by submitting JSON data into the appropriate api endpoints.
The JSON data can be constructed from data stored in PuppetDB.
We have a process which reads the content of puppetdb, constructs the JSON and submits that to the necessary api end points of the restful webservice.
The problem with this approach is that the middleware process (responsible for reading puppetdb, converting & submitting the data to the api endpoint) continuously needs to read puppetdb for added and removed hosts.
This is simply not a very practical thing to do and causes a lot IO whilst nothing needs to be done.

The idea is to trigger this middleware to read from PuppetDB if the exported resources of a host have changed.  Because only then it needs to update the configuration in the restful interface.
That would be better because you don't need to continuously scrape puppetdb for changes.

 
I'm looking for a way to notify this external service in order to trigger it load the stored exported resources from PuppetDB.
If the "trigger" would be a log (syslog) event that would already be sufficient to work with.

I have found 3 ways to achieve this, but for different reasons I'm not allowed to use them:

  1. Use a Postgress trigger on the PuppetDB database which generates an event in case data changes.

  2. https://github.com/ryanuber/puppet-tell.

  3. An exec of the "/usr/bin/logger" command with "refreshonly" which subscribes to a a file resource writing the exported resources to a file.
    The data written to the file would then have to be the same as the exported resources.
    This would mean, if the "exported resources"  change, the content of the file changes which only then produce a syslog.
Unfortunately, the notify resource doesn't support "refreshonly".


Are there any other constructions which would allow me to notify an external application in case exported resources for a host have changed?


So that we do not waste our time suggesting other non-viable alternatives, how about giving us the constraints on what solutions you will be allowed to use?

If puppet (agent/master/whatever puppet component) could produce a log event (like syslog) when the exported resources of a host have changed that would be great.

The constraints I got:
  • No execs because it's considered to be bad practice.
  • Adding triggers to Postgress is not allowed because it requires modifications to the DB schema and might impact upgrade procedures to deviate from trom what is provided by Puppetlabs
  • The Puppet is not allowed to contact / have a dependency on any external networked services (local syslog would be fine though). hence no puppet-tell 
 
So that we can get a handle on what even to consider, how about describing what has to happen to configure the target application?

See earlier, though I don't want to elaborate too much on that part.  I simply want to achieve away to generate a syslog when exported resources of a host change.

Without more information, anything we might suggest would be a shot in the dark.

I didn't want to overload the question by sticking to the essence. I hope things are a bit more clear now.
 

jcbollinger

unread,
Mar 16, 2016, 10:12:55 AM3/16/16
to Puppet Users


On Wednesday, March 16, 2016 at 4:36:09 AM UTC-5, Jelle Smet wrote:

On Tuesday, March 15, 2016 at 3:50:28 PM UTC+1, jcbollinger wrote:


On Monday, March 14, 2016 at 11:23:53 AM UTC-5, Jelle Smet wrote:
Hi list,


I have a Puppet module (internal to the company) which makes use of exported resources.
The exported resource data stored in PuppetDB is used to configure an external application.


That sounds like an odd way to go about it.  One would normally export resources that directly configure the external application, one way or another.  Details vary greatly with nature of the application that needs to be configured, of course.


So we have an internal restful service which has to be configured by submitting JSON data into the appropriate api endpoints.
The JSON data can be constructed from data stored in PuppetDB.


As I said, that's a very odd way of doing things.  It's odd on at least two levels:

1. it's a bit odd that you want to configure the application by dynamically submitting data to the service, instead of by directly managing its configuration data store.  This approach has negative consequences, including, but not limited to, the type of problem you have asked about.

2. it's very odd that you construct the JSON data by using an external application to cull it from PuppetDB, especially when you're already using exported resources.  There may be a good reason for this approach -- it's hard to tell since you're so parsimonious with the details -- but I wouldn't generally recommend it.

Overall, it does not mesh well with Puppet's approach to configuration management: state, not actions, are the central focus of Puppet's view of configuration management.  Its basic mode of operation involves cycles of (a) determining what the desired machine state is, (b) comparing the actual state to the desired one, and (c) updating the machine state where necessary.  The process you describe seems grossly inconsistent with that model.

 
We have a process which reads the content of puppetdb, constructs the JSON and submits that to the necessary api end points of the restful webservice.
The problem with this approach is that the middleware process (responsible for reading puppetdb, converting & submitting the data to the api endpoint) continuously needs to read puppetdb for added and removed hosts.
This is simply not a very practical thing to do and causes a lot IO whilst nothing needs to be done.


I agree, continuously polling the DB is a terrible idea.

 

The idea is to trigger this middleware to read from PuppetDB if the exported resources of a host have changed.  Because only then it needs to update the configuration in the restful interface.
That would be better because you don't need to continuously scrape puppetdb for changes.



A better idea would be to cut out the middleware altogether.  Put the the host on which your service runs under Puppet management, if it isn't already.  Manage the service's configuration files via Puppet, and signal it to re-read them when Puppet changes them.  This is a pretty standard pattern.

A variation on this pattern that might interest you involves building the configuration file from multiple fragments, via the Concat module.  The fragments can be exported by multiple hosts and collected into the catalog for the machine(s) on which the overall file needs to be managed.  Even if there is no alternative to managing your service via its REST API -- maybe it has no static configuration file, for example -- you should still be able to use this or a similar approach to avoid scraping the needed configuration data from PuppetDB.

 
If puppet (agent/master/whatever puppet component) could produce a log event (like syslog) when the exported resources of a host have changed that would be great.


If you *must* do it this way, then you can configure agents to send reports to the master at the end of each run, and you can configure a report processor on the master that does whatever it needs to do with the resulting data.

If you can get by with less data, and if your catalog runs normally go complete without any changes, then you could perhaps also run the agent via a scheduler (instead of as a daemon), with the --detailed-exit-codes option turned on, and trigger your middleware based on the agent reporting via its exit code that changes were applied.

 

The constraints I got:
  • No execs because it's considered to be bad practice.

Absurd.  You should use Execs where they make sense.  Execs can be over-used, but the alternative is to use a different resource type, not to avoid modeling your configuration via resources.  If there isn't already a suitable resource type available, then that involves making one.  And don't tell whoever enforces that policy, but you'll see Execs in a lot of popular and well-respected modules, including many of those published by PuppetLabs.  If you rely on many third-party modules then you probably have some Execs in your manifest set from those sources.

If the prohibition against Execs is absolute and incontrovertible, however, then it wouldn't be too hard to write a custom type that just wraps the specific command you need to run, and that executes it (only) when it refreshes.  Deploying an instance of such a type in the right place would be a great deal better than relying on some kind of external middleware to trigger updates based on log traffic.
 
  • Adding triggers to Postgress is not allowed because it requires modifications to the DB schema and might impact upgrade procedures to deviate from trom what is provided by Puppetlabs

That's reasonable.  I briefly considered suggesting triggers, but rejected the idea for exactly the reason you give.
 
  • The Puppet is not allowed to contact / have a dependency on any external networked services (local syslog would be fine though). hence no puppet-tell 

Ok.


John

Jelle Smet

unread,
Mar 16, 2016, 1:05:19 PM3/16/16
to Puppet Users
As I said, that's a very odd way of doing things.  It's odd on at least two levels:

1. it's a bit odd that you want to configure the application by dynamically submitting data to the service, instead of by directly managing its configuration data store.  This approach has negative consequences, including, but not limited to, the type of problem you have asked about.

If what you're saying is bypassing the API to manipulate the api's datastore directly it is not an option and generally doesn't sound like a good idea ...


2. it's very odd that you construct the JSON data by using an external application to cull it from PuppetDB, especially when you're already using exported resources.  There may be a good reason for this approach -- 

The restful api is used to configure monitoring.  If someone wants monitoring configured they speak to the api.  People use Puppet to manage their hosts.
The manifests describing the state of a host includes the exported resources which on their turn describe the state of the monitoring for that host (or its services/applications).
Since the data required to configure monitoring is is stored in PuppetDB, we need to have a way to interface that to the restful api ....  
 
it's hard to tell since you're so parsimonious with the details -- but I wouldn't generally recommend it.

I understand your inquiry to the context of this, but I didn't want this post to focus on why things are as they are.


Overall, it does not mesh well with Puppet's approach to configuration management: state, not actions, are the central focus of Puppet's view of configuration management.  Its basic mode of operation involves cycles of (a) determining what the desired machine state is, (b) comparing the actual state to the desired one, and (c) updating the machine state where necessary.  

Agreed.

The process you describe seems grossly inconsistent with that model.

Well, the state of monitoring of a host is managed by an external system and should become aligned to the state described by Puppet which is regarded as authoritative.
 

A better idea would be to cut out the middleware altogether.  Put the the host on which your service runs under Puppet management, if it isn't already.  Manage the service's configuration files via Puppet, and signal it to re-read them when Puppet changes them.  This is a pretty standard pattern.

Not an option. The api should be used exclusively to manipulate monitoring configuration.
 
A variation on this pattern that might interest you involves building the configuration file from multiple fragments, via the Concat module.  The fragments can be exported by multiple hosts and collected into the catalog for the machine(s) on which the overall file needs to be managed.  Even if there is no alternative to managing your service via its REST API -- maybe it has no static configuration file, for example -- you should still be able to use this or a similar approach to avoid scraping the needed configuration data from PuppetDB.

Yes, that's an idea.  It would avoid the need of continuously scraping the DB. Food for thought.

If you *must* do it this way, then you can configure agents to send reports to the master at the end of each run, and you can configure a report processor on the master that does whatever it needs to do with the resulting data.

That's the kind of stuff I was looking for.
 
If you can get by with less data, and if your catalog runs normally go complete without any changes, then you could perhaps also run the agent via a scheduler (instead of as a daemon), with the --detailed-exit-codes option turned on, and trigger your middleware based on the agent reporting via its exit code that changes were applied.

That's valuable information.  I need to look into this 

Absurd.  You should use Execs where they make sense.  Execs can be over-used, but the alternative is to use a different resource type, not to avoid modeling your configuration via resources.  If there isn't already a suitable resource type available, then that involves making one.  And don't tell whoever enforces that policy, but you'll see Execs in a lot of popular and well-respected modules, including many of those published by PuppetLabs.  If you rely on many third-party modules then you probably have some Execs in your manifest set from those sources.

Agreed.
 
If the prohibition against Execs is absolute and incontrovertible, however, then it wouldn't be too hard to write a custom type that just wraps the specific command you need to run, and that executes it (only) when it refreshes.  Deploying an instance of such a type in the right place would be a great deal better than relying on some kind of external middleware to trigger updates based on log traffic.

Agreed.
 

Thanks John for the suggestions and insights.  Much appreciated. This information gives me something to works with.

Jelle Smet

unread,
Mar 17, 2016, 10:38:36 AM3/17/16
to Puppet Users
If you can get by with less data, and if your catalog runs normally go complete without any changes, then you could perhaps also run the agent via a scheduler (instead of as a daemon), with the --detailed-exit-codes option turned on, and trigger your middleware based on the agent reporting via its exit code that changes were applied.

That's valuable information.  I need to look into this 

This does not seem to be the case for changing exported resources... Which is what I'm looking for.
However, If the exported resource data would also be written to a file by Puppet agent, then the above works, but not  by merely exporting them.

jcbollinger

unread,
Mar 18, 2016, 9:07:32 AM3/18/16
to Puppet Users


I'm not sure what you think I suggested.  I said the agent reports when it applies changes.  This includes changes associated with exported (possibly by other nodes) resources.  The agent is not involved at all in exporting resources, nor even directly in collecting exported resources.  If you are nowhere collecting and applying the resources you are exporting, then you are even farther out in the boonies than I thought.  You may be able to make some variation on this scheme work, but you are using Puppet much differently than it was designed to be used.


John

Jelle Smet

unread,
Mar 20, 2016, 9:43:12 AM3/20/16
to Puppet Users
I'm not sure what you think I suggested.  I said the agent reports when it applies changes.  This includes changes associated with exported (possibly by other nodes) resources. 
The agent is not involved at all in exporting resources, nor even directly in collecting exported resources.

So let's say we have following applied to a host:

class some_application (
    $tags ) {
  @@monitor { $fqdn:
    host_tags => $tags
  }
}

The value of "tags" is automatically looked up in Hiera.

Let' say that, someone alters the value of $tags in Hiera for a specific host.
The next time the manifest is applied to the host this will result into the "host_tags" value for that host changing and being updated in PuppetDB.

As far as I know, unless I overlook something (hence my question), puppet agent is not reporting anything about this change.
I might have misinterpreted your answers into thinking that puppet agent should report on that.
 
If you are nowhere collecting and applying the resources you are exporting, then you are even farther out in the boonies than I thought. 

charming :)

 
You may be able to make some variation on this scheme work, but you are using Puppet much differently than it was designed to be used.

What can I say?

  • Puppet agent is not allowed to communicate directly with the restful API in any way.
    Therefor a custom type making puppet agent interface directly with the API (the cleanest approach) is unfortunately not possible.
So, what are the options available on the table?
All necessary data to configure monitoring is available as exported resources. I don't think it's unreasonable to find a way to take advantage of this.

By all means, you're free to disagree with this approach.  Heck, I even disagree with this approach but it's what I have to deal with and have no control over.
But, whether this is good idea or not is not the question I'm seeking help/advice about.

I'm just trying to find a way getting notified when exported resources change and move on from there.


Thanks for the feedback and insight John, much appreciated.

- Jelle

jcbollinger

unread,
Mar 21, 2016, 11:43:53 AM3/21/16
to Puppet Users


On Sunday, March 20, 2016 at 8:43:12 AM UTC-5, Jelle Smet wrote:
I'm not sure what you think I suggested.  I said the agent reports when it applies changes.  This includes changes associated with exported (possibly by other nodes) resources. 
The agent is not involved at all in exporting resources, nor even directly in collecting exported resources.

So let's say we have following applied to a host:

class some_application (
    $tags ) {
  @@monitor { $fqdn:
    host_tags => $tags
  }
}

The value of "tags" is automatically looked up in Hiera.

Let' say that, someone alters the value of $tags in Hiera for a specific host.
The next time the manifest is applied to the host this will result into the "host_tags" value for that host changing and being updated in PuppetDB.


No.  The next time the master builds a catalog for the target node, PuppetDB is updated with the new exported resource information.  The exported resource is not included in that node's catalog unless it is also collected for that node, and if it is not in the catalog then it is not applied.  The class you present does not, itself, specify that anything whatever is applied to any node, and it has the full effect it ever will have during catalog building, even if the catalogs into which it is included are never applied at all.

 

As far as I know, unless I overlook something (hence my question), puppet agent is not reporting anything about this change.
I might have misinterpreted your answers into thinking that puppet agent should report on that.


You appear to have a misunderstanding about the nature of exported resources, and what it means to apply a resource.  Exported resources, like virtual resources, provide a means for decoupling resource declaration from including resources in catalogs.  Unlike ordinary resources, neither virtual nor exported resources are automatically included in the catalog of the node for which they are declared.  Exported resources differ from virtual resources in that once declared, exported resources can be included in any node's catalog, whereas virtual resources can be included only in the catalog of the node for which they were declared.  That exported resource data are recorded in PuppetDB is an internal mechanism for implementing that resource sharing, not in any way an end goal, nor a public interface.

Getting back to your case, I'm not sure what a Monitor resource represents in terms of the state of the machine to which it is applied, but my guess at the moment is that it doesn't represent anything at all, that no node is collecting any of these exported resources (nor, therefore, applying them), and that their whole purpose is simply to inject data into PuppetDB.

 
 
If you are nowhere collecting and applying the resources you are exporting, then you are even farther out in the boonies than I thought. 

charming :)

 
You may be able to make some variation on this scheme work, but you are using Puppet much differently than it was designed to be used.

What can I say?

  • Puppet agent is not allowed to communicate directly with the restful API in any way.
    Therefor a custom type making puppet agent interface directly with the API (the cleanest approach) is unfortunately not possible.
So, what are the options available on the table?
All necessary data to configure monitoring is available as exported resources. I don't think it's unreasonable to find a way to take advantage of this.


Puppet does not emit any notifications when exported resources change.  It does emit notifications at various levels when changes are applied to a node.  Ergo, if you want Puppet to emit notifications about changes to these data, the data have to be associated with resources that have persistent state reflective of the data, and those resources need to actually be applied to some node.

Moreover, since the states of those resources need to fully capture the data that may change in order to accurately report when changes occur, it follows that those resources can serve as an alternative to PuppetDB as a data source for your middleware.  This is preferable to scraping the data from PuppetDB for the same reason that putting triggers into the DB is not an acceptable solution to your problem.

The family of approaches that looks practical to me involves using exported File (or Concat::Fragment) resources to record the data, and having the node on which your middleware runs collect (all of) them.  You could use the agent's exit status to trigger your middleware, but I'd suggest instead setting the middleware up with a service interface (i.e. an initscript if you have SysV-style service management), that hooks the 'restart' command to compiles the data (if necessary) and dispatches it to the appropriate REST endpoint.  Then configure the resources to notify that service.

The only caveat here is that changes will not be delivered to the REST endpoint until the middleware node performs a catalog run.  That will involve more delay than you already have, but clearly you don't need changes to be signaled immediately, because you already wait for other nodes' catalog requests to cause PuppetDB to be updated.  You can modulate that to some extent via the Puppet run intervals of the nodes involved.

 

By all means, you're free to disagree with this approach.  Heck, I even disagree with this approach but it's what I have to deal with and have no control over.
But, whether this is good idea or not is not the question I'm seeking help/advice about.



I am not so much criticizing the approach as trying to determine where its boundaries are.  And trying to teach you a bit about the Puppet details that impact the problem.  One of the reasons you are having trouble is that you are trying to use Puppet in a way that it was not designed to support.  Still, if your management is not prepared to listen if you tell them that the requirements and constraints they have laid down cannot all be satisfied simultaneously, then you might consider looking for a better job (even if it turns out that you can find an acceptable solution in this particular case).

 
I'm just trying to find a way getting notified when exported resources change and move on from there.



If that's your only alternative, and -- as we've already covered -- you must not use triggers, then you are out of luck.  Puppet does not emit any notifications when exported resource data change.  On the other hand, it does emit notifications when resource changes are applied to a node, whatever the type and origin of the resource.  Changes to exported resource data are associated with applying resource changes, but the two are not directly coupled.


John

Klavs Klavsen

unread,
Mar 22, 2016, 11:21:20 AM3/22/16
to Puppet Users
I configure monitoring by letting puppet create exported resources on all hosts (where my rules then figure out what to monitor).. and then I simply pull those resources on the monitor servers - which results in config files for the things to monitor. Works beautifully with nagios/icinga and other text-based configs.

if I were to interface with something stupid (like stuff needing guy/rest-api to update config).. I'd still make puppet write config files.. perhaps in json format - and then simply catch puppet return code in script running puppet on the "config server that pulls the exported resources".. and post those new json files (filestamp tells it like it is ;) - to the config endpoint.

Jelle Smet

unread,
Mar 22, 2016, 2:41:43 PM3/22/16
to Puppet Users
(like stuff needing guy/rest-api to update config).. I'd still make puppet write config files.. perhaps in json format - and then simply catch puppet return code in script running puppet on the "config server that pulls the exported resources".. and post those new json files (filestamp tells it like it is ;) - to the config endpoint.

yeah i see your point.

Tnx for the feedback
 

Jelle Smet

unread,
Mar 22, 2016, 2:43:52 PM3/22/16
to Puppet Users

If that's your only alternative, and -- as we've already covered -- you must not use triggers, then you are out of luck.  Puppet does not emit any notifications when exported resource data change.  On the other hand, it does emit notifications when resource changes are applied to a node, whatever the type and origin of the resource.  Changes to exported resource data are associated with applying resource changes, but the two are not directly coupled.


Understood thanks for the clarification. This has been very helpful.

Much appreciated all the effort!

- Jelle
Reply all
Reply to author
Forward
0 new messages