Prerun, Postrun Commands, and Stages

Josh Cooper

unread,

Jun 8, 2011, 8:50:58 PM6/8/11

to puppe...@googlegroups.com, puppet...@googlegroups.com

Hi all,

I'm looking for background information about how bug #7127[1] should
be fixed: prerun_command don't stop puppet on error

I think there's general agreement that if the prerun command fails,
then the catalog should not be applied, but the report should be sent,
and the report's status should be "failed".

However, what about the post-run command? In particular, if the
catalog is applied successfully, but the postrun command fails, should
the overall run be considered a failure? The documentation[2] says it
should be:

"A command to run after every agent run. If this command returns a
non-zero return code, the entire Puppet run will be considered to have
failed, even though it might have performed work during the normal
run."

But there are several problems with the way the code is currently implemented.

* If the postrun command fails, puppet never sends the report.
* Errors that occur while running the pre and postrun commands are
not captured in the report's log.
* If the catalog is applied successfully, but the postrun command
fails, the report status is not changed to "failed".

Right now it doesn't matter because the report is never sent, but if I
fix that, it could matter.

Thoughts? The only use case I know of is etckeeper, but its postrun
command, etckeeper-commit-post[3], always returns 0 even if the
etckeeper command fails.

Finally, the prerun command is executed after dostorage,
download_plugins, download_fact_plugins. Is there reason for the
prerun command to occur first?

It'd be great to hear about your experience with the pre/post run
commands and what use cases you are trying to solve.

Also, is there anything that is being solved with pre/post run
commands that can't be solved using stages? For example, if the prerun
command, catalog, and postrun commands are executed as stages, in that
order, with each stage depending on its predecessor(s), it would
ensure that:

* An error in one stage would prevent the following stage(s) from executing.
* The report would contain all errors from stages that were executed.
* The report status, resource statuses, and metrics would be consistent.

Thanks,
Josh

[1] http://projects.puppetlabs.com/issues/7127
[2] http://docs.puppetlabs.com/references/stable/configuration.html#postruncommand
[3] https://code.launchpad.net/~soren/ubuntu/lucid/puppet/etckeeper-integration

Nigel Kersten

unread,

Jun 9, 2011, 9:29:38 AM6/9/11

to puppet...@googlegroups.com, puppe...@googlegroups.com

On Thu, Jun 9, 2011 at 3:15 AM, Dean Wilson <dwi...@unixdaemon.net> wrote:

On Wed, Jun 08, 2011 at 05:50:58PM -0700, Josh Cooper wrote:

> It'd be great to hear about your experience with the pre/post run
> commands and what use cases you are trying to solve.

We use the feature to generate additional information about how the
puppet run has changed the system:

http://www.unixdaemon.net/tools/puppet/nagios-wrapped-puppet-runs.html

In our use case we're only using the two stages as information hooks for
their side-effects - not to alter the puppet run.

> Also, is there anything that is being solved with pre/post run
> commands that can't be solved using stages? For example, if the prerun
> command, catalog, and postrun commands are executed as stages, in that
> order, with each stage depending on its predecessor(s), it would
> ensure that:

For my example usage I could quite easily move the commands to be execs
in the pre and post stages.

One use of pre commands that isn't solved with stages is to check "Should I even do a Puppet run right now?" or anything else that is out of band in a similar sense.

I used to do this in a wrapper script where the nodes would basically look and see if the masters were all under too much load to serve this node or not, or to check whether puppet runs had been disabled centrally, but ideally this would just be a prerun command.

--
Nigel Kersten
Product, Puppet Labs
@nigelkersten

Josh Cooper

unread,

Jun 9, 2011, 4:22:47 PM6/9/11

to puppet...@googlegroups.com, puppe...@googlegroups.com

> One use of pre commands that isn't solved with stages is to check "Should I
> even do a Puppet run right now?" or anything else that is out of band in a
> similar sense.

This makes complete sense and is how the feature was intended to work,
but unfortunately, it never has worked that way (for the agent).
Currently, if the prerun command fails, puppet will attempt to apply
the catalog. Puppet will also always attempt to send a report (due to
#1054), which partially breaks the "master under too much load" use
case.

So there are several ways in which pre/post run failure states can be
handled. I'm curious to think what you think the default behavior
should be and whether you would like to see these other failure states
supported:

If the prerun_command fails:

1. Ignore the failure, continue applying the catalog, send the report, etc.
2. Stop puppet, don't apply the catalog, don't send the report, and exit(1)
3. Stop puppet, don't apply the catalog, but do send the report,
including information about why the prerun command failed, and exit(1)

#1 is the current behavior, but could also be accomplished by
appending "|| true" to the prerun_command option, e.g. prerun_command
= /bin/meow || true.
#2 was how the feature was originally implemented and how it is
documented, but due to the merge with #1054, the default behavior was
changed to #1.
#3 can also be accomplished using stages. This would be best used in
cases where the prerun command should be "in-band" and its failure
should affect the overall report status, resource_statuses, metrics,
etc.

I'd like to propose that we change the default behavior for
prerun_command to #2 and document how to accomplish #1 and #3.

Similarly, if the postrun command fails, there are several different options:

1. Ignore the failure, send the report with whatever status resulted
from applying the catalog, etc.
2. Stop puppet, don't send the report (even though the catalog may
have been applied), and exit(1)
3. Add the postrun command error to the report, change the report
status to "failed", etc., and exit(1)

#1 can be accomplished by appending "|| true" to the postrun_command option.
#2 is the current behavior.
#3 ideally could be handled using stages, but there is no way
currently to ensure a stage is run.

I'm not sure what the default should be here. For example, if the
postrun command "/sbin/iptables -A rule" fails, should the report have
a "failed" status? If we don't send the report, will you ever know,
will you care?

Josh

Nigel Kersten

unread,

Jun 9, 2011, 5:37:04 PM6/9/11

to puppet...@googlegroups.com, puppe...@googlegroups.com

On Thu, Jun 9, 2011 at 1:22 PM, Josh Cooper <jo...@puppetlabs.com> wrote:

> One use of pre commands that isn't solved with stages is to check "Should I
> even do a Puppet run right now?" or anything else that is out of band in a
> similar sense.

This makes complete sense and is how the feature was intended to work,
but unfortunately, it never has worked that way (for the agent).
Currently, if the prerun command fails, puppet will attempt to apply
the catalog. Puppet will also always attempt to send a report (due to
#1054), which partially breaks the "master under too much load" use
case.

So there are several ways in which pre/post run failure states can be
handled. I'm curious to think what you think the default behavior
should be and whether you would like to see these other failure states
supported:

If the prerun_command fails:

1. Ignore the failure, continue applying the catalog, send the report, etc.
2. Stop puppet, don't apply the catalog, don't send the report, and exit(1)
3. Stop puppet, don't apply the catalog, but do send the report,
including information about why the prerun command failed, and exit(1)

3.

We shouldn't ask the master to compile a catalog or sync facts, but we should report. In some cases the report server isn't the same as the master. As you say 1) is easily achieved.

We had a chat about this in person, and my feeling is that the postrun command shouldn't change the status of the run (as it's already complete), but the stdout/err/status should be captured in the report if possible.

If you want a command to change the status of the run, it can be an Exec in a final Stage.

Reply all

Reply to author

Forward