Any advice on "Puppet::Util::Log requires a message" failure

1,655 views
Skip to first unread message

Roman Shaposhnik

unread,
Dec 11, 2012, 12:21:09 PM12/11/12
to puppe...@googlegroups.com
Hi!

all of a sudden the puppet code that deploys Hadoop
clusters in Bigtop started to trigger:
err: /Stage[main]/Hadoop_head_node/Solr::Server[solrcloud
server]/Solr::Solrcloud_config[collection1]/Exec[ZK collection1 config
upload]: Could not evaluate: Puppet::Util::Log requires a message

The behavior is not 100% reproducible and it doesn't
happen with the same resource in different runs, but
within the entire run it is almost guaranteed to happen.
So far -- I've seen it triggered for Exec and Package
resources.

This happens with quite a few different Puppet versions
I tried (2.6.X, 2.7.X) on various Linux distros (RHEL, SLES)
so I suspect this is a combination of a Puppet issue (perhaps
this one: http://projects.puppetlabs.com/issues/17887 )
and something in my code that is triggering it.

Taking a hint from the proposed fix to the bug report
I quoted above, I was able to generate the stacktrace
at the point where it happens (attached).

I would be happy to provide additional details to help
fix this issue in future versions of Puppet, however,
what I'm really, really in need of is any advice
that would help me rejigger the Puppet code so that
I don't bump into this issue that often.

I'm really quite desperate at this point :-(

Thanks,
Roman.

Here's the snippet of code I inserted in puppet/util/log.rb

begin
raise ArgumentError, "Puppet::Util::Log requires a message" unless msg
@message = msg.to_s
rescue => e
@message = "EMPTY MESSAGE: " + e.backtrace.to_s
end

and here's the resulting stack trace (as you can see it is now failing
in a different resources but failing nonetheless):

notice: /Stage[main]/Hadoop_head_node/Hadoop::Create_hdfs_dirs[/var/log]/Exec[HDFS
init /var/log]/returns: EMPTY MESSAGE:
/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:222:
in `message='/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:203
in `initialize'/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:81
in `new'/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:81
in `create'/usr/lib/ruby/site_ruby/1.8/puppet/util/logging.rb:7
in `send_log'/usr/lib/ruby/site_ruby/1.8/puppet/transaction/event.rb:38
in `send_log'/usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:126
in `apply_parameter'/usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:62
in `perform_changes'/usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:60
in `each'/usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:60
in `perform_changes'/usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:133
in `evaluate'/usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:48
in `apply'/usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:113
in `eval_children_and_apply_resource'/usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:91
in `eval_resource'/usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:138
in `evaluate'/usr/lib/ruby/site_ruby/1.8/puppet/util.rb:403
in `thinmark'/usr/lib/ruby/1.8/benchmark.rb:293
in `measure'/usr/lib/ruby/1.8/benchmark.rb:307
in `realtime'/usr/lib/ruby/site_ruby/1.8/puppet/util.rb:402
in `thinmark'/usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:137
in `evaluate'/usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:130
in `each'/usr/lib/ruby/site_ruby/1.8/puppet/transaction.rb:130
in `evaluate'/usr/lib/ruby/site_ruby/1.8/puppet/resource/catalog.rb:150
in `apply'/usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:120
in `retrieve_and_apply_catalog'/usr/lib/ruby/site_ruby/1.8/puppet/util.rb:178
in `benchmark'/usr/lib/ruby/1.8/benchmark.rb:293
in `measure'/usr/lib/ruby/1.8/benchmark.rb:307
in `realtime'/usr/lib/ruby/site_ruby/1.8/puppet/util.rb:177
in `benchmark'/usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:119
in `retrieve_and_apply_catalog'/usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:150
in `run'/usr/lib/ruby/site_ruby/1.8/puppet/application/apply.rb:131
in `main'/usr/lib/ruby/site_ruby/1.8/puppet/application/apply.rb:35
in `run_command'/usr/lib/ruby/site_ruby/1.8/puppet/application.rb:305
in `run'/usr/lib/ruby/site_ruby/1.8/puppet/application.rb:420
in `hook'/usr/lib/ruby/site_ruby/1.8/puppet/application.rb:305
in `run'/usr/lib/ruby/site_ruby/1.8/puppet/application.rb:411
in `exit_on_fail'/usr/lib/ruby/site_ruby/1.8/puppet/application.rb:305
in `run'/usr/lib/ruby/site_ruby/1.8/puppet/util/command_line.rb:62
in `execute'/usr/bin/puppet:4

Dominic Cleal

unread,
Dec 11, 2012, 12:34:57 PM12/11/12
to puppe...@googlegroups.com
On 11/12/12 17:21, Roman Shaposhnik wrote:
> Hi!
>
> all of a sudden the puppet code that deploys Hadoop
> clusters in Bigtop started to trigger:
> err: /Stage[main]/Hadoop_head_node/Solr::Server[solrcloud
> server]/Solr::Solrcloud_config[collection1]/Exec[ZK collection1 config
> upload]: Could not evaluate: Puppet::Util::Log requires a message
>
> The behavior is not 100% reproducible and it doesn't
> happen with the same resource in different runs, but
> within the entire run it is almost guaranteed to happen.
> So far -- I've seen it triggered for Exec and Package
> resources.
>
> This happens with quite a few different Puppet versions
> I tried (2.6.X, 2.7.X) on various Linux distros (RHEL, SLES)
> so I suspect this is a combination of a Puppet issue (perhaps
> this one: http://projects.puppetlabs.com/issues/17887 )
> and something in my code that is triggering it.
>
> Taking a hint from the proposed fix to the bug report
> I quoted above, I was able to generate the stacktrace
> at the point where it happens (attached).

During a run is odd, since I normally see it in a test harness where
rspec/mocha performs some sort of quick exit, bypassing the exception
routines.

Have you tried applying the patch I wrote for the issue?

https://github.com/puppetlabs/puppet/pull/1307/files
https://github.com/domcleal/puppet/commit/6662ef4.patch

> Here's the snippet of code I inserted in puppet/util/log.rb
>
> begin
> raise ArgumentError, "Puppet::Util::Log requires a message" unless msg
> @message = msg.to_s
> rescue => e
> @message = "EMPTY MESSAGE: " + e.backtrace.to_s
> end
>
> and here's the resulting stack trace (as you can see it is now failing
> in a different resources but failing nonetheless):
>
> notice: /Stage[main]/Hadoop_head_node/Hadoop::Create_hdfs_dirs[/var/log]/Exec[HDFS
> init /var/log]/returns: EMPTY MESSAGE:
> /usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:222:
> in `message='/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:203
> in `initialize'/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:81

That's the trouble with the error, it's triggered from within the
logging code and so it doesn't show you the context of what's happening
inside the run.

> in `new'/usr/lib/ruby/site_ruby/1.8/puppet/util/log.rb:81
> in `create'/usr/lib/ruby/site_ruby/1.8/puppet/util/logging.rb:7
> in `send_log'/usr/lib/ruby/site_ruby/1.8/puppet/transaction/event.rb:38
> in `send_log'/usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:126

This looks like the same area (while applying a property change), try
the patch.

--
Dominic Cleal
Red Hat Engineering

Andy Parker

unread,
Dec 11, 2012, 12:53:01 PM12/11/12
to puppe...@googlegroups.com
On Tue, Dec 11, 2012 at 9:21 AM, Roman Shaposhnik <shapo...@gmail.com> wrote:
Hi!

all of a sudden the puppet code that deploys Hadoop
clusters in Bigtop started to trigger:
    err: /Stage[main]/Hadoop_head_node/Solr::Server[solrcloud
server]/Solr::Solrcloud_config[collection1]/Exec[ZK collection1 config
upload]: Could not evaluate: Puppet::Util::Log requires a message

The behavior is not 100% reproducible and it doesn't
happen with the same resource in different runs, but
within the entire run it is almost guaranteed to happen.
So far -- I've seen it triggered for Exec and Package
resources.

Have you been able to come up with a smaller catalog that reproduces the problem? Or something that doesn't include any sensitive information that you could send?
 

This happens with quite a few different Puppet versions
I tried (2.6.X, 2.7.X) on various Linux distros (RHEL, SLES)
so I suspect this is a combination of a Puppet issue (perhaps
this one: http://projects.puppetlabs.com/issues/17887 )
and something in my code that is triggering it.


I agree with Dominic that this is probably related to the same issue, but I'm curious as to what is causing it.
 
Taking a hint from the proposed fix to the bug report
I quoted above, I was able to generate the stacktrace
at the point where it happens (attached).

I would be happy to provide additional details to help
fix this issue in future versions of Puppet, however,
what I'm really, really  in need of is any advice
that would help me rejigger the Puppet code so that
I don't bump into this issue that often.


Looking over the code that is part of the stack trace, I don't see anything obvious. It does look like something has to be happening in the Puppet::Transaction::ResourceHarness#apply_parameter method. Looking over it, the only way we would be getting an empty message is if auditing is not turned on and two exceptions are raised: one needs to be raised in the main section of the method before it has had the chance to set the message and the second has to happen in the rescue before the message gets set there. Those two together would cause the code to execute the `event.log_send` code without ever having received a message. Then when the exception occurs inside the ensure block the original error is lost and the new one is exposed.

I'd really like to know what the original error was. My suspicion is that because this is happening and various points it is either an out of memory or related error. Applying Dominic's patch should expose the underlying error that is occurring.
 

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.


Roman Shaposhnik

unread,
Dec 11, 2012, 1:21:59 PM12/11/12
to puppe...@googlegroups.com
Andy, Dominic,

thanks a million for replying so quickly!

On Tue, Dec 11, 2012 at 9:53 AM, Andy Parker <an...@puppetlabs.com> wrote:
> Have you been able to come up with a smaller catalog that reproduces the
> problem? Or something that doesn't include any sensitive information that
> you could send?

There's nothing sensitive about that puppet code -- it is all part of
the Apache Bigtop project and is meant to be a portable way of
deploying Hadoop stacks (be gentle -- the code was written
prior to parameterized classes/hiera and it shows ;-)):
https://github.com/apache/bigtop/tree/master/bigtop-deploy/puppet
I will try to reproduce it with smaller catalog -- sure, but I don't think
I can make it small enough for the problem to be obvious :-(

> I'd really like to know what the original error was. My suspicion is that
> because this is happening and various points it is either an out of memory
> or related error. Applying Dominic's patch should expose the underlying
> error that is occurring.

Here's a curious thing that happens when I do that. In the middle of the
run I start seeing the following stacktrace:

/usr/lib/ruby/site_ruby/1.8/puppet/util/errors.rb:60:in `fail'
/usr/lib/ruby/site_ruby/1.8/puppet/provider/service/base.rb:129:in `texecute'
/usr/lib/ruby/site_ruby/1.8/puppet/provider/service/base.rb:141:in `ucommand'
/usr/lib/ruby/site_ruby/1.8/puppet/provider/service/base.rb:82:in `start'
/usr/lib/ruby/site_ruby/1.8/puppet/type/service.rb:66:in `set_running'
/usr/lib/ruby/site_ruby/1.8/puppet/property.rb:81:in `send'
/usr/lib/ruby/site_ruby/1.8/puppet/property.rb:81:in `call_valuemethod'
/usr/lib/ruby/site_ruby/1.8/puppet/property.rb:256:in `set'
/usr/lib/ruby/site_ruby/1.8/puppet/property.rb:310:in `sync'
/usr/lib/ruby/site_ruby/1.8/puppet/type/service.rb:77:in `sync'
/usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:114:in
`apply_parameter'
/usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:56:in
`perform_changes'
/usr/lib/ruby/site_ruby/1.8/puppet/transaction/resource_harness.rb:135:in
And at the very end of the run I see:

/usr/lib/ruby/site_ruby/1.8/puppet/util.rb:276:in `waitpid2': SIGHUP
(SignalException)
from /usr/lib/ruby/site_ruby/1.8/puppet/util.rb:276:in `execute'
from /usr/lib/ruby/site_ruby/1.8/puppet/util/suidmanager.rb:150:in
`run_and_capture'
from /usr/lib/ruby/site_ruby/1.8/puppet/provider/exec/posix.rb:59:in `run'
from /usr/lib/ruby/1.8/timeout.rb:56:in `timeout'
from /usr/lib/ruby/site_ruby/1.8/puppet/provider/exec/posix.rb:58:in `run'
from /usr/lib/ruby/site_ruby/1.8/puppet/util/execution.rb:12:in `withenv'
from /usr/lib/ruby/site_ruby/1.8/puppet/provider/exec/posix.rb:57:in `run'
from /usr/lib/ruby/site_ruby/1.8/puppet/provider/exec/posix.rb:36:in `chdir'
... 33 levels...
from /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:411:in `exit_on_fail'
from /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:305:in `run'
from /usr/lib/ruby/site_ruby/1.8/puppet/util/command_line.rb:62:in `execute'
from /usr/bin/puppet:4

Here's what the system is:

$ cat /etc/*release
CentOS release 5.6 (Final)
$ ruby --version
ruby 1.8.5 (2006-08-25) [x86_64-linux]
$ puppet --version
2.6.17
$ rpm -qi puppet
Name : puppet Relocations: (not relocatable)
Version : 2.6.17 Vendor: Fedora Project
Release : 2.el5 Build Date: Thu 19 Jul
2012 01:25:32 PM EDT
Install Date: Mon 10 Dec 2012 11:37:31 AM EST Build Host:
x86-04.phx2.fedoraproject.org
Group : System Environment/Base Source RPM:
puppet-2.6.17-2.el5.src.rpm
Size : 2534255 License: GPLv2
Signature : DSA/SHA1, Thu 19 Jul 2012 05:09:16 PM EDT, Key ID 119cc036217521f6
Packager : Fedora Project
URL : http://puppetlabs.com
Summary : A network tool for managing many disparate systems
Description :
Puppet lets you centrally manage every important aspect of your system using a
cross-platform specification language that manages all the separate elements
normally aggregated in different files, like users, cron jobs, and hosts,
along with obviously discrete elements like packages, services, and files.

Thanks,
Roman.

Andy Parker

unread,
Dec 11, 2012, 2:08:16 PM12/11/12
to puppe...@googlegroups.com
On Tue, Dec 11, 2012 at 10:21 AM, Roman Shaposhnik <shapo...@gmail.com> wrote:
Andy, Dominic,

thanks a million for replying so quickly!

On Tue, Dec 11, 2012 at 9:53 AM, Andy Parker <an...@puppetlabs.com> wrote:
> Have you been able to come up with a smaller catalog that reproduces the
> problem? Or something that doesn't include any sensitive information that
> you could send?

There's nothing sensitive about that puppet code -- it is all part of
the Apache Bigtop project and is meant to be a portable way of
deploying Hadoop stacks (be gentle -- the code was written
prior to parameterized classes/hiera and it shows ;-)):
    https://github.com/apache/bigtop/tree/master/bigtop-deploy/puppet
I will try to reproduce it with smaller catalog -- sure, but I don't think
I can make it small enough for the problem to be obvious :-(

> I'd really like to know what the original error was. My suspicion is that
> because this is happening and various points it is either an out of memory
> or related error. Applying Dominic's patch should expose the underlying
> error that is occurring.

Here's a curious thing that happens when I do that. In the middle of the
run I start seeing the following stacktrace:


That stacktrace is it trying to execute the start command for some service and failing. It should be providing a message like "Could not #{type} #{@resource.ref}: #{detail}". Do you get anything like at in the logs?
Is your catalog maybe installing a new puppet package? That looks like the puppet process is being sent the HUP signal.
 

Roman Shaposhnik

unread,
Dec 13, 2012, 11:54:52 AM12/13/12
to puppe...@googlegroups.com
On Tue, Dec 11, 2012 at 11:08 AM, Andy Parker <an...@puppetlabs.com> wrote:
> That stacktrace is it trying to execute the start command for some service
> and failing. It should be providing a message like "Could not #{type}
> #{@resource.ref}: #{detail}". Do you get anything like at in the logs?

Yes I do. In fact, let me do a couple more runs on a 100% clean VMs
so that we can be certain that there's no interference from anything else.

> Is your catalog maybe installing a new puppet package? That looks like the
> puppet process is being sent the HUP signal.

Nope. Nothing like that. And given that these are EC2 VMs I control 100%
I doubt there's something else going on. But, as I said -- I'll do a couple
more runs today to make 100% sure.

Thanks for all your help so far!

Thanks,
Roman.

Josh Cooper

unread,
Jan 16, 2013, 7:45:03 PM1/16/13
to puppe...@googlegroups.com, ro...@shaposhnik.org
Exec commands are executed within a Timeout::timeout block. If the command times out, a Timeout::Error will be raised. In ruby 1.8 this is not a StandardError, so `rescue => exception` will not catch it, and this is what happens in the resource_harness.

Another place this can happen is when using file resources with remote sources, e.g. puppet:///modules/..., since ruby's HTTP code will raise Timeout::Error on a read timeout.

On ruby 1.9, Timeout::Error is a StandardError, so this problem shouldn't occur there.

I've updated Dominic's ticket http://projects.puppetlabs.com/issues/17887 with this info.

Josh 
Reply all
Reply to author
Forward
0 new messages