Overnight my Facter seems to reporting new errors. As far as I can see,
neither Puppet, Facter, MCollective or the facts in question were
updated, so I'm struggling to find a cause. The problem as exacerbated
because MCollective is emailing me with the same error every 15 minutes.
This error is printed when I run facter:
Error loading fact /var/lib/puppet/lib/facter/warranty.rb: no such file
to load -- facter/util/warranty
The fact is provided by this module (my module, but not my code)
https://forge.puppetlabs.com/jgazeley/dell
The file /var/lib/puppet/lib/facter/warranty.rb does exist on my system
and is readable as is the referenced submodule in
/var/lib/puppet/lib/facter/util/warranty.rb , so I am not sure what is
causing the problem.
Overnight my Facter seems to reporting new errors. As far as I can see,
neither Puppet, Facter, MCollective or the facts in question were
updated, so I'm struggling to find a cause. The problem as exacerbated
because MCollective is emailing me with the same error every 15 minutes.
This error is printed when I run facter:
Error loading fact /var/lib/puppet/lib/facter/warranty.rb: no such file
to load -- facter/util/warranty
The fact is provided by this module (my module, but not my code)
https://forge.puppetlabs.com/jgazeley/dell
The file /var/lib/puppet/lib/facter/warranty.rb does exist on my system
and is readable as is the referenced submodule in
/var/lib/puppet/lib/facter/util/warranty.rb , so I am not sure what is
causing the problem.
--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/54180673.5030003%40bristol.ac.uk.
For more options, visit https://groups.google.com/d/optout.
Jonathan
--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/541896E8.3090003%40bristol.ac.uk.
Trevor*slow clap*
Well played sir, well played.
On Tue, Sep 16, 2014 at 4:00 PM, Jonathan Gazeley <Jonathan...@bristol.ac.uk> wrote:
On 16/09/14 20:00, Trevor Vaughan wrote:
I had a problem similar to this when an error was introduced to my puppet master's served code base.
To troubleshoot I did the following:
1) Restart the puppetmaster process (passenger, whatever you're using)
2) Remove everything under /var/lib/puppet/lib/facter/* on the clients that are having the issue.
3) Re-run the puppet agent on the affected node
One of two things should happen, either your facts will be re-sync'd and everything will be fine or an error will be thrown noting that you have an error somewhere else.
If this *doesn't* happen, check the server logs and see if there's something more insidious happening.
Good luck,
Trevor
Thanks for your advice. I'll try this when I'm back at work in the morning. For now, allow me to leave you with this video, which seems to summarise your advice ;)
https://www.youtube.com/watch?v=kb2gzteVNa4
Cheers,
Jonathan
--
--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/5419891F.60301%40bristol.ac.uk.
On 17/09/14 17:29, Trevor Vaughan wrote:
> Ah, so looking at warranty.rb and then stdlib...puppet_vardir.rb, you
> may be missing this:
>
> begin
> require 'facter/util/puppet_settings'
> rescue LoadError => e
> # puppet apply does not add module lib directories to the $LOAD_PATH
> (See
> # #4248). It should (in the future) but for the time being we need to be
> # defensive which is what this rescue block is doing.
> rb_file = File.join(File.dirname(__FILE__), 'util',
> 'puppet_settings.rb')
> load rb_file if File.exists?(rb_file) or raise e
> end
>
Thanks. I added a similar block of code to load util/warranty.rb but it
didn't cure the situation. Something is different though. Running facter
by hand no longer complains, but my 15-minutely emails from MCollective
as it tries to cache facts are still being sent:
Could not retrieve fact='warranty_start', resolution='<anonymous>': undefined method `[]' for nil:NilClass
Could not retrieve fact='warranty_end', resolution='<anonymous>': undefined method `[]' for nil:NilClass
Could not retrieve fact='warranty_end', resolution='<anonymous>': undefined method `[]' for nil:NilClass
Could not retrieve fact='warranty_days_left', resolution='<anonymous>': can't dup NilClass
BTW #4248 is a very old bug, surely it's still not being hit?
https://projects.puppetlabs.com/issues/4248
On Thursday, September 18, 2014 4:36:16 AM UTC-5, Jonathan Gazeley wrote:
Could not retrieve fact='warranty_start', resolution='<anonymous>': undefined method `[]' for nil:NilClass
Could not retrieve fact='warranty_end', resolution='<anonymous>': undefined method `[]' for nil:NilClass
Could not retrieve fact='warranty_end', resolution='<anonymous>': undefined method `[]' for nil:NilClass
Could not retrieve fact='warranty_days_left', resolution='<anonymous>': can't dup NilClass
That suggests to me that you have moved on to a different issue. Matching up the code (facter/warranty.rb and facter/util/warranty.rb) with the error messages you report, it seems likely that function Facter::Util::Warranty::get_data() is failing on you. That will cause both Facter::Util::Warranty::purchase_date() and the Facter::Util::Warranty::warranties() to return nil. The ::warranty_start, ::warranty_end, and ::warranty_days_left facts all depend on those two functions, assuming without testing that they never return nil, hence the observed errors.
As to why get_data() may be failing, it looks like there are several possibilities:
1) The function relies on calling an HTTP API at Dell to retrieve data (lines 6-12 and 60-72). It may be that that API has been modified or removed, or that it no longer serves data for the particular machines in question.
2) The get_data() function relies on the ::serialnumber fact. I am uncertain how that fact is computed, but maybe something changed that affected its result.
3) The get_data() function uses a local cache to avoid calling the Dell API every time. If that cache has been corrupted but not altogether removed then perhaps get_data() would fail.
On 18/09/14 14:27, jcbollinger wrote:
2) The get_data() function relies on the ::serialnumber fact. I am uncertain how that fact is computed, but maybe something changed that affected its result.
On Dell machines, the ::serialnumber fact returns the service tag of the hardware. I have no idea how it retrieves that data, but it only prints out the fact when run as root. Perhaps this is breaking it, if this fact can't read other facts.
3) The get_data() function uses a local cache to avoid calling the Dell API every time. If that cache has been corrupted but not altogether removed then perhaps get_data() would fail.
It's a possibility that I will investigate. In the meantime, thanks for your suggestions.
On Thursday, September 18, 2014 9:26:12 AM UTC-5, Jonathan Gazeley wrote:On 18/09/14 14:27, jcbollinger wrote:
2) The get_data() function relies on the ::serialnumber fact. I am uncertain how that fact is computed, but maybe something changed that affected its result.
On Dell machines, the ::serialnumber fact returns the service tag of the hardware. I have no idea how it retrieves that data, but it only prints out the fact when run as root. Perhaps this is breaking it, if this fact can't read other facts.
If Puppet is running as an unprivileged account on the client then that very likely would explain the problem, but that would be unusual. There could also have been some kind of access control change around the mechanism by which the ::serialnumber is determined, but you already said you weren't seeing anything relevant in your audit log.
3) The get_data() function uses a local cache to avoid calling the Dell API every time. If that cache has been corrupted but not altogether removed then perhaps get_data() would fail.
It's a possibility that I will investigate. In the meantime, thanks for your suggestions.
Any way around, you could consider modifying the implementations of the ::warranty_start, ::warranty_end, and ::warranty_days_left facts to handle this situation more gracefully. I account them flawed for not doing so now.