PuppetDB low catalog-duplication rate Puppet DB 4.3.0

104 views
Skip to first unread message

Mike Sharpton

unread,
Jun 28, 2017, 2:11:17 PM6/28/17
to Puppet Users
Hey all,

I am hoping there is someone else in the same boat as I am.  We are running Puppet 4.2.2, along with PuppetDB 4.3.0.  I am seeing low duplication rate which I think is contributing to our queuing problems in PuppetDB.  The queue will fluctuate from 0-100 queued, to up to 2000.  We have around 4500 nodes, and we are using 8 threads on our PuppetDB server.  I am seeing that the low duplication rate is caused by hashes not matching and a full insert running which is expensive on the DB instead of just updating the time stamp.  I don't know why these would not be matching, and may need help as far as how to find something like this.  I see items in PuppetDB3 for this, but not 4.  I see that using timestamp and other items which change each time will cause the catalog to never be the same, but I would think we would have 0% duplication if this was the case.  I am also seeing that things are improved in 4.4.0 as far as performance and a missing index is corrected that may speed things.  I am wondering what others have done/seen with this and whether upgrading to 4.4.0 would do me good.  I am thinking it would as many things appear to fixed around the issues I am seeing.  Thanks in advance,

Mike

Christopher Wood

unread,
Jun 28, 2017, 3:26:38 PM6/28/17
to puppet...@googlegroups.com
I had a broadly similar issue in that I had a low catalog duplication rate and I had to change some puppet manifests around to fix that.

Back in 2015 I was doing this to get mcollective plugin sources for the file resource:

source => regsubst(keys($plugins), '^', 'puppet:///modules/mco/plugins/')

But obviously keys() returns things in any old order and every catalog was different. The solution was to sort it:

source => regsubst(sort(keys($plugins)), '^', 'puppet:///modules/mco/plugins/')

In your place I would grab some different catalogs for the same host, pretty-format them and diff to see what's different. That will show you what's changing between runs. The easy way for this is to do a bunch of agent runs and then use curl in between them on the puppetdb host.

curl http//localhost:8080/pdb/query/v4/catalogs/host.domain.com | python -m json.tool >/tmp/cat1
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [1]puppet-users...@googlegroups.com.
> To view this discussion on the web visit
> [2]https://groups.google.com/d/msgid/puppet-users/bde7abd4-fccb-420b-b3d8-d4c674ca5705%40googlegroups.com.
> For more options, visit [3]https://groups.google.com/d/optout.
>
> References
>
> Visible links
> 1. mailto:puppet-users...@googlegroups.com
> 2. https://groups.google.com/d/msgid/puppet-users/bde7abd4-fccb-420b-b3d8-d4c674ca5705%40googlegroups.com?utm_medium=email&utm_source=footer
> 3. https://groups.google.com/d/optout

Peter Krawetzky

unread,
Jul 7, 2017, 8:03:12 AM7/7/17
to Puppet Users
So I went to run the curl command listed below and it came back with nothing.  So I used pgadmin to look at the catalogs table and it's completely empty.  The system has been running for almost 24 hours after dropping/creating the postgresql database.  Any idea why the catalog table would be empty?

Mike Sharpton

unread,
Jul 10, 2017, 10:47:24 AM7/10/17
to Puppet Users
If you are not on PuppetDB 4 the uri would be different is my guess.  The v4 part specifically.  I have still not had time to mess with this, but I was able to query via the uri that was given above.

Peter Krawetzky

unread,
Jul 10, 2017, 3:20:39 PM7/10/17
to Puppet Users
Yes I am on V4 and the query just didn't return any results - no errors so I assume I am using the correct curl command.  Thanks


On Wednesday, June 28, 2017 at 2:11:17 PM UTC-4, Mike Sharpton wrote:
Reply all
Reply to author
Forward
0 new messages