> thanks kindly for the reply . sorry for the delayed response.
>
> yes, this is the old report purge that happens every hr for 14 days
>
> we are using puppetdb-1.4.0 and the default postgresql-8.4.20 that comes
> with that
Upgrade as soon as you can, that version is almost 2 years old
(original announcement here:
https://groups.google.com/forum/#!searchin/puppet-users/puppetdb$201.4.0/puppet-users/WbqIDlJlol4/MUwMDxPpQoUJ),
and there are numerous changes since then. We'd find it difficult to
attempt to debug any issue with your old revision and even if we did
the later versions may have already solved it, there have been so many
different improvements since then:
http://docs.puppetlabs.com/puppetdb/latest/release_notes.html
While you are at it, upgrade your postgresql instance to something
like 9.4, I generally recommend the PGDG upstream packages for that
purpose rather than something the distro ships, so you can get the
latest and greatest.
http://yum.postgresql.org/repopackages.php
http://apt.postgresql.org/pub/repos/apt/README
> this is a small cluster of about 5 machines, and cpuio is getting to 10
So, you only have 5 puppet clients checking into PuppetDB? Is that
what you mean? Thats a little surprising if thats true, are you sure
you don't have an IO fault somewhere, or the hardware configuration is
just tiny? Like a missing disk on a RAID 5 volume might cause slowness
to IO for example.
What kind of hardware configuration is PuppetDB running on? Is it
virtual, and what kind of disk access does it have, is this physical
spinning disk arrays, or single disk or perhaps shared disk (like via
NFS or something)? How often are the nodes checking in? More details
here are going to help me understand ... can you provide me with more
information about your topology, and hardware configuration of each
host please?
Perhaps it might be easier to send a screenshot of your PuppetDB
dashboard, the one you find going to
http://localhost:8080/ so we can
see some more numbers.
Where are you calculating this CPUIO number from? Is this top or some
other tool, can you provide me with the output of that instead of the
resultant number?
I can't imagine (and I've not seen it before) a 5 puppet node client
causing high load on a system has anything to do with PDB, unless
you're sending reports every minute or something crazy like that, or
the node itself is so small and has such bad IO it can't keep up :-).
I would presume initially its a) a fault with hardware b) poor tuned
hardware or c) OS related first (like partition configuration, or
perhaps something related), and thats where I'd focus on for tuning.
There is also a small chance this is a bug with some old revision, but
I can't see that specifically, I suspect its the environment.
ken.