puppetmasterd continuously consuming high CPU, with many interrupts

287 views
Skip to first unread message

Robin Lee Powell

unread,
Jul 2, 2012, 2:42:37 PM7/2/12
to Puppet Users
So, I have a server at home that has four VMs running inside it.
All are managed via puppet. The physical host runs puppetmasterd.

I don't recall noticing this before, but puppetmasterd has decided
to be kind of crazy. Here's the physical host with no puppetmasterd
running:

top - 11:36:15 up 271 days, 15:16, 1 user, load average: 5.68, 5.50, 6.45
Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.6%us, 1.8%sy, 0.0%ni, 80.4%id, 14.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8128776k total, 6991020k used, 1137756k free, 408756k buffers
Swap: 8388604k total, 552356k used, 7836248k free, 185220k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10296 qemu 20 0 2366m 1.5g 8884 S 6.2 19.6 6:37.65 qemu-kvm
17334 qemu 20 0 2788m 1.7g 544 S 2.7 22.3 4576:25 qemu-kvm
9904 qemu 20 0 2358m 581m 8820 S 0.9 7.3 3:55.78 qemu-kvm
1 root 20 0 46880 8076 1372 S 0.0 0.1 0:27.00 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:10.48 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 322:04.84 ksoftirqd/0
6 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
7 root RT 0 0 0 0 S 0.0 0.0 0:11.57 watchdog/0
8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1
10 root 20 0 0 0 0 S 0.0 0.0 551:03.31 ksoftirqd/1

And here it is with puppetmasterd running:

top - 11:25:07 up 271 days, 15:05, 1 user, load average: 12.59, 8.68, 7.05
Tasks: 131 total, 3 running, 128 sleeping, 0 stopped, 0 zombie
Cpu(s): 15.2%us, 36.4%sy, 0.0%ni, 6.6%id, 39.7%wa, 0.0%hi, 2.0%si, 0.0%st
Mem: 8128776k total, 6830276k used, 1298500k free, 381356k buffers
Swap: 8388604k total, 555328k used, 7833276k free, 180096k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10660 puppet 20 0 214m 107m 4040 S 61.9 1.3 8:46.81 puppetmasterd
3 root 20 0 0 0 0 S 21.4 0.0 320:38.54 ksoftirqd/0
10 root 20 0 0 0 0 R 20.2 0.0 549:30.88 ksoftirqd/1
10296 qemu 20 0 2470m 1.4g 8888 S 13.1 18.1 4:23.70 qemu-kvm
17334 qemu 20 0 2788m 1.7g 540 S 8.3 22.0 4574:54 qemu-kvm
9904 qemu 20 0 2422m 572m 8820 S 3.6 7.2 3:07.15 qemu-kvm
24980 qemu 20 0 1824m 1.4g 612 S 3.6 18.3 15046:11 qemu-kvm
12209 rlpowell 20 0 15256 1228 908 R 1.2 0.0 0:00.04 top
1 root 20 0 46880 7992 1356 S 0.0 0.1 0:26.97 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:10.48 kthreadd

The high CPU use by puppetmasterd is bad enough, but what makes me
be all like "wait, what?" is the ksoftirqd usage.

Puppet master version is 2.16.

This is *without* a client running; there's no traffic on 8140
according to tcpdump, and there's nothing happening in the log.

http://users.digitalkingdom.org/~rlpowell/media/public/puppetmasterd_strace.txt
has strace output; it's pretty boring, but there are a few select
and rt_sigprocmask calls near the bottom.

I'm totally stumped here. Any ideas?

-Robin

--
http://singinst.org/ : Our last, best hope for a fantastic future.
.i ko na cpedu lo nu stidi vau loi jbopre .i danfu lu na go'i li'u .e
lu go'i li'u .i ji'a go'i lu na'e go'i li'u .e lu go'i na'i li'u .e
lu no'e go'i li'u .e lu to'e go'i li'u .e lu lo mamta be do cu sofybakni li'u

Ashley Penney

unread,
Jul 2, 2012, 3:01:34 PM7/2/12
to puppet...@googlegroups.com
It might be totally unrelated but check for ksoftirqd and see if it's running with high CPU.  The leap second the other day caused all my puppetmasters to spike up to 100% CPU and other people had similar problems.  I notice your server has 271 days uptime so it might not be the cause but it's worth trying to either set the date with date -s or reboot the machine to see if it clears it up.

--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To post to this group, send email to puppet...@googlegroups.com.
To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.


Peter Berghold

unread,
Jul 2, 2012, 3:04:08 PM7/2/12
to puppet...@googlegroups.com
On Mon, Jul 2, 2012 at 3:01 PM, Ashley Penney <ape...@gmail.com> wrote:
It might be totally unrelated but check for ksoftirqd and see if it's running with high CPU.  The leap second the other day caused all my puppetmasters to spike up to 100% CPU and other people had similar problems.

Glad somebody besides me noticed that.  My Nagios alerts went bezerk during leap second including CPU alterts for puppet master.





llo...@oreillyauto.com

unread,
Jul 2, 2012, 3:06:19 PM7/2/12
to puppet...@googlegroups.com


On Monday, July 2, 2012 1:42:37 PM UTC-5, Robin Powell wrote:
So, I have a server at home that has four VMs running inside it.
All are managed via puppet.  The physical host runs puppetmasterd.

I don't recall noticing this before, but puppetmasterd has decided
to be kind of crazy.  Here's the physical host with no puppetmasterd
running:

 
If this started this weekend, it may be related to the leapsecond that was applied on June 30 at midnight UTC.

A restart should clear it up, or you can try the items listed here:

http://serverfault.com/q/403732/121905


Lee

Robin Lee Powell

unread,
Jul 2, 2012, 3:18:19 PM7/2/12
to puppet...@googlegroups.com
On Mon, Jul 02, 2012 at 12:06:19PM -0700, llo...@oreillyauto.com
wrote:
Turns out yes, it's the leap second, but boy was the fix I found
easier than that:

http://artipc10.vub.ac.be/wordpress/2012/07/01/leap-second-causing-ksoftirqd-and-java-to-use-lots-of-cpu-time/

$ sudo date -s "`date`"

Cleared it rigt up.

-Robin

Ken Barber

unread,
Jul 2, 2012, 3:23:50 PM7/2/12
to puppet...@googlegroups.com
> Turns out yes, it's the leap second, but boy was the fix I found
> easier than that:
>
> http://artipc10.vub.ac.be/wordpress/2012/07/01/leap-second-causing-ksoftirqd-and-java-to-use-lots-of-cpu-time/
>
> $ sudo date -s "`date`"
>
> Cleared it rigt up.

Huh. What a weird fix :-).

ken.

Peter Brown

unread,
Jul 2, 2012, 8:19:27 PM7/2/12
to puppet...@googlegroups.com
Thanks guys.
I have been having a bizzare problem with java since then and this fixed it.

Walter Heck

unread,
Jul 3, 2012, 4:13:49 AM7/3/12
to puppet...@googlegroups.com
The fix is actually not that weird, a call to clock_settime() function
is enough (which setting the date on the command line does). More
explained in the comments here:
http://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/

Walter
> --
> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
>



--
Walter Heck

--
Check out my startup: Puppet training and consulting @ http://www.olindata.com
Follow @olindata on Twitter and/or 'Like' our Facebook page at
http://www.facebook.com/olindata
Reply all
Reply to author
Forward
0 new messages