CPU Spikes

185 views
Skip to first unread message

Tim Burkhart

unread,
Dec 4, 2014, 1:19:07 PM12/4/14
to particula...@googlegroups.com
Helpful information to include
Product name: NServiceBus
Version: 5.1.2
Stacktrace:
Description:

We have 4 NServiceBus processes on our QA machine running with the RabbitMQ transport. For the most part, they run at about 0% CPU; however, last night and then again this morning, we've noticed that they each spike up to 20-30% CPU usage for a very long time (I have to kill the processes in order to get the computer down from 100% CPU usage).

One of the processes had a config of:

<TransportConfig MaximumConcurrencyLevel="32" MaxRetries="2" MaximumMessageThroughputPerSecond="32" />

The others were all right around:

<TransportConfig MaximumConcurrencyLevel="1" MaxRetries="2" MaximumMessageThroughputPerSecond="1" />

Could the 32 concurrency level be the culprit for this? I read that permissions issues could also be an issue, but wouldn't that be something that would show up instantly and not after a long period of time?

Tim Burkhart

unread,
Dec 4, 2014, 2:43:59 PM12/4/14
to particula...@googlegroups.com
They also spike for about 5 seconds once every maybe 3-5 minutes. I have ServiceControl/ServicePulse installed and it is monitoring those endpoints. Could that be the reason for the short bursts of CPU usage?

I have the performance monitoring utility open and during these 5 second bursts, there's inconsequential messages people processed with the services.

Tim Burkhart

unread,
Dec 4, 2014, 2:47:24 PM12/4/14
to particula...@googlegroups.com
Actually it is 15 second bursts

Shook, Joseph

unread,
Dec 4, 2014, 4:11:54 PM12/4/14
to particula...@googlegroups.com

Tim, we have also seen CPU Spikes.  We have the same setup you described.  Just confirming you are not the only one. 

Because we have so much churn right now and our integration environment is redeploying to the servers we haven’t dug in on the problem.

 

One thing of note from QA was, that right after creating a VM snapshot our 5 nsb endpoints where all running at 100% and had to be restarted.   Again, I have not carefully investigated this, but I fear there is a problem here that we will have to address soon.

 

 

 

Joseph Shook | Senior Software Engineer | Surescripts LLC |

O: 503-906-6045 | C: 503-784-9357 | Joseph...@Surescripts.com

--
You received this message because you are subscribed to the Google Groups "Particular Software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to particularsoftw...@googlegroups.com.
To post to this group, send email to particula...@googlegroups.com.
Visit this group at http://groups.google.com/group/particularsoftware.
To view this discussion on the web visit https://groups.google.com/d/msgid/particularsoftware/03f56a29-52be-4425-9a2a-f6fd3fc79221%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.

Tim Burkhart

unread,
Dec 4, 2014, 5:49:52 PM12/4/14
to particula...@googlegroups.com
After digging around for a little bit, I may have a theory for why ours spikes and makes us kill the processes...

We have a scheduled job that runs every 5 minutes and it aggregates some data from the db and publishes a message that is enormous (a single class with a list of 30000 items (each being a simple DTO)). My theory is that the serialization/transfer/deserialization is spiking the CPU especially since all of the processes are handling this message. And as a result, Erlang's CPU is getting squashed, so RabbitMQ probably seems like it is offline, which makes NServiceBus freak out and try its hardest to reconnect, which crushes RabbitMQ even more.

Then again, this could be a wildly unfounded hypothesis, but I'm going to split out this massive message into smaller messages and see if I can notice any improvement in CPU usage. If so, then maybe it's on the right track.

John Simons

unread,
Dec 4, 2014, 6:44:17 PM12/4/14
to particula...@googlegroups.com
Hi Tim and Joseph,

Thank you lots for the bug report.
We are working on it right now, any help that you guys can provide to help us replicate the issue would be really appreciated.

Cheers
John

andreas.ohlund

unread,
Dec 5, 2014, 4:46:22 AM12/5/14
to particula...@googlegroups.com
We've just release a patch for this. Please see separate release annoncement:

Reply all
Reply to author
Forward
0 new messages