We have an application (actually several, using the same framework)
that started to consume 100%CPU. ( Worked for years normally untill
now).
It starts doing so only after running for several days. It contains
several processes, that mainly
do something like
for (;;) {
sleep or poll or select for 25 seconds;
do_little_task();
}
I used gdb and stopwatch on several processes and found that it spends
most of the time in sleep or select. Time between cycles is about same
25 seconds, and still top and ps shows 25-50% of CPU utilization for
such process.
After restarting CPU utilization drops dramatically.
There is no memory problems, as virtual process size is the same after
restart and after long run
Any ideas?
Best regards, Konstantin
PS
Red Hat Enterprise Linux AS release 4 (Nahant Update 7)
uname -a
Linux astrares.domainname.com 2.6.9-78.0.1.ELsmp #1 SMP Tue Jul 22
18:11:48 EDT 2008 i686 i686 i386 GNU/Linux
> We have an application (actually several, using the same framework)
> that started to consume 100%CPU. ( Worked for years normally untill
> now).
> It starts doing so only after running for several days. It contains
> several processes, that mainly
> do something like
>
> for (;;) {
> sleep or poll or select for 25 seconds;
> do_little_task();
>
> }
Make absolutely sure that if you 'select' for something, you can do
that something when 'select' returns. If you don't do whatever it is
'select' told you that you could do, your next call to 'select' won't
block either.
If you're too lazy to fix it, you can hide the problem by putting a
100mS sleep before your call to 'select'. If your application isn't
intended to be high performance, and you can tolerate that delay, it
will solve the CPU problem.
Much better to fix the problem correctly, of course. Don't 'select'
for an event you aren't going to handle.
DS
The problem was in hidden thread created by Oracle after we changed
database connection options.
Then some problem on server ( still unclear) caused that thread to run
at 100%CPU.