So the 1 min load (ex: 17) is far greater the the mean runq (ex:
(55+28)/60=1.4)
That's not what I expected, in regard to the definition of the Load
related to the runq, from wikipedia, cf
http://en.wikipedia.org/wiki/Load_%28computing%29 "An idle computer has a load number of 0 and each process using or
waiting for CPU (the ready queue or run queue) increments the load
number by 1."
It is also said in the article that "However, Linux also includes
processes in uninterruptible sleep states (usually waiting for disk
activity), which can lead to markedly different results if many
processes remain blocked in I/O due to a busy or stalled I/O system.
(...) Such circumstances can result in an elevated load average"
But even if I add the 'b' processes to the 'r' processes, I'm still
far greater with the Load avg.
I would be thankful if someone have a logical explanation for these
stats.
> So the 1 min load (ex: 17) is far greater the the mean runq (ex:
> (55+28)/60=1.4)
> That's not what I expected, in regard to the definition of the Load
> related to the runq, from wikipedia, cfhttp://en.wikipedia.org/wiki/Load_%28computing%29 > "An idle computer has a load number of 0 and each process using or
> waiting for CPU (the ready queue or run queue) increments the load
> number by 1."
> It is also said in the article that "However, Linux also includes
> processes in uninterruptible sleep states (usually waiting for disk
> activity), which can lead to markedly different results if many
> processes remain blocked in I/O due to a busy or stalled I/O system.
> (...) Such circumstances can result in an elevated load average"
> But even if I add the 'b' processes to the 'r' processes, I'm still
> far greater with the Load avg.
> I would be thankful if someone have a logical explanation for these
> stats.
> > So the 1 min load (ex: 17) is far greater the the mean runq (ex:
> > (55+28)/60=1.4)
> > That's not what I expected, in regard to the definition of the Load
> > related to the runq, from wikipedia, cfhttp://en.wikipedia.org/wiki/Load_%28computing%29 > > "An idle computer has a load number of 0 and each process using or
> > waiting for CPU (the ready queue or run queue) increments the load
> > number by 1."
> > It is also said in the article that "However, Linux also includes
> > processes in uninterruptible sleep states (usually waiting for disk
> > activity), which can lead to markedly different results if many
> > processes remain blocked in I/O due to a busy or stalled I/O system.
> > (...) Such circumstances can result in an elevated load average"
> > But even if I add the 'b' processes to the 'r' processes, I'm still
> > far greater with the Load avg.
> > I would be thankful if someone have a logical explanation for these
> > stats.
Hello Luis,
if I read correctly, you are summing up the first field (or the first
and second) of vmstat output for a minute and comparing it with the
uptime load average for a minute, and the former is bigger then the
latter.
At a quick glance, the wikipedia page gives you that processes
"running or waiting" are the ones being counted. in vmstat, the man
page defines r as giving you processes waiting for run time, and b as
giving you those sleeping waiting for IO. You seem to be missing those
actually running in your count, or the man page may be inaccurate.
Of course, if processes are missing that would only *increase* your
count, and therefore increases your discrepancy.
But you are skipping over the math: while count_active_tasks is
actually counting processes in RUNNING or UNINTERRUPTIBLE as you are
trying to:
608 for_each_task(p) {
609 if ((p->state == TASK_RUNNING ||
610 (p->state & TASK_UNINTERRUPTIBLE)))
611 nr += FIXED_1;
avenrun feeds the result of that count to the exponentiating function,
and only the output of that is the load average:
So, even if your count was correct, you would still need to feed it to
CALC_LOAD to get the load average out of it you seek to match with
uptime's output.
Hope this helps.
Best -Federico
PS: Load average is always a fun subject. Besides the Wikipedia page,
there is another interesting treatment (although not necessarily fully
up
to date) on Linux Journal:
> > > So the 1 min load (ex: 17) is far greater the the mean runq (ex:
> > > (55+28)/60=1.4)
> > > That's not what I expected, in regard to the definition of the Load
> > > related to the runq, from wikipedia, cfhttp://en.wikipedia.org/wiki/Load_%28computing%29 > > > "An idle computer has a load number of 0 and each process using or
> > > waiting for CPU (the ready queue or run queue) increments the load
> > > number by 1."
> > > It is also said in the article that "However, Linux also includes
> > > processes in uninterruptible sleep states (usually waiting for disk
> > > activity), which can lead to markedly different results if many
> > > processes remain blocked in I/O due to a busy or stalled I/O system.
> > > (...) Such circumstances can result in an elevated load average"
> > > But even if I add the 'b' processes to the 'r' processes, I'm still
> > > far greater with the Load avg.
> > > I would be thankful if someone have a logical explanation for these
> > > stats.
Hello Computing Performance, I just noticed that the kernel has now distributed logic to computing these numbers - and there is an update pending for the next kernel, too. Looks like the old code was too taxing sampling-wise for many-core CPUs.
The patch for the upcoming 3.5 kernel is here (RC7), for those interested.
> So the 1 min load (ex: 17) is far greater the the mean runq (ex: > (55+28)/60=1.4)
> That's not what I expected, in regard to the definition of the Load > related to the runq, from wikipedia, cf > http://en.wikipedia.org/wiki/Load_%28computing%29 > "An idle computer has a load number of 0 and each process using or > waiting for CPU (the ready queue or run queue) increments the load > number by 1."
> It is also said in the article that "However, Linux also includes > processes in uninterruptible sleep states (usually waiting for disk > activity), which can lead to markedly different results if many > processes remain blocked in I/O due to a busy or stalled I/O system. > (...) Such circumstances can result in an elevated load average"
> But even if I add the 'b' processes to the 'r' processes, I'm still > far greater with the Load avg.
> I would be thankful if someone have a logical explanation for these > stats.